Calculus

mli · May 22, 2020, 3:44am

http://d2l.ai/chapter_preliminaries/calculus.html

gpk2000 · August 25, 2020, 12:45pm

I think it would be better to make a quote that you need to install these packages before running this code such as in the code-block below. Anaconda comes with these pre-installed but Miniconda doesn’t. I am saying this cause the book suggested to download Miniconda.

gpk2000 · August 25, 2020, 12:57pm

Also in that example this didn’t work

from d2l import mxnet as d2l

this works

import mxnet as d2l

goldpiggy · September 3, 2020, 5:50pm

Hey @gpk2000, thanks for the suggestion of Anaconda. We recommend installing miniconda as it has most of the necessary libraries but not as heavy-lifting as anaconda.

As for your question, please make sure you have the latest version of D2L and MXNet installed.

smizerex · September 7, 2020, 11:13pm

Is exercise 4 in section 2.4 possible?
How would we do it?

goldpiggy · September 8, 2020, 8:20pm

Hey @smizerex, yes chain rule can be applied here. Feel free to share your idea and discuss here.

asadalam · September 11, 2020, 12:07am

Does the first part of answer for Q.4 in 2.4 is: du/da = (du/dx)(dx/da) + (du/dy)(dy/da) + (du/dz)*(dz/da)

goldpiggy · September 11, 2020, 7:55pm

Hi @asadalam, I believe you are right!

anant_jain · September 25, 2020, 11:20am

Hey, can you please provide deeper explanation of matrix differentiation rules given?

goldpiggy · September 25, 2020, 9:26pm

Hi @anant_jain, great question. We have the math chapter talking about in-depth math, such as https://d2l.ai/chapter_appendix-mathematics-for-deep-learning/multivariable-calculus.html

Reza_Afra · October 9, 2020, 7:06pm

I just wanted to thank the authors of this book. This book is a godsend!

Abhinav_Raj · October 25, 2020, 5:34pm

http://d2l.ai/chapter_preliminaries/calculus.html#gradients
In the above section how is the derivative defined? I may be wrong but I think the derivative for Ax w.r.t. x be A instead of A transpose, assuming the usual definitions of matrix differentiation. I did it in a similar way to Matrix Derivative (page 5).
Thanks in advance.

goldpiggy · October 26, 2020, 9:23pm

Hi @Abhinav_Raj, great question! There are two layouts in matrix calculus: numerator layout and denominator layout, and we are using the denominator layout. Check more details in “Layout conventions” section in https://en.wikipedia.org/wiki/Matrix_calculus.

Abhinav_Raj · October 27, 2020, 5:11am

@goldpiggy, I see, thanks for clearing it up. So, basically we lay out the y as a row vector and then calculate the gradient in a usual way (column wise). Also, is there some specific reason for choosing this convention say it makes calculations easier or that’s how it’s done in MxNet/Pytorch?TF?
Thanks again.

goldpiggy · November 2, 2020, 9:32pm

Hey @Abhinav_Raj, it doesn’t matter that much to programming. The difference are row vector or column vector, but they are both vectors in programming.

shrza · April 28, 2021, 1:56am

http://d2l.ai/chapter_preliminaries/calculus.html#gradients
In the above section gradient was defined for the function f: ℝⁿ → ℝ in (2.4.9),
but 𝗔𝘅 and 𝘅ᵀ𝗔 are column and row vectors respectively, in these cases f(𝘅) is f:ℝⁿ → ℝᵐ. How do you define ∇ₓf(𝘅) in this case? Is it transposed Jacobian?

MucsanyiBalint · July 12, 2021, 10:33am

I may be wrong, but I believe that the nabla (or del) operator is not appropriate for the derivatives of Ax and x^TA, as it is used for gradient vectors, but these functions are R^n->R^m, thus their derivatives are matrices (Jacobian matrices). I think denoting these by delta / delta(x) Ax and delta / delta(x) x^TA would be better. The same could be said for the squared Frobenius norm at the end.

If my understanding is right, I will happily contribute. If not, at least I learned something new.

Def255 · July 25, 2021, 12:53pm

display.set_matplotlib_formats('svg') is deprecated. It is recommended to use matplotlib_inline.backend_inline.set_matplotlib_formats() instead