Calculus

mli · June 9, 2020, 8:33pm

http://d2l.ai/chapter_preliminaries/calculus.html

Diachrony · October 27, 2020, 5:17am

Exercise 1

The derivative of:

$\ x^3-\frac{1}{x}$
is:

$\ 3x^2-(-\frac{1}{x^2}$

At x = 1 we get y = 0 and a slope for the tangent of 4
The tangent will have an intercept of -4 and an equation of 4x-4

To plot:

def f(x):
    return(x**3-1/x)

#div by 0 error at x=0, so we use x=0.1 
x = np.arange(0.1, 3, 0.1)
fig = plot(x, [f(x), 4 * x-4], 'x', 'f(x)', legend=['f(x)', 'Tangent line (x=1)'])

The image will be:
2_Prelim 4_Calc 1_Ex

To save this image add this line to the beginning of the plot function:

    fig = d2l.plt.figure()

and a return to the end:

    return fig

and this after making the plot:

fig.savefig("2_Prelim 4_Calc 1_Ex.jpg")

Diachrony · October 27, 2020, 7:02am

Exercise 2

$\nabla_{\mathbf{x}}{f(\mathbf{x})=\bigg[6x_1,5e^x_2\bigg]^\top$

Diachrony · October 28, 2020, 4:09am

I appreciate any corrections!

Exercise 3

$\nabla_{\mathbf{X}}||\mathbf{x}||_2=\frac{\mathbf{x}}{||\mathbf{x}||_2}$

Explaination: ${||\mathbf{x}||_2=\sqrt{\mathbf{X}^{\top}{\mathbf{X}}}$

use the chain rule:
${u=\mathbf{X}^{\top}{\mathbf{X}}}$

${\frac{\partial}{\partial{u}}u^\frac{1}{2}.\frac{\partial}{\partial{\mathbf{x}}}\mathbf{X}^{\top}{\mathbf{X}}}$

${(\frac{1}{2}u^{-\frac{1}{2}})(2\mathbf{x})=(\frac{1}{2}\frac{1}{||\mathbf{x}||_2})(2\mathbf{x})=\frac{\mathbf{x}}{||\mathbf{x}||_2}$

Exercise 4

$\frac{\partial{u}}{\partial{a}}=\frac{\partial{u}}{\partial{x}}\cdot\frac{\partial{x}}{\partial{a}}+\frac{\partial{u}}{\partial{y}}\cdot\frac{\partial{y}}{\partial{a}}+\frac{\partial{u}}{\partial{z}}\cdot\frac{\partial{z}}{\partial{a}}$

$\frac{\partial{u}}{\partial{b}}=\frac{\partial{u}}{\partial{x}}\cdot\frac{\partial{x}}{\partial{b}}+\frac{\partial{u}}{\partial{y}}\cdot\frac{\partial{y}}{\partial{b}}+\frac{\partial{u}}{\partial{z}}\cdot\frac{\partial{z}}{\partial{b}}$

Pronton2001 · February 1, 2021, 6:38am

Your solution is so brief and correct. Thank you

guyo13 · July 27, 2023, 5:05am

I think part (2.4.8) needs clarification.
The first bullet point states that for any matrix “A” in R(m x n) and vector “x” in Rn, the gradient of (Ax) is A^T.
What does that mean? Do we treat A as a linear transformation function from Rn to Rm? How do you take the gradient of that? Why does it equal to A_T (Rn x m vector)? I missed the generalization from how the gradient was defined (vector of partial derivatives of a function whose domain is Rn and range is R).
Moving on, I didnt understand how can x_T be multiplied by A ? x_T is of (1 x n) shape and A is (m x n), the dimensions don’t match.
The second bullet point mentions the gradient of the square of the norm of x. I understand this is equivalent to x_T * Identity (nxn) * x, but again I seem to not understand the notation here, it seems like we are taking the gradient of a scalar (||x||^2 is a scalar how can we take the gradient of that)?