Hi @akhil_teja, the x is a vector, i.e. x = [x1, x2]^T
Hi does D2L provide a way where we can validate or check our solutions for the exercises ?
Discussion is the only way now.
@rammy_vadlmudi
Hey @rammy_vadlamudi, yes! This discussion forum is great way to share your thoughts and discuss the solutions. Feel free to voice it out!
Hey guys hope u all good. Iβve found today this course. Itβs quite interesting. Iβm completing it in python. Iβm learning mostly python for machine learning and AI applications. Even iβve been learning how to manage to use AWS sagemaker and clouds services. But i wanted to ask a question about finding the gradient of the function. I mean question 2: Itβs possible to define
a function like
import numpy as np
def(x): where x is a list
return 3x[0]**2 + 5np.exp(x[1])
and then apply numerical_limit function with following parameters(f = f(x), x =[1,1], h =0.01)
and return a list looping thought each index of the list x =[1,1]
or this logic is too dump?
If you guys can help me
I studied math in the past, but donβt know how to code with the most fresh and efficient way x)
thanks in advance
Hi @Luis_Ramirez, your logic is never dump! In most of DL framework, we decompose a complex function to each directly differentiable step and then apply the chain rule (i,e., we define all the derivative formula in code and apply chain rule). Check https://d2l.ai/chapter_preliminaries/autograd.html for more details. Besides, if you would like to see how to code from scratch, check here. Let me know if it helps!
Try adding this line to the top of the plot function:
fig = d2l.plt.figure()
and have the plot function
return fig
then:
def f(x)
return(x**3-1/x)
x = np.arange(0.1, 3, 0.1)
fig = plot(x, [f(x), 4 * x-4], 'x', 'f(x)', legend=['f(x)', 'Tangent line (x=1)'])
fig.savefig("2_Prelim 4_Calc 1_Ex.jpg")
- Q1
hello,
I tried my code:
import torch
x = torch.arange(2.0)
x.requires_grad_(True)
x.grad
y = 3 * torch.dot(x,x) + 5 * torch.exp(x)
y
y.backward()
x.grad
ΒΏitβs ok?
Hi, Iβm looking for some clarification on this excerpt from the very end of Section 2.4.3:
Similarly, for any matrix π, we have βπ βπβ_F^2 = 2π.
Does this mean that for a given matrix of any size filled with m*n variables, the gradient of the square of that matrix can be condensed to 2X?
Also, what does the subscripted F imply in this case?
Thanks!
Hi, I just wanted to verify my solutions for the provided exercise questions:
- Find the gradient of the function π(π±)=3*(π₯1 ^ 2) + 5π^π₯2
(Subsituting y for x2, as I assumed x1 != x2)
fβ(x) = 6x + 5e^y
- What is the gradient of the function π(π±)=βπ±β2
||x||2 = [ (3x^2)^2 + (5e^y)^2 ]^0.5
(Calculating the Euclidean distance using the Pythagorean Theorem)
||x|| = ( 9x^4 + 25e^2y ) ^ 0.5
fβ ( ||x|| ) = ( 18x^3 + 25e^2y ) / ( 9x^4 + 25e^2y ) ^ 0.5
- Can you write out the chain rule for the case where π’=π(π₯,π¦,π§), π₯=π₯(π,π), π¦=π¦(π,π), and π§=π§(π,π)?
Is this meant to be simplified to df/dx * (dx/da + dx/db) and so on for y, and z?
Thanks so much, and I apologise if my answers are completely misguided.
- Find the gradient of the function f (x) = 3x12 + 5ex2
x1/df = 6x + 5e^x2
x2/df = 52e^x2
β x f (x) = [6x + 5ex2, 52ex2]
here is pic (not sure if itβs correct)
Iβm confused with partial derivatives. Since for partial derivatives we can treat all other variables as constants, shouldnβt the derivative vector be [6x_1, 5e^x_2] ?
βf/βx_1 = β/βx_1 (3x_1^2) + DC = 6x_1 + 0 = 6x_1 (C being a constant)
βf/βx_2 = DC + β/βx_2 (5e^x_2) = 0 + 5e^x_2 = 5e^x_2
I believe the F implies the Frobenius Norm:
http://d2l.ai/chapter_preliminaries/linear-algebra.html?highlight=norms
Iβm not clear on what the notation implies when there is both a subscript F and a superscript 2. The text reads as if the Frobenius Norm is always the square root of the sum of its matrix elements, so the superscript should always be 2. Is this understanding incorrect?
Exercise 2:
βf(x) = [6x1, 5e^x2]
Exercise 3:
f(x) = (x1Β² + x2Β² β¦ + xnΒ²)ΒΉ/Β²
βf(x) = x/f(x)
Exercise 4:
u = f(x,y,z), x = x(a,b), y = y(a,b), z = z(a,b)
du/da = (du/dx)(dx/da) + (du/dy)(dy/da) + (du/dz)(dz/da)
du/db = (du/dx)(dx/db) + (du/dy)(dy/db) + (du/dz)(dz/db)
The superscript 2 means you are squaring the Forbenius Norm. So, the square root in the Forbenius Norm disappears.
I found some issue, while I run the below code in pytorch.
x = np.arange(0, 3, 0.1)
plot(x, [f(x), 2 * x - 3], 'x', 'f(x)', legend=['f(x)', 'Tangent line (x=1)'])
Thanks @zgpeace for raising this, I believe it was recently deprecated but shouldnβt error out. You can try with an older version of ipython. In any case weβll fix this in the next release https://github.com/d2l-ai/d2l-en/pull/2065
For question one:
def f(x):
return x ** 3 - 1.0 / x
def df(x):
return 3 * x ** 2 + 1/ (x * x)
def tangentLine(x, x0):
"""x is the input list, x0 is the point we compute the tangent line"""
y0 = f(x0)
a = df(x0)
b = y0 - a * x0
return a * x + b
x = np.arange(0.1, 3, 0.1)
plot(x, [f(x), tangentLine(x, 2.1)], 'x', 'f(x)', legend=['f(x)', 'Tangent line (x=2.1)'])