Optimization and Deep Learning

astonzhang · June 29, 2020, 10:15pm

https://d2l.ai/chapter_optimization/optimization-intro.html

pdhimate · August 16, 2020, 12:32am

I do understand the concept of gradient with these examples but how do I relate these to a neural network for example. How do I compute gradients for a trained model and plot or check for exploding/vanishing gradients? What are the parameters in such a case?

Mark_Fayuan · August 21, 2020, 1:32am

HOw to understand gradient dscent

goldpiggy · August 21, 2020, 8:39pm

Hey @pdhimate, great questions! I hardly see anyone plat the gradients, but if you see your loss plot going to a wrong direction, it is highly likely that the weights parameters explode.

goldpiggy · August 21, 2020, 8:40pm

Hey @Mark_Fayuan, we elaborate gradient descent here.

pdhimate · August 22, 2020, 12:22am

Thank you. But I wanted to know which plot we are exactly checking when we look for exploding or vanishing gradients. I am basically stuck here: https://stackoverflow.com/questions/63514062/tensorflow-v2-gradients-not-shown-on-tensorboard-histograms

Tan_Phan · November 17, 2020, 6:27am

Dear All,
I do not understand the relationship between eigenvalues of Hessian Matrix with extremum.

When the eigenvalues of the function’s Hessian matrix at the zero-gradient position are all positive, we have a local minimum for the function.
When the eigenvalues of the function’s Hessian matrix at the zero-gradient position are all negative, we have a local maximum for the function.
When the eigenvalues of the function’s Hessian matrix at the zero-gradient position are negative and positive, we have a saddle point for the function.

Thank you!

Tan_Phan · November 17, 2020, 9:13am

I got the answer for it. (in part 11.2 Convexity)
Sorry for this incovenience!