Optimization and Deep Learning

https://d2l.ai/chapter_optimization/optimization-intro.html

I do understand the concept of gradient with these examples but how do I relate these to a neural network for example. How do I compute gradients for a trained model and plot or check for exploding/vanishing gradients? What are the parameters in such a case?

HOw to understand gradient dscent

Hey @pdhimate, great questions! I hardly see anyone plat the gradients, but if you see your loss plot going to a wrong direction, it is highly likely that the weights parameters explode.

Hey @Mark_Fayuan, we elaborate gradient descent here.

Thank you. But I wanted to know which plot we are exactly checking when we look for exploding or vanishing gradients. I am basically stuck here: https://stackoverflow.com/questions/63514062/tensorflow-v2-gradients-not-shown-on-tensorboard-histograms

Dear All,
I do not understand the relationship between eigenvalues of Hessian Matrix with extremum.

  • When the eigenvalues of the function’s Hessian matrix at the zero-gradient position are all positive, we have a local minimum for the function.
  • When the eigenvalues of the function’s Hessian matrix at the zero-gradient position are all negative, we have a local maximum for the function.
  • When the eigenvalues of the function’s Hessian matrix at the zero-gradient position are negative and positive, we have a saddle point for the function.

Thank you!

I got the answer for it. (in part 11.2 Convexity)
Sorry for this incovenience!