I do understand the concept of gradient with these examples but how do I relate these to a neural network for example. How do I compute gradients for a trained model and plot or check for exploding/vanishing gradients? What are the parameters in such a case?
HOw to understand gradient dscent
Hey @pdhimate, great questions! I hardly see anyone plat the gradients, but if you see your loss plot going to a wrong direction, it is highly likely that the weights parameters explode.
Thank you. But I wanted to know which plot we are exactly checking when we look for exploding or vanishing gradients. I am basically stuck here: https://stackoverflow.com/questions/63514062/tensorflow-v2-gradients-not-shown-on-tensorboard-histograms
Dear All,
I do not understand the relationship between eigenvalues of Hessian Matrix with extremum.
- When the eigenvalues of the function’s Hessian matrix at the zero-gradient position are all positive, we have a local minimum for the function.
- When the eigenvalues of the function’s Hessian matrix at the zero-gradient position are all negative, we have a local maximum for the function.
- When the eigenvalues of the function’s Hessian matrix at the zero-gradient position are negative and positive, we have a saddle point for the function.
Thank you!
I got the answer for it. (in part 11.2 Convexity)
Sorry for this incovenience!