Gradient Descent

https://d2l.ai/chapter_optimization/gd.html

Hi, I would like to ask a question on the formula (11.3.12). Why do we calculate the gradient of the vector x instead of the function f(x)? Should it be “xx−ηdiag(Hf)^(−1)∇f(x)” here? Thank you!

Great catch @nxby! Would you like to post a PR and be a contributor?

Thank you @goldpiggy! I’ve just made a pull request.

Hi, I wonder if there is a typo in the sentence just below the formula (11.3.11): "Plugging in the update equations leads to the following bound e_{k+1} <= e^2_k f'''(\xi_k)/f'(x_k)‘’, instead of “e_{k+1} <= e^2_k f'''(\xi_k)/f'(x_k)”, shouldn’t it be “e_{k+1} <= \frac{1}{2} e^2_k f'''(\xi_k)/f''(x_k)”? Thanks a lot for your attention.

Thanks. I revised this part recently and made it slightly different from the previous version. Just let me know if you spot any issue.

The Peano R_n of Taylor expansion got one extra power, which was wrong.