Implementation of Multilayer Perceptron from Scratch

http://d2l.ai/chapter_multilayer-perceptrons/mlp-scratch.html

So we are estimating about 200k weights (785x256 + 256x10) with 60k data points. Is it an overfitting?