Network in Network (NiN)

https://d2l.ai/chapter_convolutional-modern/nin.html

Just out of curiosity, why the difference in training accuracy/loss per epoch when training on MxNet vs Pytorch vs Tensorflow?

After a few hours trial running, the hyper tuning spat out these parameters with best accuracy: