Network in Network (NiN)

https://d2l.ai/chapter_convolutional-modern/nin.html

Just out of curiosity, why the difference in training accuracy/loss per epoch when training on MxNet vs Pytorch vs Tensorflow?