Concise Implementation of Multilayer Perceptron

Can someone explain why there is a sudden dip in the plot?


I guess it is caused by testing what you haven’t trained well accidently…

Sem título
I want to know too, my sudden dip is different than yours

According to that scale, that dip is a loss of accuracy of around 0.05.

The test batch would have behaved differently with the current set of parameters, after that epoch. By chance, it might be the case that the current parameters give an improved accuracy on the training set but reduced accuracy on the test set.

Where that dip happens (it need not happen compulsorily even) depends on your initial state of parameters.

do me’s close to the meaning of ‘overfit’