according to https://arxiv.org/pdf/1512.03385.pdf i think there is a 1000-d fully connected softmax layer missing at in the last part of the model
Hi Ehsan. Maybe I’m correct about this, so take it with a grain of salt. The ResNet architecture was trained with the ImageNet 2012 classification dataset, this dataset includes 1000 different classes, so that is way the last Fully connected layer they used is a 1000-FC. In the case of the book, we are using a 10 classes dataset for faster learning times, that’s why we use a 10-FC layer instead. Hope this helps !!!
In relation of the softmax function, I don’t know why this was skipped.