Image Classification Dataset

I have tested that num_workers parameter in torch DataLoader does work. By selecting num_workers=4 reduce the read time to half.

  1. batch size = 1, stochastic gradient descent (SGD)
    batch size = 256, mini-batch gradient descent (MBGD)
    Because using GPU to parallel read data, so MBGD is quicker.
    Reducing the batch_size will make overall read performance slower.
    :face_with_monocle:Does my guess right?
  2. I’m a Windows user. Try it next time!


I suggest using %%timeit -r1, which is a built-in function in Jupyter, instead of the d2l timer.

%%time is better. One time is enough :grinning:

Hi friends,
I dont understand resize argument.
I cant show images after resize.

Read again.

@StevenJokess you need to change the arguments when calling show_images() method according to your chosen batch_size and resize arguments in load_data_fashion_mnist() method

something like this
show_images(X.reshape(32, 64, 64), 2, 9, scale=1.5, titles=get_fashion_mnist_labels(y))

For q1, I don’t think SGD or MSGD would affect the performance of reading dataset, since it has nothing to do with updating params.
However it’s really slower when batch_size is set to 1 to read data. May the I/O limitation of data reading is the reason of difference?