Image Classification Dataset

I have tested that num_workers parameter in torch DataLoader does work. By selecting num_workers=4 reduce the read time to half.

  1. batch size = 1, stochastic gradient descent (SGD)
    batch size = 256, mini-batch gradient descent (MBGD)
    Because using GPU to parallel read data, so MBGD is quicker.
    Reducing the batch_size will make overall read performance slower.
    :face_with_monocle:Does my guess right?
  2. I’m a Windows user. Try it next time!


I suggest using %%timeit -r1, which is a built-in function in Jupyter, instead of the d2l timer.

%%time is better. One time is enough :grinning:

Hi friends,
I dont understand resize argument.
I cant show images after resize.

Read again.

@StevenJokess you need to change the arguments when calling show_images() method according to your chosen batch_size and resize arguments in load_data_fashion_mnist() method

something like this
show_images(X.reshape(32, 64, 64), 2, 9, scale=1.5, titles=get_fashion_mnist_labels(y))

For q1, I don’t think SGD or MSGD would affect the performance of reading dataset, since it has nothing to do with updating params.
However it’s really slower when batch_size is set to 1 to read data. May the I/O limitation of data reading is the reason of difference?

In PyTorch, when loading the dataset, there is a warning. I find I can’t use the dataset.
when running “mnist_train[0][0].shape” would give an error:
TypeError: array() takes 1 positional argument but 2 were given

How to solve this?::pensive:


X, y = next(iter(data.DataLoader(data.TensorDataset(, mnist_train.targets), batch_size=18)))


What does the num_workers mean here? Does it use CPU or GPU. I got a Runtime Error after setting num_workers >0, i.e. 4. No problem with num_workers = 0

RuntimeError: DataLoader worker (pid 30141) exited unexpectedly with exit code