Continue Discussion 16 replies
Jun '20

StevenJokes

lr, num_epochs = 0.01, 10
d2l.train_ch6(net, train_iter, test_iter, num_epochs, lr)

training on cpu


no other results?
no pic?


so slow?

1 reply
Aug '20 ▶ StevenJokes

ChenYangyao

Please use a GPU for deep nets. A CPU could be 100 times slower than a GPU.

For the AlexNet on Fashion-MNIST, a GPU takes ~ 20 seconds per epoch, which means a CPU would take 2000 seconds ~ 30 minutes.

2 replies
Aug '20 ▶ ChenYangyao

StevenJokes

@ChenYangyao thank for your reply. I don’t have gpus. I’m using colab for learning.
And now I can’t find any exercitations because of my finance undergraduate diploma.

1 reply
Aug '20 ▶ StevenJokes

Nish

@StevenJokes I think colab lets you use a GPU for free? Tools -> Change runtime type -> choose ‘GPU’ under Hardware Accelerator

1 reply
Aug '20

StevenJokes

@Nish
I have known it. But 12 hours will disconnect. :mask:
And you can find why colab is not so friendly in my discussion:
http://d2l.ai/chapter_appendix-tools-for-deep-learning/colab.html

And there is an issue of python 3.6:

1 reply
Aug '20 ▶ StevenJokes

Nish

aha, I see. maybe you can try the kaggle.com free kernel? It allows 30 hours a week of GPU use for its contests I think.

1 reply
Aug '20 ▶ Nish

StevenJokes

I’ll try. @Nish
Thanks.

Jul '21 ▶ ChenYangyao

Rameen

Hi!!
If I change Dataset then how do I change the below statement like dataset of Animals?

train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size, resize=224)
Any Suggestion??

1 reply
Aug '21

DXD

For Exercise 4, I’m guessing the dominant part of computation is the block of fully connected layers at the end. Is the dominant part of the memory footprint the largest convolution layer?

Aug '21 ▶ Rameen

fanbyprinciple

If you are using custom dataset then you might have to use the torch.utils.data.Dataset to create a custom class. Writing Custom Datasets, DataLoaders and Transforms — PyTorch Tutorials 1.9.0+cu102 documentation

Aug '21

fanbyprinciple

Exercises

  1. Try increasing the number of epochs. Compared with LeNet, how are the results different? Why?

Finally was able to “train” it but it looks more like a heartbeat than anything

  1. AlexNet may be too complex for the Fashion-MNIST dataset.

    1. Try simplifying the model to make the training faster, while ensuring that the accuracy

    does not drop significantly.

    • done but accuracy has still not improved.
    1. Design a better model that works directly on 28 × 28 images.
    • done.
  2. Modify the batch size, and observe the changes in accuracy and GPU memory.

  1. Analyze computational performance of AlexNet.

    • any idea how to do it ?
    1. What is the dominant part for the memory footprint of AlexNet?

      • Linear network and 3 cnns
    2. What is the dominant part for computation in AlexNet?

      • the neuralnetwork 3 cnns
    3. How about memory bandwidth when computing the results?

      • more than 100 mb
  2. Apply dropout and ReLU to LeNet-5. Does it improve? How about preprocessing?

    
     
    
     lenet_5 = nn.Sequential(nn.Conv2d(1, 6, kernel_size=5, padding=2), nn.ReLU(),
    
                     nn.AvgPool2d(kernel_size=2, stride=2),
    
                     nn.Conv2d(6, 16, kernel_size=5), nn.ReLU(),
    
                     nn.AvgPool2d(kernel_size=2, stride=2), nn.Flatten(),
    
                     nn.Linear(16 * 5 * 5, 120), nn.ReLU(), nn.Dropout(p=0.5),
    
                     nn.Linear(120, 84), nn.ReLU(), nn.Linear(84, 10))
    
    

    its not training well as well.

FInally able to train , here is the network summary

it took me 6m 21 sec to get an accuracy of 88.7, in just 3 epochs.

Feb '22

thainq

When i have learned the CNN network, the question always in my mind is how to design the good architecture and also how many filter, layer is enough?

Aug '23

pandalabme

My solutions to the exs: 8.1

Sep '23

dingcurie

class AlexNet(d2l.Classifier):
    def __init__(self, lr=0.1, num_classes=10):
        super().__init__()
        self.save_hyperparameters()
        self.net = nn.Sequential(
            nn.LazyConv2d(96, kernel_size=11, stride=4, padding=1),
            nn.ReLU(), nn.MaxPool2d(kernel_size=3, stride=2),
            ...
            nn.LazyLinear(4096), nn.ReLU(),nn.Dropout(p=0.5),
            nn.LazyLinear(num_classes))
        self.net.apply(d2l.init_cnn)     # <-----

The last line has no effect since a dry run has not been performed yet and hence the lazy layers have not been reified into regular (non-lazy) layers.

Jun '24

Mohamed_Ahmed_Naji

Why ReLU activation is not applied after MaxPool2d?
does it may save memory and computation in forward and backward propagation

what this line “self.net.apply(d2l.init_cnn)” in the AlexNet class do?
is it will not be executed?

Sep '24

Juu_Hu

self.net.apply(d2l.init_cnn)

It seems that this line will not work, since the layer dimensions are not initialized until input data is passed through the network for the first time.

my solution is :

1. Defining an initial function in the class

  def para_init(self):
      self.net.apply(d2l.init_cnn)

2. Conducting a forward, then initial the parameters

  model = AlexNet(lr = 0.01)
  data = d2l.FashionMNIST(batch_size=128,resize=(224,224))
  model(next(iter(data.get_dataloader(True)))[0])
  model.para_init()