卷积神经网络(LeNet)

Q1: 将平均汇聚层替换为最大汇聚层,会发生什么?
A1: 简单进行了对比,可以看到相较于Max Pooling,Avg Pooling的过拟合程度更小,同时测试集的分类准确率要高一些。
Avg Pooling, epoch 10, lr 0.9: loss 0.465, train acc 0.825, test acc 0.793
Max Pooling, epoch 10, lr 0.9: loss 0.432, train acc 0.838, test acc 0.776
Avg Pooling, epoch 20, lr 0.9: loss 0.356, train acc 0.867, test acc 0.857
Max Pooling, epoch 20, lr 0.9: loss 0.316, train acc 0.883, test acc 0.849

Q2: 尝试构建一个基于LeNet的更复杂的网络,以提高其准确性。

# 模型构造如下
changed_net = nn.Sequential(
    nn.Conv2d(1, 8, kernel_size=5, padding=2), nn.ReLU(),
    nn.AvgPool2d(kernel_size=2, stride=2),
    # nn.MaxPool2d(kernel_size=2, stride=2),
    nn.Conv2d(8, 16, kernel_size=3, padding=1), nn.ReLU(),
    nn.AvgPool2d(kernel_size=2, stride=2),
    # nn.MaxPool2d(kernel_size=2, stride=2),
    nn.Conv2d(16, 32, kernel_size=3, padding=1), nn.ReLU(),
    nn.AvgPool2d(kernel_size=2, stride=2),
    # nn.MaxPool2d(kernel_size=2, stride=2),
    nn.Flatten(),
    nn.Linear(32 * 3 * 3, 128), nn.Sigmoid(),
    nn.Linear(128, 64), nn.Sigmoid(),
    nn.Linear(64, 32), nn.Sigmoid(),
    nn.Linear(32, 10)
)

网络结构如下

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1            [-1, 8, 28, 28]             208
              ReLU-2            [-1, 8, 28, 28]               0
         AvgPool2d-3            [-1, 8, 14, 14]               0
            Conv2d-4           [-1, 16, 14, 14]           1,168
              ReLU-5           [-1, 16, 14, 14]               0
         AvgPool2d-6             [-1, 16, 7, 7]               0
            Conv2d-7             [-1, 32, 7, 7]           4,640
              ReLU-8             [-1, 32, 7, 7]               0
         AvgPool2d-9             [-1, 32, 3, 3]               0
          Flatten-10                  [-1, 288]               0
           Linear-11                  [-1, 128]          36,992
          Sigmoid-12                  [-1, 128]               0
           Linear-13                   [-1, 64]           8,256
          Sigmoid-14                   [-1, 64]               0
           Linear-15                   [-1, 32]           2,080
          Sigmoid-16                   [-1, 32]               0
           Linear-17                   [-1, 10]             330
================================================================
Total params: 53,674
Trainable params: 53,674
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.19
Params size (MB): 0.20
Estimated Total Size (MB): 0.40
----------------------------------------------------------------

Avg Pooling, epoch 20, lr 0.6: loss 0.308, train acc 0.885, test acc 0.863

Q3: 在MNIST数据集上尝试以上改进的网络。

# 使用Mnist数据集
trans = transforms.ToTensor()
mnist_train = torchvision.datasets.MNIST(
    root="../data", train=True, transform=trans, download=True)
mnist_test = torchvision.datasets.MNIST(
    root="../data", train=False, transform=trans, download=True)
train_iter = data.DataLoader(mnist_train, batch_size, shuffle=True,
                             num_workers=d2l.get_dataloader_workers())
test_iter = data.DataLoader(mnist_test, batch_size, shuffle=True,
                            num_workers=d2l.get_dataloader_workers())

微信图片_20220924122203
最终结果:Avg Pooling, epoch 10, lr 0.6: loss 0.064, train acc 0.981, test acc 0.970

Q4: 显示不同输入(例如毛衣和外套)时,LeNet第一层和第二层的激活值。
在train_ch6中的X, y = X.to(device), y.to(device)和 y_hat = net(X)之间,添加如下代码

x_first_Sigmoid_layer = net[0:2](X)[0:9, 1, :, :]
d2l.show_images(x_first_Sigmoid_layer.reshape(9, 28, 28).cpu().detach(), 1, 9)
x_second_Sigmoid_layer = net[0:5](X)[0:9, 1, :, :]
d2l.show_images(x_second_Sigmoid_layer.reshape(9, 10, 10).cpu().detach(), 1, 9)
# d2l.plt.show()

经Sigmoid1(上)和Sigmoid2(下)之后的图像分别如下:


11 Likes


换adam后学习率改为0.01,训练50轮左右train_acc=0.926,test_acc=0.877,后面明显过拟合了

应该是加在epochs循环之后吧,不然每次都打印一下两个激活层。

替换为relu后第2次relu激活后的输出。

net3= nn.Sequential(
nn.Conv2d(1,6,kernel_size=5,padding=2),nn.ReLU(),
nn.MaxPool2d(kernel_size=2,stride=2),
nn.Conv2d(6,16,kernel_size=5),nn.ReLU(),
nn.MaxPool2d(kernel_size=2,stride=2),
nn.Flatten(),
nn.Linear(1655,120),nn.ReLU(),
nn.Linear(120,84),nn.ReLU(),
nn.Linear(84,10)
)
把sigmod函数修改为ReLU函数后为什么是这样


但是仅仅把卷积层后面的sigmod函数修改ReLU后效果要比原来好点

2 Likes

使用 import matplotlib.pyplot后,再使用plt.show()即可显示图像

我也是这样兄弟,解决了吗???????????

我用这个方法解决了

If you have not solved this problem, you can try searching solution in the Internet.

1 Like

不使用xavier_uniform,居然test正确率从0.82掉到了0.64,单单一个初始化居然有这么大影响。。。

没太懂您的意思,您具体是指哪个问题?Q4吗?

我试了下按照你这么改也不能看动态曲线,也是只能在最后的时候输出一个图片

这种也是最后才显示图像,如果想持续看到动态曲线应该怎们办呢?

我运行代码的时候遇到了这个问题,请你最后解决这个问题了吗

为什么图片输入时是1维,而第一个卷积层却要使用6个通道?我很好奇这6个通道的值是如何获取到的?

Why does the first convolution layer use 6 channels when the image input is 1 dimension? I am curious how the values of these 6 channels are obtained?

lr, weight_decay, num_epochs = 0.1, 0, 80
net = nn.Sequential(
nn.Conv2d(1, 8, kernel_size=3, padding=1), nn.ReLU(),
nn.Conv2d(8, 8, kernel_size=3, padding=1), nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.Conv2d(8, 16, kernel_size=3, padding=1), nn.ReLU(),
nn.Conv2d(16, 16, kernel_size=3, padding=1), nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.Linear(16 * 7 * 7, 256), nn.ReLU(), nn.Dropout(p=0.6),
nn.Linear(256, 128), nn.ReLU(), nn.Dropout(p=0.6),

loss 0.166, train acc 0.939, test acc 0.921
这时我得到的最高的测试正确率。dropout比weight_decay好用太多了在这个例子里面。

3 Likes

你也可以在加载train_iter和test_iter后加入如下两行代码:
train_iter.num_workers=0
test_iter.num_workers=0

1 Like

我修改了两个卷积层的核大小,test达到100%了是什么情况

    pooling_Layer = nn.AvgPool2d(kernel_size=2, stride=2)
    # pooling_Layer = nn.MaxPool2d(kernel_size=2, stride=2)
    activation_Layer = nn.Sigmoid()
    # activation_Layer = nn.ReLU()
    net = nn.Sequential(
        nn.Conv2d(1, 6, kernel_size=6, padding=2),
        activation_Layer,
        pooling_Layer,

        nn.Conv2d(6, 16, kernel_size=4),
        activation_Layer,
        pooling_Layer,

        nn.Flatten(),  # 图像平铺,准备进行MLP

        nn.Linear(16 * 5 * 5, 120),
        activation_Layer,

        nn.Linear(120, 84),
        activation_Layer,

        nn.Linear(84, 10)
    )
    X = torch.rand((1, 1, 28, 28), dtype=torch.float32)
    for layer in net:
        X = layer(X)
        print(layer.__class__.__name__, 'output shape:\t', X.shape)
    print('=' * 10)

    batch_size = 256
    train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size=batch_size)
    lr, num_epochs = 0.8, 16
    train_ch6(net, train_iter, test_iter, num_epochs, lr, d2l.try_gpu())

image

想问一下出现这个问题无法画图该如何解决呢?
I was running the code and found this error occured, and the graph is not sketched, how can I solve this problem?

你输出一下每一层的shape,调节或者增加卷积层的时候要注意和上一层的形状能对的上。

1 Like