Dropout

coder-mtj · May 5, 2025, 10:29am

作业提交

第一题

# 简洁实现就是把model定义为一个简单的Sequential的net
net = nn.Sequential(
        nn.Flatten(),
        nn.Linear(784, 256),
        nn.ReLU(),
        # 在第一个全连接层之后添加一个dropout层
        nn.Dropout(dropout1),
        nn.Linear(256, 256),
        nn.ReLU(),
        # 在第二个全连接层之后添加一个dropout层
        nn.Dropout(dropout2),
        nn.Linear(256, 10))

def init_weights(m):
    if type(m) == nn.Linear:
        nn.init.normal_(m.weight, std=0.01)

net.apply(init_weights);

未交换prob
prob交换后

第二题

有隐藏层

有演员，到了最后几轮直接无隐藏层函数看到要比前一个好直接开演

BlueberryPie-kyle · September 21, 2025, 9:32am

这里建议去看英文原文：
In standard dropout regularization, one zeros out some fraction of the nodes in each layer and then debiases each layer by normalizing by the fraction of nodes that were retained (not dropped out).
在标准dropout正则化中，每层中置零一部分节点，然后通过保留的节点比例对每一层进行无偏化处理.
除以1-p就是通过保留的节点比例，放大输出，使该层的总输出期望接近于不dropout的情况.
不知道谁做的翻译，感觉有伪人入侵了，中间活性值和保留节点分数是人写得出来的东西啊

peanut-hua · October 16, 2025, 2:57am

怎么使用一维卷积做时间序列的预测呢，有没有做过的