层和块

https://zh-v2.d2l.ai/chapter_deep-learning-computation/model-construction.html

“例如,我们上面模型中的第一个全连接的层接收任意维的输入,但是返回一个维度256的输出。”

请问什么说是任意维的输入,不应该是(n,20)维的输入才对嘛?

        # 这里,`block`是`Module`子类的一个实例。我们把它保存在'Module'类的成员变量
       # `_children` 中。`block`的类型是OrderedDict。
       self._modules[block] = block

上面MySequential示例代码中的疑问:

  1. 注释中_children是不是写错了,应该是_modules
  2. _modules为什么定义成OrderedDict类型,一个list应该就够了吧

我也感觉应该是 使用_modules 和用一个list是一样的,可是我用两种方式实现了之后,相同X输入,输出结果不一样不知道为什么。。。

Blockquote

class MySequential(nn.Module):
def init(self, *args):
super().init()
self.sequential = []
for bk in args:
self.sequential.append(bk)

def forward(self, X):
    for i in range(len(self.sequential)):
        X = self.sequential[i](X)
    return X

class MySequential2(nn.Module):
def init(self, *args):
super().init()
# self.sequential = []
for bk in args:
self._modules[bk] = bk

def forward(self, X):
    for bk in self._modules.values():
        X = bk(X)
    return X

if name == ‘main’:
net = MLP()
X = torch.randn(2, 20)
# print(net(X))
net = MySequential(nn.Linear(20, 256), nn.ReLU(), nn.Linear(256, 10))
net2 = MySequential2(nn.Linear(20, 256), nn.ReLU(), nn.Linear(256, 10))
print(net(X))
print(net2(X))

你所说的结果不一样是指什么,是打印出来的网络结构不同还是参数不同

X = F.relu(torch.mm(X, self.rand_weight) + 1)
请问在“FixedHiddenMLP”实现中,这个 “+1” 是指什么? Bias value吗?

个人感觉可以近似这么认为,其实就是对矩阵乘积结果加1,想要体现的是一般的Sequential不能实现的操作

It’s resonable to see the results are not matched.
Please pay attention to your code, quoted as below,
"
net = MySequential(nn.Linear(20, 256), nn.ReLU(), nn.Linear(256, 10))
net2 = MySequential2(nn.Linear(20, 256), nn.ReLU(), nn.Linear(256, 10))
"
As you can see, you instantiated four different nn.Linear layers. So the weight are not initialized as the same. To prove that, it’s easy to print and compare the weights.

The same result can be achieved by changing the way of storing blocks in MySequential to a Python list. So what is the difference between these two methods?

The main difference is that using _modules enables the other pytorch functions/methods to find the added layers automatically. To put it in another word, these layers will be registered. For example, if you want to print parameters of the network, you can simply call state_dicts(). But if the list is adopted, methods like state_dicts don’t work.
Code:
class MySequential(nn.Module):

def __init__(self, *args):
    super().__init__()
    for i,block in enumerate(args):
        self._modules[str(i)] = block

def forward(self, X):
    for block in self._modules.values():
        X = block(X)
    return X

net = MySequential(nn.Linear(20, 256), nn.ReLU(), nn.Linear(256, 10))
net.state_dicts()

1 Like

输出结果不一样,是因为Linear权值是随机初始化的

朋友们,第二题是这个意思吗?
class myMLP(nn.Module):
def init(self,*args):
super().init()
for block in args:
self._modules[block] = block
def forward(self,X):
outputs = []
for block in self._modules.values():
outputs.append(block(X))
return outputs
net = myMLP(nn.Linear(20,256),MySequential())
net(X)

list不是按顺序存放的,可能会出现问题吧