Layers and Modules

anirudh · December 23, 2020, 4:08pm

@sushmit86 forward is called inside the built in metod __call__ . You can look at this function in pytorch source code and see that self.forward is called inside this method.

The reason for not calling forward explicitly by net.forward is due to the hooks being dispatched in __call__ method.

sushmit86 · December 23, 2020, 7:33pm

Thanks so much @anirudh

Reno · February 13, 2021, 1:01am

Hi,

class MySequential(nn.Module):
def init(self, *args):
super().init()
for block in args:
# Here, block is an instance of a Module subclass. We save it
# in the member variable _modules of the Module class, and its
# type is OrderedDict
self._modules[block] = block

is it typo to use the module class as the key and value in the same time? or you have better reasons?
thx

amoux · February 14, 2021, 10:01am

It’s a typo. Keys are supposed to be ids of [str] type - not Module’s.

class MySequential(nn.Module):
    def __init__(self, *args):
        super().__init__()
        for idx, block in enumerate(args):
            self._modules[str(idx)] = block
...

anirudh · February 17, 2021, 11:18am

Thanks for raising the issue, this is now fixed here.

Lefan · February 18, 2021, 4:00pm

Hi,
I tried to use Python’s list to implement class MySequential. Here are the codes:

class MySequential_(nn.Module):
def init(self, *args):
super().init()
self.list = [block for block in args]

def forward(self, X):
    for block in self.list:
        X = block(X)
    return X

net = MySequential(nn.Linear(20, 256), nn.ReLU(), nn.Linear(256, 10))
net(X)

tensor([[ 0.2324, 0.0579, -0.0106, -0.0143, 0.1208, -0.2896, -0.0271, -0.1762,
-0.0771, 0.0069],
[ 0.3362, 0.0312, -0.0852, -0.1253, 0.1525, -0.1945, 0.0685, 0.0335,
-0.1404, -0.0617]], grad_fn=)

And the output looks fine to me. I am curious about the problem of using list to replace self._modules in implementing MySequencial?

washiloo · April 10, 2021, 8:28pm

Hi! I found a typo in Section 5.1.1 of the PyTorch version. In the code snippet used to define class MLP, inside the __init__() function there is a comment that reads “# Call the constructor of the MLP parent class ‘Block’ to perform […]”. The correct name of the parent class is ‘Module’ (‘Block’ is the mxnet version).

Great book! Thanks!

arjun_pukale · June 28, 2021, 2:28pm

what is the answer for Q1:

What kinds of problems will occur if you change MySequential to store blocks in a Python list?

arjun_pukale · June 28, 2021, 3:01pm

Q3

class MLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.hidden = nn.Linear(20, 256)
        self.out = nn.Linear(256,20)

    def forward(self, X):
        return self.out(F.relu(self.hidden(X)))

class Factory(nn.Module):
    def __init__(self, k):
        super().__init__()
        modules=[]
        for i in range(k):
            modules.append(MLP())
        self.net = nn.Sequential(*modules)
    def forward(self, X):
        return self.net(X)

net = Factory(3)
X = torch.rand(2,20)
out = net(X)

print(net)

Factory(
  (net): Sequential(
    (0): MLP(
      (hidden): Linear(in_features=20, out_features=256, bias=True)
      (out): Linear(in_features=256, out_features=20, bias=True)
    )
    (1): MLP(
      (hidden): Linear(in_features=20, out_features=256, bias=True)
      (out): Linear(in_features=256, out_features=20, bias=True)
    )
    (2): MLP(
      (hidden): Linear(in_features=20, out_features=256, bias=True)
      (out): Linear(in_features=256, out_features=20, bias=True)
    )
  )
)

Is this correct?

gphilip · July 18, 2021, 10:13am

In section 5.1.1 it says:

For example, the first fully-connected layer in our model above ingests an input of arbitrary dimension but returns an output of dimension 256.

The “model above” is defined as: nn.Sequential(nn.Linear(20, 256), nn.ReLU(), nn.Linear(256, 10)) .

From what I understand, the first fully connected layer in this model is nn.Linear(20, 256), and it takes an input of dimension exactly 20, no more and no less.

Why is it stated that this layer takes an input of arbitrary dimension? What am I missing here?

anirudh · July 22, 2021, 8:39pm

Thanks, @gphilip for raising this. Most part of the book has common text and we are trying to fix issues like these where the frameworks differ in design. Feel free to raise any other issues if you find something similar in other sections on the forum or the Github repo. Really appreciate it!

This will be fixed in #1838

fanbyprinciple · August 10, 2021, 7:15pm

Exercises and my silly answers

What kinds of problems will occur if you change MySequential to store blocks in a Python

list?

Mysequential implementation would be different. what else? hmmm.

Implement a block that takes two blocks as an argument, say net1 and net2 and returns

the concatenated output of both networks in the forward propagation. This is also called a

parallel block.


class parallel_mlp(nn.Module):

    def __init__(self, block1, block2):

        super().__init__()

        self.block1 = block1

        self.block2 = block2

    def forward(self, X):

        first = self.block1(X)

        second = self.block2(X)

        print(first, second)

        return torch.cat((first, second))

Assume that you want to concatenate multiple instances of the same network. Implement

a factory function that generates multiple instances of the same block and build a larger

network from it.

Eli_Chen · October 6, 2021, 7:37am

For exercise1:

I found if I just use the list to store modules, there is nothing in net.state_dict() and net.parameters()

Here are the example:

Eli_Chen · October 6, 2021, 7:51am

For exercise3:

Here is my example:

Ye_Zhang · March 1, 2022, 5:40pm

Hi,
In the custom Mysequential class, why we need idx to be str ? ‘self._modules[str(idx)] = module’
and as the comment of this line, you meant ‘_module’ is a type of Ordereddict right? Instead of module.

LeiSanity · November 19, 2022, 8:30am

Q2

class parallel_block(nn.Module):
    def __init__(self, net1, net2):
        super().__init__()
        self.net1 = net1
        self.net2 = net2
    
    def forward(self, X):
        return torch.cat((self.net1(X), self.net2(X*2)), 0)

parallel_block(MLP(), MLP())

pandalabme · August 23, 2023, 1:21pm

My solutions to the exs: 6.1

Deepak_Sahu · March 13, 2024, 2:25pm

My Solution to Q3:


class DaisyX(nn.Module):
    def __init__(self, genericModule: nn.Module, chain_length=5):
        super().__init__()
        for idx in range(chain_length):
            self: nn.Module 
            self.add_module(str(idx)+ genericModule.__name__, genericModule())
    
    def forward(self, X):
        for m in self.children():
            X = m(X)
        return X
    

class Increment(nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, X):
        return (X + 1)
    
net = DaisyX(Increment, 5)
X = torch.zeros((2, 2))
net(X)

zhang2023-byte · February 17, 2025, 6:58am

My exercise:

I don’t really understand this question, we use self.add_module method in nn.module to store modules (?), if we want to store modules in a Python list, meaning we have to rewrite the whole structure?
class parallelModule(nn.Module):
def init(self, net1, net2, dim):
super().init()
self.net1 = net1
self.net2 = net2
self.dim = dim

def forward(self, X):
output1 = self.net1(X)
output2 = self.net2(X)
return torch.cat((output1, output2), dim=self.dim)
My layer factory:
def layer_factory(num_layers):
layers =
for _ in range(num_layers):
layers.append(MLP())
return layers

My deep net work:
class DeepMLP(nn.Module):
def init(self, num_layers):
super().init()
layers = layer_factory(num_layers)
for idx, layer in enumerate(layers):
self.add_module(str(idx), layer)

def forward(self, X):
    for module in self.children():
        X = module(X)
    return X

winkmike · August 9, 2025, 10:32am

Hi, can I ask in your forward function return, why you need to wrap the concatenated (X, X2) inside a self.linear?

My function implementation just returns torch.cat((X, X2), dim=1)