Customer Layers

http://d2l.ai/chapter_deep-learning-computation/custom-layer.html

1 Like

Not a big deal but MyLinear should be named as MyDense in pytorch example for concordance with the text.

In 5.4.2 pytorch code of first chunk, we should use ‘self.weight’ and ‘self.bias’ rather than ‘self.weight.data’ and ‘self.bias.data’ if we want gradients existed for BP

I could not understand the meaning of the formula

y_k = \sum_{i, j} W_{ijk} x_i x_j

which computes a tensor reduction.
I don’t know the shape of the inputs and output.

1 Like

this set of exercise is very difficult to understand personally? any leads on these?

Exercises
Design a layer that takes an input and computes a tensor reduction, i.e., it returns yk =
i,j Wijkxixj .
2. Design a layer that returns the leading half of the Fourier coefficients of the data.

Exercises and my silly answers

  1. Design a layer that takes an input and computes a tensor reduction, i.e., it returns yk = i,j Wijkxixj .
  • Not sure what is expected but is it the answer?
class LayerOne(nn.Module):
    def __init__(self, first, second):
        super().__init__()
        self.weight = nn.Parameter(torch.randn(first, second))
        self.bias = nn.Parameter(torch.randn(second))
    
    def forward(self, X1, X2):
        out = torch.matmul(X1, self.weight)
        out = torch.matmul(out, X2)
        return F.relu(out)
  1. Design a layer that returns the leading half of the Fourier coefficients of the data.
1 Like

I do not very understand the meaning of exercise1. I try to realize the question as:
x is a vector, there are k matrices(weights) for producing k y_i, i for 1 to k.

In the perspective of Linear Algebra, if x is a column vector, y_k = x.T (W_k x). In machine learning, we often use the row vector as a datum, so I try to implement this algorithm like this:

class Layer(nn.Module):
    def __init__(self, N_X):
        super().__init__()
        self.weight = nn.Parameter(torch.randn(N_X, N_X, N_X))
    
    def forward(self, x):
        y = torch.zeros_like(x)
        for k in range(x.shape[-1]):
            temp = torch.matmul(x, self.weight[k]) @ x.T
            y[:, k] = temp.diagonal()
        return y

Another implementation:

def forward(self, x):
    y = torch.zeros_like(x)
    temp = []
    for k in range(x.shape[-1]):
        temp.append(torch.matmul(x, self.weight[k]).unsqueeze(0))
        
    XW = torch.cat(temp, 0).permute((1, 0, 2))
    return torch.bmm(XW, x.unsqueeze(-1)).squeeze(-1)

I think there is still more room for improvement. Welcome to discuss!