Customer Layers - pytorch - D2L Discussion

Mar '21

orkunozturk

Not a big deal but MyLinear should be named as MyDense in pytorch example for concordance with the text.

Jul '21

Aaron_L

In 5.4.2 pytorch code of first chunk, we should use ‘self.weight’ and ‘self.bias’ rather than ‘self.weight.data’ and ‘self.bias.data’ if we want gradients existed for BP

Jul '21

Wind

I could not understand the meaning of the formula

y_k = \sum_{i, j} W_{ijk} x_i x_j

which computes a tensor reduction.
I don’t know the shape of the inputs and output.

Aug '21

fanbyprinciple

this set of exercise is very difficult to understand personally? any leads on these?

Exercises
Design a layer that takes an input and computes a tensor reduction, i.e., it returns yk =
i,j Wijkxixj .
2. Design a layer that returns the leading half of the Fourier coefficients of the data.

Aug '21

fanbyprinciple

Exercises and my silly answers

Design a layer that takes an input and computes a tensor reduction, i.e., it returns yk = i,j Wijkxixj .

Not sure what is expected but is it the answer?

class LayerOne(nn.Module):
    def __init__(self, first, second):
        super().__init__()
        self.weight = nn.Parameter(torch.randn(first, second))
        self.bias = nn.Parameter(torch.randn(second))
    
    def forward(self, X1, X2):
        out = torch.matmul(X1, self.weight)
        out = torch.matmul(out, X2)
        return F.relu(out)

Design a layer that returns the leading half of the Fourier coefficients of the data.

These are fourier series but how to implement it
Learn About Fourier Coefficients - Technical Articles (allaboutcircuits.com)

Oct '21

Eli_Chen

I do not very understand the meaning of exercise1. I try to realize the question as:
x is a vector, there are k matrices(weights) for producing k y_i, i for 1 to k.

In the perspective of Linear Algebra, if x is a column vector, y_k = x.T (W_k x). In machine learning, we often use the row vector as a datum, so I try to implement this algorithm like this:

class Layer(nn.Module):
    def __init__(self, N_X):
        super().__init__()
        self.weight = nn.Parameter(torch.randn(N_X, N_X, N_X))
    
    def forward(self, x):
        y = torch.zeros_like(x)
        for k in range(x.shape[-1]):
            temp = torch.matmul(x, self.weight[k]) @ x.T
            y[:, k] = temp.diagonal()
        return y

Another implementation:

def forward(self, x):
    y = torch.zeros_like(x)
    temp = []
    for k in range(x.shape[-1]):
        temp.append(torch.matmul(x, self.weight[k]).unsqueeze(0))
        
    XW = torch.cat(temp, 0).permute((1, 0, 2))
    return torch.bmm(XW, x.unsqueeze(-1)).squeeze(-1)

I think there is still more room for improvement. Welcome to discuss!

Jan '22

guangye

Hello,

I agree with that for Exe1, the formula calculates the quadratic form of vector x (x.T * A * X), with respect to a specified number of square matrix A (indexed by k). I propose the following layer definition, taking the size of vector x and the number of matrix A as parameters:

class ReductionBlock(nn.Module):
    def __init__(self, size_in, size_out):
        super().__init__()
        self.weight = nn.Parameter(torch.randn(size_out, size_in, size_in))

    def forward(self, x):
        out = torch.matmul(x, self.weight)
        out = torch.matmul(out, x)
        return out


my_reduction = ReductionBlock(2, 3)
x = torch.ones(2)
y = my_reduction(x)
print(y)

Aug '23

pandalabme

My solutions to the exs: 6.5

18 Feb

zhang2023-byte

My exercise:

class MyReduction(nn.Module):
def init(self, in_units, units):
super().init()
self.weight = nn.Parameter(torch.randn(in_units, in_units, units))

def forward(self, X):
W_reduce_1 = torch.zeros_like(self.weight[0])
for i in range(self.weight.shape[1]):
for j in range(self.weight.shape[2]):
W_reduce_1[i][j] = self.weight[:, i, j] @ X
```
 W_reduce_2 = torch.zeros_like(W_reduce_1[0])
 for k in range(W_reduce_1.shape[1]):
     W_reduce_2[k] = W_reduce_1[:, k] @ X
 return W_reduce_2
```
TBD