@oliver PTAL at my PR here which can probably explain your doubt. I’ve added comments to the code for making it clear.
Let me know if it is still unclear.
@oliver PTAL at my PR here which can probably explain your doubt. I’ve added comments to the code for making it clear.
Let me know if it is still unclear.
Thanks for your reply but I still don’t get it, I think .backward()
method has default argument torch.tensor(1)
for scalar, but when it is called by non-scalar, argument is required, am I right? What’s difference between built-in modules and custom ones?
Hi @oliver, Sorry for the late reply.
The inbuilt loss criterion in PyTorch used here automatically reduces the loss to a scalar value using the argument reduction
= “mean”/“sum” (default is mean). You can check this out here. For our custom loss we need to achieve the same reduction and hence we do a l.sum()
before calling backward()
.
I hope this will clarify the doubt.
It’s very fun to study with this material. It’s quite amazing, a lot of good stuff.
I wanted to ask:
3.6.9
Solution 3.)
How to over come the problem of overflow for the softmax probabilities. Since, we’re dealing with exponencial function, we normalize it all. I mean to take z_i = x_i - mu(x_i) / std(x_i) and plug it into the exponential function so we can compute exp(x_i) without overflow.
Dear authors, I don’t understand the use of enumerate in the loop of function evaluate_accuracy
Why don’t use for X, y in data_iter: instead?
@goldpiggy in accuracy function why don’t we simply use the mean instead of using the sum and divide by the length later?
i mean use tf.math.reduce_mean(cmp.type(y.dtype))
Great question. Here we just want to align the TF implementation with the others.
Hi,
def train_epoch_ch3(net, train_iter, loss, updater): #@save
“”“The training loop defined in Chapter 3.”""
# Set the model to training mode
if isinstance(net, torch.nn.Module):
net.train()
# Sum of training loss, sum of training accuracy, no. of examples
metric = Accumulator(3)
for X, y in train_iter:
# Compute gradients and update parameters
y_hat = net(X)
l = loss(y_hat, y)
if isinstance(updater, torch.optim.Optimizer):
# Using PyTorch in-built optimizer & loss criterion
updater.zero_grad()
l.backward()
updater.step()
metric.add(float(l) * len(y), accuracy(y_hat, y),
y.size().numel())
else:
# Using custom built optimizer & loss criterion
l.sum().backward()
updater(X.shape[0])
metric.add(float(l.sum()), accuracy(y_hat, y), y.numel())
# Return training loss and training accuracy
return metric[0] / metric[2], metric[1] / metric[2]
In the above code, it looks good to use custom updater. But it will raise error if we use the inbuilt optimizer since the loss function should be using the inbuilt loss function as well. Please update the block for better clarification.
what if ‘isinstance(net, torch.nn.Module)’ return False? The eval mode willn’t be set.
Hi, in the PyTorch implementation of train_epoch_ch3
, is there any reason why we are using y.size().numel()
instead of simply y.numel()
?
if isinstance(updater, torch.optim.Optimizer):
...
metric.add(float(l) * len(y), accuracy(y_hat, y),
y.size().numel())
y hat means the output(
non-linear activation function(bias+linear combination of inputs)
)
@StevenJokess
if isinstance(net, torch.nn.Module):
net is a python function this line will return False
my answers discussion
y_hat[0] return the first tensor[0.1,0.3,0.6], and the y_hat[1] returns the second. We want the possibility of the true label of each tensor 0.1 and 0.5 returned, which is [0, 2] therefore the y. So y_hat[[0,1], [0,2]] returns the first posibility of the first tensor and the third posibility of the second tensor
@anirudh Why do we always check for the instance type (torch.nn.Module
) of net
? For the training and evaluation method we can just directly use net.train()
and net.eval()
respectively. Am I missing something here? Thanks!
Hi @Debanjan_Das,
We also have a few models which are built from scratch and those models do not have the train
or eval
attribute since they are not subclassing nn.Module
. This is just a way to reuse the saved functions making them compatible with scratch and concise versions of PyTorch code.
How to change the range of the y axis?
I tried to change ylim in the train_ch3 function, but it didn’t work.
def train_ch3(net, train_iter, test_iter, loss, num_epochs, updater):
"""Train a model (defined in Chapter 3).
Defined in :numref:`sec_softmax_scratch`"""
animator = Animator(xlabel='epoch', xlim=[1, num_epochs], ylim=[0.0, 1.0],
legend=['train loss', 'train acc', 'test acc'])