谢谢老哥(zsbdzsbdzsbdzsbd
所以书里说的“#@save是特殊标记,会把语句保存在对d2l包中”是骗人的啊,还在奇怪这咋实现的
我在本地conda环境下运行notebook, 但是在最后一步训练的时候,ipynb的kernel似乎崩溃了,显示如下信息:
Kernel Restarting
The kernel for softmax-regression-concise.ipynb appears to have died. It will restart automatically.
请问运行这个notebook对硬件的要求是什么。我手上的笔记本有点老了。i7 6代处理器,24G内存,4G英伟达显卡,是否跑得动,谢谢
我也遇到这个问题,是要降低d2l的版本吗
在前一章节明明已经保存了train_ch3函数了,为什么在这一章节里还是module ‘d2l.torch’ has no attribute 'train_ch3’呢?我又尝试了重新保存再重新导入d2l包仍然不行。好奇怪。
d2l版本的问题,降低一下版本试试,我之前默认是1.0.3,也提示相关错误,现在换成0.17.0好了
d2l=1.0.3环境下运行代码 d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)
报错 module 'd2l.torch' has no attribute 'train_ch3'
似乎是新版本的包中缺少了 train_ch3
的相关实现,解决办法是在 d2l 的包中添加相关代码,也就是 3.6中的 evaluate_accurary(), train_epoch_ch3() 和 train_ch3(),具体步骤如下:
- 定位到环境下d2l包的位置,我使用conda虚拟环境,名称为d2l,python版本3.11,具体是在
/home/cutelemon6/miniconda3/envs/d2l/lib/python3.11/site-packages/d2l
- 删除原来的缓存
rm -r /home/cutelemon6/miniconda3/envs/d2l/lib/python3.11/site-packages/d2l/__pycache__
- 如果你使用pytorch,在
torch.py
中添加如下代码
def evaluate_accurary(net, data_iter: torch.utils.data.DataLoader):
if isinstance(net, torch.nn.Module):
net.eval()
metric = Accumulator(2)
with torch.no_grad():
for X, y in data_iter:
metric.add(accuracy(net(X), y), y.numel())
return metric[0] / metric[1]
def train_epoch_ch3(net, train_iter, loss, updater):
metrics = Accumulator(3)
if isinstance(net, torch.nn.Module):
net.train()
for X, y in train_iter:
y_hat = net(X)
l = loss(y_hat, y)
if isinstance(updater, torch.optim.Optimizer):
updater.zero_grad()
l.mean().backward()
updater.step()
else:
l.sum().backward()
updater(X.shape[0]) # number of X's samples
metrics.add(float(l.sum()), accuracy(y_hat, y), y.numel())
return metrics[0] / metrics[2], metrics[1] / metrics[2]
def train_ch3(net, train_iter, test_iter, loss, num_epochs, updater):
"""训练模型(定义见第3章)"""
animator = Animator(xlabel='epoch', xlim=[1, num_epochs], ylim=[0.3, 0.9],
legend=['train loss', 'train acc', 'test acc'])
for i in range(num_epochs):
train_metrics = train_epoch_ch3(net, train_iter, loss, updater)
test_acc = evaluate_accurary(net, test_iter)
animator.add(i + 1, train_metrics + (test_acc,))
train_loss, train_acc = train_metrics
assert train_loss < 0.5, train_loss
assert train_acc <= 1 and train_acc > 0.7, train_acc
assert test_acc <= 1 and test_acc > 0.7, test_acc
- 重启jupyter notebook kernel,全部运行代码即可
Hi, the reason why the trainloss is not displayed on the image is because the value of the loss is too large or the value of the loss is too small. You can check the code that calculates the loss to see if the loss added in is too large or too small.