# 线性回归

“在高斯噪声的假设下，最小化均方误差等价于对线性模型的极大似然估计。”

3.1.9 → 3.1.10 推导过程

``````for epoch in range(num_epochs):
for X, y in data_iter(batch_size, features, labels):
l = loss(net(X, w, b), y)  # X和y的小批量损失
# 因为l形状是(batch_size,1)，而不是一个标量。l中的所有元素被加到一起，
# 并以此计算关于[w,b]的梯度
l.sum().backward()
sgd([w, b], lr, batch_size)  # 使用参数的梯度更新参数
train_l = loss(net(features, w, b), labels)
print(f'epoch {epoch + 1}, loss {float(train_l.mean()):f}')
``````

RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

torch就是所谓的PyTorch框架本身，只是名字叫torch。

from d2l import torch as d2l就是调用d2l包里的torch部分（跟torch框架本身是两回事），相当于import d2l.torch as d2l（别名）；

2.4 应该是因为有的问题无法用凸优化方法解决，故采用梯度下降来求解。

“import torch”：导入pytorch
“from d2l import torch as d2l”：导入pytorch版本的d2l，因为d2l基于不同的框架，有很多个版本（还有tensorflow版，paddle版，等等）