# 线性回归的从零开始实现

2 Likes

sgd函数没有返回值，里面的变量都是临时变量

param -= lr * param.grad / batch_size

4 Likes

4 Likes

python函数传递的是一个引用，你可以当作C里的指针，再看看python的基础语法吧

2 Likes

20 Likes

“-=”操作符会调用__isub__函数，而"-"操作符会调用__sub__函数，一般对于可变对象来说“-=”操作符会直接改变self自身。对于pytorch来说，应该会调用sub_函数（找了一下源码，翻到C源码，没找到具体代码在哪）。

``````import torch

x1 = 1
x2 = 2
params = [x1, x2]
for p in params:
print(id(p), id(x1), id(x2))
p -= 4
print(id(p), id(x1), id(x2))
print(params)

x1 = torch.Tensor([1])
x2 = torch.Tensor([2])
params = [x1, x2]
for p in params:
print(id(p), id(x1), id(x2))
p -= 4
print(id(p), id(x1), id(x2))
print(params)
``````

``````9784896 9784896 9784928
9784768 9784896 9784928
9784928 9784896 9784928
9784800 9784896 9784928
[1, 2]
139752445458112 139752445458112 139752445458176
139752445458112 139752445458112 139752445458176
139752445458176 139752445458112 139752445458176
139752445458176 139752445458112 139752445458176
[tensor([-3.]), tensor([-2.])]
``````

p -= 4等价于p.sub_(4)。这个可变对象改变了自身。而若如vin100提到的写成p = p - 4则会调用构造函数，并返回一个新的变量，也就不可能作用到原先的“可变对象”。
int类没有发生就地变化是因为它是一个不可变对象。

25 Likes

param -= lr * param.grad / batch_size

2 Likes

2 Likes

Why it seems smaller batch_size, with all other parameters no changed, the loss is smaller?
With batch_size=5

epoch 1,loss 0.000701
epoch 2,loss 0.000053
epoch 3,loss 0.000051
w has diff: tensor([-1.3564e-03, -1.0967e-05, 1.7130e-04, -4.7803e-05],

With batch_size=10

epoch 1,loss 0.029642
epoch 2,loss 0.000512
epoch 3,loss 0.000059
w has diff: tensor([-9.8300e-04, -6.9952e-04, -1.5647e-03, -8.0585e-05],

With batch_size=50

epoch 1,loss 0.864528
epoch 2,loss 0.387332
epoch 3,loss 0.174557
w has diff: tensor([-0.0904, -0.1257, -0.1239, -0.1213], grad_fn=)

With batch_size=100
epoch 1,loss 1.446548
epoch 2,loss 0.911614
epoch 3,loss 0.626752
w has diff: tensor([-0.1589, -0.1924, -0.2357, -0.2119], grad_fn=)

The total training should be the same, batch_size * data division = total training data.
For batch_size=100, we have 10 data divisions, but still 1000 training samples.

3 Likes

When you select a larger batch_size the total number the parameter updates shrink sharply. the model is under-fitting. you should try to change epoch to a large number so that the loss arrived at the optimum.

2 Likes

6 Likes

true_w = [2, -3.4] 请问这是什么意思吖

1 Like

for X, y in data_iter(batch_size, features, labels):
print(X, ‘\n’, y)
break

2 Likes

train_l = loss(net(features, w, b), labels)

params: param -= lr * param.grad / batch_size

3 Likes

w(weights)是权重，这里是设置真实权重

1 Like
``````net = linreg
loss = squared_loss
``````

3.2.7 有定义哈

3 Likes