Continue Discussion 11 replies
Jul '20

Irma_Ravkic

Hi, when doing standardization one would need to first calculate mean and std of the train set, and then use that mean and std to standardize the test set. Otherwise you have information leakage from training data to test data.

1 reply
Aug '20 ▶ Irma_Ravkic

goldpiggy

Hi @Irma_Ravkic, great catch! I agree with you about the information leak! Would you like to be a contributor?

2 replies
Aug '20

Irma_Ravkic

Thanks. Yes, sure, I can change that section (and if I see something else on the way).

Irma

Sep '20

gpk2000

Is @Irma_Ravkic suggestion implemented?

Sep '20

gpk2000

When I try to use sgd instead of Adam, I get nan as the rmse value at the end. I ran the code using the google colab link provided so there is no implementation problem from my side. Why doesn’t sgd work here?

1 reply
Sep '20 ▶ gpk2000

goldpiggy

Hi @gpk2000, great question. For gradients with significant variance, we may encounter issues with divergence. That is why you saw the NAN at the end. Adam and other optimization methods alleviate the problem: https://d2l.ai/chapter_optimization/adam.html.

Sep '20 ▶ goldpiggy

smizerex

In both the pdf and colab version it seems that this issue was fixed and executed in the “Data Preprocessing” passage of this section, is that true?

Feb '21

six

What do you think about:


?
Apr '21

min_xu

for the information leak problem, i change the code as follow:

Jan '22

Xarran_Bs

I submitted to Kaggle, am I doing good?

Mar '22

ShibaK

I want to tune hyper-parameter, why I use for loop but only get one picture from d2l.plot?