Predicting House Prices on Kaggle

mli · May 31, 2020, 2:50am

https://d2l.ai/chapter_multilayer-perceptrons/kaggle-house-price.html

Irma_Ravkic · July 22, 2020, 11:10pm

Hi, when doing standardization one would need to first calculate mean and std of the train set, and then use that mean and std to standardize the test set. Otherwise you have information leakage from training data to test data.

goldpiggy · August 6, 2020, 5:58pm

Hi @Irma_Ravkic, great catch! I agree with you about the information leak! Would you like to be a contributor?

Irma_Ravkic · August 6, 2020, 11:30pm

Thanks. Yes, sure, I can change that section (and if I see something else on the way).

Irma

gpk2000 · September 6, 2020, 10:02am

Is @Irma_Ravkic suggestion implemented?

gpk2000 · September 6, 2020, 11:23am

When I try to use sgd instead of Adam, I get nan as the rmse value at the end. I ran the code using the google colab link provided so there is no implementation problem from my side. Why doesn’t sgd work here?

goldpiggy · September 8, 2020, 8:10pm

Hi @gpk2000, great question. For gradients with significant variance, we may encounter issues with divergence. That is why you saw the NAN at the end. Adam and other optimization methods alleviate the problem: https://d2l.ai/chapter_optimization/adam.html.

smizerex · September 12, 2020, 5:45pm

In both the pdf and colab version it seems that this issue was fixed and executed in the “Data Preprocessing” passage of this section, is that true?

six · February 2, 2021, 9:58pm

What do you think about:

?

min_xu · April 13, 2021, 6:08am

for the information leak problem, i change the code as follow:

Xarran_Bs · January 10, 2022, 5:33pm

I submitted to Kaggle, am I doing good?

ShibaK · March 12, 2022, 5:14pm

I want to tune hyper-parameter, why I use for loop but only get one picture from d2l.plot?