Predicting House Prices on Kaggle

wtffqbpl · January 15, 2025, 1:16am

The methods from np.ndarray to torch.tensor may be changed, so I use another method to convert np.ndarray to torch.tensor.

        train_features = torch.from_numpy(all_features[:n_train].values.astype(float))
        test_features = torch.from_numpy(all_features[n_train:].values.astype(float))
        train_labels = torch.from_numpy(self.train_data.SalePrice.values.reshape(-1, 1)).to(dtype=torch.float32)

zhang2023-byte · February 14, 2025, 7:46am

my exercise:

Average validation log mse=0.182 , score= 0.41115
for textbook naive linear regression;
No. Selection effect, house data in one side of price distribution may be hard to collect.
Tuned max_epochs=20, other hyper-parameters fixed. log mse=0.12, score=0.34241;
Tuned max_epochs=50, log mse=0.068, score=0.26531. increasing training did increased model’s performance;
or tuned lr=0.03, max_epoch=30, log mse=0.057, score=0.22289, but too large lr will decrease the performance of model.
MLP with one hidden layer (num_hiddens=256), lr=0.002, max_epoch=100, log mse=0.078, score= 0.27778, better than linear regression with same lr*max_epoch;
Add dropout: dropout=0.5, log mse=0.0944, score=0.2935, meaning our model still underfit?
Add Weight Decay: wd=0.1, log mse=0.0797, score=0.27613
No positive value, can’t log.