Data Preprocessing

If the author wants to suggest pandas, then they should invoke more of the pandas API

inputs = data[['NumRooms', 'Alley']] # dataframe
outputs = data['Price'] # series

Also, calling mean on a whole dataframe will call a future warning unless you specify the operation is on numeric data


It doesn’t affect these examples, but readers should be aware of this.

You can call them directly as a series then numpy array.

# convert column to tensor
array = data[column_name]
tensor = torch.tensor(array)

By assuming we only want to drop input columns:

data = pd.read_csv(data_file)
inputs, outputs = data.iloc[:, 0:2], data.iloc[:, 2]
nas = inputs.isna().astype(int)
column_index = nas.sum(axis = 0).argmax()
inputs = inputs.drop(inputs.columns[column_index], axis=1)

Two line code

missingMostColumnIndex = data.count().argmin()