Implementation of Multilayer Perceptrons

mli · May 31, 2020, 2:46am

https://d2l.ai/chapter_multilayer-perceptrons/mlp-implementation.html

StevenJokes · June 23, 2020, 2:55pm

1.When test accuaracy increases most quickly and high, can we say that this hyperparameter is the best value?

goldpiggy · June 23, 2020, 4:38pm

Unless you are sure the given optimization function is convex, we hardly ever say the “best” model or “best” set of hyperparameters.

Kushagra_Chaturvedy · July 7, 2020, 6:48am

def net(X):
X = X.reshape((-1, num_inputs))
H = relu(X@W1 + b1)   # Here '@' stands for dot product operation
return (H@W2 + b2)

In the last line shouldn’t we have applied the softmax function to the return value H@W2 + b2? Isn’t there a chance that this operation would yield a negative value or a value greater than 1?

When I do use the softmax function, the train loss dissapears and the accuracy suddenly drops to 0. What could be the cause behind this?

goldpiggy · July 7, 2020, 11:42pm

Hi @Kushagra_Chaturvedy,

In the last line shouldn’t we have applied the softmax function to the return value H@W2 + b2 ? Isn’t there a chance that this operation would yield a negative value or a value greater than 1?

We use the loss function to process the output values of net(X), so it does not need to yield a value in (0, 1).

When I do use the softmax function, the train loss dissapears and the accuracy suddenly drops to 0. What could be the cause behind this?

Could you show us the code so that we can reproduce the results?

Kushagra_Chaturvedy · July 9, 2020, 4:58am

Here is my code:

manuel-arno-korfmann · July 9, 2020, 4:12pm

Hey,

The CrossEntropyLoss function already computes the Softmax.

Also see: https://d2l.ai/chapter_linear-networks/softmax-regression-concise.html#softmax-implementation-revisited

Kushagra_Chaturvedy · July 15, 2020, 4:04am

Oh I see. Thanks @manuel-arno-korfmann

StevenJokes · July 15, 2020, 1:55pm

What’s your IDE? I’m curious. Thanks.

Xiaomut · August 17, 2020, 11:15am

I think it’s vscode with plugins about viewing notebook

StevenJokes · August 17, 2020, 1:16pm

What’s your IDE?
@Xiaomut
I am using vscode, and I can’t find this.
@Kushagra_Chaturvedy

ccpvirus · August 18, 2020, 5:48pm

how do you use the softmax during testing? since softmax is implemented in loss function, the output of net(x) doesn’t apply softmax to its output. and the result of the argmax is the max of logits not the probability. I wonder how do you use softmax when testing

StevenJokes · August 19, 2020, 6:12am

@ccpvirus
We use softmax to calculate probablility first, and then find the max probabillity one.

goldpiggy · August 19, 2020, 8:45pm

Hi @ccpvirus, as @StevenJokes mentioned we use the “maximum” value across the 10 class outputs as our final label. As softmax is just a “rescaling” function, it doesn’t affect whether a prediction output (i.e., a class lable) is or isn’t the maximum over all classes. Let me know whether this is clear to you.

Gavin · August 27, 2020, 3:15am

Dear all, may I know why we use torch.randn to initialize the parameter here instead of using torch.normal as in Softmax Regression implementation? Are there any advantages? Or actually there are no big differences, we can use both of them? Thanks.

W1 = nn.Parameter(torch.randn(
num_inputs, num_hiddens, requires_grad=True) * 0.01)

StevenJokes · August 27, 2020, 4:00am

@Gavin
For a standard normal distribution (i.e. mean=0 and variance=1 ), you can use torch.randn
For your case of custom mean and std , you can use torch.normal

Abinash_Sahu · September 4, 2020, 1:44am

Hello. Can you please advise why 0.01 is being multiplied after generating the random numbers?
W1 = nn.Parameter(torch.randn(
num_inputs, num_hiddens, requires_grad=True) * 0.01)

StevenJokes · September 4, 2020, 3:35am

let stddev = 0.1
@Abinash_Sahu

Luis_Ramirez · October 4, 2020, 10:35am

Hi, I have a question on the last question. Which would be a smart way to search over hyperparameters. Is it possible to apply GridSearch, or RandomGridSearch for hiperparameters “like” scikit learn -algorithms???
If not, then how to iterate through a set of hyperparameters?

Thanks in advance for awnssers

machine_machine · October 4, 2020, 2:51pm

Hi

How can i download book in html as like from website?
It is more interative than learning pdf book?
does d2l.ai book consists all course codes in pytorch?
Thanks, i am starting it