Continue Discussion 10 replies
Dec '20

StevenJokess

Still suck at GAN to tensorflow…
Anyone helps?

Apr '21

StevenJokess

https://www.heywhale.com/mw/project/6062924694a58b00178fc4d6

Apr '21

StevenJokess

TF finished!
http://preview.d2l.ai/d2l-en/PR-1716/chapter_generative-adversarial-networks/gan.html

Jun '21

telegnosis

loss = nn.BCEWithLogitsLoss(reduction='sum')

With this loss where reduction is ‘sum’, I think the model does not consider the data size(batch size) in gradient descent.

Isn’t it better to use
loss = nn.BCEWithLogitsLoss(reduction='mean') with
metric.add(update_D(X, Z, net_D, net_G, loss, trainer_D)* batch_size ,update_G(Z, net_D, net_G, loss, trainer_G)* batch_size, batch_size)

If it is right, Could I commit this change?

Thanks

1 reply
Jun '21 ▶ telegnosis

telegnosis

@goldpiggy @astonzhang Thanks in advance :slight_smile:

Jul '21

Liam

If the generator does a perfect job, then 𝐷(𝐱′)≈1D(x′)≈1 so the above loss near 0, which results the gradients are too small to make a good progress for the discriminator. So commonly we minimize the following loss:

which is just feed 𝐱′=𝐺(𝐳)x′=G(z) into the discriminator but giving label 𝑦=1y=1.

The sentences above look quite confusing to me… There might be some grammatical errors in there…
Could you please rewrite these so they look coherent and natural?

1 reply
Jul '22

Todd_Northward

This sentence also confuses me a lot. Authors, please, correct this. Or at least specify the source of this sentence?

Oct '22

zincyxsnow

Same to me, I don’t understand why the above loss is near 0 and the gradients become small?

1 reply
Mar '23

omarfarooq47

The equation states that we need the parameters of the generator that maximize the loss, and the parameters of the discriminator that minimize the loss. However in the loss plot we see that the discriminator loss increases, while the generator loss decreases. Can someone please clarify?

As the model gets trained, the discriminator loss decreases as it is increasingly being fooled by the generator. The generator loss decreases as the the generator outputs are increasingly being predicted as 1 (Loss_G = loss(D(G(x)), ones)), so we have smaller and smaller loss_G

4 Feb ▶ zincyxsnow

Denis_Kazakov

There is an error in the text. For the loss function 20.1.1 to work, D should be the probability that the data is real. And this is what the original article by Goodfellow, et.al. says. So if the discriminator does a good job D(G(z)) should be close to 0, not 1. Then the log will be close to 0 and gradients will be small.