Generative Adversarial Networks

astonzhang · September 17, 2020, 5:09am

https://d2l.ai/chapter_generative-adversarial-networks/gan.html

==
The PyTorch adaptation of this section was initially contributed by @StevenJokess, as reviewed and revised by @anirudh. The PR may not show up on Git due to the suspended account of the former.

StevenJokess · December 29, 2020, 10:37am

Still suck at GAN to tensorflow…
Anyone helps?

github.com

StevenJokess/d2l-en-read/blob/moreme/chapter-generative-adversarial-networks/gan-tf6.ipynb

{
 "metadata": {
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.8-final"
  },
  "orig_nbformat": 2,
  "kernelspec": {
   "name": "python37864bittf2conda1de394c9651a4d19aa25641a40b8d2d8",
   "display_name": "Python 3.7.8 64-bit ('tf2': conda)",
   "language": "python"
  }

This file has been truncated. show original

StevenJokess · April 11, 2021, 3:34pm

https://www.heywhale.com/mw/project/6062924694a58b00178fc4d6

StevenJokess · April 11, 2021, 4:23pm

TF finished!
http://preview.d2l.ai/d2l-en/PR-1716/chapter_generative-adversarial-networks/gan.html

github.com

StevenJokess/d2l-en-read/blob/moreme/chapter-generative-adversarial-networks/gan-tf-ok/gan_tf6 (3).ipynb

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.8-final"
    },
    "orig_nbformat": 2,
    "kernelspec": {
      "name": "python37864bittf2conda1de394c9651a4d19aa25641a40b8d2d8",
      "display_name": "Python 3.7.8 64-bit ('tf2': conda)",

This file has been truncated. show original

telegnosis · June 1, 2021, 3:34am

loss = nn.BCEWithLogitsLoss(reduction='sum')

With this loss where reduction is ‘sum’, I think the model does not consider the data size(batch size) in gradient descent.

Isn’t it better to use
loss = nn.BCEWithLogitsLoss(reduction='mean') with
metric.add(update_D(X, Z, net_D, net_G, loss, trainer_D)* batch_size ,update_G(Z, net_D, net_G, loss, trainer_G)* batch_size, batch_size)

If it is right, Could I commit this change?

Thanks

telegnosis · June 2, 2021, 3:56am

@goldpiggy @astonzhang Thanks in advance

Liam · July 11, 2021, 1:08pm

If the generator does a perfect job, then 𝐷(𝐱′)≈1D(x′)≈1 so the above loss near 0, which results the gradients are too small to make a good progress for the discriminator. So commonly we minimize the following loss:
…
which is just feed 𝐱′=𝐺(𝐳)x′=G(z) into the discriminator but giving label 𝑦=1y=1.

The sentences above look quite confusing to me… There might be some grammatical errors in there…
Could you please rewrite these so they look coherent and natural?

Todd_Northward · July 18, 2022, 3:34am

This sentence also confuses me a lot. Authors, please, correct this. Or at least specify the source of this sentence?

zincyxsnow · October 3, 2022, 7:39am

Same to me, I don’t understand why the above loss is near 0 and the gradients become small?

omarfarooq47 · March 28, 2023, 9:38pm

The equation states that we need the parameters of the generator that maximize the loss, and the parameters of the discriminator that minimize the loss. However in the loss plot we see that the discriminator loss increases, while the generator loss decreases. Can someone please clarify?

As the model gets trained, the discriminator loss decreases as it is increasingly being fooled by the generator. The generator loss decreases as the the generator outputs are increasingly being predicted as 1 (Loss_G = loss(D(G(x)), ones)), so we have smaller and smaller loss_G

Denis_Kazakov · February 4, 2025, 11:32am

There is an error in the text. For the loss function 20.1.1 to work, D should be the probability that the data is real. And this is what the original article by Goodfellow, et.al. says. So if the discriminator does a good job D(G(z)) should be close to 0, not 1. Then the log will be close to 0 and gradients will be small.