About the acc and loss of training #1

SherrySky97 · 2019-11-25T03:02:18Z

Hi，
I am recreating your network and I have a problem with training.
To save time, I reduced the number of enhanced pictures by rotation to 4 and reduced the number of filters to 32.
When I trained for 200 epochs, my results are as follows:
D_loss acc G_loss
0.004051 | 100 | 6.248324 | 2 days, 12:04:16.416641
0.003934 | 100 | 5.108233 | 2 days, 12:04:35.168347
0.010535 | 100 | 4.973368 | 2 days, 12:04:53.246595
0.004152 | 100 | 5.723831 | 2 days, 12:05:10.234157
0.006562 | 100 | 5.827019 | 2 days, 12:05:28.685273
0.003435 | 100 | 5.879334 | 2 days, 12:05:46.087877
I want to know if your results are similar to me.
Because GAN expects the acc of D to be 50%, and the loss of G is as small as possible，this result does not meet expectations.
Do you have a way to optimize your network?
Your work has helped me a lot.
Thanks,
Best wishes.

ymirsky · 2019-11-26T18:24:47Z

You may find that the acc will go to 100%, and that is fine for a short while. Just keep an eye on the outputs to be sure there is no collapse (strange pixels or artifacts). 200 epochs may be too much unless you increase your dataset size with many more distinct samples.

If you require the pre-trained models I can share them with you for research only. Please email me and tell me which university + lab you are affiliated with.

SherrySky97 · 2019-12-11T11:28:33Z

Hi，
Do you train your program on the GPU?
When I trained I found that Volatile GPU-Util was always 0, although each GPU is occupied 161MiB .
And my training is very slow.
It takes 8h to train an epoch.
Do you have a way to speed up training?
Thanks,
Best wishes.

ymirsky · 2019-12-11T11:44:27Z

Yes, testing was performed on a GPU. Check to make sure the config.py has the GPU set to the right of (eg "0") and not "" which means CPU.

…

On Wed, Dec 11, 2019, 20:28 SherrySky97 ***@***.***> wrote: Hi， Do you train your program on the GPU? When I trained I found that Volatile GPU-Util was always 0, although each GPU is occupied 161MiB . And my training is very slow. It takes 8h to train an epoch. Do you have a way to speed up training? Thanks, Best wishes. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1?email_source=notifications&email_token=ACYEV236MHPJ7MHUVJKQTT3QYDFGDA5CNFSM4JRCVGT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGSZGEA#issuecomment-564499216>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACYEV26CQRPGNZ3I2JWRGLDQYDFGDANCNFSM4JRCVGTQ> .

SherrySky97 · 2019-12-12T08:13:08Z

Thank you, now I can train the net on a GPU.
And I have some questions.

I find the output of discriminator is 2x2x2x1,

conv3d_5 (Conv3D) (None, 2, 2, 2, 1) 16385 batch_normalization_3[0][0]

but I think it should be 1x1x1x1(a number)
2. The adversarial loss ground truths on your net is
valid = np.zeros((batch_size,) + self.disc_patch)
fake = np.ones((batch_size,) + self.disc_patch)
but in others' net , it is
valid = np.ones((batch_size, 1))
fake = np.zeros((batch_size, 1))
Why the marks are reversed？
3. Why the lr of D(0.000001) is so small?
4. I wonder if you have used binary_crossentropy loss.
I found that most cgan use this loss in github.
5. And I wonder why you use the mse loss on D.
Can you help me solve these questions？
Thanks,
Best wishes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the acc and loss of training #1

About the acc and loss of training #1

SherrySky97 commented Nov 25, 2019

ymirsky commented Nov 26, 2019

SherrySky97 commented Dec 11, 2019

ymirsky commented Dec 11, 2019 via email

SherrySky97 commented Dec 12, 2019

About the acc and loss of training #1

About the acc and loss of training #1

Comments

SherrySky97 commented Nov 25, 2019

ymirsky commented Nov 26, 2019

SherrySky97 commented Dec 11, 2019

ymirsky commented Dec 11, 2019 via email

SherrySky97 commented Dec 12, 2019

conv3d_5 (Conv3D) (None, 2, 2, 2, 1) 16385 batch_normalization_3[0][0]