Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About training time #13

Open
EnghishYang opened this issue Aug 6, 2022 · 2 comments
Open

About training time #13

EnghishYang opened this issue Aug 6, 2022 · 2 comments

Comments

@EnghishYang
Copy link

Hi,

Thank you for the good work.

As you stated in "Implementation Details": "We use 4 NVIDIA RTX 3090 GPUs for the model training, and the average running time for one epoch is around 20 hours." What is the actual time for the overall training of your entire model? Because you set epochs to 100 in the config file (eg xsum dataset), it will take a very long time(20h*100?) to train these epochs.

Thank you very much!

@yixinL7
Copy link
Owner

yixinL7 commented Aug 6, 2022

Hi, the 100 epoch is just a default number. On CNN/DM the model actually reaches the best performance within one epoch. On XSum, it reaches the best performance within 5 epochs.

@EnghishYang
Copy link
Author

Thank you very much for your reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants