You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As you stated in "Implementation Details": "We use 4 NVIDIA RTX 3090 GPUs for the model training, and the average running time for one epoch is around 20 hours." What is the actual time for the overall training of your entire model? Because you set epochs to 100 in the config file (eg xsum dataset), it will take a very long time(20h*100?) to train these epochs.
Thank you very much!
The text was updated successfully, but these errors were encountered:
Hi, the 100 epoch is just a default number. On CNN/DM the model actually reaches the best performance within one epoch. On XSum, it reaches the best performance within 5 epochs.
Hi,
Thank you for the good work.
As you stated in "Implementation Details": "We use 4 NVIDIA RTX 3090 GPUs for the model training, and the average running time for one epoch is around 20 hours." What is the actual time for the overall training of your entire model? Because you set epochs to 100 in the config file (eg xsum dataset), it will take a very long time(20h*100?) to train these epochs.
Thank you very much!
The text was updated successfully, but these errors were encountered: