-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply BRIO to other generation tasks #9
Comments
Hi, thank you for your interest in our work. I'd recommend several things:
Please let me know if you have more questions. Good luck! |
Hi, @yixinL7 , could you please share some insights about the |
BTW, how to come up with this eval function for different dataset ? Any criterion ? So much thanks for your amazing work ! Lines 411 to 416 in 135f0e5
|
Thank you for your advice and kind words! |
Hi, @HillZhang1999 . I think this may help: |
@Hannibal046 Thanks a lot! |
Thanks @Hannibal046 for the comment. Hi @HillZhang1999, I'm not very familiar with GEC but I think your observation makes sense. It's very critical to have diverse candidates to make sure the model can learn something meaningful. I'd recommend you try some other decoding algorithms. There are actually several new papers about this. To give some examples: |
Dear Yixin, thank you for your help, i will check it out. |
Hi, thanks for this fantastic work.
Here is my question: I try to use BRIO in another generation task and re-implement it in Fairseq. However, I find that the performance is relatively poor after incorporating BRIO.
I look further into the generation results and find that many results are just a single period. Moreover, the distribution of scores of candidates seems to be isotropic after training with the contrastive loss (I set the hyper-parameters following the CNN setting in your paper), such as the example shown below:
Can you give me any advice?
The text was updated successfully, but these errors were encountered: