-
Notifications
You must be signed in to change notification settings - Fork 464
Fix comment clarity in maybe_evaluate function in grpo_fast.py
#1159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @MohdElgaar, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on enhancing code readability by clarifying a crucial comment in the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly clarifies a comment regarding the timeout logic in the maybe_evaluate function. The change is accurate and improves readability. I have one suggestion to refactor the conditional logic itself to make it more direct and self-documenting, which would further enhance the code's clarity.
| """Optionally evaluate the model.""" | ||
| try: | ||
| # timeout 0.01 if this is the last training step or we're not evaluating | ||
| # timeout 0.01 if this is not the last training step or we're not evaluating |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing the comment, it's much clearer now. To further improve readability, consider refactoring the conditional logic on line 2565. The current logic is a bit inverted from how one might naturally read the condition. By flipping the if/else, the code can more directly express the primary case for the long timeout.
Suggestion:
Change line 2565 to:
timeout = 100 if training_step >= args.num_training_steps and args.local_eval_every >= 0 else 0.01This makes it explicit that the long timeout is used only on the last step when evaluation is enabled, which aligns nicely with the second line of the comment block.
Head branch was pushed to by a user without write access
| "{{ '<think>\\n\\n</think>\\n\\n' }}" | ||
| "{% endif %}" | ||
| "{% endif %}" | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Qwen3 Template: Undefined Variable Causes Runtime Error
The qwen3 chat template references ns.last_query_index at line 707 without initializing the ns namespace variable first. In Jinja2 templates, namespace variables must be declared using {% set ns = namespace(last_query_index=...) %} before they can be accessed. This will cause a runtime error when the template is used since ns is undefined.
… accurately reflect timeout logic.
The comment describing the timeout logic in
maybe_evaluatewas confusing because it didn't clearly state when the short timeout (0.01) is used. It could be read as saying we use the short timeout when it IS the last step, and otherwise should use a long timeout.Fixed by adding "not" to clarify: we use the short timeout when this is NOT the last training step or we're not evaluating. The long timeout (100) is only used at the last step.
Note
Clarifies the timeout comment in
maybe_evaluateto correctly state when the short (0.01) vs long (100) timeout is used.Written by Cursor Bugbot for commit b063ac9. This will update automatically on new commits. Configure here.