Add script to fetch PR review comments #1722

jonsimantov · 2025-06-07T02:03:23Z

This commit introduces a new script scripts/gha/get_pr_review_comments.py that allows you to fetch review comments from a specified GitHub Pull Request. The comments are formatted to include the commenter, file path, line number, diff hunk, and the comment body, making it easy to copy and paste for review.

The script utilizes a new function get_pull_request_review_comments added to the existing scripts/gha/firebase_github.py library. This new function handles fetching line-specific comments from the GitHub API, including pagination.

The script takes a PR number as a required argument and can optionally take repository owner, repository name, and GitHub token as arguments, with the token also being configurable via the GITHUB_TOKEN environment variable.

Description

Provide details of the change, and generalize the change in the PR title above.

Testing

Describe how you've tested these changes. Link any manually triggered Integration tests or CPP binary SDK Packaging Github Action workflows, if applicable.

Type of Change

Place an x the applicable box:

Bug fix. Add the issue # below if applicable.
New feature. A non-breaking change which adds functionality.
Other, such as a build process or documentation change.

Notes

Bug fixes and feature changes require an update to the Release Notes section of release_build_files/readme.md.
Read the contribution guidelines CONTRIBUTING.md.
Changes to the public API require an internal API review. If you'd like to help us make Firebase APIs better, please propose your change in a feature request so that we can discuss it together.

This commit introduces a new script `scripts/gha/get_pr_review_comments.py` that allows you to fetch review comments from a specified GitHub Pull Request. The comments are formatted to include the commenter, file path, line number, diff hunk, and the comment body, making it easy to paste into me for review. The script utilizes a new function `get_pull_request_review_comments` added to the existing `scripts/gha/firebase_github.py` library. This new function handles fetching line-specific comments from the GitHub API, including pagination. The script takes a PR number as a required argument and can optionally take repository owner, repository name, and GitHub token as arguments, with the token also being configurable via the GITHUB_TOKEN environment variable.

scripts/gha/get_pr_review_comments.py

This commit significantly enhances the `scripts/gha/get_pr_review_comments.py` script and its underlying library function in `scripts/gha/firebase_github.py`. Key improvements include: - Copyright year updated to 2025. - Output now includes comment ID, in_reply_to_id (if applicable), and the comment creation timestamp. - Comments are marked as [OUTDATED] if their diff position is no longer current (i.e., API 'position' field is null). - Added a `--context-lines <N>` argument (default 10) to control the amount of diff hunk context displayed. N=0 shows the full hunk. - Introduced a `--since <ISO_8601_timestamp>` argument to filter comments, showing only those created at or after the specified time. The `get_pull_request_review_comments` function in the library was updated to support this `since` parameter in the API call. These changes provide more comprehensive comment information and allow for better control over the data fetched, making it more useful for reviewing and addressing PR feedback, especially in complex PRs with multiple review rounds.

This commit fixes an IndentationError in the `scripts/gha/get_pr_review_comments.py` script. The error was caused by a malformed comment on the final print statement within the main loop. The stray comment has been removed and the print statement's newline character has been ensured. This resolves the syntax error and allows the script to be parsed and executed correctly.

This commit fixes an issue in `scripts/gha/get_pr_review_comments.py` where the `--context-lines` argument did not correctly suppress the full diff hunk for comments not associated with a specific line (i.e., where the API 'position' field is null). The `print_contextual_diff_hunk` function has been updated to: - Print an explanatory message instead of the full diff hunk when `--context-lines > 0` and the comment's position is null or invalid. - Retain the behavior of printing the full diff hunk if `--context-lines 0` is specified. - A redundant line in the context calculation logic was also removed. This ensures that setting a context limit via `--context-lines` will not unexpectedly display full diff hunks for file-level or other non-line-specific comments.

scripts/gha/get_pr_review_comments.py

This commit refactors the `scripts/gha/get_pr_review_comments.py` script to simplify its output and add new filtering capabilities, based on your feedback. Key changes: - Diff Hunk Display: The complex contextual diff hunk display has been removed. The script now either shows the full diff hunk (if `--context-lines 0`) or the last N lines of the diff hunk (if `--context-lines N > 0`). The `print_contextual_diff_hunk` function was removed, and this logic is now inline. - Skip Outdated Comments: A new `--skip-outdated` flag allows you to exclude comments marked as [OUTDATED] from the output. - Line Number Display: For [OUTDATED] comments, the script now prefers `original_line` for the "Line in File Diff" field, falling back to `line`, then "N/A". - Metadata: Continues to display comment ID, reply ID, timestamp, status, user, file, URL, and body. These changes aim to make the script easier to maintain and its output more predictable, while still providing essential information and filtering options for reviewing PR comments.

scripts/gha/get_pr_review_comments.py

This commit applies two minor updates to the `scripts/gha/get_pr_review_comments.py` script: 1. The script's description in the command-line help (argparse) has been made more generic, changing from "format for use with me" to "format into a simple text output". 2. The diff hunk context displayed for each comment is now enclosed in triple backticks (```) to ensure it's rendered correctly as a preformatted code block in Markdown environments. These changes improve the script's general usability and the presentation of its output.

This commit modifies the "suggest next command" feature in `scripts/gha/get_pr_review_comments.py`. The time added to the last processed comment's timestamp (for the `--since` parameter in the suggested command) has been changed from 1 second to 2 seconds. This adjustment provides a slightly larger buffer to more reliably exclude already seen comments when fetching subsequent comments, addressing potential timestamp granularity or query resolution behavior observed with the GitHub API. The `since` parameter for the relevant API endpoint filters by `created_at`, and this change is a heuristic improvement for that existing logic.

scripts/gha/get_pr_review_comments.py

This commit applies minor textual updates to the `scripts/gha/get_pr_review_comments.py` script: - Removed an explanatory comment from the `import firebase_github` line for a cleaner import block. - Refined the script's description in the command-line help text for slightly improved conciseness (removed an article "a").

This commit updates the `scripts/gha/get_pr_review_comments.py` script to format its entire output using Markdown. This significantly improves the readability and structure of the comment data when pasted into Markdown-aware systems. Changes include: - Comment attribution (user, ID, reply ID) is now an H3 heading with bolding and code formatting. - Metadata (Timestamp, Status, File, Line, URL) is presented as a Markdown bulleted list with bold labels and appropriate formatting for values (code ticks, links). - "Diff Hunk Context" and "Comment Body" are now H4 headings. - The diff hunk itself remains wrapped in triple backticks for code block rendering. - A Markdown horizontal rule (---) is used to separate individual comments. These changes make the script's output more organized and easier to parse visually.

This commit refines the Markdown heading structure in the output of `scripts/gha/get_pr_review_comments.py` for improved readability and document hierarchy. Changes include: - The main output title "Review Comments" is now an H1 heading. - Each comment's attribution line (user, ID) is now an H2 heading. - Section headings within each comment, "Context" (formerly "Diff Hunk Context") and "Comment" (formerly "Comment Body"), are now H3 headings. These changes make the script's output more organized and easier to navigate when rendered as Markdown.

scripts/gha/get_pr_review_comments.py

This commit applies final readability adjustments to the output of `scripts/gha/get_pr_review_comments.py`: - The default value for the `--context-lines` argument has been changed from 10 to 0. This means the full diff hunk will be displayed by default, aligning with your feedback preferring more context initially unless otherwise specified. The help text for this argument has also been updated. - Markdown Spacing: - An additional newline is added after the main H1 title ("# Review Comments") for better separation. - A newline is added before the "### Context:" H3 subheading to separate it from the metadata list. - A newline is added before the "### Comment:" H3 subheading to separate it from the diff hunk block. These changes further refine the script's output for clarity and your experience.

This commit introduces a more granular system for classifying and filtering pull request review comments in the `scripts/gha/get_pr_review_comments.py` script. New Comment Statuses: - `[IRRELEVANT]`: Comment's original diff position is lost (`position` is null). Displays `original_line`. - `[OLD]`: Comment is anchored to the diff, but its line number has changed (`line` != `original_line`). Displays current `line`. - `[CURRENT]`: Comment is anchored and its line number is unchanged. Displays current `line`. New Command-Line Flags: - `--exclude-old` (default False): If set, hides `[OLD]` comments. - `--include-irrelevant` (default False): If set, shows `[IRRELEVANT]` comments (which are hidden by default). - The old `--skip-outdated` flag has been removed. Default Behavior: - Shows `[CURRENT]` and `[OLD]` comments. - Hides `[IRRELEVANT]` comments. This provides you with more precise control over which comments are displayed, improving the script's utility for various review workflows. The "suggest next command" feature correctly interacts with these new filters, only considering non-skipped comments for its timestamp calculation.

scripts/gha/get_pr_review_comments.py

This commit enhances `scripts/gha/get_pr_review_comments.py` in two ways: 1. Suggested Command: The "suggest next command" feature now prepends `sys.executable` to the command. This ensures that the suggested command uses the same Python interpreter that the script was originally run with, making it more robust across different environments or if a specific interpreter was used. 2. Diff Hunk Context Display: - The default for `--context-lines` is now 10 (reverted from 0). - When `--context-lines > 0`, the script will first print the diff hunk header line (if it starts with "@@ "). - It will then print the last N (`args.context_lines`) lines from the *remainder* of the hunk. This ensures the header is shown for context, and then the trailing lines of that hunk are displayed, avoiding double-printing of the header if it would have naturally fallen into the "last N lines" of the full hunk. - If `--context-lines == 0`, the full hunk is displayed.

scripts/gha/get_pr_review_comments.py

This commit makes a minor stylistic refactoring in the `scripts/gha/get_pr_review_comments.py` script. When displaying the trailing lines of a diff hunk (for `--context-lines > 0`), the script now uses `print("\n".join(lines))` instead of a `for` loop with `print()` for each line. This change achieves the same visual output but is more concise and Pythonic for joining and printing a list of strings as multiple lines.

… (updated_at) This commit modifies `scripts/gha/get_pr_review_comments.py` to correctly use `updated_at` timestamps for its `--since` filtering and the "suggest next command" feature. This aligns with the observed behavior of the GitHub API endpoint for listing pull request review comments, where the `since` parameter filters by update time rather than creation time (contrary to some initial documentation interpretations for this specific endpoint). Changes include: - The "suggest next command" feature now tracks the maximum `updated_at` timestamp from processed comments to calculate the `--since` value for the next suggested command. - The help text for the `--since` argument has been updated to clarify it filters by "updated at or after". - The informational message printed to stderr when the `--since` filter is active now also states "updated since". - The `created_at` timestamp continues to be displayed for each comment for informational purposes.

scripts/gha/get_pr_review_comments.py

This commit makes a minor stylistic refactoring in the `scripts/gha/get_pr_review_comments.py` script. When displaying the trailing lines of a diff hunk (for `--context-lines > 0`, after the header line is potentially printed and removed from the `hunk_lines` list), the script now uses `print("\n".join(hunk_lines[-args.context_lines:]))` instead of explicitly creating a sub-list and then looping through it with `print()` for each line. This change achieves the same visual output (printing a newline if `hunk_lines` becomes empty after header removal) but is more concise.