diff --git a/.github/workflows/basic-tests.yml b/.github/workflows/basic-tests.yml index d197df3e74612..14f8b04225df7 100644 --- a/.github/workflows/basic-tests.yml +++ b/.github/workflows/basic-tests.yml @@ -316,7 +316,7 @@ jobs: run: > prek --all-files --show-diff-on-failure --color always --verbose - --hook-stage manual + --stage manual update-chart-dependencies if: always() # For UV we are not failing the upgrade installers check if it is updated because @@ -326,7 +326,7 @@ jobs: run: > prek --all-files --show-diff-on-failure --color always --verbose - --hook-stage manual upgrade-important-versions || true + --stage manual upgrade-important-versions || true if: always() env: UPGRADE_FLIT: "false" @@ -346,7 +346,7 @@ jobs: run: | if ! prek \ --all-files --show-diff-on-failure --color always --verbose \ - --hook-stage manual upgrade-important-versions; then + --stage manual upgrade-important-versions; then echo -e "\n\033[0;31mThere are changes in prek hooks after upgrade check.\033[0m" echo -e "\n\033[0;33mHow to fix:\033[0m Run \`breeze ci upgrade\` locally to fix it!.\n" exit 1 diff --git a/.github/workflows/ci-amd-arm.yml b/.github/workflows/ci-amd-arm.yml index 3643392e011b6..6ddff9f5aed97 100644 --- a/.github/workflows/ci-amd-arm.yml +++ b/.github/workflows/ci-amd-arm.yml @@ -210,7 +210,7 @@ jobs: save-cache: true - name: "Run pin-versions" run: | - if ! prek --all-files --verbose --hook-stage manual pin-versions; then + if ! prek --all-files --verbose --stage manual pin-versions; then echo -e "\n\033[0;31mThere are changes in actions hooks after upgrade check.\033[0m" echo -e "\n\033[0;33mHow to fix:\033[0m Run \`breeze ci upgrade\` locally to fix it!.\n" exit 1 diff --git a/.github/workflows/ci-image-checks.yml b/.github/workflows/ci-image-checks.yml index f4e302ab17036..fd8ea826aea47 100644 --- a/.github/workflows/ci-image-checks.yml +++ b/.github/workflows/ci-image-checks.yml @@ -206,7 +206,7 @@ jobs: platform: ${{ inputs.platform }} save-cache: false - name: "MyPy checks for ${{ matrix.mypy-check }}" - run: prek --color always --verbose --hook-stage manual "$MYPY_CHECK" --all-files + run: prek --color always --verbose --stage manual "$MYPY_CHECK" --all-files env: VERBOSE: "false" COLUMNS: "202" diff --git a/AGENTS.md b/AGENTS.md index 8427eacf54cff..50fc270d56b93 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -19,8 +19,8 @@ - **Type-check:** `breeze run mypy path/to/code` - **Lint with ruff only:** `prek run ruff --from-ref ` - **Format with ruff only:** `prek run ruff-format --from-ref ` -- **Run regular (fast) static checks:** `prek run --from-ref --hook-stage pre-commit` -- **Run manual (slower) checks:** `prek run --from-ref --hook-stage manual` +- **Run regular (fast) static checks:** `prek run --from-ref --stage pre-commit` +- **Run manual (slower) checks:** `prek run --from-ref --stage manual` `` is the branch the PR will be merged into — usually `main`, but could be `v3-1-test` when creating a PR for the 3.1 branch. @@ -114,9 +114,9 @@ code review checklist in [`.github/instructions/code-review.instructions.md`](.g API correctness, and AI-generated code signals. Fix any violations before pushing. 3. Confirm the code follows the project's coding standards and architecture boundaries described in this file. -4. Run regular (fast) static checks (`prek run --from-ref --hook-stage pre-commit`) +4. Run regular (fast) static checks (`prek run --from-ref --stage pre-commit`) and fix any failures. -5. Run manual (slower) checks (`prek run --from-ref --hook-stage manual`) and fix any failures. +5. Run manual (slower) checks (`prek run --from-ref --stage manual`) and fix any failures. 6. Run relevant tests (`breeze run pytest -xvs`) and confirm they pass. 7. Check for security issues — no secrets, no injection vulnerabilities, no unsafe patterns. diff --git a/INSTALL b/INSTALL index 0c264b9897c1e..cb3e045debdf3 100644 --- a/INSTALL +++ b/INSTALL @@ -131,7 +131,7 @@ In order to see UI in Airflow, you need to compile front-end assets first. In case you already installed `breeze` and `prek`, you can build the assets with the following commands: - prek --hook-stage manual compile-ui-assets --all-files + prek --stage manual compile-ui-assets --all-files or simply: diff --git a/airflow-core/hatch_build.py b/airflow-core/hatch_build.py index 018067a87a9ae..e199d4b70d343 100644 --- a/airflow-core/hatch_build.py +++ b/airflow-core/hatch_build.py @@ -65,7 +65,7 @@ def build_standard(self, directory: str, artifacts: Any, **build_data: Any) -> s self.write_git_version() # run this in the parent directory of the airflow-core (i.e. airflow repo root) work_dir = Path(self.root).parent.resolve() - cmd = ["prek", "run", "--hook-stage", "manual", "compile-ui-assets", "--all-files"] + cmd = ["prek", "run", "--stage", "manual", "compile-ui-assets", "--all-files"] log.warning("Running command: %s", " ".join(cmd)) run(cmd, cwd=work_dir.as_posix(), check=True) dist_path = Path(self.root) / "src" / "airflow" / "ui" / "dist" diff --git a/contributing-docs/05_pull_requests.rst b/contributing-docs/05_pull_requests.rst index f936210a8e2fb..639bb3659c94f 100644 --- a/contributing-docs/05_pull_requests.rst +++ b/contributing-docs/05_pull_requests.rst @@ -1,4 +1,4 @@ - +contributing-docs/05_pull_requests.rst .. Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information @@ -75,6 +75,15 @@ Pull Request guidelines Before you submit a Pull Request (PR) from your forked repo, check that it meets these guidelines: +- Start with **Draft**: Until you are sure that your PR passes all the quality checks and tests, keep it + in **Draft** status. This will signal to maintainers that the PR is not yet ready + for review and it will prevent maintainers from accidentally merging it before + it's ready. Once you are sure that your PR is ready for review, you can mark it as + "Ready for review" in the GitHub UI. Our regular check will convert all PRs from + non-collaborators that do not pass our quality gates to Draft status, so if you see + that your PR is in Draft status and you haven't set it to Draft. Check the + comments to see what needs to be fixed. + - Include tests, either as doctests, unit tests, or both, to your pull request. The Airflow repo uses `GitHub Actions `_ to @@ -147,6 +156,57 @@ these guidelines: - Adhere to guidelines for commit messages described in this `article `_. This makes the lives of those who come after you (and your future self) a lot easier. +.. _pull-request-quality-criteria: + +Pull Request quality criteria +----------------------------- + +Every open PR must meet the following minimum quality criteria before maintainers will review it. +PRs that do not meet these criteria may be automatically converted to **draft** status by project +tooling, with a comment explaining what needs to be fixed. + +1. **Descriptive title** — The PR title must clearly describe the change. + Generic titles such as "Fix bug", "Update code", "Changes", single-word titles, or titles + that only reference an issue number (e.g. "Fixes #12345") do not meet this bar. + +2. **Meaningful description** — The PR body must contain a meaningful description of *what* the + PR does and *why*. An empty body, a body consisting only of the PR template + checkboxes/headers with no added text, or a body that merely repeats the title is not + sufficient. + +3. **Passing static checks** — The PR's static checks (pre-commit / ruff / mypy) must pass. + You can run them locally with ``prek run --from-ref main`` before pushing. + +4. **Gen-AI disclosure** — If the PR was created with the assistance of generative AI tools, + the description must include a disclosure (see `Gen-AI Assisted contributions`_ below). + +5. **Coherent changes** — The PR should contain related changes only. Completely unrelated + changes bundled together will be flagged. + +**What happens when a PR is converted to draft?** + +- The comment informs you what you need to do. +- Fix each issue, then mark the PR as "Ready for review" in the GitHub UI - but only after making sure that all the issues are fixed. +- Maintainers will then proceed with a normal review. + +Converting a PR to draft is **not** a rejection — it is an invitation to bring the PR up to +the project's standards so that maintainer review time is spent productively. + +**What happens when a PR is closed for quality violations?** + +If a contributor has more than 3 open PRs that are flagged for quality issues, maintainers may +choose to **close** the PR instead of converting it to draft. Closed PRs receive the +``closed because of multiple quality violations`` label and a comment listing the violations. +Contributors are welcome to open a new PR that addresses the issues listed in the comment. + +**What happens when suspicious changes are detected?** + +When maintainers review a PR's diff before approving CI workflow runs and determine that it +contains suspicious changes (e.g. attempts to exfiltrate secrets, modify CI pipelines +maliciously, or inject harmful code), **all open PRs by the same author** will be closed +and labeled ``suspicious changes detected``. A comment is posted on each PR explaining that +the closure was triggered by suspicious changes found in the flagged PR. + Gen-AI Assisted contributions ----------------------------- diff --git a/contributing-docs/08_static_code_checks.rst b/contributing-docs/08_static_code_checks.rst index 1ddb14c4f9383..69e12196c50e0 100644 --- a/contributing-docs/08_static_code_checks.rst +++ b/contributing-docs/08_static_code_checks.rst @@ -175,13 +175,13 @@ But you can run prek hooks manually as needed. .. code-block:: bash - prek mypy-airflow-core mypy-dev --hook-stage pre-push + prek mypy-airflow-core mypy-dev --stage pre-push - Run only mypy airflow checks on all "airflow-core" files by using: .. code-block:: bash - prek mypy-airflow-core --all-files --hook-stage pre-push + prek mypy-airflow-core --all-files --stage pre-push - Run all pre-commit stage hooks on all files by using: @@ -272,7 +272,7 @@ Manual prek hooks Most of the checks we run are configured to run automatically when you commit the code or push PR. However, there are some checks that are not run automatically and you need to run them manually. You can run -them manually by running ``prek --hook-stage manual ``. +them manually by running ``prek --stage manual ``. Special pin-versions prek ------------------------- @@ -287,7 +287,7 @@ However, you can run it manually by running: .. code-block:: bash export GITHUB_TOKEN=YOUR_GITHUB_TOKEN - prek --all-files --hook-stage manual --verbose pin-versions + prek --all-files --stage manual --verbose pin-versions Mypy checks @@ -309,19 +309,19 @@ command (example for ``airflow`` files): .. code-block:: bash - prek --hook-stage manual mypy- --all-files + prek --stage manual mypy- --all-files For example: .. code-block:: bash - prek --hook-stage manual mypy-airflow --all-files + prek --stage manual mypy-airflow --all-files To show unused mypy ignores for any providers/airflow etc, eg: run below command: .. code-block:: bash export SHOW_UNUSED_MYPY_WARNINGS=true - prek --hook-stage manual mypy-airflow --all-files + prek --stage manual mypy-airflow --all-files MyPy uses a separate docker-volume (called ``mypy-cache-volume``) that keeps the cache of last MyPy execution in order to speed MyPy checks up (sometimes by order of magnitude). While in most cases MyPy diff --git a/dev/breeze/doc/11_issues_tasks.rst b/dev/breeze/doc/11_issues_tasks.rst index de4ab617d4894..e064486ce6aac 100644 --- a/dev/breeze/doc/11_issues_tasks.rst +++ b/dev/breeze/doc/11_issues_tasks.rst @@ -54,5 +54,5 @@ Example usage: ----- -Next step: Follow the `Advanced Breeze topics <12_advanced_breeze_topics.rst>`__ instructions to learn more -about advanced Breeze topics and internals. +Next step: Follow the `Pull request tasks <13_pr_tasks.rst>`__ instructions to learn how to +manage GitHub pull requests with Breeze. diff --git a/dev/breeze/doc/13_pr_tasks.rst b/dev/breeze/doc/13_pr_tasks.rst new file mode 100644 index 0000000000000..5226b9eb557da --- /dev/null +++ b/dev/breeze/doc/13_pr_tasks.rst @@ -0,0 +1,112 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +Pull request tasks +------------------ + +There are Breeze commands that help maintainers manage GitHub pull requests for the Apache Airflow project. + +Those are all of the available PR commands: + +.. image:: ./images/output_pr.svg + :target: https://raw.githubusercontent.com/apache/airflow/main/dev/breeze/doc/images/output_pr.svg + :width: 100% + :alt: Breeze PR commands + +Auto-triaging PRs +""""""""""""""""" + +The ``breeze pr auto-triage`` command finds open PRs from non-collaborators that don't meet +minimum quality criteria and lets maintainers take action on them interactively. + +.. image:: ./images/output_pr_auto-triage.svg + :target: https://raw.githubusercontent.com/apache/airflow/main/dev/breeze/doc/images/output_pr_auto-triage.svg + :width: 100% + :alt: Breeze PR auto-triage + +The command works in several phases: + +1. **Fetch** — Fetches open, non-draft PRs via the GitHub GraphQL API. +2. **Filter** — Skips collaborators, bot accounts (dependabot, renovate, github-actions), + and PRs already labeled ``ready for maintainer review``. +3. **Assess** — Runs deterministic checks (CI failures, merge conflicts, missing test + workflows) and optionally LLM-based quality assessment (via ``claude`` or ``codex`` CLI). +4. **Triage** — Presents flagged PRs grouped by author. For each PR the maintainer chooses + an action: + + * **[D]raft** — Convert to draft and post a comment listing the violations (default for + most PRs). + * **[C]lose** — Close the PR, add the ``closed because of multiple quality violations`` + label, and post a comment. This is the suggested default when the author has more than + 3 flagged PRs. + * **[R]eady** — Add the ``ready for maintainer review`` label so the PR is skipped in + future runs. + * **[S]kip** — Take no action on this PR. + * **[Q]uit** — Stop processing. + + The command computes a smart default action based on CI failures, merge conflicts, LLM + assessment, and the number of flagged PRs by the same author. In ``--dry-run`` mode the + default action is displayed without prompting. + +5. **Workflow approval** — PRs with no test workflows run are presented for workflow + approval. Before approving, the maintainer reviews the full PR diff to check for + suspicious changes (e.g. attempts to exfiltrate secrets or modify CI pipelines). If + suspicious changes are confirmed, **all open PRs by that author** are closed, labeled + ``suspicious changes detected``, and commented. + +Labels used by auto-triage +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The command uses the following GitHub labels to track triage state: + +``ready for maintainer review`` + Applied when a maintainer chooses the **[R]eady** action on a flagged PR. PRs with this + label are automatically skipped in future triage runs, indicating the maintainer has + reviewed the flags and considers the PR acceptable for review. + +``closed because of multiple quality violations`` + Applied when a maintainer chooses the **[C]lose** action. This label marks PRs that were + closed because they did not meet quality criteria and the author had more than 3 flagged + PRs open at the time. A comment listing the violations is posted on the PR. + +``suspicious changes detected`` + Applied when a maintainer identifies suspicious changes (e.g. secret exfiltration attempts, + malicious CI modifications) while reviewing a PR diff during the workflow approval phase. + When this label is applied, **all open PRs by the same author** are closed and labeled, + with a comment explaining the reason. + +These labels must exist in the GitHub repository before using the command. If a label is +missing, the command will print a warning and skip the labeling step. + +Example usage: + +.. code-block:: bash + + # Dry run to see which PRs would be flagged and what action would be taken + breeze pr auto-triage --dry-run + + # Run with CI checks only (no LLM) + breeze pr auto-triage --check-mode ci + + # Filter by label and author + breeze pr auto-triage --label area:core --author some-user + + # Limit to 10 PRs + breeze pr auto-triage --max-num 10 + + # Verbose mode — show individual skip reasons during filtering + breeze pr auto-triage --verbose diff --git a/dev/breeze/doc/README.rst b/dev/breeze/doc/README.rst index 03df38bae1c4a..25399684e5bce 100644 --- a/dev/breeze/doc/README.rst +++ b/dev/breeze/doc/README.rst @@ -50,6 +50,7 @@ The following documents describe how to use the Breeze environment: * `Release management tasks <09_release_management_tasks.rst>`_ - describes how to use Breeze for release management tasks. * `UI tasks <10_ui_tasks.rst>`_ - describes how Breeze commands are used to support Apache Airflow project UI. * `Issues tasks <11_issues_tasks.rst>`_ - describes how Breeze commands are used to manage GitHub issues. +* `Pull request tasks <13_pr_tasks.rst>`_ - describes how Breeze commands are used to manage GitHub pull requests. * `Advanced Breeze topics <12_advanced_breeze_topics.rst>`_ - describes advanced Breeze topics/internals of Breeze. You can also learn more context and Architecture Decisions taken when developing Breeze in the diff --git a/dev/breeze/doc/images/output-commands.svg b/dev/breeze/doc/images/output-commands.svg index f970ad85743ea..6cf90a8bcef98 100644 --- a/dev/breeze/doc/images/output-commands.svg +++ b/dev/breeze/doc/images/output-commands.svg @@ -1,4 +1,4 @@ - +