Skip to content

chore(security): add harden-runner to critical release and AI agent workflows#8017

Open
jordanconway wants to merge 1 commit into
pytorch:mainfrom
jordanconway:hardening/harden-runner-critical
Open

chore(security): add harden-runner to critical release and AI agent workflows#8017
jordanconway wants to merge 1 commit into
pytorch:mainfrom
jordanconway:hardening/harden-runner-critical

Conversation

@jordanconway
Copy link
Copy Markdown
Contributor

Summary

Adds step-security/harden-runner (in audit mode) as the first step in the three highest-risk workflows in this repo. No existing behaviour is changed — audit mode only observes and logs; it does not block anything.


Workflows covered

release-pypi.yml

Publishes Python packages to PyPI for torchvision, torchaudio, torchao, executorch, torchcodec, and torchTune. This is the highest-value target for a supply-chain compromise — any secret exfiltration here affects millions of downstream PyTorch users. Allowed-endpoints baseline: GitHub, AWS STS/S3 (for staging bucket), PyPI upload.

release-docker.yml

Pulls CUDA Docker images from GHCR and re-tags/pushes them to Docker Hub under the pytorch/ namespace. Same risk tier as PyPI publishing. Allowed-endpoints baseline: GHCR, Docker Hub auth + registry.

_claude-code.yml ⚠️ Highest priority

The centralised reusable Claude AI agent workflow used across pytorch and meta-pytorch orgs. It carries:

  • id-token: write (OIDC → AWS Bedrock)
  • pull-requests: write + issues: write
  • contents: read

An AI agent workflow with open egress and write permissions is an especially sensitive surface. Audit mode will immediately show whether the anthropics/claude-code-action or any of its dependencies attempt to reach unexpected endpoints. Allowed-endpoints baseline: GitHub API, AWS STS, Bedrock runtime.


What's not covered here (and why)

Workflow Reason skipped
release-stage-pypi.yml Uses container: pytorch/almalinux-builder:cpu — harden-runner does not support container jobs
tflint.yml Uses container: node:20 — same limitation
All other 100 workflows Covered in a follow-up PR after the high-risk set is confirmed

Next steps after merge

  1. Let the workflows run a few times in audit mode
  2. Review the egress logs at app.stepsecurity.io
  3. Refine the allowed-endpoints lists based on observed traffic
  4. Switch egress-policy from auditblock once the allowlist is confirmed

This change is part of a broader supply-chain hardening effort following a repository audit. See https://github.com/jordanconway/package-manager-hardening for the full methodology.

…orkflows

Add step-security/harden-runner (audit mode) as the first step in the
three highest-risk workflows:

- release-pypi.yml: pushes packages to PyPI — highest-value target for
  supply-chain compromise; any secret exfiltration here affects millions
  of PyTorch users downstream.

- release-docker.yml: pulls from GHCR and pushes to Docker Hub under
  the pytorch/ namespace — same risk tier as PyPI publishing.

- _claude-code.yml: AI agent workflow with id-token:write, contents:read,
  pull-requests:write, issues:write, and AWS OIDC access to Bedrock. Open
  egress from an AI agent with write permissions is an especially sensitive
  surface; audit mode will immediately show if the Claude action or its
  deps try to reach unexpected endpoints.

All three are started in egress-policy: audit (not block) per best
practice — switch to block after reviewing the audit logs and confirming
the allowlist is complete. disable-sudo: true is set on all three.

Note: release-stage-pypi.yml and tflint.yml use container: jobs; harden-
runner does not support container jobs and must be added separately once
those jobs migrate off containers or via a host-level approach.

See https://github.com/jordanconway/package-manager-hardening for the
full hardening methodology.

Signed-off-by: Jordan Conway <jconway@linuxfoundation.org>
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 28, 2026

@jordanconway is attempting to deploy a commit to the Meta Open Source Team on Vercel.

A member of the Team first needs to authorize it.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant