Skip to content

Decouple Docker from PyTorch build pipeline (pytorch/test-infra/calculate-docker action) #319

@zxiiro

Description

@zxiiro

The title of this issue needs to be improved but the thought here is that the current PyTorch CI build and test pipeline assumes that it is NOT running from inside a container; either a VM or dedicated host. When we tried in 2024H1 to migrate to ARC a container based runner autoscaler backed by Kubernetes this caused us some issues as we then needed to also support a multi-level nested container pipeline as scripts in PyTorch assumed they can just run docker build and docker run as part of the build pipeline.

We had to use things like DIND at multiple nested container levels causing us to have to write many workaround scripts to support this effort.

The goal of this issue is to discuss how we can decouple the assumption that a job could run docker build|run and move into a more GHA native way to build and run pytorch containers for the build and test pipelines allowing us to more easily adopt Container based self-hosted runner autoscalers.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions