Decouple Docker from PyTorch build pipeline (pytorch/test-infra/calculate-docker action)

The title of this issue needs to be improved but the thought here is that the current PyTorch CI build and test pipeline assumes that it is NOT running from inside a container; either a VM or dedicated host. When we tried in 2024H1 to migrate to ARC a container based runner autoscaler backed by Kubernetes this caused us some issues as we then needed to also support a multi-level nested container pipeline as scripts in PyTorch assumed they can just run `docker build` and `docker run` as part of the build pipeline. 

We had to use things like DIND at multiple nested container levels causing us to have to write many workaround scripts to support this effort.

The goal of this issue is to discuss how we can decouple the assumption that a job could run docker build|run and move into a more GHA native way to build and run pytorch containers for the build and test pipelines allowing us to more easily adopt Container based self-hosted runner autoscalers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Decouple Docker from PyTorch build pipeline (pytorch/test-infra/calculate-docker action) #319

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Decouple Docker from PyTorch build pipeline (pytorch/test-infra/calculate-docker action) #319

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions