Standard Dockerfiles for TensorFlow builds.
Maintainer: @angerson (TensorFlow OSS DevInfra; SIG Build)
These docker containers are for building and testing TensorFlow in CI
environments (and for users replicating those CI builds). They are openly
developed in TF SIG Build, verified by Google developers, and published to
tensorflow/build on Docker Hub.
The TensorFlow OSS DevInfra team is evaluating these containers for building
tf-nightly
.
These Dockerfiles are built and deployed to Docker Hub via Github Actions.
The tags are defined as such:
- The
latest
tags are kept up-to-date to build TensorFlow'smaster
branch. - The
version number
tags target the corresponding TensorFlow version. We continuously build thecurrent-tensorflow-version + 1
tag, so when a new TensorFlow branch is cut, that Dockerfile is frozen to support that branch. - We support the same Python versions that TensorFlow does.
For simple changes, you can adjust the source files and then make a PR. Send it to @angerson for review. We have presubmits that will make sure your change still builds a container. After approval and submission, our GitHub Actions workflow deploys the containers to Docker Hub.
- To update Python packages, look at
devel.requirements.txt
- To update system packages, look at
devel.packages.txt
- To update the way
bazel build
works, look atdevel.usertools/*.bazelrc
.
To rebuild the containers locally after making changes, use this command from this directory:
DOCKER_BUILDKIT=1 docker build \
--build-arg PYTHON_VERSION=python3.9 --target=devel -t my-tf-devel .
It will take a long time to build devtoolset and install CUDA packages. After
it's done, you can use the commands below to test your changes. Just replace
tensorflow/build:latest-python3.9
with my-tf-devel
to use your image
instead.
TensorFlow team members (i.e. Google employees) can apply a Build and deploy to gcr.io for staging
tag to their PRs to the Dockerfiles, as long as the PR
is being developed on a branch of this repository, not a fork. Unfortunately
this is not available for non-Googler contributors for security reasons.
The TensorFlow DevInfra team runs a daily test suite that builds tf-nightly
and runs a bazel test
suite on both the Pip package (the "pip" tests) and
on the source code itself (the "nonpip" tests). These test scripts are often
referred to as "The Nightly Tests" and can be a common reason for a TF PR to be
reverted. The build scripts aren't visible to external users, but they use
the configuration files which are included in these containers. Our test suites,
which include the build of tf-nightly
, are easy to replicate with these
containers, and here is how you can do it.
Presubmits are not using these containers... yet.
Here are some important notes to keep in mind:
-
The Ubuntu CI jobs that build the
tf-nightly
package build at the GitHubnightly
tag. You can see the specific commit of atf-nightly
package on pypi.org intf.version.GIT_VERSION
, which will look something likev1.12.1-67282-g251085598b7
. The final section,g251085598b7
, is a short git hash. -
If you interrupt a
docker exec
command withctrl-c
, you will get your shell back but the command will continue to run. You cannot reattach to it, but you can kill it withdocker kill tf
(ordocker kill the-container-name
). This will destroy your container but will not harm your work since it's mounted. If you have any suggestions for handling this better, let us know.
Now let's build tf-nightly
.
-
Set up your directories:
- A directory with the TensorFlow source code, e.g.
/tmp/tensorflow
- A directory for TensorFlow packages built in the container, e.g.
/tmp/packages
- A directory for your local bazel cache (can be empty), e.g.
/tmp/bazelcache
- A directory with the TensorFlow source code, e.g.
-
Choose the Docker container to use from Docker Hub. The options for the
master
branch are:tensorflow/build:latest-python3.10
tensorflow/build:latest-python3.9
tensorflow/build:latest-python3.8
tensorflow/build:latest-python3.7
For this example we'll use
tensorflow/build:latest-python3.9
. -
Pull the container you decided to use.
docker pull tensorflow/build:latest-python3.9
-
Start a backgrounded Docker container with the three folders mounted.
- Mount the TensorFlow source code to
/tf/tensorflow
. - Mount the directory for built packages to
/tf/pkg
. - Mount the bazel cache to
/tf/cache
. You don't need/tf/cache
if you're going to use the remote cache.
Here are the arguments we're using:
--name tf
: Names the containertf
so we can refer to it later.-w /tf/tensorflow
: All commands run in the/tf/tensorflow
directory, where the TF source code is.-it
: Makes the container interactive for running commands-d
: Makes the container start in the background, so we can send commands to it instead of running commands from inside.
And
-v
is for mounting directories into the container.docker run --name tf -w /tf/tensorflow -it -d \ -v "/tmp/packages:/tf/pkg" \ -v "/tmp/tensorflow:/tf/tensorflow" \ -v "/tmp/bazelcache:/tf/cache" \ tensorflow/build:latest-python3.9 \ bash
Note: if you wish to use your own Google Cloud Platform credentials for e.g. RBE, you may also wish to set
-v $HOME/.config/gcloud:/root/.config/gcloud
to make your credentials available to bazel. You don't need to do this unless you know what you're doing. - Mount the TensorFlow source code to
Now you can continue on to any of:
- Build
tf-nightly
and then (optionally) run a test suite on the pip package (the "pip" suite) - Run a test suite on the TF code directly (the "nonpip" suite)
- Build the libtensorflow packages (the "libtensorflow" suite)
- Run a code-correctness check (the "code_check" suite)
-
Apply the
update_version.py
script that changes the TensorFlow version toX.Y.Z.devYYYYMMDD
. This is used fortf-nightly
on PyPI and is technically optional.docker exec tf python3 tensorflow/tools/ci_build/update_version.py --nightly
-
Build TensorFlow by following the instructions under one of the collapsed sections below. You can build both CPU and GPU packages without a GPU. TF DevInfra's remote cache is better for building TF only once, but if you build over and over, it will probably be better in the long run to use a local cache. We're not sure about which is best for most users, so let us know on Gitter.
This step will take a long time, since you're building TensorFlow. GPU takes much longer to build. Choose one and click on the arrow to expand the commands:
TF Nightly CPU - Remote Cache
Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/cpu.bazelrc \ build --config=sigbuild_remote_cache \ tensorflow/tools/pip_package:build_pip_package
And then construct the pip package:
docker exec tf \ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package \ /tf/pkg \ --cpu \ --nightly_flag
TF Nightly GPU - Remote Cache
Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/gpu.bazelrc \ build --config=sigbuild_remote_cache \ tensorflow/tools/pip_package:build_pip_package
And then construct the pip package:
docker exec tf \ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package \ /tf/pkg \ --nightly_flag
TF Nightly CPU - Local Cache
Make sure you have a directory mounted to the container in
/tf/cache
!Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/cpu.bazelrc \ build --config=sigbuild_local_cache \ tensorflow/tools/pip_package:build_pip_package
And then construct the pip package:
docker exec tf \ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package \ /tf/pkg \ --cpu \ --nightly_flag
TF Nightly GPU - Local Cache
Make sure you have a directory mounted to the container in
/tf/cache
!Build the sources with Bazel:
docker exec tf \ bazel --bazelrc=/usertools/gpu.bazelrc \ build --config=sigbuild_local_cache \ tensorflow/tools/pip_package:build_pip_package
And then construct the pip package:
docker exec tf \ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package \ /tf/pkg \ --nightly_flag
-
Run the helper script that checks for manylinux compliance, renames the wheels, and then checks the size of the packages.
docker exec tf /usertools/rename_and_verify_wheels.sh
-
Take a look at the new wheel packages you built! They may be owned by
root
because of how Docker volume permissions work.ls -al /tmp/packages
-
To continue on to running the Pip tests, create a venv and install the testing packages:
docker exec tf /usertools/setup_venv_test.sh bazel_pip "/tf/pkg/tf_nightly*.whl"
-
And now run the tests depending on your target platform:
--config=pip
includes the same test suite that is run by the DevInfra team every night. If you want to run a specific test instead of the whole suite, pass--config=pip_venv
instead, and then set the target on the command like normal.TF Nightly CPU - Remote Cache
Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/cpu.bazelrc \ test --config=sigbuild_remote_cache \ --config=pip
TF Nightly GPU - Remote Cache
Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/gpu.bazelrc \ test --config=sigbuild_remote_cache \ --config=pip
TF Nightly CPU - Local Cache
Make sure you have a directory mounted to the container in
/tf/cache
!Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/cpu.bazelrc \ test --config=sigbuild_local_cache \ --config=pip
TF Nightly GPU - Local Cache
Make sure you have a directory mounted to the container in
/tf/cache
!Build the sources with Bazel:
docker exec tf \ bazel --bazelrc=/usertools/gpu.bazelrc \ test --config=sigbuild_local_cache \ --config=pip
-
Run the tests depending on your target platform.
--config=nonpip
includes the same test suite that is run by the DevInfra team every night. If you want to run a specific test instead of the whole suite, you do not need--config=nonpip
at all; just set the target on the command line like usual.TF Nightly CPU - Remote Cache
Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/cpu.bazelrc \ test --config=sigbuild_remote_cache \ --config=nonpip
TF Nightly GPU - Remote Cache
Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/gpu.bazelrc \ test --config=sigbuild_remote_cache \ --config=nonpip
TF Nightly CPU - Local Cache
Make sure you have a directory mounted to the container in
/tf/cache
!Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/cpu.bazelrc \ test --config=sigbuild_local_cache \ --config=nonpip
TF Nightly GPU - Local Cache
Make sure you have a directory mounted to the container in
/tf/cache
!Build the sources with Bazel:
docker exec tf \ bazel --bazelrc=/usertools/gpu.bazelrc \ test --config=sigbuild_local_cache \ --config=nonpip
-
Build the libtensorflow packages.
TF Nightly CPU - Remote Cache
Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/cpu.bazelrc \ build --config=sigbuild_remote_cache \ --config=libtensorflow_build
TF Nightly GPU - Remote Cache
Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/gpu.bazelrc \ build --config=sigbuild_remote_cache \ --config=libtensorflow_build
TF Nightly CPU - Local Cache
Make sure you have a directory mounted to the container in
/tf/cache
!Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/cpu.bazelrc \ build --config=sigbuild_local_cache \ --config=libtensorflow_build
TF Nightly GPU - Local Cache
Make sure you have a directory mounted to the container in
/tf/cache
!Build the sources with Bazel:
docker exec tf \ bazel --bazelrc=/usertools/gpu.bazelrc \ build --config=sigbuild_local_cache \ --config=libtensorflow_build
-
Run the tests depending on your target platform.
--config=libtensorflow_test
includes the same test suite that is run by the DevInfra team every night. If you want to run a specific test instead of the whole suite, just set the target on the command line like usual.TF Nightly CPU - Remote Cache
Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/cpu.bazelrc \ test --config=sigbuild_remote_cache \ --config=libtensorflow_test
TF Nightly GPU - Remote Cache
Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/gpu.bazelrc \ test --config=sigbuild_remote_cache \ --config=libtensorflow_test
TF Nightly CPU - Local Cache
Make sure you have a directory mounted to the container in
/tf/cache
!Build the sources with Bazel:
docker exec tf bazel --bazelrc=/usertools/cpu.bazelrc \ test --config=sigbuild_local_cache \ --config=libtensorflow_test
TF Nightly GPU - Local Cache
Make sure you have a directory mounted to the container in
/tf/cache
!Build the sources with Bazel:
docker exec tf \ bazel --bazelrc=/usertools/gpu.bazelrc \ test --config=sigbuild_local_cache \ --config=libtensorflow_test
-
Run the
repack_libtensorflow.sh
utility to repack and rename the archives.CPU
docker exec tf /usertools/repack_libtensorflow.sh /tf/pkg "-cpu-linux-x86_64"
GPU
docker exec tf /usertools/repack_libtensorflow.sh /tf/pkg "-gpu-linux-x86_64"
-
Every night the TensorFlow team runs
code_check_full
, which contains a suite of checks that were gradually introduced over TensorFlow's lifetime to prevent certain unsable code states. This check has supplanted the old "sanity" or "ci_sanity" checks.docker exec tf bats /usertools/code_check_full.bats --timing --formatter junit
-
Shut down and remove the container when you are finished.
docker stop tf docker rm tf