-
Notifications
You must be signed in to change notification settings - Fork 70
Setting up your environment
- Python
>=3.8,<3.10
. We currently use3.8
. For more info, see the Cloud Composer version list. - Poetry for installing and managing dependencies.
- gcloud command-line tool with Google Cloud Platform credentials configured. Instructions can be found here.
- Familiarity with Apache Airflow (
>=v2.1.4
) -
Terraform
>=v0.15.1
-
Google Cloud Composer environment running Apache Airflow
>=2.1.0
and Cloud Composer>=2.0
. To create a new Cloud Composer environment, see this guide.
Download and install the gcloud
CLI tool here. Once installed, go through the initial setup via
gcloud init
We use Poetry to make environment dependencies more deterministic and uniform across different machines.
If you haven't done so, install Poetry using these instructions. We recommend using poetry's official installer. Here is a quick installation set of commands:
# Install Poetry
#
# Note: We're currently using their preview branch for features not yet found in their official release.
curl -sSL https://install.python-poetry.org | python3 - --preview
Dependencies are specified in pyproject.toml
and can be installed based on the dependency group you intend to use: pipelines
or docs
.
If you intend to develop data pipelines, install the dependencies as follows:
poetry install --only pipelines
If you intend to contribute content for the documentation set (a.k.a. doc set), install the dependencies as follows:
poetry install --only docs
After the dependencies are installed, initialize the Airflow database:
poetry run airflow db init
To ensure you have a proper setup for pipeline development, run the tests:
poetry run python -m pytest -v tests
The default development environment for the documentation set is Colab. If you choose to develop doc set content using your local machine, you'll need to authenticate yourself using gcloud
:
gcloud auth login