Skip to content

Setting up your environment

Adler Santos edited this page Apr 13, 2022 · 10 revisions

Requirements

General

  • Python >=3.8,<3.10. We currently use 3.8. For more info, see the Cloud Composer version list.
  • Poetry for installing and managing dependencies.
  • gcloud command-line tool with Google Cloud Platform credentials configured. Instructions can be found here.

For building data pipelines

Environment setup

gcloud

Download and install the gcloud CLI tool here. Once installed, go through the initial setup via

gcloud init

Poetry

We use Poetry to make environment dependencies more deterministic and uniform across different machines.

If you haven't done so, install Poetry using these instructions. We recommend using poetry's official installer. Here is a quick installation set of commands:

# Install Poetry 
#
# Note: We're currently using their preview branch for features not yet found in their official release.
curl -sSL https://install.python-poetry.org | python3 - --preview

Installing dependencies

Dependencies are specified in pyproject.toml and can be installed based on the dependency group you intend to use: pipelines or docs.

Installing dependencies for pipeline development

If you intend to develop data pipelines, install the dependencies as follows:

poetry install --only pipelines

Installing dependencies for documentation set

If you intend to contribute content for the documentation set (a.k.a. doc set), install the dependencies as follows:

poetry install --only docs

Additional setup

Setup Airflow (pipeline development only)

After the dependencies are installed, initialize the Airflow database:

poetry run airflow db init

Run unit tests

To ensure you have a proper setup for pipeline development, run the tests:

poetry run python -m pytest -v tests

Setup Colab (doc set only)

The default development environment for the documentation set is Colab. If you choose to develop doc set content using your local machine, you'll need to authenticate yourself using gcloud:

gcloud auth login