AETHER-xAI

Description

This project develops an EO embedding/language model that can be used for explainable predictions from EO data.

Getting Started

Virtual environment

First, install dependencies in a venv using uv

# clone project
git clone https://github.com/WUR-AI/aether
cd aether

# Create venv
python3 -m venv .venv
source .venv/bin/activate

# install uv manager
pip install uv

# install all Python dependencies
uv sync # reads pyproject.toml + uv.lock

# install project locally (editable)
uv pip install -e .

Note, running uv sync in the venv will always update the package to the most up-to-date version (as defined by the repo's pyproject.toml file).

Set paths

Next, create a file in your local repo parent folder aether/ called .env. Copy the contents of aether/env.example and adjust the paths to your local system. Important: DATA_DIR should either point to aether/data/ OR if it points to another folder (e.g., my/local/data/) then copy the contents of aether/data/ to my/local/data/ to ensure the butterfly use case runs using the provided example data. Other data will automatically be downloaded and organised by pooch if possible, or should be copied manually.

Data folders should follow the following directory structure:

├── registry.txt                         <- Pooch config file, don't change.
├── s2bms/                               <- Dataset folder.
│   ├── model_ready_s2bms.csv            <- Csv file with "name_loc" id, locations, aux data and target data.
│   ├── aux_classes.csv                  <- Csv file with explanations for aux data class names.
│   ├── caption_templates                <- Caption templates
│       ├── v1.json                      <- Json file with list of caption templates (referencing aux column names).
│   ├── splits/                          <- Torch data splits
│   ├── source/                          <- Optional: source data used to create model_ready csv.
│   ├── eo/                              <- EO data modalities
│       ├── s2/                          <- Modality 1: (e.g. sentinel-2)
│           ├── s2_<NAME_LOC_1>.tif      <- EO modality data for a single location (indexed by unique <NAME_LOC>)
│           ├── s2_<NAME_LOC_2>.tif
│       ├── aef/                         <- Modality 2: (e.g. AEF)
│       ├── other_modality/
├── other_dataset/

Training

Experiment configurations (such as choosing data, encoders, hyperparameters etc.) are managed through Hydra configurations. Define your experiment configurations in configs/experiments/experiment_name.yaml, for example to train predictive model with GeoCLIP coordinate encoder for the Butterfly data using configs/experiments/prediction.yaml (copied below)

# @package _global_
# all parameters below will be merged with parameters from default configurations set above
# this allows you to overwrite only specified parameters

defaults:
  - override /model: predictive_geoclip
  - override /data: butterfly_coords


tags: ["prediction", "geoclip_coords"]

seed: 12345

trainer:
  min_epochs: 1
  max_epochs: 100

data:
  batch_size: 64

logger:
  wandb:
    tags: ${tags}
    group: "predictive"
  aim:
    experiment: "predictive"

To execute this experiment run (inside your venv):

python train.py experiment=prediction

Directory structure

We follow the directory structure from the Hydra-Lightning template, which looks like:

├── .github                   <- Github Actions workflows
│
├── configs                   <- Hydra configs
│   ├── callbacks                <- Callbacks configs
│   ├── data                     <- Data configs
│   ├── debug                    <- Debugging configs
│   ├── experiment               <- Experiment configs
│   ├── extras                   <- Extra utilities configs
│   ├── hparams_search           <- Hyperparameter search configs
│   ├── hydra                    <- Hydra configs
│   ├── local                    <- Local configs
│   ├── logger                   <- Logger configs
│   ├── model                    <- Model configs
│   ├── paths                    <- Project paths configs
│   ├── trainer                  <- Trainer configs
│   │
│   ├── eval.yaml             <- Main config for evaluation
│   └── train.yaml            <- Main config for training
│
├── data                   <- Project data
│
├── logs                   <- Logs generated by hydra and lightning loggers
│
├── notebooks              <- Jupyter notebooks. Naming convention is a number (for ordering),
│                             the creator's initials, and a short `-` delimited description,
│                             e.g. `1.0-jqp-initial-data-exploration.ipynb`.
│
├── scripts                <- Shell scripts
│
├── src                    <- Source code
│   ├── data                     <- Data scripts
│   ├── data_prepocessing        <- Data preprocessing scripts
│   ├── models                   <- Model scripts
│   ├── utils                    <- Utility scripts
│   │
│   ├── eval.py                  <- Run evaluation
│   └── train.py                 <- Run training
│
├── tests                  <- Tests of any kind
│
├── .env.example              <- Example of file for storing private environment variables
├── .gitignore                <- List of files ignored by git
├── .pre-commit-config.yaml   <- Configuration of pre-commit hooks for code formatting
├── .project-root             <- File for inferring the position of project root directory
├── environment.yaml          <- File for installing conda environment
├── Makefile                  <- Makefile with commands like `make train` or `make test`
├── pyproject.toml            <- Environment requirements, configuration options for testing and linting,
├── setup.py                  <- File for installing project as a package
├── uv.lock                   <- A frozen snapshot of exact dependencies for the uv package manager.
└── README.md

Attribution

This repo is based on the Hydra-Lightning template. Some code was adapted from github.com/vdplasthijs/PECL/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AETHER-xAI

Description

Getting Started

Virtual environment

Set paths

Training

Directory structure

Attribution

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 266 Commits
.github		.github
configs		configs
data		data
logs		logs
notebooks		notebooks
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.project-root		.project-root
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

License

cn241/aether

Folders and files

Latest commit

History

Repository files navigation

AETHER-xAI

Description

Getting Started

Virtual environment

Set paths

Training

Directory structure

Attribution

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages