Read about this project in the Xarray blog.
🏔️⚡ A collaborative benchmarking and optimization effort from NSF-NCAR, Development Seed, and NVIDIA to accelerate data-intensive geoscience AI/ML workflows using GPU-native technologies like Zarr v3, CuPy, KvikIO, and NVIDIA DALI.
This repository contains code, benchmarks, and examples from Xarray on GPUs hackathon project during the NREL/NCAR/NOAA Open Hackathon in Golden, Colorado from 18-27 February 2025. The goal of this project is to provide a proof-of-concept example of optimizing the performance of geospatial machine learning workflows on GPUs by using Zarr-python v3 and NVIDIA DALI.
In this project, we demonstrate how to:
- Optimize chunking strategies for Zarr datasets
- Read ERA5 Zarr v3 data directly into GPU memory using CuPy and KvikIO
- Apply GPU-based decompression using NVIDIA's nvCOMP
- Build end-to-end GPU-native DALI pipelines
- Improve training throughput for U-Net-based ML models
In this repository, you will find the following:
benchmarks/
: Scripts to evaluate read and write performance for Zarr v3 datasets on both CPU and GPU.zarr_dali_example/
: Contains a minimal example of using DALI to read Zarr data and train a model.zarr_ML_optimization
: Contains an example benchmark for training a U-Net model using DALI with Zarr data format.rechunk
: Contains a notebook that demonstrates how to optimize chunking strategies for Zarr datasets.
See zarr_ML_optimization/README.md for more details on running the U-Net training example.
Start by cloning the repo & setting up the conda
environment:
git clone https://github.com/pangeo-data/ncar-hackathon-xarray-on-gpus.git
cd ncar-hackathon-xarray-on-gpus
conda env create --file environment.yml
conda activate gpuhackathon
This is for those who want full reproducibility of the virtual environment. Create a virtual environment with just Python and conda-lock installed first.
conda create --name gpuhackathon python=3.11 conda-lock=2.5.7
conda activate gpuhackathon
Generate a unified conda-lock.yml
file
based on the dependency specification in environment.yml
. Use only when
creating a new conda-lock.yml
file or refreshing an existing one.
conda-lock lock --mamba --file environment.yml --platform linux-64 --with-cuda=12.8
Installing/Updating a virtual environment from a lockile. Use this to sync your
dependencies to the exact versions in the conda-lock.yml
file.
conda-lock install --mamba --name gpuhackathon conda-lock.yml
See also https://conda.github.io/conda-lock/output/#unified-lockfile for more usage details.