Skip to content

fyingtsai/TrackOverlayML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

License: MIT

TrackOverlayML Pipeline

Note: Refactored and documented with GitHub Copilot assistance. Based on ATLAS Collaboration code for ML-driven track overlay routing.

Overview

Train a neural network to intelligently route ATLAS simulation events:

  • MC-overlay: Full simulation (accurate but slow)
  • Track-overlay: Fast simulation (approximation)
  • Goal: Use Track-overlay when it matches MC-overlay (MatchProb > 0.5), otherwise use MC-overlay

Data Access

The framework requires ATLAS simulation data in specific formats:

Option 1: Pre-processed HDF5 files (Ready for training)

/eos/user/f/fatsai/TrackOverlayDATA/matched_JZ7W_data.h5
/eos/user/f/fatsai/TrackOverlayDATA/unmatched_JZ7W_data.h5

Option 2: Raw CSV files (For full preprocessing pipeline)

/eos/user/f/fatsai/TrackOverlayDATA/MCOverlay_JZ7W/*.csv
/eos/user/f/fatsai/TrackOverlayDATA/TrackOverlay_JZ7W/*.csv

Access: These datasets are stored on CERN EOS and require ATLAS collaboration access rights.

To request access:

Expected directory structure:

data/
├── MC-overlay_JZ7W/
│   ├── file1.csv
│   ├── file2.csv
│   └── ...
└── Track-overlay_JZ7W/
    ├── file1.csv
    ├── file2.csv
    └── ...

For different samples, use the sample name in the directory:

data/
├── MC-overlay_ttbar/
│   └── *.csv
├── Track-overlay_ttbar/
│   └── *.csv
├── MC-overlay_JZ7W/
│   └── *.csv
└── Track-overlay_JZ7W/
    └── *.csv

Setting up your data:

# Create directories for your sample
mkdir -p data/MC-overlay_ttbar
mkdir -p data/Track-overlay_ttbar

and copy or link your CSV files accordingly.

Quick Start

Recommended: Using Singularity

The easiest way to run this framework is using the pre-built Singularity container, which includes all dependencies:

# Pull the container (only needed once)
singularity pull docker://fyingtsai/dsnnr_4gpu:v5
# or on Perlmutter
podman-hpc pull docker://fyingtsai/dsnnr_4gpu:v5

Alternative: Local Installation

If you cannot use Singularity, install dependencies locally:

Option A: Using uv

# Install uv if not already installed
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create virtual environment and install dependencies
uv venv
source .venv/bin/activate
#uv pip install -e .
uv run python scripts/prepare_data.py --sample JZ7W --path data

Option B: Using Conda

# Create environment from file
conda env create -f environment.yml --prefix /path/to/your/scratch/trackoverlay-ml
# Activate environment
conda activate /path/to/your/scratch/trackoverlay-ml

Option C: Using pip

pip install "tensorflow>=2.8.0" "numpy>=1.21.0" "pandas>=1.3.0" "scikit-learn>=1.0.0" "matplotlib>=3.5.0,<3.9.0" "seaborn>=0.11.0" "tables>=3.7.0" "statsmodels>=0.13.0" "mplhep>=0.3.28,<0.4.0" "xarray>=0.20.0"

Note: All examples in this README assume Singularity usage. For local installation, remove the singularity exec dsnnr_4gpu_v5.sif prefix.

Example:

# Full pipeline
singularity exec dsnnr_4gpu_v5.sif python scripts/run_pipeline.py --sample JZ7W --epochs 5

# Or run steps individually (Recommended. Run each step individually for easier debugging and better control)
singularity exec dsnnr_4gpu_v5.sif python scripts/prepare_data.py --sample JZ7W --path data
singularity exec dsnnr_4gpu_v5.sif python scripts/train_model.py --sample JZ7W --path data --epochs 5
singularity exec dsnnr_4gpu_v5.sif python scripts/evaluate_model.py --sample JZ7W

More Training examples

# Train on balanced 10k + 10k
singularity exec dsnnr_4gpu_v5.sif python scripts/train_model.py --sample ttbar --matched_size 10000 --unmatched_size 10000

# Train on realistic imbalanced ratio (1:10)
singularity exec dsnnr_4gpu_v5.sif python scripts/train_model.py --sample ttbar --matched_size 5000 --unmatched_size 50000

# Use all matched, but limit unmatched
singularity exec dsnnr_4gpu_v5.sif python scripts/train_model.py --sample ttbar --unmatched_size 20000

# Full pipeline with balanced training
singularity exec dsnnr_4gpu_v5.sif python scripts/run_pipeline.py --stage all --sample ttbar --path data --matched_size 5000 --unmatched_size 5000 --epochs 20

# Full pipeline with cross-sample evaluation
singularity exec dsnnr_4gpu_v5.sif python scripts/run_pipeline.py --stage all --sample ttbar --eval_sample JZ7W

# Just train on subset
singularity exec dsnnr_4gpu_v5.sif python scripts/run_pipeline.py --stage train --sample ttbar --matched_size 10000 --unmatched_size 10000

Project Structure

TrackOverlayML/
├── data/                       # Data directory (--path to customize)
│   ├── MC-overlay_{sample}/    # MC workflow CSVs (required, if not yet have a h5 dataframe)
│   ├── Track-overlay_{sample}/ # Track workflow CSVs (required, if not yet have a h5 dataframe)
│   ├── matched_{sample}_data.h5    # Good matches (pre-created)
│   └── unmatched_{sample}_data.h5  # Poor matches (pre-created)
├── scripts/                    # Main entry points
│   ├── prepare_data.py         # Merge MC/Track, compute features
│   ├── train_model.py          # Train classifier
│   ├── evaluate_model.py       # Evaluate performance
│   └── run_pipeline.py         # Run all steps
├── network/classifier.py       # Model architecture
├── utils/                      # Evaluation & plotting
└── results/                    # Outputs (models, plots, logs)

Usage

Individual Steps (best for those just getting started)

# Step 1: Prepare data (merge MC/Track workflows)
singularity exec dsnnr_4gpu_v5.sif python scripts/prepare_data.py --sample ttbar --trainsplit 0.8

# Step 2: Train model
singularity exec dsnnr_4gpu_v5.sif python scripts/train_model.py --sample ttbar --epochs 200

# Step 3: Evaluate (same sample)
singularity exec dsnnr_4gpu_v5.sif python scripts/evaluate_model.py --sample ttbar

# Step 3b: Evaluate on different sample
singularity exec dsnnr_4gpu_v5.sif python scripts/evaluate_model.py --sample ttbar --eval_sample JZ7W

Common Workflows

Train multiple models on same data:

singularity exec dsnnr_4gpu_v5.sif python scripts/prepare_data.py --sample ttbar
singularity exec dsnnr_4gpu_v5.sif python scripts/train_model.py --sample ttbar --layers 32 16 8
singularity exec dsnnr_4gpu_v5.sif python scripts/train_model.py --sample ttbar --layers 64 32 16

Cross-sample evaluation:

# Train on ttbar, test on JZ7W
singularity exec dsnnr_4gpu_v5.sif python scripts/train_model.py --sample ttbar
singularity exec dsnnr_4gpu_v5.sif python scripts/prepare_data.py --sample JZ7W
singularity exec dsnnr_4gpu_v5.sif python scripts/evaluate_model.py --sample ttbar --eval_sample JZ7W

Quick evaluation on subset:

singularity exec dsnnr_4gpu_v5.sif python scripts/evaluate_model.py --sample ttbar --matched_size 5000 --unmatched_size 50000

Key Arguments

Argument Default Description
--path data Data directory path
--sample JZ7W Sample name (ttbar, JZ7W, etc.)
--eval_sample None Different sample for evaluation
--trainsplit 0.8 Train/test split ratio
--epochs 100 Training epochs
--batchsize 80 Batch size
--lr 0.001 Learning rate
--layers 45 35 30 Hidden layer sizes
--patience 20 Early stopping patience
--rouletter smart Roulette type (smart/hard)
--matched_size None Limit matched samples for training/eval
--unmatched_size None Limit unmatched samples for training/eval

Run python scripts/run_pipeline.py --help for full list.

Data Flow

MC-overlay_{sample}/        Track-overlay_{sample}/
└── *.csv                   └── *.csv
         ↓                           ↓
         └──── Merge on EventNumber ─┘
                      ↓
          Create labels (MatchProb > 0.5)
                      ↓
    matched_*.h5 (good) & unmatched_*.h5 (poor)
                      ↓
              Train/Test split
                      ↓
            Train classifier
                      ↓
          Evaluate performance

Output

results/{sample}/
├── classifier/
│   ├── classifier.h5           # Trained model
│   └── history.pkl             # Training history
├── logs/                       # Logs for each step
└── {xscore}/{rouletter}/
    └── plots/                  # ROC, efficiency, fraction plots

Notes

  • Matched (TargetLabel=1): MatchProb > 0.5 (Track-overlay accurate)
  • Unmatched (TargetLabel=0): MatchProb ≤ 0.5 (needs MC-overlay)
  • Preprocessed HDF5 files are cached for faster reruns

Contributing

When making changes:

  • Keep function docstrings updated
  • Add inline comments for complex physics calculations
  • Update this README if workflow changes

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages