EAF is a Fermilab GPU cluster (NVIDIA A100). If you have a Fermilab Services account, you also have EAF access.
Notes
- Each user gets a single GPU with a memory quota. Use EAF for in-house sample production and light workloads. For heavy jobs, use Polaris or hand off to the production team.
You need either a VPN or a browser proxy.
Option: SOCKS proxy for your browser (Firefox)
- Start a dynamic SOCKS proxy to a FNAL host:
ssh -D 3128 USER@FNAL_MACHINE
- In Firefox, set the Automatic proxy configuration URL to the provided PAC file: https://www.nevis.columbia.edu/~sc5303/fnal-proxy.pac
Tip: The PAC file assumes port 3128, matching the command above.
- EAF mounts nearly the same disks as the GPVMs, except
/pnfsis not mounted. - For SPINE software, use:
/exp/sbnd/app/users/sc5303/SPINE.
EAF does not include Apptainer by default. The easiest route is to install it inside a Conda environment.
Quota tip: EAF has a small /home quota. Point Conda/Pip/Apptainer caches to /exp/sbnd/data or /exp/sbnd/app to avoid filling /home.
Recommended cache settings (add to your shell startup):
export APPTAINER_CACHEDIR=/exp/sbnd/data/users/USERNAME/apptainer_cache
export PIP_CACHE_DIR=/exp/sbnd/data/users/USERNAME/pip_cache
export XDG_CACHE_HOME=/exp/sbnd/data/users/USERNAME/.cacheExample .condarc:
pkgs_dirs:
- /exp/sbnd/data/users/sc5303/apptainer_cache/pkgs
envs_dirs:
- /exp/sbnd/data/users/sc5303/apptainer_cache/envs
channels:
- conda-forge
solver: libmamba
Build OpT0Finder (from the repository root):
source configure.sh
make -jInference uses a trained model to produce predictions on unseen data. In this workflow, it creates HDF5 files from LArCV inputs.
LArCV files are created after the reco1 stage in SBND.
Script used: /exp/sbnd/app/users/sc5303/SPINE/inference/inference.sh
Key variables inside the script
- CFG — Configuration file path
- LOG_DIR — Directory for logs
- FNAME — Input LArCV file(s); can be a list
- workdir — Output directory (use
/exp/sbnd/datadue to size) - container — Apptainer image path
- CUDA_VISIBLE_DEVICES — Set to
0(EAF provides one GPU)
Ensure the output directory exists and has sufficient space.
CAFs and flatCAFs are created from cafmakerjob_sbnd.fcl in SBND.
HDF5 files are created from the inference step above.
Merge HDF5 and CAF using sbn-ml-cafmaker (see link below).
Merged output is CAF, for flatCAF format, use
flatten_caf normal_caf_file.caf.root flat_caf_file_name.flat.caf.root
after setting up sbnana.
Select a tagged sbnana version available from ups list -aK+ sbnana.
For example:
setup sbnana v10_01_04 -q e26:prof`
cmake /exp/sbnd/app/users/sc5303/sbn_ml_cafmaker -DHDF5_INSTALL="/exp/icarus/app/users/mueller/hdf5/hdf5_install"
make
Where to get components:
- Inference bash script: https://github.com/bear-is-asleep/sbnd_spine_train/blob/master/deghost/train_uresnet.sh (modify for EAF)
- Configuration files (CFG): https://github.com/DeepLearnPhysics/spine-prod/tree/main/config/sbnd
Need to pull from github
- OpT0Finder: https://github.com/bear-is-asleep/OpT0Finder
- spine: https://github.com/DeepLearnPhysics/spine/tree/v0.7.6
- spine-prod: https://github.com/DeepLearnPhysics/spine-prod
- sbn-ml-cafmaker: https://github.com/justinjmueller/sbn_ml_cafmaker