Skip to content

compTAG/pyECT-experiments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pyECT-experiments

The code for reproducing the experiments using the pyECT package.

Python version 3.10 should be installed on your system (or in your virtual environment). Base requirements can be installed by running pip install -r requirements.txt from the root of this repository.

Acknowledgements Computational efforts were performed on the Tempest High Performance Computing System, operated and supported by University Information Technology Research Cyberinfrastructure (RRID:SCR_026229) at Montana State University.

Data Setup

Fashion-MNIST: The Fashion-MNIST dataset may be downloaded here. Then, the training set CSV file should be moved into the data/fashionmnist/ directory, and can be preprocessed by running data/fashionmnist/preprocess-fashionmnist.py.

ImageNet: The ImageNet sample dataset can be downloaded here. The images should be moved to a data/imagenet/images directory within this repo, and then preprocessed by running the data/imagenet/preprocess-imagenet.py function.

3DCCs: The 3DCCs dataset can be generated by running the data/generate_3d_cubical_complexes.py script.

Stanford Mesh Datasets: The Stanford mesh datasets can be downloaded as PLY files from here. They should be moved to the data/stanford directory and preprocessed using the data/stanford/preprocess-stanford.py file.

Dependency/Package Setup

Base requirements for running the package can be installed using pip install -r requirements.txt. Then, additional packages need to be installed using the following steps.

PyECT: PyECT should be installed using the linked anonymous repository in the paper. From the root of the repository, you may run pip install .

Eucalc: Eucalc should be installed following the instructions here.

FastTopology: FastTopology may be installed and compiled from here.

Reproducing Experiments

To run the experiments, you may run the experiments/main.py function, specifying values for the following command line arguments:

Arguments

  • data_path (str, required)
    Path to the dataset file. This may be:

    • A compressed NumPy .npz file containing image data, or
    • An .obj file containing 3D mesh data.
      The file will be loaded and processed depending on the selected data_type.
  • data_type (str, required)
    Specifies the format and structure of the input data. Must be one of the predefined DATA_TYPES.
    Examples include:

    • 'image': 2D image data stored in .npz format
    • '3d_mesh': 3D triangular mesh in .obj format
    • '3d_cubical_complex': Voxel-based or grid-based 3D data
  • invariant (str, required)
    The topological invariant to compute. Must be one of INVARIANT_TYPES. Examples:

    • 'wect': Weighted Euler Characteristic Transform
    • 'ecf': Euler Characteristic Field
  • implementation_name (str, required)
    A descriptive name for the experiment or implementation variant being executed.
    Used for logging, result organization, and reproducibility.

  • output_path (str, required)
    Destination path for the output .csv file where computed invariants or experiment results will be saved.

  • num_directions (int, required)
    Number of directions for directional transforms (e.g., for WECT).
    Higher values produce more detailed signatures but increase computation time.

  • num_timesteps (int, required)
    Number of filtration time steps used when computing invariants.
    Controls the resolution of the transform.

  • device ("cpu" | "cuda" | "mps", optional; default: "cpu")
    Which hardware backend to use:

    • "cpu" for standard CPU execution
    • "cuda" for NVIDIA GPU acceleration
    • "mps" for Apple Silicon GPU acceleration
  • batch_size (int, optional; default: 1)
    Number of images to process simultaneously.
    Useful for GPU acceleration when processing many images.

  • cores (int, optional; default: 1)
    Number of CPU cores to utilize for parallelized implementations.

As an example, you may consider this bash script:

INVARIANT="wect"
IMPLEMENTATION_NAME="dect"
DEVICE="cpu"
DATASET="armadillo"

DIRECTIONS=(1 2)
TIMESTEPS=(10 100)

for NUM_DIRECTIONS in "${DIRECTIONS[@]}"; do
  for NUM_TIMESTEPS in "${TIMESTEPS[@]}"; do

    OUTPUT_PATH="results/${DATASET}/${INVARIANT}/${IMPLEMENTATION_NAME}/${NUM_DIRECTIONS}_dirs_${NUM_TIMESTEPS}_timesteps_${DEVICE}.csv"

    # ensure directory exists
    DIR_NAME=$(dirname "$OUTPUT_PATH")
    mkdir -p "$DIR_NAME"

    echo "Running: directions=$NUM_DIRECTIONS, timesteps=$NUM_TIMESTEPS"

    python experiments/main.py \
      --data_path="data/${DATASET}.obj" \
      --data_type="3d_mesh" \
      --invariant="${INVARIANT}" \
      --implementation_name="${IMPLEMENTATION_NAME}" \
      --output_path="${OUTPUT_PATH}" \
      --num_directions="${NUM_DIRECTIONS}" \
      --num_timesteps="${NUM_TIMESTEPS}" \
      --device="${DEVICE}" \
      --batch_size=1 \
      --cores=1

  done
done

Creating Figures

Figures that plot the results can be created using the notebooks in the figures/ directory. Note that you may need to change path names based on the name of your results directory.

About

A centralized repository for running experiments on the pyECT package.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •