pyECT-experiments

The code for reproducing the experiments using the pyECT package.

Python version 3.10 should be installed on your system (or in your virtual environment). Base requirements can be installed by running pip install -r requirements.txt from the root of this repository.

Acknowledgements Computational efforts were performed on the Tempest High Performance Computing System, operated and supported by University Information Technology Research Cyberinfrastructure (RRID:SCR_026229) at Montana State University.

Data Setup

Fashion-MNIST: The Fashion-MNIST dataset may be downloaded here. Then, the training set CSV file should be moved into the data/fashionmnist/ directory, and can be preprocessed by running data/fashionmnist/preprocess-fashionmnist.py.

ImageNet: The ImageNet sample dataset can be downloaded here. The images should be moved to a data/imagenet/images directory within this repo, and then preprocessed by running the data/imagenet/preprocess-imagenet.py function.

3DCCs: The 3DCCs dataset can be generated by running the data/generate_3d_cubical_complexes.py script.

Stanford Mesh Datasets: The Stanford mesh datasets can be downloaded as PLY files from here. They should be moved to the data/stanford directory and preprocessed using the data/stanford/preprocess-stanford.py file.

Dependency/Package Setup

Base requirements for running the package can be installed using pip install -r requirements.txt. Then, additional packages need to be installed using the following steps.

PyECT: PyECT should be installed using the linked anonymous repository in the paper. From the root of the repository, you may run pip install .

Eucalc: Eucalc should be installed following the instructions here.

FastTopology: FastTopology may be installed and compiled from here.

Reproducing Experiments

To run the experiments, you may run the experiments/main.py function, specifying values for the following command line arguments:

Arguments

data_path (str, required)
Path to the dataset file. This may be:
- A compressed NumPy .npz file containing image data, or
- An .obj file containing 3D mesh data.
  The file will be loaded and processed depending on the selected data_type.
data_type (str, required)
Specifies the format and structure of the input data. Must be one of the predefined DATA_TYPES.
Examples include:
- 'image': 2D image data stored in .npz format
- '3d_mesh': 3D triangular mesh in .obj format
- '3d_cubical_complex': Voxel-based or grid-based 3D data
invariant (str, required)
The topological invariant to compute. Must be one of INVARIANT_TYPES. Examples:
- 'wect': Weighted Euler Characteristic Transform
- 'ecf': Euler Characteristic Field
implementation_name (str, required)
A descriptive name for the experiment or implementation variant being executed.
Used for logging, result organization, and reproducibility.
output_path (str, required)
Destination path for the output .csv file where computed invariants or experiment results will be saved.
num_directions (int, required)
Number of directions for directional transforms (e.g., for WECT).
Higher values produce more detailed signatures but increase computation time.
num_timesteps (int, required)
Number of filtration time steps used when computing invariants.
Controls the resolution of the transform.
device ("cpu" | "cuda" | "mps", optional; default: "cpu")
Which hardware backend to use:
- "cpu" for standard CPU execution
- "cuda" for NVIDIA GPU acceleration
- "mps" for Apple Silicon GPU acceleration
batch_size (int, optional; default: 1)
Number of images to process simultaneously.
Useful for GPU acceleration when processing many images.
cores (int, optional; default: 1)
Number of CPU cores to utilize for parallelized implementations.

As an example, you may consider this bash script:

INVARIANT="wect"
IMPLEMENTATION_NAME="dect"
DEVICE="cpu"
DATASET="armadillo"

DIRECTIONS=(1 2)
TIMESTEPS=(10 100)

for NUM_DIRECTIONS in "${DIRECTIONS[@]}"; do
  for NUM_TIMESTEPS in "${TIMESTEPS[@]}"; do

    OUTPUT_PATH="results/${DATASET}/${INVARIANT}/${IMPLEMENTATION_NAME}/${NUM_DIRECTIONS}_dirs_${NUM_TIMESTEPS}_timesteps_${DEVICE}.csv"

    # ensure directory exists
    DIR_NAME=$(dirname "$OUTPUT_PATH")
    mkdir -p "$DIR_NAME"

    echo "Running: directions=$NUM_DIRECTIONS, timesteps=$NUM_TIMESTEPS"

    python experiments/main.py \
      --data_path="data/${DATASET}.obj" \
      --data_type="3d_mesh" \
      --invariant="${INVARIANT}" \
      --implementation_name="${IMPLEMENTATION_NAME}" \
      --output_path="${OUTPUT_PATH}" \
      --num_directions="${NUM_DIRECTIONS}" \
      --num_timesteps="${NUM_TIMESTEPS}" \
      --device="${DEVICE}" \
      --batch_size=1 \
      --cores=1

  done
done

Creating Figures

Figures that plot the results can be created using the notebooks in the figures/ directory. Note that you may need to change path names based on the name of your results directory.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
data		data
experiments		experiments
figures		figures
results		results
slurm		slurm
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
results.zip		results.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pyECT-experiments

Data Setup

Dependency/Package Setup

Reproducing Experiments

Arguments

Creating Figures

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

compTAG/pyECT-experiments

Folders and files

Latest commit

History

Repository files navigation

pyECT-experiments

Data Setup

Dependency/Package Setup

Reproducing Experiments

Arguments

Creating Figures

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages