When performing multiple SDGym runs on the same day, save the artifacts with consistent naming

### Problem Description
An upcoming version of SDGym will save the artifacts (synthesizers, synthetic data) that are created when benchmarking. These artifacts will be saved based on the date of the benchmarking run. For example:

```
output_destination/
|--- SDGym_results_06_24_2025/
     |--- census_06_24_2025/
          |--- CTGANSynthesizer/
               |-- CTGANSynthesizer.pkl
               |-- CTGANSynthesizer_synthetic_data.csv
          |--- GaussianCopulaSynthesizer/
               |-- GaussianCopulaSynthesizer.pkl
               |-- GaussianCopulaSynthesizer_synthetic_data.csv
     |--- <dataset_name>_06_24_2025/
          |--- <artifacts>
     |--- results.csv
     |--- metainfo.yaml
```

The problem is that it's possible to run the SDGym benchmark multiple times on a single day. We need to have a consistent, well-defined output in the case that that happens.

### Expected behavior
If there is only 1 run that happens per day, then the naming scheme for all the artifacts should be exactly as shown above.
- The final results should be named `results.csv`
- The meta info should be named `metainfo.yaml`
- The synthesizer folder should be named `<synthesizer_name>/`

If another runs happens on the same day, then we should do the following:
- Create a new final results file called `results(1).csv`. (If that's taken, then keep incrementing the suffixes, `results(2).csv`, `results(3).csv`, etc.)
    - The number should correspond with the `run_id` that is saved  in the `metainfo` file. The first one will be `run_<date>_0`, then the next will be `run_<date>_1`, then `run_<date>_2`, etc.
- Do the same with the metainfo file. Name it `metainfo(1).yaml`. (If that's taken, keep incrementing the suffixes, `metainfo(2).yaml`, `metainfo(3).yaml`, etc.)
    - The number should correspond with the `run_id` that is saved  in the `metainfo` file. The first one will be `run_<date>_0`, then the next will be `run_<date>_1`, then `run_<date>_2`, etc.
    - The number should also be the same as the corresponding results file
- Generally speaking, it would be really rare to run the same (synthesizer, dataset) combo a second time on the same day. However if it happens, then we should do the same naming scheme for the new synthesizer folder. **Use the same number for the run that is used for the corresponding results and metainfo file.**
    - For example the folder would then be called `CTGANSynthesizer(1)/`.
    - Inside the folder, the artifacts should be renamed to: `CTGANSynthesizer(1).pkl`, `CTGANSynthesizer(1)_synthetic_data.csv`, etc.
    - In the `results(1).csv` we should refer to it as `CTGANSynthesizer(1)`.

Below is the structure for how it would look like:

```
output_destination/
|--- SDGym_results_06_24_2025/
     |--- census_06_24_2025/
          |--- CTGANSynthesizer/
               |-- CTGANSynthesizer.pkl
               |-- CTGANSynthesizer_synthetic_data.csv
          |--- GaussianCopulaSynthesizer/
               |-- GaussianCopulaSynthesizer.pkl
               |-- GaussianCopulaSynthesizer_synthetic_data.csv
          |--- CTGANSynthesizer(1)/
               |-- CTGANSynthesizer(1).pkl
               |-- CTGANSynthesizer(1)_synthetic_data.csv
     |--- <dataset_name>_06_24_2025/
          |--- <artifacts>
     |--- results.csv
     |--- results(1).csv
     |--- metainfo.yaml
     |--- metainfo(1).yaml
```

### Additional context
If a user is doing multiple runs on the same day, it is most likely because they are splitting up the (synthesizer, dataset) combinations that they would like to test. Perhaps a second run is done on slower synthesizers or larger datasets.

In an ideal case, we'd just like to append the results from the subsequent run(s) to the existing `results.csv` and `metainfo.yaml` file. However, this would require us to implement a file locking system in case multiple, concurrent runs are trying to access the same file at the same time. For now, this is out-of-scope so we're writing a new file instead.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

When performing multiple SDGym runs on the same day, save the artifacts with consistent naming #448

Problem Description

Expected behavior

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

When performing multiple SDGym runs on the same day, save the artifacts with consistent naming #448

Description

Problem Description

Expected behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions