Skip to content

Energy Benchmarks

Dominik Schwabe edited this page May 21, 2025 · 3 revisions

Energy Benchmarks

The energy is currently measured in-programm with zeus. Due to how zeus and our implementation of energy measurements work, root privileges are required for CPU measurements. GPU measurements can be done without any additional privileges.

The measurements are performed by defining a starting window before and an ending window after the generation process using a given model. The model receives a dataset and begins generation for each data point in an iterative manner. Since the measurement window only covers the generation phase, dataset loading is excluded from the measurements.

Energy Command

Run Examples

Example without any special configurations:

qtransform run=energy model=EB_gpt_small_l1_h2 dataset=tsV2 tokenizer=tsV2 run.max_iters=1 wandb.enabled=False

Example using run configurations for idle time and the maximum number of iterations:

qtransform run=energy model=EB_gpt_small_l1_h2 dataset=tsV2 tokenizer=tsV2 run.max_iters=10 run.idle_time=60 wandb.enabled=False

Run Options

idle_time: 0 # higher idle time means slightly more accurate measurements

# iterations and tokens to generate with given dataset to "preheat" the running device
# other generation options for preheating can't be changed via run configurations for now
preheat:
  max_iters: 0
  max_new_tokens: 512

# generation configuration
max_new_tokens: 512
temperature: 0.7
top_k: 200

max_iters: ???

out:
  path: null # path where the results should be saved to

Most run options are already explained on the qtransform page. Check there for more info.

idle_time

Set by default to 0. When idle_time is greater than 0, energy while the programm is idling is measured. The energy while idling is measured before any generation is done.

The energy measured during idle time is scaled to match the generation duration and subtracted from the final results. This serves to approximate the energy consumed solely by the generation process, excluding background activities such as the operating system.

preheat

Generation with the given dataset and model will be done before the actual measuring of energy during generation. This serves to warmup caches. No energy is measured during preheating.

max_iters

Set by default to 0. Preheating won't occur unless the number is greater than 0.

max_new_tokens

Set by default to 512. How many tokens should be generated per iteration during preheating.

out

path

The path where the results should be saved to. If not specified the results will be printed to console instead.

Structure of results

The first time the energy command is executed with a path specified, it will generate a run.txt, which keeps track of the current run (run referring here to the execution of the energy command). For each run a new folder with an incrementally increasing number starting at 1 will be created.

Example structure of how the results are stored with run.out.path="/home/user/qtransform/energy_measurements"

qtransform/
├── energy_measurements
│   ├── 1
│   │   ├── energy_verbose.csv
│   │   └── run_cfg.txt
│   ├── 2
│   │   ├── energy_verbose.csv
│   │   └── run_cfg.txt
│   ├── n
│   │   ├── ...
│   │   └── ...
│   ├── energy_averages.csv
│   └── runs.txt
│
...

energy_verbose

Saves for each iteration of the given dataset the measured energy in Joule and the time in seconds for the specific datapoint. The .csv file storing these values has the following header:

time(s) cpu_energy(J) gpu_energy(J)

run_cfg

Saves the run configuration

energy_averages

Saves for each run the per iteration averages and total sum of the measured energy in Joule, the time in seconds and the number of tokens. The .csv file storing these values has the following header:

run total_time(s) avg_time(s) total_cpu_energy(J) avg_cpu_energy(J) total_gpu_energy(J) avg_gpu_energy(J) max_new_tokens
Clone this wiki locally