Skip to content

dsrhaslab/prismo

Repository files navigation

prismo2

Prismo is a configurable block-based I/O benchmark tool designed to stress-test storage systems. Each workload is specified in a JSON file and drives a producer–consumer pipeline of I/O packets against a target file. Operations, access patterns, block content, and backend engine are all independently configurable, enabling reproducible experiments across synthetic and trace-driven workloads.

Toolchain

Tool Description
Astroide Converts .blkparse block traces into the compact binary .prismo format used by trace-driven generators
Deltoide Analyses a dataset and emits compression/deduplication distribution profiles, ready to paste into a workload config
Cardoide Campaign runner, simplifies the execution of multiple workloads by allowing repeated runs and providing configurable filters

Prerequisites

  1. Install dependencies
sudo apt update
sudo apt install -y meson ninja-build
sudo apt install -y liburing-dev libspdlog-dev libeigen3-dev nlohmann-json3-dev libboost-all-dev libzstd-dev
  1. Download and install SPDK
# Clone the repository
git clone https://github.com/spdk/spdk
cd spdk
git submodule update --init

# Install dependencies
./scripts/pkgdep.sh

# Build
./configure --enable-asan
make

# Run tests
./test/unit/unittest.sh
  1. Download and install argparse
# Clone the repository
git clone https://github.com/p-ranav/argparse
cd argparse

# Build the tests
mkdir build
cd build
cmake -DARGPARSE_BUILD_SAMPLES=on -DARGPARSE_BUILD_TESTS=on ..
make

# Run tests
./test/tests

# Install the library
sudo make install

Building

Important

To build this project, the Meson Build System must be able to locate a compatible C++ compiler on your system. When using GCC, the required version is 13.4 or newer.

Before compiling, Meson needs to know where to find the SPDK libraries. Update the spdk_root variable in meson.build so it points to the SPDK repository path you installed earlier.

meson setup builddir --buildtype=release -Dpkg_config_path=/path/to/spdk/build/lib/pkgconfig/
meson compile -C builddir

The binary is placed at builddir/prismo. The tools (astroide, deltoide) are built alongside it. For convenience, the program can also be installed system-wide, allowing you to run it from any location without specifying the full path.

meson install -C builddir

Usage

# Run workload, print report to stdout
prismo -c workload.json

# Write report to file
prismo -c workload.json -o report.json

# Enable debug logging
prismo -c workload.json -l

Options

Flag Description Default
-c, --config Path to the workload JSON file (required)
-o, --output Write JSON report to this file stdout (-)
-l, --logging Enable debug logging off

Configuration

Workloads are defined using a JSON file divided into six independent sections. Each section can be freely combined, enabling the creation of purely synthetic workloads, trace-driven workloads, or hybrids.

Job

"job": {
  "name": "my_workload",
  "numjobs": 1,
  "filename": "testfile",
  "block_size": 4096,
  "limit": 268435456,
  "metric": "full",
  "termination": {
    "type": "iterations",
    "value": 200000
  }
}
Field Description Default
name Workload name (required)
numjobs Number of parallel producer-consumer pairs (required)
filename Path of the target file (required)
block_size I/O block size in bytes (required)
limit Maximum file size in bytes (required)
metric Granularity of metric collection (required)
termination Termination condition: stop after N operations or after M milliseconds (required)
ramp Linear increase or decrease of throughput (optional)

The metric parameter accepts values none | base | standard | full, progressively collecting more metrics and consequently reducing performance. For maximum performance, disable metric collection by selecting none.

The termination condition can be expressed in two ways: one limits the number of operations to N, while the other limits execution time to M milliseconds.

"termination": {
  "type": "iterations",
  "value": 2e6
}
"termination": {
  "type": "runtime",
  "value": 30000
}

The ramp parameter linearly increases or decreases throughput based on start_ratio and end_ratio. Throughput begins at start_ratio and reaches end_ratio after duration milliseconds. When start_ratio < end_ratio, a speed-up is simulated, otherwise a slow-down occurs.

"ramp": {
  "start_ratio": 0.1,
  "end_ratio": 1.0,
  "duration": 5000
}

Note

This parameter is optional. When not specified, the benchmark runs at maximum throughput for the entire execution.


Operation

Controls which I/O operations are issued by the benchmark. There are currently four operation generators, and they accept read | write | fsync | fdatasync | nop in their configurations. In the example below, the benchmark continuously issues write, but it could issue any of the listed operations.

"operation": {
  "type": "constant",
  "operation": "write"
}
Type Description Example
constant Repeatedly issues the same operation 01_nop_seq_const_posix.json
percentage Operations sampled from a discrete distribution 03_rw_random_random_posix.json
sequence Repeats a fixed operation pattern 06_zipf_dedup_posix.json
trace Replay operations from a .prismo trace 07_trace_all_posix.json

Note

Each generator has its own specific configuration. Reviewing the examples is recommended to better understand how they work.

In some workloads, it is useful to force buffered data to be flushed. Barriers provide this behavior by issuing one operation after another operation has been triggered N times. In the example below, an fsync is issued every 1024 write operations, and an fdatasync every 512.

"barrier": [
  {
    "operation": "fsync",
    "trigger": "write",
    "threshold": 1024
  },
  {
    "operation": "fdatasync",
    "trigger": "write",
    "threshold": 512
  }
]

Access

Controls which file offset each operation targets. Offsets are bounded by the limit parameter defined in job. The available access generators are intentionally simple, but they still model useful behaviors such as hot spots, cache-friendly locality, and production-style trace replay.

"access": {
  "type": "zipfian",
  "skew": 0.8
}
Type Description Example
sequential Monotonically increasing offsets 01_nop_seq_const_posix.json
random Uniformly random offsets 03_rw_random_random_posix.json
zipfian Zipf-distributed offsets (hot-spot skew) 04_rw_zipf_random_posix.json
trace Replay offsets from a .prismo trace 07_trace_all_posix.json

Content

Defines the contents of the buffers used by write operations. If a workload does not issue writes, the content generator is never invoked, which can improve the benchmark's operation rate.

"content": {
  "type": "random",
  "refill": true
},
Type Description Example
constant Zero-filled buffer 01_nop_seq_const_posix.json
random Random bytes 04_rw_zipf_random_posix.json
dedup Deduplication and compression profile 16_dedup_heavy_barrier_posix.json
trace Replay block content from a .prismo trace 07_trace_all_posix.json

With the exception of constant, all content generators use the refill field. It controls whether buffers are regenerated from scratch or reuse the same base buffer. For random, when refill == true, the entire buffer is rewritten with random bytes, otherwise only the buffer header changes, which allows higher throughput.

The properties of generated content are important when evaluating systems, especially those that implement compression and deduplication optimizations. For this reason, generators can optionally include a compression profile that applies different reduction ratios according to a distribution. In the example below, half of the content remains uncompressed, while the other half is compressed by 50%.

"compression": [
  { "percentage": 50, "reduction": 0  },
  { "percentage": 50, "reduction": 50 }
]

For the dedup generator, you must define a discrete distribution of duplicate groups that determines how many times blocks repeat. In the example below, half of the written blocks are unique (repeats = 0), while the remaining half have three duplicates each. In addition, each repeats group can define its own compression profile.

Tip

Use Deltoide to derive these distributions from a real dataset.

"content": {
  "type": "dedup",
  "refill": false,
  "distribution": [
    {
      "percentage": 50,
      "repeats": 0,
      "compression": [
        { "percentage": 50, "reduction": 0  },
        { "percentage": 50, "reduction": 50 }
      ]
    },
    {
      "percentage": 50,
      "repeats": 3,
      "compression": [
        { "percentage": 100, "reduction": 80 }
      ]
    }
  ]
}

Caution

Do not combine the top-level compressor with the dedup generator, otherwise the compression settings defined for each repeats group will be overwritten.


Trace Extension

Controls how a trace is extended after it reaches the end. The goal is to continue generating operations indefinitely while preserving the statistical properties observed in the original data, allowing production-like conditions to be reproduced synthetically.

"extension": "sample",
"memory": 16384

While reading the .prismo binary file, records are buffered to improve deserialization performance. The memory field defines how many bytes are reserved for this buffering step.

Extension Description
repeat Restart from the beginning
sample Samples records according to their observed distribution
regression Extrapolate via multivariate regression

Note

Trace-based generation is available in operation, access, and content generators. This makes hybrid workloads possible, where one generator can replay traces while the others produce synthetic data.


Engine

Defines the backend engine responsible for executing I/O requests. Both synchronous and asynchronous implementations are available. In general, asynchronous engines with polling are recommended for higher throughput because they avoid interruption overhead from blocking system calls.

"engine": {
  "type": "posix",
  "open_flags": ["O_CREAT", "O_RDWR"]
}
Type Description Example
posix Synchronous I/O using the standard POSIX API 01_nop_seq_const_posix.json
uring Asynchronous I/O using the Linux io_uring API 19_nop_seq_const_uring.json
aio POSIX asynchronous I/O using the AIO interface 37_nop_seq_const_aio.json
spdk High-performance user-space storage I/O via SPDK 55_nop_seq_const_spdk.json

The posix, uring, and aio engines open the files on which I/O operations are performed, while the open_flags field specifies the flags passed to open(2). The following flags are supported:

# Core access modes (choose exactly one)
O_RDONLY | O_WRONLY | O_RDWR

# Common write behavior flags
O_APPEND | O_TRUNC | O_CREAT

# Strong consistency / performance flags
O_SYNC | O_DSYNC | O_RSYNC | O_DIRECT

Warning

The aio interface requires O_DIRECT in open_flags, because asynchronous behavior is fully effective only with direct I/O.

The uring interface supports several configuration flags defined by io_uring_setup(2). The following subset is currently available:

# Polling modes
IORING_SETUP_IOPOLL | IORING_SETUP_SQPOLL | IORING_SETUP_HYBRID_IOPOLL

# Thread and CPU control
IORING_SETUP_SQ_AFF | IORING_SETUP_SINGLE_ISSUER

# Queue tuning
IORING_SETUP_CQSIZE | IORING_SETUP_CLAMP

# Safety / semantic guarantees
IORING_FEAT_NODROP

Warning

The IORING_SETUP_SINGLE_ISSUER flag is available only from Linux kernel 6.0 onward. If this flag is enabled (not commented out in the code), builds targeting older kernels may fail. Upgrading the kernel is recommended.

The spdk engine uses the bdev interface, so the target must be a block device. Its configuration is provided through the json_config_file parameter, which points to a JSON file containing the bdev configuration. Examples are available in spdk.

Warning

reactor_mask should select at least two CPU cores. Otherwise, request execution may stall, as a single worker thread could remain busy polling and monopolize the only available core.


Logger

The logger captures benchmark activity and writes detailed execution records. These logs are stored in a structured format, which can then be analyzed with the scripts inside tools directory to generate plots and run statistical analysis.

Logging detail follows the metric level selected in job. As you move from none to full, records include progressively richer information.

"logger": {
  "type": "spdlog",
  "name": "prismo",
  "queue_size": 8192,
  "thread_count": 1,
  "truncate": true,
  "to_stdout": true,
  "files": [
    "./log1.log",
    "./log2.log",
    "./log3.log"
  ]
}

Note

This component is optional. If your primary goal is maximum throughput, disable logging because it adds measurable overhead.

Report

The JSON report provides a detailed benchmark summary, with one entry per job and an all aggregate when multiple jobs run (numjobs > 1). When metric collection is enabled, each job entry includes overall statistics such as total operations, total bytes transferred, runtime, IOPS, and bandwidth, plus per-operation metrics.

{
  "jobs": [
    {
      "job_id": 0,
      "operations": [
        {
          "bandwidth_bytes_per_sec": 1826388175.56,
          "count": 1000000,
          "iops": 445895.55,
          "latency_ns": {
            "avg": 2165,
            "max": 95973,
            "min": 1899
          },
          "operation": "write",
          "percentiles_ns": {
            "p50": 2048,
            "p90": 2048,
            "p95": 2048,
            "p99": 2048,
            "p99_9": 4096,
            "p99_99": 8192
          },
          "total_bytes": 4096000000
        }
      ],
      "overall_bandwidth_bytes_per_sec": 1826388175.56,
      "overall_iops": 445895.55,
      "runtime_sec": 2.24268,
      "total_bytes": 4096000000,
      "total_operations": 1000000
    }
  ]
}

Contributing

Currently there are only a few implementations of the top-level configuration components, which limits the range of workload properties that can be expressed. Adding more generators for operation, access, content, engines, and extensions would be a valuable contribution.

  1. Implement the abstract base class for the desired component.

  2. Define a JSON configuration and accept it in the class constructor.

  3. Register the constructor in the component's parsing function, found in factory.cpp.

Important

Logger implementations must be thread-safe, because engines may share the same logger instance.

Any other contributions are also welcome 🥰.

About

Configurable block-based I/O benchmark tool

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors