Remove list creation from benchmark #1
Description
Many implementations remove elements from the list by mutating it (others create a new one), which makes it necessary to run each invocation on a fresh list. Creating a large list takes some time as well of course, so this considerably influences the overall runtime. How can the effect of creating the list be separated from actual removal?
Include list creation - subtract baseline
The current benchmarks include list creation. To make up for that one benchmark only measures creation time (called the baseline) so the actual time for removal can be computed by subtracting it:
@Benchmark
public void baseline(Blackhole bh) {
List<Integer> list = createArrayList();
bh.consume(list);
}
@Benchmark
public void iterativeAt(Blackhole bh) {
List<Integer> list = createArrayList();
List<Integer> removed = IterativeAtRemover.remove(list, removeAts);
bh.consume(removed);
}
This is not ideal to quickly compare results and also feels a little messy - is there no cleaner approach?
Lifecycle per invocation
With @Setup(Level.Invocation)
it would be possible to set the list up in a lifecycle method but the Javadoc warns against using it and I do not understand all implications.
/**
* Invocation level: to be executed for each benchmark method execution.
*
* <p><b>WARNING: HERE BE DRAGONS! THIS IS A SHARP TOOL.
* MAKE SURE YOU UNDERSTAND THE REASONING AND THE IMPLICATIONS
* OF THE WARNINGS BELOW BEFORE EVEN CONSIDERING USING THIS LEVEL.</b></p>
*
* <p>This level is only usable for benchmarks taking more than a millisecond
* per single {@link Benchmark} method invocation. It is a good idea to validate
* the impact for your case on ad-hoc basis as well.</p>
*
* <p>WARNING #1: Since we have to subtract the setup/teardown costs from
* the benchmark time, on this level, we have to timestamp *each* benchmark
* invocation. If the benchmarked method is small, then we saturate the
* system with timestamp requests, which introduce artificial latency,
* throughput, and scalability bottlenecks.</p>
*
* <p>WARNING #2: Since we measure individual invocation timings with this
* level, we probably set ourselves up for (coordinated) omission. That means
* the hiccups in measurement can be hidden from timing measurement, and
* can introduce surprising results. For example, when we use timings to
* understand the benchmark throughput, the omitted timing measurement will
* result in lower aggregate time, and fictionally *larger* throughput.</p>
*
* <p>WARNING #3: In order to maintain the same sharing behavior as other
* Levels, we sometimes have to synchronize (arbitrage) the access to
* {@link State} objects. Other levels do this outside the measurement,
* but at this level, we have to synchronize on *critical path*, further
* offsetting the measurement.</p>
*
* <p>WARNING #4: Current implementation allows the helper method execution
* at this Level to overlap with the benchmark invocation itself in order
* to simplify arbitrage. That matters in multi-threaded benchmarks, when
* one worker thread executing {@link Benchmark} method may observe other
* worker thread already calling {@link TearDown} for the same object.</p>
*/
With regards to the first line:
This level is only usable for benchmarks taking more than a millisecond per single
Benchmark
method invocation
I could achieve that by creating a separate benchmark for larger lists, which might be a good idea anyway. But does that mean I can ignore the other warnings?
What else?
Are there other solutions for this problem?