Skip to content
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions _benchmark/glossary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
layout: default
title: Glossary
nav_order: 10
---

# OpenSearch Benchmark glossary

The following terms are commonly used in OpenSearch Benchmark:

- **Corpora**: A collection of documents.
- **Latency**: If `target-throughput` is disabled (has no value or a value of `0)`, latency is equivalent to service time. If `target-throughput` is enabled (has a value of 1 or greater), latency is the service time plus the time the request waits in the queue before being sent.
- **Metric keys**: The metrics that OpenSearch Benchmark stores, based on the configuration in the [metrics record]({{site.url}}{{site.baseurl}}/benchmark/metrics/metric-records/).
- **Operations**: In workloads, a list of API requests performed by a workload.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we say "API operations for consistency?

- **Pipeline**: A series of steps occurring before and after a workload is run that determines benchmark results.
- **Schedule**: In workloads, a list of operations in a specific order.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the difference between this and operations is that operations is not necessarily in order but schedule is? Also, is the order time-based, and if so, is it execution time or the time the request was put in the queue or sent?

Copy link
Contributor Author

@Naarcha-AWS Naarcha-AWS Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly. A schedule is simply a list of two or more operations performed in the order they appear at the time the workload is run. The order the operations in a schedule isn't time-based.

I'll adjust the definition accordingly.

- **Service time**: The amount of time that it takes for `opensearch-py`, the primary client for OpenSearch Benchmark, to send a request and receive a response from the OpenSearch cluster. It includes the amount of time that it takes for the server to process a request and also _includes_ network latency, load balancer overhead, and deserialization/serialization.
- **Summary report**: A report output at the end a test based on the metric keys defined in the workload.
- **Test**: A single invocation of the OpenSearch Benchmark binary.
- **Throughput**: The number of operations completed in a given period of time.
- **Workload**: A collection of one or more benchmarking scenarios that use a specific document corpus to perform a benchmark against your cluster. The document corpus contains any indexes, data files, and operations invoked when the workload runs.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a scenario the same as a test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Though we'll switch this one to "test" for now. In Benchmark 2.0, which has yet to be released, we are renaming "tests" --- "scenarios".

Loading