Skip to content

Commit 74c1ab1

Browse files
authored
Merge branch 'main' into top-n
2 parents 9d3d683 + ef8abd7 commit 74c1ab1

File tree

10 files changed

+201
-17
lines changed

10 files changed

+201
-17
lines changed

_benchmark/user-guide/understanding-workloads/choosing-a-workload.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Consider the following criteria when deciding which workload would work best for
1818

1919
- The cluster's use case.
2020
- The data types that your cluster uses compared to the data structure of the documents contained in the workload. Each workload contains an example document so that you can compare data types, or you can view the index mappings and data types in the `index.json` file.
21-
- The query types most commonly used inside your cluster. The `operations/default.json` file contains information about the query types and workload operations.
21+
- The query types most commonly used inside your cluster. The `operations/default.json` file contains information about the query types and workload operations. For a list of common operations, see [Common operations]({{site.url}}{{site.baseurl}}/benchmark/user-guide/understanding-workloads/common-operations/).
2222

2323
## General search clusters
2424

Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
---
2+
layout: default
3+
title: Common operations
4+
nav_order: 16
5+
grand_parent: User guide
6+
parent: Understanding workloads
7+
---
8+
9+
# Common operations
10+
11+
[Test procedures]({{site.url}}{{site.baseurl}}/benchmark/user-guide/understanding-workloads/anatomy-of-a-workload#_operations-and-_test-procedures) use a variety of operations, found inside the `operations` directory of a workload. This page details the most common operations found inside OpenSearch Benchmark workloads.
12+
13+
- [Common operations](#common-operations)
14+
- [bulk](#bulk)
15+
- [create-index](#create-index)
16+
- [delete-index](#delete-index)
17+
- [cluster-health](#cluster-health)
18+
- [refresh](#refresh)
19+
- [search](#search)
20+
21+
<!-- vale off -->
22+
## bulk
23+
<!-- vale on -->
24+
25+
The `bulk` operation type allows you to run [bulk](/api-reference/document-apis/bulk/) requests as a task.
26+
27+
The following example shows a `bulk` operation type with a `bulk-size` of `5000` documents:
28+
29+
```yml
30+
{
31+
"name": "index-append",
32+
"operation-type": "bulk",
33+
"bulk-size": 5000
34+
}
35+
```
36+
37+
38+
<!-- vale off -->
39+
## create-index
40+
<!-- vale on -->
41+
42+
The `create-index` operation runs the [Create Index API](/api-reference/index-apis/create-index/). It supports the following two modes of index creation:
43+
44+
- Creating all indexes specified in the workloads `indices` section
45+
- Creating one specific index defined within the operation itself
46+
47+
The following example creates all indexes defined in the `indices` section of the workload. It uses all of the index settings defined in the workload but overrides the number of shards:
48+
49+
```yml
50+
{
51+
"name": "create-all-indices",
52+
"operation-type": "create-index",
53+
"settings": {
54+
"index.number_of_shards": 1
55+
},
56+
"request-params": {
57+
"wait_for_active_shards": "true"
58+
}
59+
}
60+
```
61+
62+
The following example creates a new index with all index settings specified in the operation body:
63+
64+
```yml
65+
{
66+
"name": "create-an-index",
67+
"operation-type": "create-index",
68+
"index": "people",
69+
"body": {
70+
"settings": {
71+
"index.number_of_shards": 0
72+
},
73+
"mappings": {
74+
"docs": {
75+
"properties": {
76+
"name": {
77+
"type": "text"
78+
}
79+
}
80+
}
81+
}
82+
}
83+
}
84+
```
85+
86+
87+
88+
<!-- vale off -->
89+
## delete-index
90+
<!-- vale on -->
91+
92+
The `delete-index` operation runs the [Delete Index API](api-reference/index-apis/delete-index/). Like with the [`create-index`](#create-index) operation, you can delete all indexes found in the `indices` section of the workload or delete one or more indexes based on the string passed in the `index` setting.
93+
94+
The following example deletes all indexes found in the `indices` section of the workload:
95+
96+
```yml
97+
{
98+
"name": "delete-all-indices",
99+
"operation-type": "delete-index"
100+
}
101+
```
102+
103+
The following example deletes all `logs_*` indexes:
104+
105+
```yml
106+
{
107+
"name": "delete-logs",
108+
"operation-type": "delete-index",
109+
"index": "logs-*",
110+
"only-if-exists": false,
111+
"request-params": {
112+
"expand_wildcards": "all",
113+
"allow_no_indices": "true",
114+
"ignore_unavailable": "true"
115+
}
116+
}
117+
```
118+
119+
<!-- vale off -->
120+
## cluster-health
121+
<!-- vale on -->
122+
123+
The `cluster-health` operation runs the [Cluster Health API](api-reference/cluster-api/cluster-health/), which checks the cluster health status and returns the expected status according to the parameters set for `request-params`. If an unexpected cluster health status is returned, the operation reports a failure. You can use the `--on-error` option in the OpenSearch Benchmark `execute-test` command to control how OpenSearch Benchmark behaves when the health check fails.
124+
125+
The following example creates a `cluster-health` operation that checks for a `green` health status on any `log-*` indexes:
126+
127+
```yml
128+
{
129+
"name": "check-cluster-green",
130+
"operation-type": "cluster-health",
131+
"index": "logs-*",
132+
"request-params": {
133+
"wait_for_status": "green",
134+
"wait_for_no_relocating_shards": "true"
135+
},
136+
"retry-until-success": true
137+
}
138+
139+
```
140+
141+
<!-- vale off -->
142+
## refresh
143+
<!-- vale on -->
144+
145+
The `refresh` operation runs the Refresh API. The `operation` returns no metadata.
146+
147+
148+
The following example refreshes all `logs-*` indexes:
149+
150+
```yml
151+
{
152+
"name": "refresh",
153+
"operation-type": "refresh",
154+
"index": "logs-*"
155+
}
156+
```
157+
158+
159+
<!-- vale off -->
160+
## search
161+
<!-- vale on -->
162+
163+
The `search` operation runs the [Search API](/api-reference/search/), which you can use to run queries in OpenSearch Benchmark indexes.
164+
165+
The following example runs a `match_all` query inside the `search` operation:
166+
167+
```yml
168+
{
169+
"name": "default",
170+
"operation-type": "search",
171+
"body": {
172+
"query": {
173+
"match_all": {}
174+
}
175+
},
176+
"request-params": {
177+
"_source_include": "some_field",
178+
"analyze_wildcard": "false"
179+
}
180+
}
181+
```

_data-prepper/pipelines/configuration/sinks/s3.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -173,14 +173,14 @@ When you provide your own Avro schema, that schema defines the final structure o
173173

174174
In cases where your data is uniform, you may be able to automatically generate a schema. Automatically generated schemas are based on the first event that the codec receives. The schema will only contain keys from this event, and all keys must be present in all events in order to automatically generate a working schema. Automatically generated schemas make all fields nullable. Use the `include_keys` and `exclude_keys` sink configurations to control which data is included in the automatically generated schema.
175175

176-
Avro fields should use a null [union](https://avro.apache.org/docs/1.10.2/spec.html#Unions) because this will allow missing values. Otherwise, all required fields must be present for each event. Use non-nullable fields only when you are certain they exist.
176+
Avro fields should use a null [union](https://avro.apache.org/docs/1.12.0/specification/#unions) because this will allow missing values. Otherwise, all required fields must be present for each event. Use non-nullable fields only when you are certain they exist.
177177

178178
Use the following options to configure the codec.
179179

180180
Option | Required | Type | Description
181181
:--- | :--- | :--- | :---
182-
`schema` | Yes | String | The Avro [schema declaration](https://avro.apache.org/docs/1.2.0/spec.html#schemas). Not required if `auto_schema` is set to true.
183-
`auto_schema` | No | Boolean | When set to `true`, automatically generates the Avro [schema declaration](https://avro.apache.org/docs/1.2.0/spec.html#schemas) from the first event.
182+
`schema` | Yes | String | The Avro [schema declaration](https://avro.apache.org/docs/1.12.0/specification/#schema-declaration). Not required if `auto_schema` is set to true.
183+
`auto_schema` | No | Boolean | When set to `true`, automatically generates the Avro [schema declaration](https://avro.apache.org/docs/1.12.0/specification/#schema-declaration) from the first event.
184184

185185
### `ndjson` codec
186186

@@ -208,8 +208,8 @@ Use the following options to configure the codec.
208208

209209
Option | Required | Type | Description
210210
:--- | :--- | :--- | :---
211-
`schema` | Yes | String | The Avro [schema declaration](https://avro.apache.org/docs/current/specification/#schema-declaration). Not required if `auto_schema` is set to true.
212-
`auto_schema` | No | Boolean | When set to `true`, automatically generates the Avro [schema declaration](https://avro.apache.org/docs/current/specification/#schema-declaration) from the first event.
211+
`schema` | Yes | String | The Avro [schema declaration](https://avro.apache.org/docs/1.12.0/specification/#schema-declaration). Not required if `auto_schema` is set to true.
212+
`auto_schema` | No | Boolean | When set to `true`, automatically generates the Avro [schema declaration](https://avro.apache.org/docs/1.12.0/specification/#schema-declaration) from the first event.
213213

214214
### Setting a schema with Parquet
215215

_field-types/supported-field-types/binary.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,5 +50,5 @@ The following table lists the parameters accepted by binary field types. All par
5050

5151
Parameter | Description
5252
:--- | :---
53-
`doc_values` | A Boolean value that specifies whether the field should be stored on disk so that it can be used for aggregations, sorting, or scripting. Optional. Default is `true`.
54-
`store` | A Boolean value that specifies whether the field value should be stored and can be retrieved separately from the _source field. Optional. Default is `false`.
53+
`doc_values` | A Boolean value that specifies whether the field should be stored on disk so that it can be used for aggregations, sorting, or scripting. Optional. Default is `false`.
54+
`store` | A Boolean value that specifies whether the field value should be stored and can be retrieved separately from the _source field. Optional. Default is `false`.

_install-and-configure/configuring-opensearch/network-settings.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ OpenSearch supports the following advanced network settings for transport commun
5151

5252
## Selecting the transport
5353

54-
The default OpenSearch transport is provided by the `transport-netty4` module and uses the [Netty 4](https://netty.io/) engine for both internal TCP-based communication between nodes in the cluster and external HTTP-based communication with clients. This communication is fully asynchronous and non-blocking. However, there are other transport plugins available that can be used interchangeably:
54+
The default OpenSearch transport is provided by the `transport-netty4` module and uses the [Netty 4](https://netty.io/) engine for both internal TCP-based communication between nodes in the cluster and external HTTP-based communication with clients. This communication is fully asynchronous and non-blocking. The following table lists other available transport plugins that can be used interchangeably.
5555

5656
Plugin | Description
5757
:---------- | :--------

_install-and-configure/configuring-opensearch/security-settings.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ nav_order: 40
99

1010
The Security plugin provides a number of YAML configuration files that are used to store the necessary settings that define the way the Security plugin manages users, roles, and activity within the cluster. For a full list of the Security plugin configuration files, see [Modifying the YAML files]({{site.url}}{{site.baseurl}}/security/configuration/yaml/).
1111

12-
The following sections describe security-related settings in `opensearch.yml`. To learn more about static and dynamic settings, see [Configuring OpenSearch]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index/).
12+
The following sections describe security-related settings in `opensearch.yml`. You can find the `opensearch.yml` in the `<OPENSEARCH_HOME>/config/opensearch.yml`. To learn more about static and dynamic settings, see [Configuring OpenSearch]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index/).
1313

1414
## Common settings
1515

_security/configuration/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,4 +28,4 @@ The Security plugin has several default users, roles, action groups, permissions
2828
{: .note }
2929

3030
For a full list of `opensearch.yml` Security plugin settings, Security plugin settings, see [Security settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/security-settings/).
31-
{: .note}
31+
{: .note}

_security/configuration/security-admin.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,13 +23,13 @@ The `securityadmin.sh` script requires SSL/TLS HTTP to be enabled for your OpenS
2323

2424
## A word of caution
2525

26-
If you make changes to the configuration files in `config/opensearch-security`, OpenSearch does _not_ automatically apply these changes. Instead, you must run `securityadmin.sh` to load the updated files into the index.
26+
If you make changes to the configuration files in `config/opensearch-security`, OpenSearch does _not_ automatically apply these changes. Instead, you must run `securityadmin.sh` to load the updated files into the index. The `securityadmin.sh` file can be found in `<OPENSEARCH_HOME>/plugins/opensearch-security/tools/securityadmin.[sh|bat]`.
2727

2828
Running `securityadmin.sh` **overwrites** one or more portions of the `.opendistro_security` index. Run it with extreme care to avoid losing your existing resources. Consider the following example:
2929

3030
1. You initialize the `.opendistro_security` index.
3131
1. You create ten users using the REST API.
32-
1. You decide to create a new [reserved user]({{site.url}}{{site.baseurl}}/security/access-control/api/#reserved-and-hidden-resources) using `internal_users.yml`.
32+
1. You decide to create a new [reserved user]({{site.url}}{{site.baseurl}}/security/access-control/api/#reserved-and-hidden-resources) using `internal_users.yml`, found in `<OPENSEARCH_HOME>/config/opensearch-security/` directory.
3333
1. You run `securityadmin.sh` again to load the new reserved user into the index.
3434
1. You lose all ten users that you created using the REST API.
3535

_security/configuration/yaml.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ The approach we recommend for using the YAML files is to first configure [reserv
1717

1818
## action_groups.yml
1919

20-
This file contains any initial action groups that you want to add to the Security plugin.
20+
This file contains any role mappings required for your security configuration. You can find the `role_mapping.yml` file in `<OPENSEARCH_HOME>/config/opensearch-security/roles_mapping.yml`.
2121

2222
Aside from some metadata, the default file is empty, because the Security plugin has a number of static action groups that it adds automatically. These static action groups cover a wide variety of use cases and are a great way to get started with the plugin.
2323

@@ -43,6 +43,8 @@ _meta:
4343
4444
You can use `allowlist.yml` to add any endpoints and HTTP requests to a list of allowed endpoints and requests. If enabled, all users except the super admin are allowed access to only the specified endpoints and HTTP requests, and all other HTTP requests associated with the endpoint are denied. For example, if GET `_cluster/settings` is added to the allow list, users cannot submit PUT requests to `_cluster/settings` to update cluster settings.
4545

46+
You can find the `allowlist.yml` file in `<OPENSEARCH_HOME>/config/opensearch-security/allowlist.yml`.
47+
4648
Note that while you can configure access to endpoints this way, for most cases, it is still best to configure permissions using the Security plugin's users and roles, which have more granular settings.
4749

4850
```yml
@@ -92,7 +94,7 @@ requests: # Only allow GET requests to /sample-index1/_doc/1 and /sample-index2/
9294

9395
## internal_users.yml
9496

95-
This file contains any initial users that you want to add to the Security plugin's internal user database.
97+
This file contains any initial users that you want to add to the Security plugin's internal user database. You can find this file in ``<OPENSEARCH_HOME>/config/opensearch-security/internal_users.yml`.
9698

9799
The file format requires a hashed password. To generate one, run `plugins/opensearch-security/tools/hash.sh -p <new-password>`. If you decide to keep any of the demo users, *change their passwords* and re-run [securityadmin.sh]({{site.url}}{{site.baseurl}}/security/configuration/security-admin/) to apply the new passwords.
98100

@@ -313,7 +315,7 @@ admin_tenant:
313315

314316
## opensearch.yml
315317

316-
In addition to many OpenSearch settings, this file contains paths to TLS certificates and their attributes, such as distinguished names and trusted certificate authorities.
318+
In addition to many OpenSearch settings, the `opensearch.yml` file contains paths to TLS certificates and their attributes, such as distinguished names and trusted certificate authorities. You can find this file in `<OPENSEARCH_HOME>/config/`.
317319

318320
```yml
319321
plugins.security.ssl.transport.pemcert_filepath: esnode.pem

_tuning-your-cluster/availability-and-recovery/snapshots/searchable_snapshot.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,4 +108,5 @@ The following are known limitations of the searchable snapshots feature:
108108
- Many remote object stores charge on a per-request basis for retrieval, so users should closely monitor any costs incurred.
109109
- Searching remote data can impact the performance of other queries running on the same node. We recommend that users provision dedicated nodes with the `search` role for performance-critical applications.
110110
- For better search performance, consider [force merging]({{site.url}}{{site.baseurl}}/api-reference/index-apis/force-merge/) indexes into a smaller number of segments before taking a snapshot. For the best performance, at the cost of using compute resources prior to snapshotting, force merge your index into one segment.
111-
- We recommend configuring a maximum ratio of remote data to local disk cache size using the `cluster.filecache.remote_data_ratio` setting. A ratio of 5 is a good starting point for most workloads to ensure good query performance. If the ratio is too large, then there may not be sufficient disk space to handle the search workload. For more details on the maximum ratio of remote data, see issue [#11676](https://github.com/opensearch-project/OpenSearch/issues/11676).
111+
- We recommend configuring a maximum ratio of remote data to local disk cache size using the `cluster.filecache.remote_data_ratio` setting. A ratio of 5 is a good starting point for most workloads to ensure good query performance. If the ratio is too large, then there may not be sufficient disk space to handle the search workload. For more details on the maximum ratio of remote data, see issue [#11676](https://github.com/opensearch-project/OpenSearch/issues/11676).
112+
- k-NN native-engine-based indexes using `faiss` and `nmslib` engines are incompatible with searchable snapshots.

0 commit comments

Comments
 (0)