-
Notifications
You must be signed in to change notification settings - Fork 617
Add length token filter docs #8152 #8156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
e53228f
b47aec3
ed64e86
e3011b9
2c6742f
1ba0e69
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
--- | ||
layout: default | ||
title: Length | ||
parent: Token filters | ||
nav_order: 240 | ||
--- | ||
|
||
# Length token filter | ||
|
||
The `length` token filter is used to remove tokens that don't meet specified length criteria, such as minimum and maximum values, from the token stream. | ||
|
||
|
||
## Parameters | ||
|
||
The `length` token filter can be configured with the following parameters. | ||
|
||
Parameter | Required/Optional | Data type | Description | ||
:--- | :--- | :--- | :--- | ||
`min` | Optional | Integer | Minimum length of tokens that should be created. Default is `0`. | ||
kolchfa-aws marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
`max` | Optional | Integer | Maximum length of tokens that should be created. Default is `Integer.MAX_VALUE` (`2147483647`). | ||
kolchfa-aws marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
|
||
## Example | ||
|
||
The following example request creates a new index named `my_index` and configures an analyzer with a `length` filter: | ||
|
||
```json | ||
PUT my_index | ||
{ | ||
"settings": { | ||
"analysis": { | ||
"analyzer": { | ||
"only_keep_4_to_10_characters": { | ||
"tokenizer": "whitespace", | ||
"filter": [ "length_4_to_10" ] | ||
} | ||
}, | ||
"filter": { | ||
"length_4_to_10": { | ||
"type": "length", | ||
"min": 4, | ||
"max": 10 | ||
} | ||
} | ||
} | ||
} | ||
} | ||
``` | ||
{% include copy-curl.html %} | ||
|
||
## Generated tokens | ||
|
||
Use the following request to examine the tokens generated using the analyzer: | ||
|
||
```json | ||
GET /my_index/_analyze | ||
{ | ||
"analyzer": "only_keep_4_to_10_characters", | ||
"text": "OpenSearch is a great tool!" | ||
} | ||
``` | ||
{% include copy-curl.html %} | ||
|
||
The response contains the generated tokens: | ||
|
||
```json | ||
{ | ||
"tokens": [ | ||
{ | ||
"token": "OpenSearch", | ||
"start_offset": 0, | ||
"end_offset": 10, | ||
"type": "word", | ||
"position": 0 | ||
}, | ||
{ | ||
"token": "great", | ||
"start_offset": 16, | ||
"end_offset": 21, | ||
"type": "word", | ||
"position": 3 | ||
}, | ||
{ | ||
"token": "tool!", | ||
"start_offset": 22, | ||
"end_offset": 27, | ||
"type": "word", | ||
"position": 4 | ||
} | ||
] | ||
} | ||
``` |
Uh oh!
There was an error while loading. Please reload this page.