-
Notifications
You must be signed in to change notification settings - Fork 617
docs(ad): add Managing anomalies guide, expand Operational settings #11180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…add diagrams - Add new page _observing-your-data/ad/managing-anomalies.md covering how to alert on anomalies with Alerting monitors, including a JSON example, rationale table, and sample alert. - Expand _observing-your-data/ad/index.md: - Separate timestamp selection from operational settings. - Add guidance on detector interval, frequency, window delay, and history, with trade-off explanations. - Cross-link Step 6 to the Managing anomalies page. - Include a frequency vs. window delay timeline diagram. - Add assets: - images/anomaly-detection/window-delay-vs-frequency.png - images/anomaly-detection/alerting_editor.png Signed-off-by: kaituo <[email protected]>
Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged. Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer. When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review. |
@kolchfa-aws The PR is ready for doc review. |
|
||
## Alert on anomalies | ||
|
||
You can create an [Alerting monitor]({{site.url}}{{site.baseurl}}/monitoring-plugins/alerting/) using either the Anomaly detector editor or the Extraction query editor. When you want to monitor an individual anomaly detector's results and notification condition thresholds on anomaly grade and confidence, use the Anomaly detector editor. Otherwise, use the Extraction query editor to monitor multiple detectors' results or write complex queries/trigger conditions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaituo Can you use any of the 5 Monitor types to create an alerting monitor for anomalies or does it have to be a per query monitor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They can use any of the five options. The two I mentioned are the most common use cases. However, it’s ultimately up to the users how they choose to use them.
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: kaituo <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
|
||
<img src="{{site.url}}{{site.baseurl}}/images/anomaly-detection/alerting_editor.png" alt="Alerting editor" width="800" height="800"> | ||
|
||
For anomaly alerting, in **Monitor type**, select **Per query monitor** (this is the only type that supports anomaly detection). Then, in **Monitor defining method**, choose one of these methods to define your monitor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaituo I'm assuming that the per query monitor is required because it's the only one that supports anomaly detection. Please confirm that this is accurate. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not. Anomaly detection writes to an index and can be monitored just like other indexes. We have a dedicated AD UI in query monitor that simplify configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - removed this text. How about this statement "Anomaly detection is available only if you are defining a per query monitor." on the per query monitor page: is this not accurate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it meant anomaly detection UI is available in query monitor. We don't have anomaly detection UI in other types of monitors. But they can write query to fetch anomaly results and define trigger for it in other types of monitors.
Signed-off-by: kolchfa-aws <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Editorial review
|
||
- **`"size": 1`** in the search input: Retrieves a single document so you can reference `ctx.results.0.hits.hits.0` in the notification to identify which entity (such as `host` or `service`) triggered the alert. | ||
|
||
- **`execution_end_time` range `"{{period_end}}||-2m"` → `"{{period_end}}"`**: Filters results based on detector `execution_end_time`---the time the detector finishes running and indexes the result. Because OpenSearch operates in near-real-time (results are not immediate), indexing and refresh operations introduce a delay before a document becomes searchable. To account for this write-to-search latency, this example includes a small overlap (`-2m`). Specify the overlap based on your system's worst-case delay. Avoid using `data_end_time` (the bucket’s logical end), which can miss results that arrive later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **`execution_end_time` range `"{{period_end}}||-2m"` → `"{{period_end}}"`**: Filters results based on detector `execution_end_time`---the time the detector finishes running and indexes the result. Because OpenSearch operates in near-real-time (results are not immediate), indexing and refresh operations introduce a delay before a document becomes searchable. To account for this write-to-search latency, this example includes a small overlap (`-2m`). Specify the overlap based on your system's worst-case delay. Avoid using `data_end_time` (the bucket’s logical end), which can miss results that arrive later. | |
- **`execution_end_time` range `"{{period_end}}||-2m"` → `"{{period_end}}"`**: Filters results based on detector `execution_end_time`---the time the detector finishes running and indexes the result. Because OpenSearch operates in near real time (results are not immediate), indexing and refresh operations introduce a delay before a document becomes searchable. To account for this write-to-search latency, this example includes a small overlap (`-2m`). Specify the overlap based on your system's worst-case delay. Avoid using `data_end_time` (the bucket's logical end), which can miss results that arrive later. |
|
||
- **`"max_anomaly_grade"` aggregation**: Detects the most severe anomaly in the time window. You can use any field in the anomaly result index for aggregation. For additional fields, see the [Anomaly result mapping]({{site.url}}{{site.baseurl}}/monitoring-plugins/ad/result-mapping/). | ||
|
||
- **Monitor schedule every 2 minutes**: Evaluates results every two minutes to detect anomalies quickly. Combined with a 2-minute alert throttle, this avoids duplicate notifications for the same event. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **Monitor schedule every 2 minutes**: Evaluates results every two minutes to detect anomalies quickly. Combined with a 2-minute alert throttle, this avoids duplicate notifications for the same event. | |
- **Monitor schedule every 2 minutes**: Evaluates results every 2 minutes to detect anomalies quickly. Combined with a 2-minute alert throttle, this avoids duplicate notifications for the same event. |
Signed-off-by: Nathan Bower <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
This PR:
Issues Resolved
Closes #11145
Version
3.3+
Frontend features
If you're submitting documentation for an OpenSearch Dashboards feature, add a video that shows how a user will interact with the UI step by step. A voiceover is optional.
frequency.mov
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.