Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature branch] Add Neural Stats API #1208

Merged

Conversation

q-andy
Copy link

@q-andy q-andy commented Mar 4, 2025

Description

Implementing Neural Stats API framework design proposed in #1196. This initial PR sets up the foundation for the framework to track event and state stats throughout the neural search plugin and exposed their values via API.

Image

  • Event-based stats
    • Event stats are recorded in code at a node level (processor executions, documents ingested, etc)
    • When an API call is made, all node-level maps are fetched via transport action and returned in the response.
  • State stats
    • State stats are defined by helper functions that populate state stat values
    • When an API call is made, the functions are invoked and the information is added to the response on demand

See RFC for more details.

Initial implementation includes 3 stats:

  • Text embedding processor executions
  • Text embedding processors in pipelines
  • Cluster version

Related Issues

Resolves #1196 #1104
Related: #1146

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

Example requests

GET /_plugins/_neural/stats
{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"cluster_version": "3.0.0",
	"processors": {
		"ingest": {
			"text_embedding_processors_in_pipelines": 0
		}
	},
	"all_nodes": {
		"processors": {
			"ingest": {
				"text_embedding_executions": 0
			}
		}
	},
	"nodes": {
		"r-hVPa7-Ra6FBbdpfVrHjg": {
			"processors": {
				"ingest": {
					"text_embedding_executions": 0
				}
			}
		}
	}
}
GET /_plugins/_neural/stats?include_metadata=true
{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"cluster_version": {
		"value": "3.0.0",
		"stat_type": "settable"
	},
	"processors": {
		"ingest": {
			"text_embedding_processors_in_pipelines": {
				"value": 0,
				"stat_type": "countable"
			}
		}
	},
	"all_nodes": {
		"processors": {
			"ingest": {
				"text_embedding_executions": {
					"value": 0,
					"stat_type": "timestamped_counter",
					"trailing_interval_value": 0,
					"minutes_since_last_event": 29018783
				}
			}
		}
	},
	"nodes": {
		"r-hVPa7-Ra6FBbdpfVrHjg": {
			"processors": {
				"ingest": {
					"text_embedding_executions": {
						"value": 0,
						"stat_type": "timestamped_counter",
						"trailing_interval_value": 0,
						"minutes_since_last_event": 29018783
					}
				}
			}
		}
	}
}
GET _plugins/_neural/stats/text_embedding_executions?include_metadata=true&flat_keys=true

{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"all_nodes.processors.ingest.text_embedding_executions": {
		"value": 0,
		"stat_type": "timestamped_counter",
		"trailing_interval_value": 0,
		"minutes_since_last_event": 29018784
	},
	"nodes": {
		"r-hVPa7-Ra6FBbdpfVrHjg": {
			"processors.ingest.text_embedding_executions": {
				"value": 0,
				"stat_type": "timestamped_counter",
				"trailing_interval_value": 0,
				"minutes_since_last_event": 29018784
			}
		}
	}
}

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Andy Qin <[email protected]>
Signed-off-by: Andy Qin <[email protected]>
Signed-off-by: Andy Qin <[email protected]>
Signed-off-by: Andy Qin <[email protected]>
Signed-off-by: Andy Qin <[email protected]>
@heemin32
Copy link
Collaborator

heemin32 commented Mar 7, 2025

Then, shouldn't it be like this? (all_nodes unflattened)

Yes, that makes more sense, good catch

Also wondering if this could be better. (metadata flattened)

I considered that as well, what I'm thinking is the "stat metadata" object replaces the raw value so the flat key name is the same whether or not metadata is included or not. Like the format will always be at "path.to.stat: <data>", sometimes the data is the raw value, sometimes its a the metadata, but the stat key name is same.

It also keeps things organized when you parse recursively. For example. if a cluster manager is periodically calling the stats API and parsing the response without hardcoding stat names, if we keep metadata non-flat, for every stat you can foreach on the categories with flat_keys and parse each stat and all its metadata one by one in a single pass. Whereas if it was completely flat metadata, running a foreach would need additional logic to split, sort, and group the metadata keys by the stat name again. Also you would probably never be interested in the value of path.to.stat.stat_type independently of the other stat metadata.

Just some thoughts. I don't see the usecase right now but I could still include an option to flatten the stat metadata as well, it's a pretty small change.

Yes. One use case that I can think of is that, when operator want to grep stats containing processor, this whole flat option can be handy.

_plugin/neural/stats?flat=true | grep processor

Signed-off-by: Andy Qin <[email protected]>
Copy link
Author

@q-andy q-andy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed review comments, What's left is changing the format of the response according to the discussion above (breaking down high level categories as unflat like all_nodes), adding examples, and adding BWCs.

.map(String::toLowerCase)
.collect(Collectors.toSet());

private NeuralSearchSettingsAccessor settingsAccessor;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't static fields typically go before instance fields?

Signed-off-by: Andy Qin <[email protected]>
Signed-off-by: Andy Qin <[email protected]>
@q-andy
Copy link
Author

q-andy commented Mar 12, 2025

Added BWC tests. Currently they don't run since there's no backwards versions, but included them to have basis for future 3.x versions.

Updated response formatting:

Default

GET {{ _.base_url }}/_plugins/_neural/stats
{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"info": {
		"cluster_version": "3.0.0",
		"processors": {
			"ingest": {
				"text_embedding_processors_in_pipelines": 0
			}
		}
	},
	"all_nodes": {
		"processors": {
			"ingest": {
				"text_embedding_executions": 0
			}
		}
	},
	"nodes": {
		"Qs3osnyfTz6AokNiPB2uRQ": {
			"processors": {
				"ingest": {
					"text_embedding_executions": 0
				}
			}
		}
	}
}

Flatten

GET {{ _.base_url }}/_plugins/_neural/stats?flat_stat_paths=true
{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"info": {
		"cluster_version": "3.0.0",
		"processors.ingest.text_embedding_processors_in_pipelines": 0
	},
	"all_nodes": {
		"processors.ingest.text_embedding_executions": 0
	},
	"nodes": {
		"Qs3osnyfTz6AokNiPB2uRQ": {
			"processors.ingest.text_embedding_executions": 0
		}
	}
}

Flatten w/ metadata

GET {{ _.base_url }}/_plugins/_neural/stats?flat_stat_paths=true&include_metadata=true
{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"info": {
		"cluster_version": {
			"value": "3.0.0",
			"stat_type": "info_string"
		},
		"processors.ingest.text_embedding_processors_in_pipelines": {
			"value": 0,
			"stat_type": "info_counter"
		}
	},
	"all_nodes": {
		"processors.ingest.text_embedding_executions": {
			"value": 0,
			"stat_type": "timestamped_event_counter",
			"trailing_interval_value": 0,
			"minutes_since_last_event": 29028938
		}
	},
	"nodes": {
		"Qs3osnyfTz6AokNiPB2uRQ": {
			"processors.ingest.text_embedding_executions": {
				"value": 0,
				"stat_type": "timestamped_event_counter",
				"trailing_interval_value": 0,
				"minutes_since_last_event": 29028938
			}
		}
	}
}

@q-andy q-andy force-pushed the neural-stats branch 3 times, most recently from 720ea81 to 647d730 Compare March 13, 2025 18:15
Signed-off-by: Andy Qin <[email protected]>
Copy link
Member

@martin-gaievski martin-gaievski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks Andy. please address my comment regarding the rebasing on latest main

qa/build.gradle Outdated
}
testRuntimeOnly group: 'net.minidev', name:'json-smart', version: "${versions.json_smart}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to rebase on latest main and remove this

Copy link
Author

@q-andy q-andy Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fork branch is up to date with main with latest commits, I think this diff is from the feature branch on this repo being out of date. I don't have write permissions to sync it, could you try syncing it and this should disappear? thanks @junqiu-lei!

Copy link

codecov bot commented Mar 13, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.90%. Comparing base (57124dd) to head (82504ce).
Report is 1 commits behind head on feature/neural-stats-api.

Additional details and impacted files
@@                      Coverage Diff                       @@
##             feature/neural-stats-api    #1208      +/-   ##
==============================================================
- Coverage                       81.83%   80.90%   -0.93%     
+ Complexity                       2607     1423    -1184     
==============================================================
  Files                             190      115      -75     
  Lines                            8922     5001    -3921     
  Branches                         1520      803     -717     
==============================================================
- Hits                             7301     4046    -3255     
+ Misses                           1028      643     -385     
+ Partials                          593      312     -281     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@q-andy q-andy force-pushed the neural-stats branch 2 times, most recently from 8648e13 to 82504ce Compare March 14, 2025 16:26
Copy link
Member

@vibrantvarun vibrantvarun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, Nice job Andy

@vibrantvarun
Copy link
Member

@heemin32 if your all comments are resolved then can we merge this?

@heemin32
Copy link
Collaborator

@heemin32 if your all comments are resolved then can we merge this?

I think my comments are all resolved. G2G.

@vibrantvarun vibrantvarun merged commit 89e6932 into opensearch-project:feature/neural-stats-api Mar 14, 2025
90 of 96 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v3.0.0 v3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFC] Neural Plugin Stats API
4 participants