Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce metrics collection overhead #1024

Open
mbutrovich opened this issue Oct 18, 2024 · 0 comments
Open

Reduce metrics collection overhead #1024

mbutrovich opened this issue Oct 18, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@mbutrovich
Copy link
Contributor

mbutrovich commented Oct 18, 2024

What is the problem the feature request solves?

Running TPC-H locally, I see >3% of on-CPU time spent in comet::execution::metrics::utils::update_comet_metric. This function appears to be called whenever native execution wakes up in the polling loop, typically to produce a batch. Starting from the root of the plan, the preorder traversal behavior is:

  • For every metric in the node:
    • JNI call to allocate a string for the metric's name
    • JNI call to update the metric using the string as the key
  • For every child in the node:
    • JNI call to fetch child metric node
    • Recursively call update_comet_metric

Describe the potential solution

There are a few things to explore:

  1. Does reducing the granularity of metrics updates affect the correctness of these metrics? If not, we could update metrics less frequently.
  2. Can we eliminate the overhead of repeatedly allocating strings via JNI for every metric? Addressed in perf: Cache jstrings during metrics collection #1029.
  3. Can we update an entire node's metrics with a single JNI call, rather than a JNI call for each metric?

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant