Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JMX Scraper: YAML file and integration test hadoop #1675

Merged
merged 18 commits into from
Feb 3, 2025

Conversation

robsunday
Copy link
Contributor

@robsunday robsunday commented Jan 27, 2025

Scope

  1. YAML file for Hadoop added
  2. Metrics definitions updated to be aligned with semconv in JMX Metrics Gatherer and in JMX Scraper
  3. Integration test for Hadoop added for JMX Scraper

Part of #1362

Comment on lines +45 to +48
export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS"
export HADOOP_NAMENODE_OPTS="$HADOOP_NAMENODE_OPTS -Dcom.sun.management.jmxremote.authenticate=false"
export HADOOP_NAMENODE_OPTS="$HADOOP_NAMENODE_OPTS -Dcom.sun.management.jmxremote.ssl=false"
export HADOOP_NAMENODE_OPTS="$HADOOP_NAMENODE_OPTS -Dcom.sun.management.jmxremote.port=9999 -Dcom.sun.management.jmxremote.rmi.port=9999"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] would it be possible to avoid having this script file by just setting the HADOOP_NAMENODE_OPTS with appropriate JMX settings directly ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about all those other env vars too tho?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed with @robsunday yesterday, this looks to be a modified copy of a shell script of the hadoop distribution, which might have been modified to configure the JMX configuration.

The current startup script might allow to easily provide this configuration, but quite often it's not possible to easily override through a single env variable, and sed surgery isn't the easiest thing to maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I tried few ideas how to get rid of this custom script, but without a luck. All tips/guides I've found on the internet recommend modification of hadoop-env.sh (as it is done in our test).
At first glance it looks like passing your own env var HADOOP_NAMENODE_OPTS to docker should work.
Unfortunately, Hadoop launches multiple processes during startup and executes this file many times to setup env variables and the ossue is that initial content of HADOOP_NAMENODE_OPTS passed to docker is somehow lost.
Only hardcoding JMX stuff in hadoop-env.sh worked for me, but if you know about reasonable alternate solution I'll be happy to use it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for trying to remove it, we should be fine to leave it as-is since we don't have a better option.

@github-actions github-actions bot requested a review from SylvainJuge January 31, 2025 10:04
@breedx-splk breedx-splk merged commit d2a97f4 into open-telemetry:main Feb 3, 2025
14 checks passed
breedx-splk pushed a commit to breedx-splk/opentelemetry-java-contrib that referenced this pull request Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants