Skip to content

Implementation for the latest Messaging Semantic conventions. #13192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

cbos
Copy link

@cbos cbos commented Feb 4, 2025

Update Messaging span with the latest semantic conventions.
https://github.com/open-telemetry/semantic-conventions/blob/v1.30.0/docs/messaging/messaging-spans.md

By default nothing changes, it adheres to this:

  • SHOULD introduce an environment variable OTEL_SEMCONV_STABILITY_OPT_IN
    in the existing major version which is a comma-separated list of values.
    The list of values includes:
    • messaging - emit the new, stable messaging conventions,
      and stop emitting the old experimental messaging conventions
      that the instrumentation emitted previously.
    • messaging/dup - emit both the old and the stable messaging conventions,
      allowing for a seamless transition.
    • The default behavior (in the absence of one of these values) is to continue
      emitting whatever version of the old experimental messaging conventions
      the instrumentation was emitting previously.
    • Note: messaging/dup has higher precedence than messaging in case both values are present

@cbos cbos requested a review from a team as a code owner February 4, 2025 15:23
Copy link

linux-foundation-easycla bot commented Feb 4, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@laurit laurit mentioned this pull request Mar 24, 2025
@laurit
Copy link
Contributor

laurit commented Mar 24, 2025

@cbos are you doing this for fun or do you intend to get this merged?

@cbos
Copy link
Author

cbos commented Mar 31, 2025

@laurit
Thanks for reaching out.
My intent is to get the merged, but I had difficulties to find out how the CLA should be handled. That took an awful lot of time to get that sorted out in the organisation I am currently active, but that is arranged now. I signed the CLA.

@cbos
Copy link
Author

cbos commented Mar 31, 2025

@laurit
Thanks for reaching out.
My intent is to get the merged, but I had difficulties to find out how the CLA should be handled. That took an awefull lot of time to get that sorted out in the organisation I am currently active, but that is arranged now. I signed the CLA.

Copy link
Contributor

@laurit laurit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually it would be nice if we also had tests running with the stable semconvs. For db tests we use

tasks {
  val testStableSemconv by registering(Test::class) {
    jvmArgs("-Dotel.semconv-stability.opt-in=database")
  }

  check {
    dependsOn(testStableSemconv)
  }
}

I guess the tests can be added a bit later once the other parts of the PR are finished.

*/
@Deprecated
String operationName() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a package private method it is not part of the public api and can be renamed without preserving the original method. If you wish to keep the original method then it would be better to make the old method class the new one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I marked it as deprecated as it still actively used to support the old naming convention, but should not be used for the new naming convention. To make sure that it is not used or the usage is not increased I marked it as deprecated.
Do you agree, or should I remove it?

Comment on lines -26 to +27
String getDestination(REQUEST request);
default Long getBatchMessageCount(REQUEST request, @Nullable RESPONSE response) {
return null;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rearranging methods like that complicates the review. It is hard to tell what methods have been added and what was just moved.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this to have the list of methods align with the list of attributes on https://github.com/open-telemetry/semantic-conventions/blob/main/docs/messaging/messaging-spans.md#messaging-attributes

Then you can easily see if all attributes are covered.
The old list of methods felt like 'random' placed in the interface, no logical grouping. But I can be wrong about that.

Do you want me to move the methods back?

* This class is internal and is hence not for public use. Its APIs are unstable and can change at
* any time.
*/
public class PropagatorBasedBaggageExtractor<REQUEST> implements ContextCustomizer<REQUEST> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this required by the specification? If I understand correctly what this is supposed to do it seems to have a limited usefulness. Perhaps it would be better to introduce it in a separate PR?

enum SqsAttributesGetter
implements
MessagingAttributesGetter<Request<?>, Response<?>>,
MessagingNetworkAttributesGetter<Request<?>, Response<?>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not much point in adding the network attributes interface when no methods are implemented

SpanKindExtractor.alwaysConsumer(),
toSqsRequestExtractors(attributesExtractors()),
singletonList(messagingAttributeExtractor),
messagingReceiveInstrumentationEnabled);
messagingReceiveInstrumentationEnabled || SemconvStability.emitStableMessagingSemconv());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My advice would be to do the attribute changes in one PR and leave any potential changes to trace structure for followup PRs. Changes to trace structure will most likely require new tests and will make your PR much larger. Also as far as I can tell https://opentelemetry.io/docs/specs/semconv/messaging/messaging-spans/#message-creation-context-as-parent-of-process-span hints that having the process span as child of publish span is still allowed. Having these in one trace seems to be preferred by users.

public static Timer onEnter() {
public static Timer onEnter(@Advice.This MessageConsumer consumer) {

if (JmsConfig.EXPERIMENTAL_CONSUMER_PROCESS_TELEMETRY_ENABLED) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this change is not related to the semantic conventions consider submitting it as a separate PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reverted this change and I created a new ticket for this:
#14054
But I think I took the wrong way to create the ticket, it does not have labels attached to it yet.

@cbos
Copy link
Author

cbos commented Apr 4, 2025

@laurit
Thanks for the review, next week I will go through all details and update the PR

@zeitlinger
Copy link
Member

@cbos do you need help getting this over the finishing line?

@cbos
Copy link
Author

cbos commented Jun 2, 2025

@zeitlinger
Thanks for reaching out.
Due to priority incidents on the project I am working on, I was not able to catch up with the review comments.
But I will work on this in the coming days.
I think I have to split the review in 2 parts.

@cbos
Copy link
Author

cbos commented Jun 4, 2025

While I was working on the Messaging Semantic conventions I found out that there are 2 additional problems.
There are related to span links rather than parent-child relations.

Problem 1 - context propagation:
image

Span links, only link one span to the other.
But the context is not propagated. Especially baggage headers are mend for context propagation.
But that is missing right now.

That is what I implemented in this PR as well
image
I created a class for that instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/internal/PropagatorBasedBaggageExtractor.java used that from the JMS implementation.
@laurit you made a comment on this util class to make it a separate PR.

Problem 2 - no child spans:
image

JMS has 2 ways to receive messages.
One of these options (which is the most I think, especially in older software) causes that you don't have child spans. These 'child spans' do not have a context and will become standalone.

That is solved in this PR as well:
image

I created a setting for that as well:
otel.instrumentation.jms.experimental.consumer-process-telemetry.enabled, which is default disabled.


@laurit @zeitlinger
What is your advise on how to continue with this?
As far as I can see, only the new Messaging Semantic conventions is only half solution.
The other 2 parts are also needed to have a full functional solution, otherwise you miss essential information or have crippled trace.
Taking out the changes for these 2 parts will take time as well.

Locally I have an OpenTelemetry demo setup with some changes in which I included JMS messaging as well to test/demo all the changes of the new messaging conventions. Based on that I found the 2 problems as described above.

I can share that setup as well if you like.

@laurit
Copy link
Contributor

laurit commented Jun 5, 2025

Span links, only link one span to the other.
But the context is not propagated. Especially baggage headers are mend for context propagation.
But that is missing right now.

@trask has this come up before in the spec? Imo you should definitely push this in a different PR. If this is not a speced behavior you'll either need to work to get this into the spec or add it as an experimental, disabled by default, feature.

JMS has 2 ways to receive messages.
One of these options (which is the most I think, especially in older software) causes that you don't have child spans. These 'child spans' do not have a context and will become standalone.

To me this is a new feature not related to semconv update and thus should be put in a separate PR.

As far as I can see, only the new Messaging Semantic conventions is only half solution.

The thing with the new semantic conventions is that they can only be enabled with a new major release which means that they could remain behind an opt-in flag for quite a while. The other 2 features aren't really connected with the semconv update and could be implemented separately. Having separate features in separate PRs could also make the review easier.

@cbos
Copy link
Author

cbos commented Jun 5, 2025

@laurit
Thanks for your feedback, I will work on it to separate the two other parts.

@trask
Copy link
Member

trask commented Jun 11, 2025

Span links, only link one span to the other.
But the context is not propagated. Especially baggage headers are mend for context propagation.
But that is missing right now.

I suspect this is expected, since baggage is typically associated with a single trace, and in the messaging modeling there are multiple traces involved.

I'd probably recommend opening an issue in semantic-conventions repo seeking any clarifications, since the messaging semconv is not simple (and it's not completely final).

@cbos
Copy link
Author

cbos commented Jun 12, 2025

Span links, only link one span to the other.
But the context is not propagated. Especially baggage headers are mend for context propagation.
But that is missing right now.

I suspect this is expected, since baggage is typically associated with a single trace, and in the messaging modeling there are multiple traces involved.

I'd probably recommend opening an issue in semantic-conventions repo seeking any clarifications, since the messaging semconv is not simple (and it's not completely final).

@trask
Thanks for your reply.

Based on what I read at https://github.com/open-telemetry/semantic-conventions/blob/main/docs/messaging/messaging-spans.md#context-propagation and with the old/current semantic conventions baggage propagation just normally works ends up in the consumer span.

I will take out the change from this PR and I will create a ticket on this repo. Then we can take it forward from that point.
And if clarification is needed on semantics level, we can create a ticket for that as well.


Update: I created #14024 for this

@github-actions github-actions bot added the test native This label can be applied to PRs to trigger them to run native tests label Jun 16, 2025
@cbos
Copy link
Author

cbos commented Jun 19, 2025

@trask / @laurit / @zeitlinger
As I mentioned on #14024 I missed some important part in the semantic conventions.
My impression was that with the new convention it 'always' should create span links INSTEAD of using the context as parent.

These conventions use spans links as the default mechanism to correlate producers and consumer(s) because:

  • It is the only consistent trace structure that can be guaranteed, given the many different messaging systems models available.

But somehow, I missed this part:

Message creation context as parent of "Process" span
Exclusively for single messages scenarios, the "Process" span MAY use the message's creation context as its parent, thus achieving a direct parent-child relationship between producer and consumer(s). Instrumentations SHOULD document whether they use the message creation context as a parent for "Process" spans and MAY provide configuration options allowing users to control this behavior.

That has consequenses for this implementation, I will rework the implementation based on that.

  • So if the new semantic conventions are enabled it should always create the span links, but still can use the same as parent as well.
  • Context as parent is still possible (but not default with the new conventions I think).

Some questions for the implementation:
Currently otel.instrumentation.messaging.experimental.receive-telemetry.enabled is a setting. If that is enabled, span links are created. (see https://opentelemetry.io/docs/zero-code/java/agent/instrumentation/#capturing-consumer-message-receive-telemetry-in-messaging-instrumentations)
But according to the new specs span links are not about 'receive' but about 'process'. That would opt for a new setting with 'process' in the name.
Should the behavior 'single message processing' be configurable per instrumentation implementation, so separately for JMS and RabbitMQ and Kafka, or a generic setting were it gets enabled/disabled for all types? Mostlikely an application does not use all these messaging techniques at the same time.

The receive-telemetry setting is false by default. Should the default be 'true' with the new semantic conventions?

@trask
Copy link
Member

trask commented Jun 19, 2025

I think we may not need a new configuration property if we can put all the new behavior behind otel.semconv-stability.opt-in=messaging.

Exclusively for single messages scenarios, the "Process" span MAY use the message's creation context as its parent, thus achieving a direct parent-child relationship between producer and consumer(s)

I support Java instrumentation choosing to implement this MAY. We can add opt-out configuration property in the future if someone requests it.

# Conflicts:
#	instrumentation-api-incubator/build.gradle.kts
#	instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/internal/SemconvStability.java
@cbos
Copy link
Author

cbos commented Jun 30, 2025

I think we may not need a new configuration property if we can put all the new behavior behind otel.semconv-stability.opt-in=messaging.

Exclusively for single messages scenarios, the "Process" span MAY use the message's creation context as its parent, thus achieving a direct parent-child relationship between producer and consumer(s)

I support Java instrumentation choosing to implement this MAY. We can add opt-out configuration property in the future if someone requests it.

@trask
Thanks for your clarification.
I checked the PR again, and as soon as otel.semconv-stability.opt-in=messaging or otel.semconv-stability.opt-in=dup/messaging the new names are used (and with dup also some old attributes are still available).

In both cases the emitStableMessagingSemconv() is true and based on that span links are created.
But that now 'only' creates span links, which means that with otel.semconv-stability.opt-in=dup/messaging` no parent-child spans are created anymore.

What it your expectation for the 'dup/messaging' situation?

I think that is only remaining point to address, all other parts are addressed as far as I can see.

@laurit, based on your earlier comments I replied, but based on that some questions are still open. Can you have a look at them so we can finish these review comments as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test native This label can be applied to PRs to trigger them to run native tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants