Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(instrumentation-aws-sdk): add gen ai conventions for converse stream span #2769

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

anuraaga
Copy link
Contributor

Which problem is this PR solving?

  • Currently only non-streaming Converse populates gen ai conventions

Short description of the changes

  • Allows a service extension to override the response, needed to wrap streams
  • Don't end the span in middleware if an extension indicates the response is a stream as the extension needs to end it after it populates attributes during stream consumption. Unrelated to semconv, this should also make the timings more accurate instead of only being for the initial request before response consumption
  • Handles ConverseStream in bedrock extension using above

/cc @trentm @codefromthecrypt

if (override) {
response.output = override;
normalizedResponse.data = override;
}
self._callUserResponseHook(span, normalizedResponse);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered whether this should be done for the user hook too but didn't think there's enough use case for it. Currently the change is only internal since AFAIK, users can't define service extensions

Copy link

codecov bot commented Mar 21, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.47%. Comparing base (122b203) to head (22ea86d).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2769      +/-   ##
==========================================
+ Coverage   89.45%   89.47%   +0.01%     
==========================================
  Files         174      174              
  Lines        8318     8340      +22     
  Branches     1591     1595       +4     
==========================================
+ Hits         7441     7462      +21     
- Misses        877      878       +1     
Files with missing lines Coverage Δ
...entelemetry-instrumentation-aws-sdk/src/aws-sdk.ts 92.72% <100.00%> (+0.18%) ⬆️
...ntation-aws-sdk/src/services/ServicesExtensions.ts 96.55% <100.00%> (ø)
...umentation-aws-sdk/src/services/bedrock-runtime.ts 98.36% <100.00%> (+0.68%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Streaming is tricky but you handled it concisely. Thanks

Copy link
Contributor

@trentm trentm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial review. I haven't looked carefully at the implementation yet.

// isStream - if true, then the response is a stream so the span should not be ended by the middleware.
// the ServiceExtension must end the span itself, generally by wrapping the stream and ending after it is
// consumed.
isStream?: boolean;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if the GenAI SIG discussed/documented wanting this behaviour of ending the span after the full stream is consumed? I've seen opinions vary when discussing HTTP streaming. See the guidance for HTTP client span duration here: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md#http-client-span-duration

Is there any equivalent in the Python GenAI-related instrumentations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the link, that's interesting indeed in terms of response streaming. There isn't any guideline on gen ai spans, however I feel it's sort of implied by the conventions for token usage - there really isn't a way to populate them without keeping the span until the end of the stream. While the duration could keep streaming out of it, that would then need to override the end time of that span with the earlier timestamp, which I don't think the JS SDK supports.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the duration could keep streaming out of it, that would then need to override the end time of that span with the earlier timestamp, which I don't think the JS SDK supports.

Correct, it does not support that. The (documented) behaviour then would be that the Span duration would be just up until the first response from the server, effectively TTFB.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah do you mean that we should use TTFB here? That would mean we couldn't record usage information though.

BTW, I realized that this might be closer to RPC than HTTP which has some specification for streaming

https://opentelemetry.io/docs/specs/semconv/rpc/rpc-spans/#message-event

an event for each message sent/received on client and server spans SHOULD be created

I think this also implies the overall span is for the whole stream. FWIW python keeps the span for the entire stream too.

import {
ConverseStreamOutput,
TokenUsage,
} from '@aws-sdk/client-bedrock-runtime';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a runtime dependency on the library being instrumented. In general, I don't think we can have that, because it adds coupling between the instrumentation (which is meant to work for a range of aws-sdk versions) and a particular version of the aws-sdk dependency.

These are just types. My understanding is the typical handling is to add a (possibly simplified) copy of the types from the module being instrumented, and add those to a local types ts file in the instrumentation. See, for example, the existing "src/aws-sdk.types.ts" in this instrumentation package, which says:

/*
  These are slightly modified and simplified versions of the actual SQS types included
  in the official distribution:
  https://github.com/aws/aws-sdk-js/blob/master/clients/sqs.d.ts
  These are brought here to avoid having users install the `aws-sdk` whenever they
  require this instrumentation.
*/

Copy link
Contributor Author

@anuraaga anuraaga Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, nice catch. I just changed to import type which should remove the runtime dependency. I feel it could also be used for those existing aws-sdk.types.ts though I didn't change them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like instrumentation-graphql and some others are using import type ... for this kind of thing.
Sounds good as long as it passes the test-all-versions tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants