Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document generic approach for span status (code + description) and exception event when instrumented code throws #1536

Open
lmolkova opened this issue Oct 31, 2024 · 3 comments

Comments

@lmolkova
Copy link
Contributor

lmolkova commented Oct 31, 2024

The common approach seems to be:

  • do nothing if the exception is handled by the instrumented library (retries, etc)
  • for unhandled (by client lib) exceptions:
    • set span status to error
    • set span status description to exception message
    • set error.type attribute on spans/metrics based on the exception type or more specific low-cardinality error code
    • DO NOT record exception event (by default) unless recording local root span or when the instrumentation knows that user code didn't handle the exception. Reason: exceptions are huge and expensive. Users decide if/how to record them when they catch them. If exception stays unhandled, the server instrumentation will record it.

We should document it and link from DB/messaging/gen-ai and other conventions.

@alanwest
Copy link
Member

Another aspect of this to consider is whether information should be captured from the outermost exception or innermost in the case of nested exceptions.

In capturing error.type and span status description, my intuition is that the innermost exception type/message is the most useful because it most closely describes the root of the problem.

On the other hand, when a user opts in to recording exception events, my intuition is that the outermost exception may be the most useful because the stack trace captured usually contains information from all nested exceptions.

Interestingly, the DB semantic conventions seems to suggest capturing details from the innermost exception:

[9]: The error.type SHOULD match the db.response.status_code returned by the database or the client library, or the canonical name of exception that occurred. When using canonical exception type name, instrumentation SHOULD do the best effort to report the most relevant type. For example, if the original exception is wrapped into a generic one, the original exception SHOULD be preferred. Instrumentations SHOULD document how error.type is populated.

The HTTP semantic conventions seem less opinionated and does not have a similar statement.

@trask
Copy link
Member

trask commented Nov 5, 2024

  • DO NOT record exception event (by default) unless recording SERVER/CONSUMER span when the instrumentation knows that user code didn't handle the exception.

a slight variation on this is to replace SERVER/CONSUMER span above with "local root" span

@trask
Copy link
Member

trask commented Nov 6, 2024

since it appears that github discussions don't get backreferenced, linking here: open-telemetry/opentelemetry-java-instrumentation#12125

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: V1 - Stable Semantics
Development

No branches or pull requests

3 participants