Conversation
Adds outcome tracking to trace items with data category, quantity, and key_id fields. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
The latest Buf updates on your PR. Results from workflow ci / buf-checks (pull_request).
|
Changes message name from plural to singular since it represents a single outcome. The field name item_outcomes remains plural as it's a repeated list. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Dav1dde
left a comment
There was a problem hiding this comment.
Some context:
- As a timestamp outcomes can use the
receivedtimestamp of theTraceItem. - The key id is needed for outcomes, we already do a decently good job to preserve the key id (e.g. for spans), so we want to keep this going.
Outcomes allow for more metadata to be attached, like event_id, remote_addr. These are currently not as important, but we should already decide what we want to do if we need to add more metadata to outcomes:
- We cannot add any more metadata
- We add more top-level fields to the trace item
- We change the schema and introduce a new version which introduces some name-spacing
- We change the schema now.
Since we do already have quite a few top level attributes, which are quite similar metadata to e.g. a remote_addr (the received timestamp), I think option 2) is an okay solution.
|
|
||
| message ItemOutcome { | ||
| // DataCategory that defined in Relay | ||
| uint32 data_category = 1; |
There was a problem hiding this comment.
This should be an enum taken from the list of categories in the Relay package.
@Dav1dde should we extract this from Relay?
There was a problem hiding this comment.
@phacops I was making the assumption we would validate the data category in the consumer itself
There was a problem hiding this comment.
It's fine. I guess we also don't have a list of DataCategory in protobuf so what I'd want to do is not really possible yet. We'd have to make the list in protobuf first then replace it everywhere it's used instead of using the Python class to generate it or something.
There was a problem hiding this comment.
@Dav1dde can we get a rust crate for the relay consts? so we can import the consts like we do in other places in snuba
There was a problem hiding this comment.
In Relay, for outcomes, the data category is actually typed as u8 which leaves us with about 220 more data categories we can invent.
Currently the source of truth for data categories lives in Sentry, but also all of Sentry (and Relay for that matter) is built in a way that it needs to be able to deal with unknown datacategories (arbitrary numbers), so we could follow this approach also here and just use a number.
I think setting up code generation in Relay, to import into sentry-protos creates a weird dependency graph where sentry-protos depends on some Relay artifcat but Relay also depends on sentry-protos to produce data.
Is changing from u8/uint32 to an enum in protobuf a breaking change? If not we can start out as a number, then think about moving the source of truth for a data category into sentry-protos or some other schema repo.
There was a problem hiding this comment.
Maybe I misunderstood the question, @MeredithAnya would you like to use the rust artifact/list of datacateories just in the consumer to have some typing or also here in the protos?
If it's just about the consumer, the consumer needs to be able to deal with unknown data categories, as we don't want to have to update the consumer before introducing a new data category.
There was a problem hiding this comment.
Maybe I misunderstood the question, @MeredithAnya would you like to use the rust artifact/list of datacateories just in the consumer to have some typing or also here in the protos?
@Dav1dde I was talking about the consumers. Right now I want to enforce that the data categories we are creating accepted outcomes for are in the right subset, in getsentry outcomes consumer they are using it like
from sentry.constants import DataCategory
MIN_CATEGORY = min(DataCategory).value
MAX_CATEGORY = max(DataCategory).value
IGNORED_CATEGORIES = {DataCategory.SESSION}where in sentry's constants.py the DataCategory is
DataCategory = sentry_relay.consts.DataCategoryObviously I'd be doing this differently but right now snuba only has the ability to do this in the python consumers/ api logic with https://pypi.org/project/sentry-relay/
I could see us add |
What's the use of the |
2a5dd7d to
1c457dc
Compare
@Dav1dde @phacops decided to change the schema now to make |
This looks good, especially with the |
Adds
Outcomesto theTraceItemwhich will have metadata likekey_idand then aCategoryCountwith the following fields:data_category(uint32)quantity(uint64)A
TraceItemcan have multiple outcomes because there can be different quantity types. For example,Logitems have outcomes in both thelog_itemandlog_bytecategory, hence the need for multipleCategoryCounts.🤖 Generated with Claude Code