-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(events): Allow attaching context to event dispatching #205
base: main
Are you sure you want to change the base?
Conversation
🦋 Changeset detectedLatest commit: 1e17dea The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a bug in implementation (haven't verified but very likely)
But overall, I would prefer moving event ingestion context to initialization parameters. I believe this makes our API less error-prone and easier to understand. It also simplifies our implementation. And I don't think this excludes any real use cases (and given that context is applied retroactively, any dynamic use case is likely to be buggy)
# Unstable | ||
# Sets an arbitrary context key/value pair to be sent with all events. | ||
def unstable_set_context(key, value) | ||
@core.set_context(key, value) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API seems confusing because the name is not related to event ingestion in any way.
I have also have questions about use case and usage pattern:
- I imagine that context might often be set in bulk, so maybe passing a map is more ergonomic than setting keys one by one
- How to unset a key?
- Does it really need to be dynamic? Given there's some lag between submitting an event and it going out to event ingestion service, setting context may apply retroactively and "reset" context for previously tracked events. I don't think this behavior is easily explainable and users are likely not expecting this, so that just seems like a trap. What do you think about making context an initialization parameter instead of new methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think the plan is to allow providing a context in bulk during SDK initialization as well, that will come later
- Set it to
null
- Yeah I think we're OK with that, as long as it's documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Makes sense. I think, JS implementation doesn't remove the field though but sends
null
- Do you mean we're good with users shooting themselves in the foot as long as it's documented behavior? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- point taken. I don't expect that to be a common pattern - I expect most customers will set the context once during initialization/page load, etc. or rather infrequently as a result of user actions. while I agree that there is risk of a race condition here, I'd expect it to be rare enough to be negligible. @petzel let us know what your thoughts are here, I think @rasendubi has a valid concern, which is also present in the js sdk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- yup, that sounds about right. I'll make a note to clean that up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on the environment but environments that go offline (e.g., mobile, web on mobile networks) or trigger retries may find the context to be off quite often. Context may also change between retries, so the server might receive duplicate events but with different context. This becomes even worse with persistence
Imo, any changeable context should be part of events themselves
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could associate a timestamp with a context
and use that for ensuring they don't get incorrectly associated with events tracked before that timestamp. should be fairly simple to do with simple sorting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could associate a timestamp with a
context
and use that for ensuring they don't get incorrectly associated with events tracked before that timestamp. should be fairly simple to do with simple sorting.
That kinda works except when it doesn't. Timestamps are not monotonic and travel back in time all the time because of skews and time corrections. Timestamps are not reliable for establishing ordering/correlation of events — especially on random PCs where we don't control system time
We can establish order by using a single channel for both context and events. There's still a lot of edge cases though, and making it work with persistence is going to be painful. We can make it work reliably but I really don't think that we should — that just seems like a time sink with no real benefit.
I still don't see a reason to make context dynamic, so I would rather make it static/init-only until we have a customer with a real use case.
If we need dynamic context, the easiest solution is to attach context to each event individually. This is simple, reliable, and unlocks more use cases in the future (e.g., scoping context to a particular request in server env or integration with server overrides). It may seem wasteful duplicating the same context but HTTP body compression could really shine here and significantly reduce the overhead (and given we're sending tons of jsons and they compress really well, we should use some compression to get that 5x reduction in size)
b833266
to
a50e240
Compare
@@ -60,17 +61,30 @@ impl EventIngestionConfig { | |||
/// A handle to Event Ingestion subsystem. | |||
pub struct EventIngestion { | |||
tx: mpsc::Sender<BatchedMessage<Event>>, | |||
context_sender: mpsc::Sender<(String, Value)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd propose we rename tx
to event_sender
for symmetry and readability
String(Str), | ||
Number(f64), | ||
Boolean(bool), | ||
Null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: I think we want to model a potentially absent value as Option<ContextValue>
. This will make sure that we don't send null
fields and instead erase fields.
This also works well with deserialization because Option<>
parses null
, so we can get rid of try_from_json()
and ContextError
let event_delivery_clone = Arc::clone(&event_delivery); | ||
let (context_sender, mut context_rx) = mpsc::channel::<(String, ContextValue)>(100); | ||
runtime.spawn_untracked(async move { | ||
while let Some((key, value)) = context_rx.recv().await { | ||
let mut event_delivery = event_delivery_clone.lock().await; | ||
event_delivery.attach_context(key, value) | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor (not blocking): this mostly works but seems a bit wasteful and may suffer from congestions on event delivery. Given that context is only read/written very briefly, it seems wasteful to lock the whole event delivery with an async mutex (where a sync mutex/rwlock on context would suffice)
))); | ||
|
||
let event_delivery_clone = Arc::clone(&event_delivery); | ||
let (context_sender, mut context_rx) = mpsc::channel::<(String, ContextValue)>(100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: Using a separate channel for context does not guarantee ordering and may suffer from partial writes (which is probably unexpected by users)
Given the following piece of user code:
client.set_context("field1", "hello")
client.set_context("field2", "world")
client.track(event)
The event might get sent out with only field1
in the context (but not field2
), or with no context at all.
#[cfg(feature = "event_ingestion")] | ||
pub use event_ingestion::ContextValue; | ||
pub use sdk_key::SdkKey; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably unintended change of API: SdkKey
is now available without event_ingestion
feature flag
@@ -80,5 +80,6 @@ pub use attributes::{ | |||
pub use configuration::Configuration; | |||
pub use error::{Error, EvaluationError, Result}; | |||
#[cfg(feature = "event_ingestion")] | |||
pub use event_ingestion::ContextValue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason you want to expose it at the top level instead of keeping it scoped to event_ingestion
mod?
Co-authored-by: Oleksii Shmalko <[email protected]>
Description
eppo-multiplatform port of JS sdk implementation Eppo-exp/js-sdk-common#213
Testing
Wrote tests