You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add diagnostic logging to MemoryJournal to track lock contention
Added comprehensive telemetry to investigate intermittent test failures in
InMemoryEventsByTagSpec.ReadJournal_live_query_EventsByTag_should_find_events_from_offset_exclusive.
Changes:
- Added Stopwatch-based timing instrumentation to track lock acquisition wait times in WriteMessagesAsync and ReplayTaggedMessagesAsync
- Added DEBUG logging for all journal message types (ReplayTaggedMessages, ReplayAllEvents, SelectCurrentPersistenceIds)
- Logs thread IDs, lock wait times, lock hold times, and event counts
- Enabled DEBUG log level in InMemoryEventsByTagSpec test configuration
Diagnostic output includes:
- "[DIAG] WriteMessagesAsync called on thread X, attempting to acquire lock"
- "[DIAG] Lock acquired after Xms on thread Y"
- "[DIAG] Lock released after holding for Xms"
- "[DIAG] Wrote event for {persistenceId}, seq {seqNr}, total events: {count}"
- "[DIAG] ReplayTaggedMessages lock acquired after Xms"
- "[DIAG] Found X events matching tag 'Y'"
This instrumentation will capture timing data on CI/CD build servers to determine
if the test failures are caused by:
1. Lock contention/fairness issues with Monitor
2. Thread pool starvation
3. Async execution context delays
4. RecoveryPermitter bottlenecks
All diagnostic logging uses [DIAG] prefix for easy filtering. These changes are
temporary for diagnosis and will be reverted once root cause is identified.
0 commit comments