You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OS: Ubuntu
Python version: Python 3.10
SDK version: 1.31.1
API version: 0.52b1 (same code in 0.53b0 though)
What happened?
The tornado instrumenter and how it tracks a request's elapsed time is not async-safe and does not appear to be able to handle multiple concurrent requests without overwriting and using an incorrect value. This makes the HTTP_SERVER_DURATION metric inaccurate.
Steps to Reproduce
Start one long-running request to a tornado server.
Wait a moment, and start a second long-running request to a tornado server.
Let the first request finish
Let the second request finish.
Review the calculated elapsed time of that request.
This effect is much more visible with streaming requests/handlers like an SSE handler or WebSocketHandler (if I ignore #2761) that may stay connected for hours/days.
The _START_TIME is stored in server_histograms which is a global dict created by the instrumentor. It stores this in the _prepare hook, then reads this back in _record_on_finish_metrics. You can get interleaved start times.
The start time really needs tracked per handler instance (connection).
Expected Result
Request1 Prepare -> StartTime = 10:00:00am
Request2 Prepare -> StartTime = 10:00:05am
Request1 OnFinish -> Reads original StartTime of 10:00:00am, so at 10:00:10am, elapsed time should be 10 seconds.
Request1 OnFinish -> Reads correct original StartTime as 10:00:05am, so at 10:00:10am, elapsed time should be 5 seconds.
Actual Result
Request1 Prepare -> StartTime = 10:00:00am
Request2 Prepare -> StartTime = 10:00:05am
Request1 OnFinish -> Reads incorrect StartTime of 10:00:05am, so at 10:00:10am, elapsed time reported incorrectly as 5 seconds.
Request1 OnFinish -> Reads correct original StartTime as 10:00:05am, so at 10:00:10am, elapsed reported as 5 seconds.
Additional context
No response
Would you like to implement a fix?
Yes
The text was updated successfully, but these errors were encountered:
Describe your environment
OS: Ubuntu
Python version: Python 3.10
SDK version: 1.31.1
API version: 0.52b1 (same code in 0.53b0 though)
What happened?
The tornado instrumenter and how it tracks a request's elapsed time is not async-safe and does not appear to be able to handle multiple concurrent requests without overwriting and using an incorrect value. This makes the HTTP_SERVER_DURATION metric inaccurate.
Steps to Reproduce
This effect is much more visible with streaming requests/handlers like an SSE handler or WebSocketHandler (if I ignore #2761) that may stay connected for hours/days.
The
_START_TIME
is stored in server_histograms which is a global dict created by the instrumentor. It stores this in the_prepare
hook, then reads this back in _record_on_finish_metrics. You can get interleaved start times.The start time really needs tracked per handler instance (connection).
Expected Result
Actual Result
Additional context
No response
Would you like to implement a fix?
Yes
The text was updated successfully, but these errors were encountered: