-
Notifications
You must be signed in to change notification settings - Fork 710
opentelemetry-instrumentation-starlette: fix memory leak and double middleware #3529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
...ntelemetry-instrumentation-starlette/src/opentelemetry/instrumentation/starlette/__init__.py
Show resolved
Hide resolved
@@ -331,9 +332,6 @@ def __init__(self, *args: Any, **kwargs: Any): | |||
# adding apps to set for uninstrumenting | |||
_InstrumentedStarlette._instrumented_starlette_apps.add(self) | |||
|
|||
def __del__(self): | |||
_InstrumentedStarlette._instrumented_starlette_apps.remove(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would fail if the app isn't in the set, which would happen in the .clear()
call in StarletteInstrumentation.uninstrument()
. And actually, __del__
could only ever be called after such a call because the set held a strong reference, preventing __del__
from being called. I am guessing older versions of starlette had a similar memory leak that was fixed, causing the newer ones to trigger the crash here.
I changed to WeakSet
which should fix it, and we don't need __del__
since it's automatically cleaned when the object goes away
def setUp(self) -> None: | ||
patcher = patch.dict( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously patch was decorating setUp
- it was active only for the duration of setUp
, not the entire test case. Recent starlette seems to initialize middleware lazily so the patch needs to be active during the test case for it. I switched to this pattern that activates it both for setUp
(for older eager starlette) and the test cases (for newer lazy starlette). Note, moving the decorator to the class level would loose the former.
@anuraaga this is ready to review since you fixed the testcase right? |
@@ -230,7 +230,8 @@ def test_basic_post_request_metric_success(self): | |||
dict(point.attributes), expected_duration_attributes | |||
) | |||
if metric.name == "http.server.duration": | |||
self.assertAlmostEqual(duration, point.sum, delta=30) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
latest had trouble with this assertion because it initializes middleware lazily, so the duration
ends up being way longer than the actual request's duration. I think in principal, it's only safe to check the request duration is not larger than the test duration rather than comparing the actual values so I changed the assertions.
I wanted to use self.assertGreater(point.sum, 0)
but the old ms metric is round
ed and easily 0
on a fastish machine - not sure if the rounding is actually required for it but that would be a change in the ASGI middleware
Yup should be ok now |
Oh no, I didn't check existing PRs before making this one and now see @MattiasDC's #3456 is quite similar. I found the tests seem to pass fine for the current baseline version so didn't raise it here which is the main difference. Feel free to reject this PR as a duplicate and get that one in instead though, then I can send another one with just the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funny that running the same command, I got different result for this file 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah to get it to update, I removed the file first and ran the command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dee09c0
to
3aebb34
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @emdneto have rebased this PR
def test_manual_instrument_is_noop(self): | ||
app = self._create_starlette_app() | ||
self._instrumentor.instrument_app(app) | ||
self.assertEqual(len(app.user_middleware), 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assertion is fixed by this PR. I thought there would also be double spans but some layer must be deduping them so luckily there wouldn't have been double telemetry it seems. But still good to fix the double middleware
Co-authored-by: Joe McGinley <[email protected]>
Description
Fixes memory leak from using
set
, which prevents__del__
, which means starlette apps can never be garbage collected.Fixes adding two middlewares if
instrument_app
is called with auto instrumentation active/cc @xrmx
Type of change
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Does This PR Require a Core Repo Change?
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.