fix(metrics): gracefully handle DogStatsd client creation failures#7756
fix(metrics): gracefully handle DogStatsd client creation failures#7756sentry[bot] wants to merge 1 commit intomasterfrom
Conversation
| # During thread cleanup (e.g., ThreadPoolExecutor shutdown), the client | ||
| # factory may fail or set an exception while returning a value, causing | ||
| # a SystemError. We catch this and log it, returning None to allow | ||
| # metrics calls to fail silently rather than crashing the application. | ||
| logger.warning( | ||
| "Failed to create DogStatsd client in thread %s: %s", | ||
| threading.current_thread().name, | ||
| e, | ||
| exc_info=True, | ||
| ) | ||
| return None |
There was a problem hiding this comment.
Bug: If __client_factory fails, self.__thread_state.client is not set. Subsequent metric calls in the same thread will repeatedly and unsuccessfully attempt to create the client, causing log spam.
Severity: CRITICAL
Suggested Fix
In the except Exception block where the client factory failure is handled, explicitly set self.__thread_state.client = None. This ensures the attribute is set, preventing subsequent calls from re-triggering the client creation logic and the associated AttributeError.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: snuba/utils/metrics/backends/datadog.py#L46-L56
Potential issue: In the `__client` property, if the `self.__client_factory()` call fails
with an exception, the assignment to `self.__thread_state.client` does not complete.
Because the attribute remains unset, any subsequent call to a metric method within the
same thread will trigger another `AttributeError`. This leads to a cycle where the code
repeatedly attempts to create the client, fails, and logs a warning with a full stack
trace for every single metric invocation. This will cause significant log spam and
performance degradation in high-throughput paths.
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| try: | ||
| client = self.__thread_state.client | ||
| except AttributeError: | ||
| client = self.__thread_state.client = self.__client_factory() |
There was a problem hiding this comment.
Outer except only catches AttributeError, not SystemError
High Severity
The PR description states the issue is that threading.local() client access during ThreadPoolExecutor shutdown causes a C-level SystemError. However, the outer try/except on line 40-42 only catches AttributeError. If the SystemError occurs when reading self.__thread_state.client (line 41) — which is exactly the scenario described — it won't be caught, and the crash the PR intends to fix will still happen. The inner except Exception only protects the factory call path, not the attribute access path. The outer except needs to also catch Exception (or at least SystemError) to fully address the described issue.


Fixes SNUBA-9Y5. The issue was that: DatadogMetricsBackend's
threading.local()client access during concurrent ThreadPoolExecutor shutdown causes C-level SystemError.loggingmodule and initialize a logger instance.__clientproperty to catch exceptions duringDogStatsdclient factory calls.Noneif the client factory fails, preventing application crashes during thread cleanup.increment,gauge,timing,distribution,event) to check if theDogStatsdclient isNoneand fail silently if it is, avoiding errors when the client could not be initialized.This fix was generated by Seer in Sentry, triggered by pierre.massat@sentry.io. 👁️ Run ID: 10536044
Not quite right? Click here to continue debugging with Seer.
Legal Boilerplate
Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. and is gonna need some rights from me in order to utilize my contributions in this here PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.