-
Notifications
You must be signed in to change notification settings - Fork 355
feat(debugger): add snapshot time budget #6897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Overall package sizeSelf size: 13.24 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.7.0 | 35.02 MB | 35.02 MB | | @datadog/native-appsec | 10.3.0 | 20.73 MB | 20.74 MB | | @datadog/native-iast-taint-tracking | 4.0.0 | 11.72 MB | 11.73 MB | | @datadog/pprof | 5.12.0 | 11.19 MB | 11.57 MB | | @opentelemetry/resources | 1.30.1 | 557.67 kB | 7.71 MB | | @opentelemetry/core | 1.30.1 | 908.66 kB | 7.16 MB | | protobufjs | 7.5.4 | 2.95 MB | 5.82 MB | | @datadog/wasm-js-rewriter | 5.0.1 | 2.82 MB | 3.55 MB | | @datadog/native-metrics | 3.1.1 | 1.02 MB | 1.43 MB | | @opentelemetry/api-logs | 0.208.0 | 199.48 kB | 1.42 MB | | @opentelemetry/api | 1.9.0 | 1.22 MB | 1.22 MB | | jsonpath-plus | 10.3.0 | 617.18 kB | 1.08 MB | | import-in-the-middle | 1.15.0 | 127.66 kB | 856.24 kB | | lru-cache | 10.4.3 | 804.3 kB | 804.3 kB | | @datadog/openfeature-node-server | 0.2.0 | 118.51 kB | 437.19 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | source-map | 0.7.6 | 185.63 kB | 185.63 kB | | pprof-format | 2.2.1 | 163.06 kB | 163.06 kB | | @datadog/sketches-js | 2.1.1 | 109.9 kB | 109.9 kB | | @isaacs/ttlcache | 2.1.1 | 90.58 kB | 90.58 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 7.0.5 | 63.38 kB | 63.38 kB | | istanbul-lib-coverage | 3.2.2 | 34.37 kB | 34.37 kB | | rfdc | 1.4.1 | 27.15 kB | 27.15 kB | | dc-polyfill | 0.1.10 | 26.73 kB | 26.73 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | shell-quote | 1.8.3 | 23.74 kB | 23.74 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | semifies | 1.0.0 | 15.84 kB | 15.84 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | mutexify | 1.4.0 | 5.71 kB | 8.74 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | module-details-from-path | 1.0.4 | 3.96 kB | 3.96 kB | | escape-string-regexp | 5.0.0 | 3.66 kB | 3.66 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #6897 +/- ##
==========================================
+ Coverage 83.83% 84.59% +0.75%
==========================================
Files 506 505 -1
Lines 21359 21171 -188
==========================================
+ Hits 17907 17910 +3
+ Misses 3452 3261 -191 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
BenchmarksBenchmark execution time: 2025-11-15 05:56:51 Comparing candidate commit 495678c in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 1608 metrics, 62 unstable metrics. |
3f58922 to
1992630
Compare
1992630 to
03362e2
Compare
This comment has been minimized.
This comment has been minimized.
4f232ba to
c5280f9
Compare
c5280f9 to
b9b7fa5
Compare
ddd5ec0 to
bfd99a1
Compare
65c4e4b to
4a970c8
Compare
| // This can be skipped and we can go directly to its children, of which | ||
| // there will always be exactly two, the first containing the key, and the | ||
| // second containing the value of this entry of the Map. | ||
| if (Object.hasOwn(value, timeBudgetSym)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can reduce the snapshot size by simply filter those items out and put noCaptureReason: 'timeout' on the map/collection object itself. same as size capture limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I didn't like this bit either. I took a second look and found a better implementation that addresses your concern.
4a970c8 to
6f67548
Compare
6f67548 to
495678c
Compare
BridgeAR
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the code, not the tests.
|
|
||
| // Trigger the breakpoint once probe is successfully installed | ||
| function triggerBreakpoint (url) { | ||
| async function triggerBreakpoint (url) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually adds a single tick without any other benefit (besides marking the method as an async one).
Could you add JSDoc as well? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is test code, so I optimize for readability rather than performance. But I can add the JSDoc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up adding types to the entirety of the debugger test utils, and the changes have become quite extensive, so I'll add those in follow-up PR to keep this one clean
| maxCollectionSize, | ||
| maxFieldCount, | ||
| maxLength, | ||
| deadlineNs: start + BigInt(config.dynamicInstrumentation.captureTimeoutMs ?? 10) * 1_000_000n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the default is already defined in the config. We could also create the BigInt in the config directly (and also define it as default). That seems best in my opinion.
| deadlineNs: start + BigInt(config.dynamicInstrumentation.captureTimeoutMs ?? 10) * 1_000_000n | |
| deadlineNs: start + config.dynamicInstrumentation.captureTimeoutMs * 1_000_000n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean convert captureTimeoutMs to nanoseconds in the config? So we still take captureTimeoutMs as the init input, but compute a separate captureTimeoutNs at config time, which we reference here?
packages/dd-trace/src/debugger/devtools_client/snapshot/index.js
Outdated
Show resolved
Hide resolved
packages/dd-trace/src/debugger/devtools_client/snapshot/collector.js
Outdated
Show resolved
Hide resolved
| * @property {number} maxCollectionSize - The maximum size of a collection to include in the snapshot | ||
| * @property {number} maxFieldCount - The maximum number of properties on an object to include in the snapshot | ||
| * @property {bigint} deadlineNs - The deadline in nanoseconds compared to `process.hrtime.bigint()` | ||
| * @property {boolean} [deadlineReached=false] - Whether the deadline has been reached. Should not be set by the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it should not be set by a user, let us not document it here? And if I understand it correct, it is just a way to transport the information to other code parts. Is there any other way to transport that information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't document it here, TypeScript will complain when we reference it, right?
But yes, it's just a way to transport the information out to the caller, so that when they pass in the same opts object again later in a subsequent call to getRuntimeObject, this information is retained, so that we don't have to re-compute it. I agree this is not a good programming pattern, but I felt it was the one that had the best performance for what we wanted. But I'd be happy to consider alternatives 😊
Another approach I considered was a context object that could be passed around. But since we already pass the opts object around all over code, I felt it was easier to just re-use that. But maybe we could make it more obvious by nesting it under a ctx property. So instead of deadlineReached being directly on the root of the opts object, it could be ctx.deadlineReached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed this in the latest commit - both the JSDoc and added the ctx property
packages/dd-trace/src/debugger/devtools_client/snapshot/processor.js
Outdated
Show resolved
Hide resolved
packages/dd-trace/src/debugger/devtools_client/snapshot/processor.js
Outdated
Show resolved
Hide resolved
Enforce a per-snapshot time budget. By default this budget is 10ms, but
can be modified by the experimental config, either as an environment
variable:
DD_DYNAMIC_INSTRUMENTATION_CAPTURE_TIMEOUT_MS=20
Or programatically:
tracer.init({
dynamicInstrumentation: {
captureTimeoutMs: 20
}
})
When the budget is exceeded, remaining values that are not already
resolved, are marked with `notCapturedReason: 'timeout'`.
8b643eb to
c5336bf
Compare
c5336bf to
af0b05b
Compare

What does this PR do?
DD_DYNAMIC_INSTRUMENTATION_CAPTURE_TIMEOUT_MStracer.init({ dynamicInstrumentation: { captureTimeoutMs }})session.postpromise), are marked withnotCapturedReason: 'timeout'.Motivation
Prevent snapshot collection from exceeding acceptable latency and resource usage in complex/large object graphs.
Additional Notes
Depends on DataDog/system-tests#5705Not covered by this PR:
Runtime.getPropertieson an array with a million elements will take much much longer than 10ms. I'll address this in follow up PRs (feat(debugger): add special handling for very large collections/objects #6912).