-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
libstore: Use boost::regex
for GC root discovery
#13142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
As it turns out using `std::regex` is actually the bottleneck for root discovery. Just substituting `std::` -> `boost::` makes root discovery twice as fast (3x if counting only userspace time). Some rather ad-hoc measurements to motivate the switch: (On master) ``` nix build github:nixos/nix/1e822bd4149a8bce1da81ee2ad9404986b07914c#nix-cli --out-link result-1e822bd4149a8bce1da81ee2ad9404986b07914c taskset -c 2,3 hyperfine "result-1e822bd4149a8bce1da81ee2ad9404986b07914c/bin/nix store gc --dry-run --max 0" Benchmark 1: result-1e822bd4149a8bce1da81ee2ad9404986b07914c/bin/nix store gc --dry-run --max 0 Time (mean ± σ): 481.6 ms ± 3.9 ms [User: 336.2 ms, System: 142.0 ms] Range (min … max): 474.6 ms … 487.7 ms 10 runs ``` (After this patch) ``` taskset -c 2,3 hyperfine "result/bin/nix store gc --dry-run --max 0" Benchmark 1: result/bin/nix store gc --dry-run --max 0 Time (mean ± σ): 254.7 ms ± 9.7 ms [User: 111.1 ms, System: 141.3 ms] Range (min … max): 246.5 ms … 281.3 ms 10 runs ``` `boost::regex` is a drop-in replacement for `std::regex`, but much faster. Doing a simple before/after comparison doesn't surface any change in behavior: ``` result/bin/nix store gc --dry-run -vvvvv --max 0 |& grep "got additional" | wc -l result-1e822bd4149a8bce1da81ee2ad9404986b07914c/bin/nix store gc --dry-run -vvvvv --max 0 |& grep "got additional" | wc -l ```
Please check that this doesn't cause a dependency on |
Hmm,
I guess this is boost's Should I also modify |
This reduces the closure size on master by 40MiB. ``` $ nix build github:nixos/nix/1e822bd4149a8bce1da81ee2ad9404986b07914c#nix-store --out-link closure-on-master $ nix build .#nix-store -L --out-link closure-without-icu $ nix path-info --closure-size -h ./closure-on-master /nix/store/8gwr38m5h6p7245ji9jv28a2a11w1isx-nix-store-2.29.0pre 124.4 MiB $ nix path-info --closure-size -h ./closure-without-icu /nix/store/k0gwfykjqpnmaqbwh23nk55lhanc9g24-nix-store-2.29.0pre 86.6 MiB ```
@edolstra @Mic92 Seems like master already links to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me.
@@ -63,6 +63,7 @@ scope: { | |||
"--with-coroutine" | |||
"--with-iostreams" | |||
]; | |||
enableIcu = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could in a different PR probably use disallowedReferences to make sure we never reference to icu in nix.
@xokdvium I wonder if this is also a candidate for builtins.match if it that much faster. I remember the gcc version of std::regex also would stackoverflow on slightly larger input. |
@Mic92 that's the plan. There's been several attempts at this (see #7762) and also Lix did the migration as well (lix-project/lix@447212f). Before we do that though I think we need a good test suite. One possible way to do that is to extract |
…3142 libstore: Use `boost::regex` for GC root discovery (backport #13142)
Motivation
As it turns out using
std::regex
is actually the bottleneck for root discovery. Just substitutingstd::
->boost::
makes root discovery twice as fast (3x if counting only userspace time).Some rather ad-hoc measurements to motivate the switch:
(On master)
(After this patch)
boost::regex
is a drop-in replacement forstd::regex
, but much faster. Doing a simple before/after comparison doesn't surface any change in behavior:Note
This really doesn't make a dent in the actual runtime of GC itself.
Context
Add 👍 to pull requests you find important.
The Nix maintainer team uses a GitHub project board to schedule and track reviews.