perf(EntryMode::extract_from_bytes): add happy path check by datdenkikniet · Pull Request #2461 · GitoxideLabs/gitoxide

datdenkikniet · 2026-03-07T10:31:23Z

Since the position of the space in the entrymode is often 6, we can add an explicit check for this case and skip some of the operations performed in the loop, making the benchmark a little faster.

Benches (cargo bench --bench decode-objects -- TreeRef) when compared to main:

Current benchmark tree

TreeRef()               time:   [91.447 ns 91.539 ns 91.631 ns]
                        change: [−4.4580% −4.3152% −4.1764%] (p = 0.00 < 0.05)
                        Performance has improved.

TreeRefIter()           time:   [34.566 ns 34.611 ns 34.661 ns]
                        change: [−15.910% −15.735% −15.567%] (p = 0.00 < 0.05)
                        Performance has improved.

Improvement is more marginal (but still present) with a less artificial tree:

Current HEAD^{tree}

TreeRef()               time:   [1.0033 µs 1.0041 µs 1.0050 µs]
                        change: [−9.9484% −9.7732% −9.5428%] (p = 0.00 < 0.05)
                        Performance has improved.

TreeRefIter()           time:   [899.42 ns 899.95 ns 900.56 ns]
                        change: [−4.7541% −4.5380% −4.3125%] (p = 0.00 < 0.05)
                        Performance has improved.

Obviously the usefulness of this change relies on two things: the case of index 6 being the space is indeed the happy path (and from what I can find on the internet, that does seem to be the default case), and whether this micro-optimization is worth the increased code complexity.

Additionally, we can skip some subtraction & logic stuff if the octal value is computed immediately and used, which saves a few cycles.

Notes no-longer relevant commit (only improved perf on benchmark, but likely not on usual workloads)

The 2nd improvement (which is independent of the first) is the use of iter().position() instead of ByteSlice::find_byte in decode::fast_entry. It yielded the following improvements (compared to only the happy-path fix) for me

TreeRef()               time:   [84.198 ns 84.299 ns 84.405 ns]
                        change: [−13.030% −12.865% −12.676%] (p = 0.00 < 0.05)
                        Performance has improved.

TreeRefIter()           time:   [26.710 ns 26.887 ns 27.067 ns]
                        change: [−35.780% −35.469% −35.121%] (p = 0.00 < 0.05)
                        Performance has improved.

This large a speedup was actually a little unexpected, ~~but as indicated in the commit message, I guess there's some "we were blocking the compiler from optimizing/vectorizing for us" that is now removed.~~. Looking at the compiler output in compiler explorer actually does not support this theory. I'm not entirely sure what TREE looks like, but perhaps this is just a false positive: the names used in the benchmark are too small to benefit from the memchr implementation that find_byte uses, so using a basic simple loop (which is what the iter().position() compiles to) is faster.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b50a7e923

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "Codex (@codex) address that feedback".

gix-object/src/tree/mod.rs

Since the position of the space in the entrymode is often (always?) 6, we can add an explicit check for this case and skip some of the operations performed in the loop, making the benchmark a little faster.

Sebastian Thiel (Byron)

Thanks so much, this is a massive 'relative' improvement visible particularly in the iterator case. And that will absolutely benefit the tree-lookup.

Gnuplot not found, using plotters backend
TreeRef()               time:   [60.083 ns 60.294 ns 60.504 ns]
                        change: [−3.3324% −2.9038% −2.4809%] (p = 0.00 < 0.05)
                        Performance has improved.

TreeRefIter()           time:   [19.623 ns 19.917 ns 20.246 ns]
                        change: [−42.220% −41.582% −40.877%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild
  10 (10.00%) high severe

The above was tested against main with pre-allocation improvements already merged.

In any case, I think it's well-worth the added complexity - this code is performance critical.

chatgpt-codex-connector bot reviewed Mar 7, 2026

View reviewed changes

gix-object/src/tree/mod.rs Outdated Show resolved Hide resolved

datdenkikniet force-pushed the mini-optimize branch 3 times, most recently from c889fcd to 4931a1e Compare March 7, 2026 10:40

perf(EntryMode::extract_from_bytes): add happy path check

575e957

Since the position of the space in the entrymode is often (always?) 6, we can add an explicit check for this case and skip some of the operations performed in the loop, making the benchmark a little faster.

datdenkikniet force-pushed the mini-optimize branch 2 times, most recently from afe3753 to 575e957 Compare March 8, 2026 14:42

LOCI-AI (loci-dev) mentioned this pull request Mar 9, 2026

UPSTREAM PR #2461: perf(EntryMode::extract_from_bytes): add happy path check auroralabs-loci/gitoxide#30

Open

Sebastian Thiel (Byron) approved these changes Mar 22, 2026

View reviewed changes

Sebastian Thiel (Byron) merged commit 6abbe82 into GitoxideLabs:main Mar 22, 2026
55 of 58 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(EntryMode::extract_from_bytes): add happy path check#2461

perf(EntryMode::extract_from_bytes): add happy path check#2461
Sebastian Thiel (Byron) merged 1 commit intoGitoxideLabs:mainfrom
datdenkikniet:mini-optimize

datdenkikniet commented Mar 7, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Sebastian Thiel (Byron) left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

datdenkikniet commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Sebastian Thiel (Byron) left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

datdenkikniet commented Mar 7, 2026 •

edited

Loading