Add a statically dispatched version of `make_comparator` #8814

adriangb · 2025-11-10T03:50:59Z

Working on apache/datafusion#18449 it came up that the dynamic dispatch here can hurt performance for simple types. This version seems to perform better in benchmarks there (admittedly run alongside other changes).

…le scalar values

Dandandan · 2025-11-12T10:35:48Z

Would be nice to see the sorting benchmarks on this change! My feeling is it might significantly reduce the overhead of the lexical sorting kernels (compared to row format).

adriangb · 2025-11-12T13:05:36Z

Or it might make it worse… it’s a trade off of indirection vs branch prediction. As we found out in apache/datafusion#18449 the same change can be 70% faster on ARM MacBooks and 70% slower on x86.

So needless to say this should not be merged without benchmarks

Update all benchmark names from "comparator ..." to "make_comparator ..." to ensure compatibility with baseline comparisons across branches. This allows criterion to properly compare results between the add-comparator-benchmark and typed-comparator branches using --save-baseline and --baseline flags. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

adriangb · 2025-11-15T21:49:04Z

I added benchmarks and comparing to main this seems overall slower:

make_comparator i32 100 rows
                        time:   [3.8138 µs 3.8190 µs 3.8246 µs]
                        change: [+1.2144% +1.3230% +1.4364%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

make_comparator i32 1M rows
                        time:   [3.8120 µs 3.8133 µs 3.8147 µs]
                        change: [+0.8398% +0.9598% +1.0570%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) low mild
  3 (3.00%) high mild

make_comparator i32 nulls 50% 100 rows
                        time:   [4.7215 µs 4.7240 µs 4.7272 µs]
                        change: [+15.169% +15.252% +15.353%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  4 (4.00%) high mild
  9 (9.00%) high severe

make_comparator i32 nulls 100% 100 rows
                        time:   [4.7141 µs 4.7168 µs 4.7204 µs]
                        change: [+18.554% +18.690% +18.819%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe

make_comparator f64 100 rows
                        time:   [4.2620 µs 4.2640 µs 4.2664 µs]
                        change: [+4.1768% +4.3310% +4.4556%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  9 (9.00%) high severe

make_comparator f64 1M rows
                        time:   [4.2628 µs 4.2637 µs 4.2647 µs]
                        change: [+5.1286% +5.3869% +5.6529%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  8 (8.00%) high severe

make_comparator utf8 100 rows
                        time:   [7.0815 µs 7.0846 µs 7.0880 µs]
                        change: [+13.332% +13.648% +13.927%] (p = 0.00 < 0.05)
                        Performance has regressed.

make_comparator utf8 1M rows
                        time:   [7.6914 µs 7.6936 µs 7.6959 µs]
                        change: [+8.4761% +8.8173% +9.1074%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  9 (9.00%) high mild
  1 (1.00%) high severe

make_comparator utf8view 100 rows
                        time:   [8.0346 µs 8.0509 µs 8.0688 µs]
                        change: [+59.938% +60.188% +60.453%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
  4 (4.00%) high mild
  10 (10.00%) high severe

make_comparator utf8view 1M rows
                        time:   [10.060 µs 10.066 µs 10.073 µs]
                        change: [+60.032% +60.127% +60.223%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

make_comparator dict(u32,utf8) 100 rows
                        time:   [11.618 µs 11.623 µs 11.628 µs]
                        change: [+43.742% +43.911% +44.091%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
  2 (2.00%) high mild
  12 (12.00%) high severe

make_comparator dict(u32,utf8) 1M rows
                        time:   [11.899 µs 11.907 µs 11.918 µs]
                        change: [+51.329% +51.453% +51.603%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  10 (10.00%) high severe

make_comparator list(i32) 100 rows
                        time:   [11.389 µs 11.400 µs 11.415 µs]
                        change: [+69.504% +70.082% +70.679%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
  4 (4.00%) high mild
  12 (12.00%) high severe

make_comparator list(i32) 1M rows
                        time:   [11.955 µs 11.969 µs 11.981 µs]
                        change: [+68.186% +68.422% +68.643%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 21 outliers among 100 measurements (21.00%)
  21 (21.00%) high mild

So I'm going to close this for now as a failed experiment.

adriangb added 2 commits November 9, 2025 21:42

Add a statically dispatched comparator for better performance on simp…

f80758c

…le scalar values

expose a slim struct, move to constructor

07511cc

github-actions bot added the arrow Changes to the arrow crate label Nov 10, 2025

adriangb mentioned this pull request Nov 10, 2025

Refactor InListExpr to support structs by re-using existing hashing infrastructure apache/datafusion#18449

Open

adriangb added 4 commits November 9, 2025 21:54

fix docs

888669e

fix example

e7c0ea6

replace internal uses, export

b253256

use more internally

dfc8cd3

adriangb force-pushed the typed-comparator branch from dfc8cd3 to 6edfedf Compare November 10, 2025 16:24

adriangb added 2 commits November 15, 2025 15:26

add benchmarks

f15423c

add benchmarks

01a89d7

adriangb force-pushed the typed-comparator branch from 6edfedf to 01a89d7 Compare November 15, 2025 07:35

adriangb and others added 2 commits November 15, 2025 07:39

fmt

f8157bb

adriangb closed this Nov 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a statically dispatched version of `make_comparator` #8814

Add a statically dispatched version of `make_comparator` #8814

adriangb commented Nov 10, 2025

Uh oh!

Dandandan commented Nov 12, 2025

Uh oh!

adriangb commented Nov 12, 2025

Uh oh!

adriangb commented Nov 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add a statically dispatched version of make_comparator #8814

Add a statically dispatched version of make_comparator #8814

Conversation

adriangb commented Nov 10, 2025

Uh oh!

Dandandan commented Nov 12, 2025

Uh oh!

adriangb commented Nov 12, 2025

Uh oh!

adriangb commented Nov 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add a statically dispatched version of `make_comparator` #8814

Add a statically dispatched version of `make_comparator` #8814