Skip to content

Conversation

hhhizzz
Copy link

@hhhizzz hhhizzz commented Oct 14, 2025

Which issue does this PR close?

Related to:

Rationale for this change

Improve the performance in ParquetRecoredBatchReader, especially when the rowselector is short.

  • By changing a hash map to a enum array

What changes are included in this PR?

For parquet/src/arrow/array_reader/cached_array_reader.rs, update the hash function

Are these changes tested?

The hashmaps are already covered by existing tests.
Also tested by manual read parquets.

Are there any user-facing changes?

No

Performance results in arrow_reader_row_filter.rs

on my 3950X

Benchmark Change Verdict
int64 == 9999 / all_columns / async 🟢 -1.61% Improved
int64 == 9999 / all_columns / sync 🔴 +1.56% Regressed
int64 == 9999 / exclude_filter_column / async 🟢 -1.11% Improved
int64 == 9999 / exclude_filter_column / sync ⚪ -0.97% Within noise
float64 > 99.0 / all_columns / async 🟢 -6.25% Improved
float64 > 99.0 / all_columns / sync 🟢 -11.24% Improved
float64 > 99.0 / exclude_filter_column / async 🟢 -11.10% Improved
float64 > 99.0 / exclude_filter_column / sync 🟢 -3.31% Improved
ts ≥ 9000 / all_columns / async 🔴 +2.77% Regressed
ts ≥ 9000 / all_columns / sync ⚪ -0.06% Within noise
ts ≥ 9000 / exclude_filter_column / async 🟢 -2.54% Improved
ts ≥ 9000 / exclude_filter_column / sync ⚪ +0.28% Within noise
int64 > 90 / all_columns / async 🟢 -14.68% Improved
int64 > 90 / all_columns / sync 🟢 -21.00% Improved
int64 > 90 / exclude_filter_column / async 🟢 -17.66% Improved
int64 > 90 / exclude_filter_column / sync 🟢 -14.53% Improved
float64 ≤ 99.0 / all_columns / async 🟢 -9.20% Improved
float64 ≤ 99.0 / all_columns / sync 🟢 -11.07% Improved
float64 ≤ 99.0 / exclude_filter_column / async 🟢 -10.01% Improved
float64 ≤ 99.0 / exclude_filter_column / sync 🟢 -11.80% Improved
ts < 9000 / all_columns / async 🟢 -3.43% Improved
ts < 9000 / all_columns / sync 🟢 -6.23% Improved
ts < 9000 / exclude_filter_column / async 🟢 -4.00% Improved
ts < 9000 / exclude_filter_column / sync 🟢 -3.91% Improved
utf8View <> '' / all_columns / async 🟢 -16.56% Improved
utf8View <> '' / all_columns / sync 🟢 -12.10% Improved
utf8View <> '' / exclude_filter_column / async 🟢 -13.00% Improved
utf8View <> '' / exclude_filter_column / sync 🟢 -17.29% Improved
float64 > 99.0 AND ts ≥ 9000 / all_columns / async 🔴 +3.51% Regressed
float64 > 99.0 AND ts ≥ 9000 / all_columns / sync 🟢 -2.19% Improved
float64 > 99.0 AND ts ≥ 9000 / exclude_filter_column / async 🟢 -2.63% Improved
float64 > 99.0 AND ts ≥ 9000 / exclude_filter_column / sync 🟢 -2.72% Improved

@github-actions github-actions bot added the parquet Changes to the parquet crate label Oct 14, 2025
@hhhizzz hhhizzz marked this pull request as ready for review October 14, 2025 15:44
@alamb
Copy link
Contributor

alamb commented Oct 15, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1016-gcp #17~24.04.1-Ubuntu SMP Wed Sep 3 01:55:36 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing row-selector-optimize (9e7cb15) to 597c903 diff
BENCH_NAME=arrow_reader_row_filter
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_row_filter
BENCH_FILTER=
BENCH_BRANCH_NAME=row-selector-optimize
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Oct 15, 2025

Thank you for this contribution @hhhizzz . I have kicked off the benchmarks for this PR as well

@alamb
Copy link
Contributor

alamb commented Oct 15, 2025

🤖: Benchmark completed

Details

group                                                                                main                                   row-selector-optimize
-----                                                                                ----                                   ---------------------
arrow_reader_row_filter/float64 <= 99.0/all_columns/async                            1.05  1775.8±12.53µs        ? ?/sec    1.00   1687.2±6.43µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/all_columns/sync                             1.05      2.1±0.01ms        ? ?/sec    1.00   1984.4±3.98µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/async                  1.06   1622.7±7.10µs        ? ?/sec    1.00  1525.4±12.77µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/sync                   1.05  1698.7±11.14µs        ? ?/sec    1.00   1620.1±8.04µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/async              1.00   1520.2±7.01µs        ? ?/sec    1.00  1513.2±17.67µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/sync               1.01  1887.8±12.15µs        ? ?/sec    1.00  1868.3±16.78µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/async    1.01   1349.8±5.71µs        ? ?/sec    1.00   1342.5±9.46µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/sync     1.00  1470.5±11.83µs        ? ?/sec    1.01  1484.5±14.65µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/async                             1.05   1766.1±4.18µs        ? ?/sec    1.00   1682.0±7.40µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/sync                              1.05      2.1±0.02ms        ? ?/sec    1.00  1990.1±15.61µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/async                   1.05   1610.9±9.95µs        ? ?/sec    1.00  1532.6±12.62µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/sync                    1.05  1702.8±17.27µs        ? ?/sec    1.00  1616.5±12.88µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/async                              1.00    925.1±4.74µs        ? ?/sec    1.00    923.3±7.12µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/sync                               1.00  1000.1±20.11µs        ? ?/sec    1.01   1005.4±3.70µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/async                    1.01    875.2±4.34µs        ? ?/sec    1.00    865.6±9.20µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/sync                     1.00    982.5±6.17µs        ? ?/sec    1.00    979.9±7.92µs        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/async                                 1.19      4.7±0.02ms        ? ?/sec    1.00      3.9±0.02ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/sync                                  1.24      4.8±0.02ms        ? ?/sec    1.00      3.9±0.01ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/async                       1.19      4.1±0.01ms        ? ?/sec    1.00      3.4±0.01ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/sync                        1.19      4.0±0.01ms        ? ?/sec    1.00      3.3±0.01ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/async                                  1.00  1937.9±11.30µs        ? ?/sec    1.02   1980.9±6.84µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/sync                                   1.00      2.2±0.01ms        ? ?/sec    1.01      2.2±0.01ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/async                        1.00   1761.5±5.54µs        ? ?/sec    1.03   1814.6±6.73µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/sync                         1.00  1886.4±15.21µs        ? ?/sec    1.02   1926.0±6.85µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/async                                 1.00  1264.1±14.24µs        ? ?/sec    1.00   1257.9±5.79µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/sync                                  1.02  1429.6±10.56µs        ? ?/sec    1.00  1406.8±12.70µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/async                       1.00   1136.9±5.76µs        ? ?/sec    1.00   1133.8±9.86µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/sync                        1.01   1277.9±9.51µs        ? ?/sec    1.00   1265.9±7.01µs        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/async                             1.16      4.8±0.02ms        ? ?/sec    1.00      4.2±0.01ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/sync                              1.14      5.5±0.02ms        ? ?/sec    1.00      4.8±0.02ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/async                   1.18      4.2±0.02ms        ? ?/sec    1.00      3.5±0.02ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/sync                    1.18      4.1±0.02ms        ? ?/sec    1.00      3.5±0.01ms        ? ?/sec

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @hhhizzz -- this is a really nice contribution and very nice performance results.

I had a few comments and I would ilke to get another pair of eyes on the packed decoder changes, but otherwise 👍 from me

THanks again

@alamb
Copy link
Contributor

alamb commented Oct 19, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1017-gcp #18~24.04.1-Ubuntu SMP Tue Sep 23 17:51:44 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing row-selector-optimize (6899355) to d49f017 diff
BENCH_NAME=arrow_reader_row_filter
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_row_filter
BENCH_FILTER=
BENCH_BRANCH_NAME=row-selector-optimize
Results will be posted here when complete

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @hhhizzz -- assuming the benchmarks are still good this one looks good to go from my perspective

This is a nice find

Self(mask)
}

/// Mark the given [`Encoding`] as present in this mask.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@alamb
Copy link
Contributor

alamb commented Oct 19, 2025

🤖: Benchmark completed

Details

group                                                                                main                                   row-selector-optimize
-----                                                                                ----                                   ---------------------
arrow_reader_row_filter/float64 <= 99.0/all_columns/async                            1.04  1773.4±14.44µs        ? ?/sec    1.00  1706.6±10.25µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/all_columns/sync                             1.07      2.1±0.02ms        ? ?/sec    1.00  1974.0±17.06µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/async                  1.03   1615.7±7.39µs        ? ?/sec    1.00   1562.1±8.56µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/sync                   1.04   1720.1±8.68µs        ? ?/sec    1.00  1650.0±17.51µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/async              1.01  1534.0±11.80µs        ? ?/sec    1.00   1515.2±8.12µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/sync               1.01  1876.8±18.23µs        ? ?/sec    1.00  1851.2±14.98µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/async    1.01   1361.2±9.96µs        ? ?/sec    1.00   1345.9±6.01µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/sync     1.02  1477.1±11.55µs        ? ?/sec    1.00   1450.5±9.94µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/async                             1.05  1775.8±18.45µs        ? ?/sec    1.00   1693.6±7.72µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/sync                              1.06      2.1±0.01ms        ? ?/sec    1.00   1980.2±7.16µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/async                   1.05   1621.2±9.68µs        ? ?/sec    1.00  1544.6±10.21µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/sync                    1.04  1708.0±12.13µs        ? ?/sec    1.00  1639.4±20.48µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/async                              1.02    938.5±8.04µs        ? ?/sec    1.00    920.6±6.65µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/sync                               1.00    978.0±6.92µs        ? ?/sec    1.00    975.4±6.67µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/async                    1.00    829.7±6.88µs        ? ?/sec    1.01    841.5±5.16µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/sync                     1.00    967.6±9.84µs        ? ?/sec    1.00    969.4±9.29µs        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/async                                 1.12      4.6±0.03ms        ? ?/sec    1.00      4.1±0.02ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/sync                                  1.19      4.8±0.02ms        ? ?/sec    1.00      4.1±0.01ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/async                       1.13      4.1±0.02ms        ? ?/sec    1.00      3.6±0.01ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/sync                        1.16      4.0±0.02ms        ? ?/sec    1.00      3.4±0.01ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/async                                  1.01  1969.7±11.55µs        ? ?/sec    1.00  1943.5±10.70µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/sync                                   1.01      2.2±0.01ms        ? ?/sec    1.00      2.2±0.01ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/async                        1.02  1826.2±13.23µs        ? ?/sec    1.00   1789.3±8.26µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/sync                         1.01  1930.8±13.56µs        ? ?/sec    1.00  1916.9±10.24µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/async                                 1.00   1257.9±9.08µs        ? ?/sec    1.00  1255.0±12.80µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/sync                                  1.01   1399.8±8.60µs        ? ?/sec    1.00  1379.5±13.03µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/async                       1.01  1148.4±20.98µs        ? ?/sec    1.00   1133.4±5.81µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/sync                        1.03  1288.5±10.09µs        ? ?/sec    1.00   1254.7±9.26µs        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/async                             1.13      4.9±0.02ms        ? ?/sec    1.00      4.3±0.03ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/sync                              1.13      5.6±0.02ms        ? ?/sec    1.00      5.0±0.02ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/async                   1.15      4.2±0.01ms        ? ?/sec    1.00      3.7±0.01ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/sync                    1.17      4.1±0.02ms        ? ?/sec    1.00      3.5±0.01ms        ? ?/sec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants