You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Introduce new setting fuse_parquet_read_batch_size which controls the batch size during deserializing of parquet fuse table data block.
The default value is set to 8192. In preliminary TPCH tests, this setting performed good.
TPCH SF 300 q1:
select l_returnflag, l_linestatus,
sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc, count(*) as count_order
from lineitem
group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus;
Single Query Node
Disk cache enabled
Table lineitem fully cached (hot)
round 1
round 2
round 3
v1.2.711
37.965 s
38.161 s
37.933 s
this PR
23.938 s
23.656 s
23.845 s
Tests
Unit Test
Logic Test
Benchmark Test
No Test - use existing tests
Type of change
Bug Fix (non-breaking change which fixes an issue)
New Feature (non-breaking change which adds functionality)
Breaking Change (fix or feature that could cause existing functionality not to work as expected)
ci-cloudBuild docker image for cloud testpr-featurethis PR introduces a new feature to the codebase
1 participant
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
Introduce new setting
fuse_parquet_read_batch_sizewhich controls the batch size during deserializing of parquet fuse table data block.The default value is set to 8192. In preliminary TPCH tests, this setting performed good.
TPCH SF 300 q1:
lineitemfully cached (hot)Tests
Type of change
This change is