Merged
Conversation
nandahkrishna
approved these changes
Mar 24, 2025
Contributor
|
🤖 An automated task has requested bottles to be published to this PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Created by
brew bumpCreated with
brew bump-formula-pr.Details
release notes
statsgot another round of improvements:Before, it was limited to a simple, English-centric heuristic:
0/1,t/fory/ncase-insensitive, the data type of the column is inferred as boolean--boolean-patterns <arg>option, we can now specify arbitrarytrue_pattern:false_patternpattern pairs. Each pattern can be a string of length > 1 and are case-insensitive. If a pattern ends with "*", it is treated as a prefix.For example,
t*:f*matches "true", "Truthy", "T" as boolean true so long as the corresponding false pattern (e.g. "Fake, False, f") is also matched and the cardinality is 2.For backwards compatibility, the default true/false pairs are
1:0,t*:f*,y*:n*By enabling the
--percentilesflag,statswill now return the 5th, 10th, 40th, 60th, 90th and 95th percentile by default using the nearest-rank method for all numeric and date/datetime columns. The returned percentiles can be configured to return different percentiles using the--percentile-list <arg>option.Note that the method for computing quartiles (Method 3) is basically a specialized implementation of the nearest rank method for q1 (25th), q2 (50th or median) and q3 (75th percentile), thus the choice of non-overlapping defaults for
--percentile-list.frequency: got a performance boost now that we're usingqsv-stats0.32.0, which uses the fasterfoldhashcrateahashwithfoldhashsuite-wide, qsv got a tad faster when doing hash lookupssample: "streaming" bernoulli sampling now works for any remotely hosted CSVs with servers that support chunked downloads, without requiring range request support.Added
stats: add configurable boolean inferencing feat:statsadd configurable boolean inferencing dathere/qsv#2595stats: add--percentilesoption feat:statsadd--percentilesoption dathere/qsv#2617Changed
assert_eq!macro withsimilar_asserts::assert_eq!macro for easier debugging replace stdassert_eq!macro withsimilar_asserts::assert_eq!macro for easier debugging dathere/qsv#2605Fixed
luau: fix flaky register_lookup_table CI test that only intermittently fails in Windows by using buffered writer in lookupwrite_cache_filehelper dathere/qsv@f494b46sample: refactor "streaming" Bernoulli sampling, so it actually works without requiring range requests support refactor:sample"streaming" Bernoulli sampling dathere/qsv#2600Full Changelog: dathere/qsv@3.2.0...3.3.0