-
Notifications
You must be signed in to change notification settings - Fork 1k
Allow Users to Provide Custom ArrayFormatters when Pretty-Printing Record Batches
#8829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
tobixdev
wants to merge
11
commits into
apache:main
Choose a base branch
from
tobixdev:custom-formatters
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+490
−18
Open
Changes from 4 commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
4ff1f3f
Draft for implementing custom ArrayFormatters
tobixdev 47f00a7
Improve custom pretty printing
tobixdev 37bac6f
Add sanity check for number of columns
tobixdev 1e8f101
Formatting
tobixdev fcc6478
Minor fixes
tobixdev 0599d17
Use accessors in FormatOptions
tobixdev 6882394
Move ArrayFormatterFactory into FormatOptions
tobixdev d186741
Move ArrayFormatterFactory to display module to avoid issues with fea…
tobixdev a6aa5ff
Fix error in equals/hash constact in FormatOptions
tobixdev a8934ef
Add number of fields and columns to error message
tobixdev 017d8dc
Merge branch 'main' into custom-formatters
tobixdev File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -57,23 +57,23 @@ pub enum DurationFormat { | |
| pub struct FormatOptions<'a> { | ||
| /// If set to `true` any formatting errors will be written to the output | ||
| /// instead of being converted into a [`std::fmt::Error`] | ||
| safe: bool, | ||
| pub safe: bool, | ||
| /// Format string for nulls | ||
| null: &'a str, | ||
| pub null: &'a str, | ||
| /// Date format for date arrays | ||
| date_format: TimeFormat<'a>, | ||
| pub date_format: TimeFormat<'a>, | ||
| /// Format for DateTime arrays | ||
| datetime_format: TimeFormat<'a>, | ||
| pub datetime_format: TimeFormat<'a>, | ||
| /// Timestamp format for timestamp arrays | ||
| timestamp_format: TimeFormat<'a>, | ||
| pub timestamp_format: TimeFormat<'a>, | ||
| /// Timestamp format for timestamp with timezone arrays | ||
| timestamp_tz_format: TimeFormat<'a>, | ||
| pub timestamp_tz_format: TimeFormat<'a>, | ||
| /// Time format for time arrays | ||
| time_format: TimeFormat<'a>, | ||
| pub time_format: TimeFormat<'a>, | ||
| /// Duration format | ||
| duration_format: DurationFormat, | ||
| pub duration_format: DurationFormat, | ||
| /// Show types in visual representation batches | ||
| types_info: bool, | ||
| pub types_info: bool, | ||
| } | ||
|
|
||
| impl Default for FormatOptions<'_> { | ||
|
|
@@ -170,6 +170,10 @@ impl<'a> FormatOptions<'a> { | |
| } | ||
|
|
||
| /// Returns true if type info should be included in visual representation of batches | ||
| #[deprecated( | ||
| since = "58.0.0", | ||
| note = "Directly access the `types_info` field instead.`" | ||
| )] | ||
| pub const fn types_info(&self) -> bool { | ||
| self.types_info | ||
| } | ||
|
|
@@ -272,14 +276,16 @@ pub struct ArrayFormatter<'a> { | |
| } | ||
|
|
||
| impl<'a> ArrayFormatter<'a> { | ||
| /// Returns an [`ArrayFormatter`] using the provided formatter. | ||
| pub fn new(format: Box<dyn DisplayIndex + 'a>, safe: bool) -> Self { | ||
| Self { format, safe } | ||
| } | ||
|
|
||
| /// Returns an [`ArrayFormatter`] that can be used to format `array` | ||
| /// | ||
| /// This returns an error if an array of the given data type cannot be formatted | ||
| pub fn try_new(array: &'a dyn Array, options: &FormatOptions<'a>) -> Result<Self, ArrowError> { | ||
| Ok(Self { | ||
| format: make_formatter(array, options)?, | ||
| safe: options.safe, | ||
| }) | ||
| Ok(Self::new(make_formatter(array, options)?, options.safe)) | ||
| } | ||
|
|
||
| /// Returns a [`ValueFormatter`] that implements [`Display`] for | ||
|
|
@@ -332,12 +338,15 @@ fn make_formatter<'a>( | |
| } | ||
|
|
||
| /// Either an [`ArrowError`] or [`std::fmt::Error`] | ||
| enum FormatError { | ||
| pub enum FormatError { | ||
| /// An error occurred while formatting the array | ||
| Format(std::fmt::Error), | ||
| /// An Arrow error occurred while formatting the array. | ||
| Arrow(ArrowError), | ||
| } | ||
|
|
||
| type FormatResult = Result<(), FormatError>; | ||
| /// The result of formatting an array element via [`DisplayIndex::write`]. | ||
| pub type FormatResult = Result<(), FormatError>; | ||
|
|
||
| impl From<std::fmt::Error> for FormatError { | ||
| fn from(value: std::fmt::Error) -> Self { | ||
|
|
@@ -352,7 +361,8 @@ impl From<ArrowError> for FormatError { | |
| } | ||
|
|
||
| /// [`Display`] but accepting an index | ||
| trait DisplayIndex { | ||
| pub trait DisplayIndex { | ||
| /// Write the value of the underlying array at `idx` to `f`. | ||
| fn write(&self, idx: usize, f: &mut dyn Write) -> FormatResult; | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we should add a |
||
| } | ||
|
|
||
|
|
||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than making these public, what about adding accessors for them? I think that would make it easier to change the underlying implementation in the future without causing breaking API changes
I think by making all these fields
pubit means people can construct format options explicitly likeSo adding any new field to the struct will be a breaking API change
If we keep them private fields, then we can add new fields without breaking existing peopel