Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-43683: [Python] Use pandas StringDtype when enabled (pandas 3+) #44195
base: main
Are you sure you want to change the base?
GH-43683: [Python] Use pandas StringDtype when enabled (pandas 3+) #44195
Changes from all commits
ea76574
3e17983
e0b2958
8a6d6c3
11d2691
56b61f2
fdd6af3
84b8234
136b091
e5db09f
ec750bd
f9f960f
93284cf
4ab2aaa
a7e5e34
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious on how were categories interpreted before inferring the new string type, was this just not taken into account on the arrow side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If
field.name in categories
is true, that means the user asked to convert this column to a categorical dtype on the pandas side. This is handled on the C++ side to dictionary encode the column, and so in this case we don't have to specify any custom pandas extension dtype here, because then our conversion layer will convert that dictionary encoded column to a pandas categorical.