Skip to content

GH-43135: [R] Change the binary type mapping to blob::blob #45595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 12, 2025

Conversation

eitsupi
Copy link
Contributor

@eitsupi eitsupi commented Feb 20, 2025

Rationale for this change

Many packages, including nanoarrow, use the blob class provided by the blob package to represent a vector of binary data.
However, this package did not use blob.
Therefore, when this package convert Arrow Binary type to R, it was difficult for other packages to interpret it as a Binary type.

What changes are included in this PR?

The current special processing for the arrow_binary class should also be performed for the blob class.
Also, arrow_binary, arrow_large_binary, and arrow_fixed_size_binary will be changed to subclasses of the blob class.

Are these changes tested?

Rewrote some existing tests to use blob instead of arrow_binary.

Are there any user-facing changes?

Yes.

But the traditional classes have been changed to subclasses of blob, so little impact is expected.

Copy link

⚠️ GitHub issue #43135 has been automatically assigned in GitHub to PR creator.

@eitsupi eitsupi force-pushed the r-blob-support branch 2 times, most recently from 8e8dac3 to 1811175 Compare February 20, 2025 16:49
@eitsupi eitsupi marked this pull request as ready for review February 20, 2025 16:49
@pitrou
Copy link
Member

pitrou commented Mar 6, 2025

@paleolimbot @assignUser Does one of you want to take a look?

@assignUser
Copy link
Member

Sounds like a good idea to improve compatibility with the larger ecosystem. Especially if nanoarrow already does it.

Not sure if there were reasons to not implement this historically @jonkeane @thisisnic ?

@paleolimbot
Copy link
Member

Sounds like a good idea to improve compatibility with the larger ecosystem. Especially if nanoarrow already does it.

+1! I am not aware of a practical reason to differentiate between fixed size/large/regular binary at the R level (which seems to be the reason noted in the issue thread) but perhaps others will add that context.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Mar 7, 2025
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Mar 7, 2025
Copy link
Member

@jonkeane jonkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this

@eitsupi
Copy link
Contributor Author

eitsupi commented Mar 8, 2025

Thank you. I opened #45709 as a follow-up issue.

If a version is released that includes this, we can work on #45709 in the next release after that.

@eitsupi
Copy link
Contributor Author

eitsupi commented Mar 12, 2025

Can anyone merge this? Thanks!

@assignUser assignUser merged commit 713d09b into apache:main Mar 12, 2025
11 checks passed
@assignUser assignUser removed the awaiting merge Awaiting merge label Mar 12, 2025
@eitsupi eitsupi deleted the r-blob-support branch March 13, 2025 00:13
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 713d09b.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 4 possible false positives for unstable benchmarks that are known to sometimes produce them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants