Skip to content

API, Core: Use Supplier for FileIO on Scan#15646

Open
nastra wants to merge 2 commits intoapache:mainfrom
nastra:scan-fileio-supplier
Open

API, Core: Use Supplier for FileIO on Scan#15646
nastra wants to merge 2 commits intoapache:mainfrom
nastra:scan-fileio-supplier

Conversation

@nastra
Copy link
Contributor

@nastra nastra commented Mar 16, 2026

This PR refactors the Scan API to return Supplier<FileIO> instead of a direct FileIO instance, enabling lazy/deferred initialization of the FileIO. The motivation is to accommodate cases (e.g., REST scans) where FileIO may not be immediately available at scan construction time.

@Override
public FileIO io() {
return table.io();
public Supplier<FileIO> io() {
Copy link
Contributor Author

@nastra nastra Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdblue looks like this is an API-breaking change unfortunately. This used to be protected FileIO in 1.10.x.
A viable workaround would be to use fileIO() (or some other naming) at the API level and deprecate and not touch io() here. This can be seen in commit a7e24f2

@nastra nastra force-pushed the scan-fileio-supplier branch from e87053c to a7e24f2 Compare March 16, 2026 08:52
@nastra nastra requested a review from rdblue March 16, 2026 09:44
return scanFileIO;
public Supplier<FileIO> fileIO() {
return () -> {
Preconditions.checkState(
Copy link
Contributor Author

@nastra nastra Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdblue we could consider removing this check, wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do feel like there's an argument that we should still keep the check. The Supplier return type indicates to a caller that they need to think about when to invoke get but if the caller does invoke get prematurely, I think it's best to throw rather than return null.

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, thank you @nastra

return scanFileIO;
public Supplier<FileIO> fileIO() {
return () -> {
Preconditions.checkState(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do feel like there's an argument that we should still keep the check. The Supplier return type indicates to a caller that they need to think about when to invoke get but if the caller does invoke get prematurely, I think it's best to throw rather than return null.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants