Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Directly reading parquet files in a s3 bucket from the load_dataset method #5566

Open
shamanez opened this issue Feb 22, 2023 · 1 comment
Open
Labels
duplicate This issue or pull request already exists enhancement New feature or request

Comments

@shamanez
Copy link

Feature request

Right now, we have to read the get the parquet file to the local storage. So having ability to read given the bucket directly address would be benificial

Motivation

In a production set up, this feature can help us a lot. So we do not need move training datafiles in between storage.

Your contribution

I am willing to help if there's anyway.

@shamanez shamanez added the enhancement New feature or request label Feb 22, 2023
@lhoestq
Copy link
Member

lhoestq commented Feb 23, 2023

Hi ! I think is in the scope of this other issue: to #5281

@lhoestq lhoestq added the duplicate This issue or pull request already exists label Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants