-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blob search using the glob(Unix style pathname pattern) syntax #40269
Comments
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jalauzon-msft @vincenttran-msft. |
Hello @martinResearch. I'm an AI assistant for the azure-sdk-for-python repository. I have some suggestions that you can try out while the team gets back to you. • Currently, blob listing via the SDK supports filtering by prefix (using the • A glob feature request would imply adding a new method (e.g. • Consideration should be given to whether the pattern matching can be delegated to the service. At this time, the service only supports prefix filters, so implementing a true glob syntax may require local processing. • As a workaround, using existing methods like • Further discussion on this feature might benefit from clarifying use cases and performance expectations to guide a possible server-side or optimized client-side enhancement. The team will get back to you shortly, hopefully this helps in the meantime. |
Hi @martinResearch, thanks for reaching out. As you mention, for this type of feature to have a real impact on efficiency, this is something that would need to be implemented server-side. You would need to reach out to the service team for that. What did you have in mind for client-side changes that could improve this scenario? Keep in mind, the SDK does not load all Blobs at once and so filtering after the fact is not something we would be likely to do and would fall to the user application. |
My current implementation:
|
I would like to be able to efficiently list all the blobs in a container that match a Unix style pathname pattern.
as implemented in the python glob module
Describe the solution you'd like
I would like to be able to list blobs using for example
or
We could list all the blobs in
dataset
and do the filtering locally, but that is very slow when the number of blobs is large and we are interested in selecting only a small subset. Ideally this feature would be implemented on the server side to avoid having to retrieve large lists of files locally, but I guess this would be out of scope for the python SDK.I implemented my own solution based on the glob python package but would like this feature to be part of the SDK instead.
The text was updated successfully, but these errors were encountered: