-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for blobs synchronization #40271
Comments
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jalauzon-msft @vincenttran-msft. |
Hello @martinResearch. I'm an AI assistant for the azure-sdk-for-python repository. I have some suggestions that you can try out while the team gets back to you. • This feature request targets adding a built‐in method to synchronize local files and Azure blobs, similar to the functionality provided by azcopy sync (azcopy sync documentation). • Currently, the Python SDK does not include a native synchronization method that compares last modification times or MD5 hashes out-of-the-box. We recommend using custom implementations derived from ContainerClient as an interim solution. • As a future enhancement, consolidating such a sync method into the SDK could improve integration into Python workflows. Please consider providing detailed use cases and scenarios in your feature request to help prioritize its development. The team will get back to you shortly, hopefully this helps in the meantime. |
Hi @martinResearch, thanks for reaching out. Unfortunately, while we do see the value in this type of feature, to be perfectly honest, this is not something we are likely to add to the SDK anytime soon. A sync function such as this is a very complicated feature that even the AzCopy team still encounters issues with, especially given it needs to be highly generalized to support many different scenarios. We simply don't have the engineering resources to implement/maintain such a feature at this time. Additionally, if we do ever introduce something like this, it would likely be in a separate package. If you have an implementation that works for you, I recommend you continue to use that. |
my current implementation: """ContainerClientExtended - a helper class for accessing Azure Blob Storage. ContainerClientExtended is a wrapper around azure.storage.blob.ContainerClient which from concurrent.futures import ThreadPoolExecutor from azure.storage.blob import BlobProperties, ContainerClient class ContainerClientExtended(ContainerClient):
|
Is your feature request related to a problem? Please describe.
I have a many blobs in a blob storage folder and would like to synchronize them with local copies i.e. download blobs that are missing locally or have been modified in the azure storage since the last synchronization, based on the last modification data and/or the MD5 hash.
azcopy
supports that feature troughazcopy sync
https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-blobs-synchronizeIt would be great to have the same feature through the Python SDK for a better integration in the python ecosystem.
Describe the solution you'd like
I would like a method
that can take either a prefix or an explicit list of blobs and synchronize with files in a local folder.
I have implemented my own class derived from
azure.storage.blob.ContainerClient
to support that feature, but I would prefer it to be supported by the SDK out of the boxThe text was updated successfully, but these errors were encountered: