-
Notifications
You must be signed in to change notification settings - Fork 202
Implement Azure Blob Store #1554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
88ea9e1 to
024eccb
Compare
024eccb to
042699f
Compare
042699f to
ce79966
Compare
ce79966 to
d5bf750
Compare
d5bf750 to
7e2a535
Compare
|
I have updated the PR with the latest changes and have verified that it works locally. |
|
Any updates here? |
aaronmondal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good already. One thing I'm wondering is what happened to the buffer logic from the S3 implementation. Was that not usable here or did it have other issues?
Regarding the code, there seems to be quite a lot of duplication going on though. WDYT about something roughly like this?
+---------------Current Architecture-----+ +------Refactored Architecture----+
| | | |
| +----------------------------+ | | +----------------------------+ |
| | StoreDriver Trait | | | | StoreDriver Trait | |
| | Common interface for all | | | | Common interface for all | |
| | storage | | | | storage | |
| +----------------------------+ | | +----------------------------+ |
| | | | | |
| implemented by | | implemented by |
| | | | | |
| v | | v |
| +----------+ +----------+ +----------+ | | +-----------------------------+ |
| | | | | | | | | | GenericCloudStore<P,NowFn> | |
| | S3Store | |AzureStore| | GcsStore | | | | Common implementation | |
| | | | | | | | | |with optimized critical paths| |
| +----------+ +----------+ +----------+ | | +-----------------------------+ |
| | | | |
+----------------------------------------+ | uses |
| | |
| v |
| +----------------------------+ |
| | CloudStorageProvider Trait| |
| | Minimal provider- | |
| | specific operations | |
| +----------------------------+ |
| | |
| implemented by |
| | |
| +---------+---------+ |
| | | | |
| v v v |
| +------+ +------+ +------+ |
| | S3 | |Azure | | GCS | |
| +------+ +------+ +------+ |
| |
+---------------------------------+
There are other parts that look very duplicated among the now 3 store implementations like the object path logic and the request transformations across the stores.
Reviewable status: 0 of 2 LGTMs obtained, and 0 of 10 files reviewed (waiting on @jhpratt)
amankrx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly! This is something that I had planned as well. But it needs some restructuring, that was out of scope for this PR atleast. We can maybe create a separate PR that handles the GenericCloudStore. This would minimize a lot of duplication.
Reviewable status: 0 of 2 LGTMs obtained, and 0 of 10 files reviewed (waiting on @jhpratt)
this would be a rework, ideally doable by a strict refactoring (zero functional change), but this needs to be done before(!) adding new stores, to avoid higher/duplicate effort. Would you be willing to add a new bounty for this, or do you expect this to happen in context of providing the azure blob store? Personally, I would even split this into 2 bounties: one for the theoretical work (provide the inheritance diagram), another one for the implementation. 3rd and 4ths ones etc. would be for the stores (lower effort). One thing that is problematic and that you need to know: @aaronmondal , @MarcusSorealheis, your response-times within the repo and especially your bounties is far too low - be aware that this can alienate contributors (normal ones and bounty-workers). |
|
The rework for the generalized cloud store is already a WIP. I will raise that PR next week that should separate it out the functionality. |
It's just that Issues like this one shouldn't take 6 months. Splitting issues down to chunks and collaborating does the job much faster and with higher quality. That's all to it. (and response-times... well...) |
|
related: @MarcusSorealheis , is this a hickup of algora? the bounty #659 is still listed as 'unpaid' https://algora.io/TraceMachina/bounties |
Description
This PR implements the Azure Blob Store implementation and closely aligns with the AWS S3 implementation.
This utilizes the azure_sdk libraries (azure_core, azure_storage, and azure_storage_blobs). The libraries are still unofficial and being developed, so we should be mindful of any breaking changes in the future.
Apart from that, I started my development by first creating a POC: https://gist.github.com/amankrx/45e7d2a6ed935aa13dda0318681af2ad
This POC tests the get and upload blobs with all the default features disabled for Azure SDK. This creates a custom HttpClient with manually signing the requests to perform the transactions.
Fixes #1542
/claim #1542
Type of change
Please delete options that aren't relevant.
How Has This Been Tested?
I tested it locally using the
bazel testcommand with theazure_blob_backend.json5file, configured to use the actual blob storage. To test this locally, you should set the environment variableAZURE_STORAGE_KEYas the access key. Additionally, I created a test suite to cover both individual happy paths and error scenarios.Checklist
bazel test //...passes locallygit amendsee some docsThis change is