Skip to content

Conversation

@amankrx
Copy link
Collaborator

@amankrx amankrx commented Dec 26, 2024

Description

This PR implements the Azure Blob Store implementation and closely aligns with the AWS S3 implementation.
This utilizes the azure_sdk libraries (azure_core, azure_storage, and azure_storage_blobs). The libraries are still unofficial and being developed, so we should be mindful of any breaking changes in the future.

Apart from that, I started my development by first creating a POC: https://gist.github.com/amankrx/45e7d2a6ed935aa13dda0318681af2ad
This POC tests the get and upload blobs with all the default features disabled for Azure SDK. This creates a custom HttpClient with manually signing the requests to perform the transactions.

Fixes #1542
/claim #1542

Type of change

Please delete options that aren't relevant.

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

I tested it locally using the bazel test command with the azure_blob_backend.json5 file, configured to use the actual blob storage. To test this locally, you should set the environment variable AZURE_STORAGE_KEY as the access key. Additionally, I created a test suite to cover both individual happy paths and error scenarios.

Checklist

  • Updated documentation if needed
  • Tests added/amended
  • Local testing completed
  • bazel test //... passes locally
  • PR is contained in a single commit, using git amend see some docs

This change is Reviewable

@CLAassistant
Copy link

CLAassistant commented Dec 26, 2024

CLA assistant check
All committers have signed the CLA.

@amankrx amankrx marked this pull request as draft December 26, 2024 20:05
@amankrx amankrx marked this pull request as ready for review December 29, 2024 22:10
@amankrx
Copy link
Collaborator Author

amankrx commented May 6, 2025

I have updated the PR with the latest changes and have verified that it works locally.
Awaiting review from @MarcusSorealheis @aaronmondal

@amankrx
Copy link
Collaborator Author

amankrx commented May 11, 2025

Any updates here?

Copy link
Contributor

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good already. One thing I'm wondering is what happened to the buffer logic from the S3 implementation. Was that not usable here or did it have other issues?

Regarding the code, there seems to be quite a lot of duplication going on though. WDYT about something roughly like this?

+---------------Current Architecture-----+  +------Refactored Architecture----+
|                                        |  |                                 |
|       +----------------------------+   |  | +----------------------------+  |
|       |      StoreDriver Trait     |   |  | |      StoreDriver Trait     |  |
|       |  Common interface for all  |   |  | |  Common interface for all  |  |
|       |         storage            |   |  | |          storage           |  |
|       +----------------------------+   |  | +----------------------------+  |
|                    |                   |  |                |                |
|              implemented by            |  |          implemented by         |
|                    |                   |  |                |                |
|                    v                   |  |                v                |
| +----------+ +----------+ +----------+ |  | +-----------------------------+ |
| |          | |          | |          | |  | |  GenericCloudStore<P,NowFn> | |
| |  S3Store | |AzureStore| | GcsStore | |  | |   Common implementation     | |
| |          | |          | |          | |  | |with optimized critical paths| |
| +----------+ +----------+ +----------+ |  | +-----------------------------+ |
|                                        |  |                |                |
+----------------------------------------+  |              uses               |
                                            |                |                |
                                            |                v                |
                                            | +----------------------------+  |
                                            | |  CloudStorageProvider Trait|  |
                                            | |    Minimal provider-       |  |
                                            | |    specific operations     |  |
                                            | +----------------------------+  |
                                            |                |                |
                                            |          implemented by         |
                                            |                |                |
                                            |      +---------+---------+      |
                                            |      |         |         |      |
                                            |      v         v         v      |
                                            | +------+   +------+   +------+  |
                                            | |  S3  |   |Azure |   | GCS  |  |
                                            | +------+   +------+   +------+  |
                                            |                                 |
                                            +---------------------------------+

There are other parts that look very duplicated among the now 3 store implementations like the object path logic and the request transformations across the stores.

+@jhpratt

Reviewable status: 0 of 2 LGTMs obtained, and 0 of 10 files reviewed (waiting on @jhpratt)

Copy link
Collaborator Author

@amankrx amankrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! This is something that I had planned as well. But it needs some restructuring, that was out of scope for this PR atleast. We can maybe create a separate PR that handles the GenericCloudStore. This would minimize a lot of duplication.

Reviewable status: 0 of 2 LGTMs obtained, and 0 of 10 files reviewed (waiting on @jhpratt)

@laz-001
Copy link

laz-001 commented Jun 14, 2025

Regarding the code, there seems to be quite a lot of duplication going on though. WDYT about something roughly like this?

this would be a rework, ideally doable by a strict refactoring (zero functional change), but this needs to be done before(!) adding new stores, to avoid higher/duplicate effort.

Would you be willing to add a new bounty for this, or do you expect this to happen in context of providing the azure blob store? Personally, I would even split this into 2 bounties: one for the theoretical work (provide the inheritance diagram), another one for the implementation. 3rd and 4ths ones etc. would be for the stores (lower effort).

One thing that is problematic and that you need to know:

@aaronmondal , @MarcusSorealheis, your response-times within the repo and especially your bounties is far too low - be aware that this can alienate contributors (normal ones and bounty-workers).

@amankrx
Copy link
Collaborator Author

amankrx commented Jun 14, 2025

The rework for the generalized cloud store is already a WIP. I will raise that PR next week that should separate it out the functionality.

@jhpratt jhpratt removed their assignment Jun 14, 2025
@laz-001
Copy link

laz-001 commented Jun 15, 2025

The rework for the generalized cloud store is already a WIP. I will raise that PR next week that should separate it out the functionality.

It's just that Issues like this one shouldn't take 6 months.

Splitting issues down to chunks and collaborating does the job much faster and with higher quality.

That's all to it.

(and response-times... well...)

@laz-001
Copy link

laz-001 commented Jun 17, 2025

related:

@MarcusSorealheis , is this a hickup of algora? the bounty #659 is still listed as 'unpaid' https://algora.io/TraceMachina/bounties

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

💎 Implement an Azure store

5 participants