Opt-in decentralized backup for s3

Currently, we rely heavily on the s3 file system to manage chunks. Replacing it with ipfs is hard, but what can be make IPFS sourcing and verification opt-in:

- publish metadata file each time the s3 storage is updated with new blocks
- the metadata file may optionally contain s3 buckets for the CID for fast access
- the metadata file will contain the whole tree structure of s3
- at the leafs, for each file store IPFS CIDs of the parquet files stored in s3

When downloading, replace `ls` method with downloading the metadata file from an IPFS gateway (will be fast) and traversing the tree there. For download, one can still use s3 (which is fast) with the additional step to verify the CID (can be done [locally](https://stackoverflow.com/questions/60046604/node-less-way-to-generate-a-cid-that-matches-ipfs-desktop-cid) w/o running an ipfs node) 

For data ingestion, we need a separate process which would periodically monitor the s3 buckets and:

- seed the new files with CID
- place storage order to the Crust Network
- update the metadata file on-chain


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opt-in decentralized backup for s3 #69

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Opt-in decentralized backup for s3 #69

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions