Skip to content

Opt-in decentralized backup for s3 #69

@dzhelezov

Description

@dzhelezov

Currently, we rely heavily on the s3 file system to manage chunks. Replacing it with ipfs is hard, but what can be make IPFS sourcing and verification opt-in:

  • publish metadata file each time the s3 storage is updated with new blocks
  • the metadata file may optionally contain s3 buckets for the CID for fast access
  • the metadata file will contain the whole tree structure of s3
  • at the leafs, for each file store IPFS CIDs of the parquet files stored in s3

When downloading, replace ls method with downloading the metadata file from an IPFS gateway (will be fast) and traversing the tree there. For download, one can still use s3 (which is fast) with the additional step to verify the CID (can be done locally w/o running an ipfs node)

For data ingestion, we need a separate process which would periodically monitor the s3 buckets and:

  • seed the new files with CID
  • place storage order to the Crust Network
  • update the metadata file on-chain

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions