Feature: cp with --part-size and --parallel flags enabling copy support for objects >= 5TB
#5257
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Community Contribution License
All community contributions in this pull request are licensed to the project maintainers
under the terms of the Apache 2 license.
By creating this pull request I represent that I have the right to license the
contributions to the project maintainers under the Apache 2 license.
Description
This PR enables
mc cpto handle files greater than or equal to 5TB by:mc cpfailed--part-sizeand--parallelfor configurable copySee below
- OR file size >= 5 TiB
- OR
--zipflag used- OR
--checksumflag used(through client)
(via standard multipart upload API)
--part-sizeand--parallel- File size < 5 TiB
- No
--zipflag- No
--checksumflag(server-to-server)
(via ComposeObject API with X-Amz-Copy-Source-Range)
api currently doesn't support parallel
- OR
--disable-multipartflagPreviously I attempted to make
mc catfast but realised fundamental bottleneck of io.Read which is sequential: #5255Motivation and Context
I want to efficiently copy 12TB file across buckets on same host. Currently
mc cpdoesn't allow configuring parallelisation or part size.UPDATE: when embarking on this change, I found undocumented environment variables but it will won't support files >= 5TB:
e.g. can be called:
How to test this PR?
run unit tests
Types of changes
Checklist:
commit-idorPR #here)