generated from amazon-archives/__template_Custom
-
Notifications
You must be signed in to change notification settings - Fork 26
feat(dcp): dcp optimized s3reader for faster and partial DCP loading #378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
jet-tong
wants to merge
24
commits into
awslabs:main
Choose a base branch
from
jet-tong:feat/dcp-list-of-ranges-s3reader
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
a78932d
feat: new ListOfRangesS3Reader for DCP partial reading
jet-tong 5bb2c7c
feat(dcp): use constructor pattern for ListOfRanges optimization
jet-tong af01848
fix: resolve mypy errors and minor logic and name changes
jet-tong 5fc014c
perf(s3reader): optimize for sequential DCP workloads
jet-tong da73d3e
refactor: rename reader as dcp optimized and add max_gap_size support
jet-tong 39fa629
refactor: rename list_of_ranges.py as dcp_optimized.py
jet-tong a515b4a
refactor: use one default max gap size across classes
jet-tong 68165e6
test(dcp): update dcp e2e tests with DCPOptimizedS3Reader
jet-tong 531b22b
feat: make DCPOptimizedS3Reader seekable for PyTorch backwards compat…
jet-tong 5e1322b
refactor: simplify DCPOptimizedS3Reader to use single active stream
jet-tong cc5bfc9
refactor(dcp): refactor core functions and add docstrings and error m…
jet-tong 463a1f1
fix(dcp): use filename only for file range key
jet-tong c985b26
refactor: rename RangeRequest as ItemRange
jet-tong 5720907
fix(dcp): use os to extract basename instead of split
jet-tong ede4d8d
fix(dcp): improve error handling and validation
jet-tong b83284b
refactor: address github comments
jet-tong 5ff5fdc
refactor(dcp): improve naming and typing for DCP optimized reader con…
jet-tong a9ed023
fix(s3reader): use Python 3.9 compatible types and remove redundant c…
jet-tong bd6f274
refactor: cleanup imports, styling and comments
jet-tong f9a90b8
fix: move DCP imports under TYPE_CHECKING
jet-tong f5e810a
refactor(s3reader): switch from index to iterator based state management
jet-tong 3c25552
per(dcp): minor group lookup optimization with dict
jet-tong 7226079
refactor: improve item buffer method and support closing
jet-tong 3a05cb5
fix(s3reader): resolve stream pos desync when skipping coalescing bytes
jet-tong File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here - this feels pretty janky to me. What's this used for? Just debugging or to actually do something based on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User agent - agree this still feels janky.