-
Notifications
You must be signed in to change notification settings - Fork 12
feat: introduce hybrid storage #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- add vitest - add stream buffer for exact byte count reading from ReadableStream
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR introduces a custom streaming Git packfile parser as the foundation for a hybrid R2 + D1 storage architecture. The parser handles packfile header parsing, object decompression, delta objects (both OFS and REF types), and SHA-1 checksum verification. Key Changes:
Issues Found:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Client as Git Client
participant Worker as Cloudflare Worker
participant Cache as Cache API
participant DO as Durable Object (SQLite)
participant R2 as R2 Storage
participant Parser as PackfileParser
Client->>Worker: git clone/fetch request
Worker->>Cache: Check for parsed commit/tree
alt Cache Hit
Cache-->>Worker: Return parsed object
Worker-->>Client: Send object
else Cache Miss
Worker->>DO: Query object index (oid→pack_id, offset)
DO-->>Worker: Return pack location
Worker->>R2: Range request for object at offset
R2-->>Worker: Stream compressed packfile chunk
Worker->>Parser: Create PackfileParser(stream)
Parser->>Parser: parseHeader() - validate signature
Parser->>Parser: parseObject() - read type/size
Parser->>Parser: decompressObject() - inflate zlib
Parser->>Parser: Verify SHA-1 checksum
Parser-->>Worker: Return parsed object
Worker->>Cache: Store parsed object (1 year TTL)
Worker-->>Client: Send object
end
Note over Parser: StreamBuffer manages chunks<br/>Memory-efficient compaction<br/>Delayed hash computation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
20 files reviewed, 1 comment
apps/web/src/git/pack/index.ts
Outdated
| if ( | ||
| this.buffer.bytesRead + bufferSize >= | ||
| this._progress.bytesRead + bufferSize * 2 | ||
| ) { | ||
| return Result.err( | ||
| err instanceof Error ? err : new Error(String(err)) | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: the condition logic is incorrect - this.buffer.bytesRead should equal this._progress.bytesRead since they're synchronized (lines 92, 104, 150), making this condition effectively bufferSize >= bufferSize * 2, which is only true when bufferSize <= 0
this means the error branch will almost never execute, and decompression will always retry up to maxAttempts even when more data won't help
| if ( | |
| this.buffer.bytesRead + bufferSize >= | |
| this._progress.bytesRead + bufferSize * 2 | |
| ) { | |
| return Result.err( | |
| err instanceof Error ? err : new Error(String(err)) | |
| ); | |
| } | |
| if (this.done && available.length < bufferSize) { | |
| return Result.err( | |
| err instanceof Error ? err : new Error(String(err)) | |
| ); | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/web/src/git/pack/index.ts
Line: 366:373
Comment:
**logic:** the condition logic is incorrect - `this.buffer.bytesRead` should equal `this._progress.bytesRead` since they're synchronized (lines 92, 104, 150), making this condition effectively `bufferSize >= bufferSize * 2`, which is only true when `bufferSize <= 0`
this means the error branch will almost never execute, and decompression will always retry up to `maxAttempts` even when more data won't help
```suggestion
if (this.done && available.length < bufferSize) {
return Result.err(
err instanceof Error ? err : new Error(String(err))
);
}
```
How can I resolve this? If you propose a fix, please make it concise.
This PR rewrites the whole git backend to support R2 + D1 Hybrid Architecture