Simplify DecompressingHashReader #8314

FabHof · 2025-10-21T15:27:29Z

The DecompressingHashReader now actually uses the reader.buffer and let's the interface handle the rest.

bhansconnect

A number of comments about the interface as a whole. Maybe worth addressing, but I think they also apply to the original code. So I don't think they necessarily should block this PR. Would be curious to here answers though around the design.

bhansconnect · 2025-10-22T04:25:34Z

src/bundle/streaming_reader.zig

+                var actual_hash: [32]u8 = undefined;
+                self.hasher.final(&actual_hash);
+                if (std.mem.eql(u8, &actual_hash, &self.expected_hash)) {
+                    self.hash_verified = true;


I'm kinda surprised this is just a bool on the struct and not an error state if unverified at the end.

Actually we could return a read error here.

bhansconnect · 2025-10-22T04:28:36Z

src/bundle/streaming_reader.zig

    pub fn verifyComplete(self: *Self) !void {
        // Read any remaining data to ensure we process the entire stream
-        var discard_buffer: [1024]u8 = undefined;
        while (true) {


Does this still need a while(true)?

Yes, because discard does not have to discard everything up to the limit. But i found ther is an discardRemaining that already does the loop, so I will use that one.

bhansconnect · 2025-10-22T04:29:47Z

src/bundle/streaming_reader.zig

        while (true) {
-            const n = try self.read(&discard_buffer);
-            if (n == 0) break;
+            _ = self.interface.discard(std.Io.Limit.unlimited) catch |err| {


Would it be more robust to error if any data is left and ensure the caller reads all data before calling this function? This feels potentially error prone. Better to make it explicit for the caller?

Good idea, I will change it.

I decided to keep it this way for now. The current bundler depends on it. I think the tar iter does not read unit the end of the stream, so it's simplest to do that in the verifyComplete

bhansconnect · 2025-10-22T04:32:07Z

src/bundle/streaming_reader.zig

+        self.allocator_ptr.free(self.interface.buffer);
    }

    fn stream(r: *std.Io.Reader, w: *std.Io.Writer, limit: std.Io.Limit) std.Io.Reader.StreamError!usize {


Just to make sure I understand this correctly. This means it never returns any data. So this only ever calculates the hash? This can never return decompressed data to a caller. So this really is like a form of final consumer that thats a reader and creates a hash. If so, what does it use this api? Why not just a function that takes a reader, and then computes the hash. This feels like misuse of this api, but maybe I don't understand some part of how it is used.

Yes, the point is that this stream function is just called by the reader interface and not directly. The reader interface will first look at the buffer, and then calls this stream function. So if this function only fills the internal buffer and but not the writer, it will still work, because the interface function will use the data in the buffer.

More details: https://github.com/ziglang/zig/blob/eef8deb918210a9d9d0732ef523b20e2cb963f01/lib/std/Io/Reader.zig#L42

…uffer.

FabHof · 2025-10-22T14:03:26Z

I think I have a bug somewhere, I need to investigate again.

FabHof · 2025-10-22T14:36:10Z

@bhansconnect I found a bug in the CompressingHashWriter that I also fixed now.

And we are now actually writing to the writer. It's not actually more complex than writing to the internal buffer and probably aligns more with how it is supposed to be used.

Sorry if I spammed you with my changes.

Simplify DecompressingHashReader

6159ea3

FabHof force-pushed the reader-cleanup branch from 66bfeca to 6159ea3 Compare October 21, 2025 15:33

FabHof marked this pull request as ready for review October 21, 2025 22:19

bhansconnect previously approved these changes Oct 22, 2025

View reviewed changes

DecompressingHashReader returns error.ReadFailed when hashes missmatch.

51139ce

FabHof dismissed bhansconnect’s stale review via 51139ce October 22, 2025 09:52

FabHof requested a review from bhansconnect October 22, 2025 09:59

DecompressingHashReader stream wirtes to writer instead of internal b…

0f5c778

…uffer.

FabHof marked this pull request as draft October 22, 2025 14:03

fix bug in streaming_writer and add test for it

c3de27f

FabHof marked this pull request as ready for review October 22, 2025 14:32

FabHof added 2 commits October 22, 2025 14:51

add verifyComplete to test

604e73d

Merge branch 'main' into reader-cleanup

90ab2ea

Uh oh!

Simplify DecompressingHashReader #8314

Are you sure you want to change the base?

Simplify DecompressingHashReader #8314

Conversation

FabHof commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bhansconnect left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FabHof commented Oct 22, 2025

Uh oh!

FabHof commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FabHof commented Oct 21, 2025 •

edited

Loading