-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read tag data in its entirety from AsyncFileReader. #81
Conversation
#76 was auto-closed because when I wrote the description of #79 (#79 (comment)) I had the words "I think this probably also closes #76" |
ah ok! This should be good for review now... |
This is a horribly leaky abstraction because you're trying to infer how much more data you'll need to read in total from individual requests made to the underlying network layer. On the contrary it's the entire point that these layers are isolated, so it's intentional that the network layer should not be able to infer where the header parsing is, solely based on requests. The internals of how the metadata is read should be isolated from reading the bytes. It might work in this specific case to do your estimation, but that happens to rely on this specific case, and if the internals of how metadata is parsed is changed again in the future, that would break your code. Instead of hacking around this via byte ranges, I think it would be better to expose a little more of the internals of metadata parsing, so that you can swap our your own implementation of reading the metadata if you wish. See #82 |
Could we voice chat or something? |
Layers need to be isolated: Yes! Clearly, the exponential prefetch better separates layers. Still, I think #83 works very well together with this PR. (this PR would reduce additional requests from ~9 to 2).
I completely errored on this and am sad I couldn't appreciate all the effort you put into #82
I've already made unit tests for tag parsing in tiff2. Would it help if:
|
Sorry, I've already spent more time on this than I wanted to. I have a bunch of other projects to attend to and I don't want to spend more time arguing about this. If anything, I'd much rather focus on data parsing and decompression (e.g. decoding predictors) |
Why was #76 closed 👀?
Closes #70
This PR allows a caching middleware to directly estimate the remaining
TileOffsets
andTileByteCounts
arrays from the byte size ofTileOffsets0
, which is in #74 as a proof-of-concept.TileOffsets0
arrayA middleware could do exponential cache growing, just like aVec
, but if a good estimate (2*range_len+4*range_len.isqrt()
) of size can be made,Vec::with_capacity(good_estimate)
is better derivation.pdf.EDIT: Actually, an exponential prefetch (exponent 2) will work quite well with this, since the byte size of all ifds ~>=
4*TileOffsets0.isqrt()
1also Cleans up tag loading logic and cursor handling.
Footnotes
The estimate of $2b_{offset0}+4\sqrt(b_{offset0}$ is only based on the assumption that we are dealing with a square unmasked bigtiff cog. smalltiff adds a $\frac{4}{3}$ factor, masks a $2$ factor and square-ness only works on the $\sqrt{}$ factor, which is small in comparison. ↩