Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change proof generation logic #236

Open
kcalvinalvin opened this issue Dec 21, 2020 · 1 comment
Open

Change proof generation logic #236

kcalvinalvin opened this issue Dec 21, 2020 · 1 comment

Comments

@kcalvinalvin
Copy link
Member

The current logic is to:

  1. Save the proofs to a flat file.
  2. Rewrite the proofs when the ttl is known at a future block

This hurts genproofs a lot because there's a lot of disk and leveldb access going on.

I think we can just have a default lookahead value (probably 1,000) and mark only the txos that are spent within that range as talked about in one of the calls. This will get rid of the need for leveldb (since it can now be in memory) and having to access the disk twice.

@adiabat
Copy link
Contributor

adiabat commented Dec 22, 2020

I don't think this is a change in logic.

For 1 - well proofs are a separate thing from TTLs, we just happen to put them in the same file now.
2 - we allocate space for TTL values, (starts out as all 0s, but could just as easily be 0xff) and write the correct value in once it becomes known. This is optimizing for read speed as the TTLs proofs and everything are all right next to each other on disk, and we have a size prefix right before, so we can grab all the data from disk and throw it to the network without even looking at it. It does need a bunch of little writes to build this data, but it's better to have slow genProofs and fast server / IBD.

I don't think the 1K dropoff will help -- that means you have to either sort the data by TTL ... but it also needs to be addressable by outpoint, so either duplicate the data or run a search every block to find entries whose time is up to remove.

Also replacing levelDB with an in-ram map means we need a way to flush it to disk for stop / resume.
Also we need enough RAM to save a big chunk of the UTXO set. Granted not as big if we have a 1K TTL cutoff.

If we're maintaining a UTXO set anyway (in the case of a full node) the data needed for TTL determination is an extra 2 bytes per UTXO (though actually you could construct a block with more than 65K outputs if they have empty scripts...)

Right now cmd/utreexoserver keeps a levelDB for just these 2 bytes! Which does seem dumb. But if it's along side the UTXO set then it won't be a big deal.

I don't think the 1K max TTL will help though; it keeps the set size down but it's expensive to maintain. For this data set my guess is the extra read to maintain that cutoff will slow things down more than the reduction in number of elements will speed things up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants