forked from ethereum/go-ethereum
-
Notifications
You must be signed in to change notification settings - Fork 281
feat: add toolkit for exporting and transforming missing block header fields #903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jonastheis
wants to merge
14
commits into
develop
Choose a base branch
from
jt/export-headers-toolkit
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
9c51e74
feat: add toolkit for exporting and transforming missing block header…
jonastheis 5222fda
feat: add missing header writer to write the missing header file with…
jonastheis 2b5c450
feat: add sha256 checksum generation
jonastheis 9eaa720
add bitmask to encapsulate logic around header information in it. als…
jonastheis dcdd6f0
feat: add optional verification of input and output file with separat…
jonastheis c8ce9d9
add feature to continue already existing header file
jonastheis b41eb8e
change byte layout of deduplicated file due to too many vanities (>64…
jonastheis 042377e
fix some stuff
jonastheis bf8ec7d
Merge remote-tracking branch 'origin/develop' into jt/export-headers-…
jonastheis 12fdb4a
add state root to header
jonastheis 882f5fd
goimports
jonastheis 0f94e9c
only create 1 task at a time
jonastheis fc20472
address review comments
jonastheis 0463b3e
Merge branch 'develop' into jt/export-headers-toolkit
jonastheis File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
data/ |
13 changes: 13 additions & 0 deletions
13
rollup/missing_header_fields/export-headers-toolkit/Dockerfile
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
FROM golang:1.22 | ||
|
||
WORKDIR /app | ||
|
||
COPY go.mod go.sum ./ | ||
|
||
RUN go mod download | ||
|
||
COPY . . | ||
|
||
RUN go build -o main . | ||
|
||
ENTRYPOINT ["./main"] |
67 changes: 67 additions & 0 deletions
67
rollup/missing_header_fields/export-headers-toolkit/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
# Export missing block header fields toolkit | ||
|
||
A toolkit for exporting and transforming missing block header fields of Scroll before EuclidV2 upgrade. | ||
|
||
## Context | ||
We are using the [Clique consensus](https://eips.ethereum.org/EIPS/eip-225) in Scroll L2. Amongst others, it requires the following header fields: | ||
- `extraData` | ||
- `difficulty` | ||
|
||
However, before EuclidV2, these fields were not stored on L1/DA. | ||
In order for nodes to be able to reconstruct the correct block hashes when only reading data from L1, | ||
we need to provide the historical values of these fields to these nodes through a separate file. | ||
Additionally, the `stateRoot` field is included in the file to ensure that the block headers can be reconstructed correctly, | ||
independently of the state trie type used in the node (before EuclidV1 the state trie was ZK trie, after EuclidV1 it is a regular Merkle Patricia Trie). | ||
|
||
This toolkit provides commands to export the missing fields, deduplicate the data and create a file | ||
with the missing fields that can be used to reconstruct the correct block hashes when only reading data from L1. | ||
|
||
The toolkit provides the following commands: | ||
- `fetch` - Fetch missing block header fields from a running Scroll L2 node and store in a file | ||
- `dedup` - Deduplicate the headers file, print unique values and create a new file with the deduplicated headers | ||
|
||
## Binary layout deduplicated missing header fields file | ||
The deduplicated header file binary layout is as follows: | ||
|
||
```plaintext | ||
<unique_vanity_count:uint8><unique_vanity_1:[32]byte>...<unique_vanity_n:[32]byte><header_1:header>...<header_n:header> | ||
|
||
Where: | ||
- unique_vanity_count: number of unique vanities n | ||
- unique_vanity_i: unique vanity i | ||
- header_i: block header i | ||
- header: | ||
<flags:uint8><vanity_index:uint8><state_root:[32]byte><seal:[65|85]byte> | ||
- flags: bitmask, lsb first | ||
- bit 6: 0 if difficulty is 2, 1 if difficulty is 1 | ||
- bit 7: 0 if seal length is 65, 1 if seal length is 85 | ||
- vanity_index: index of the vanity in the sorted vanities list (0-255) | ||
- state_root: 32 bytes of state root data | ||
- seal: 65 or 85 bytes of seal data | ||
``` | ||
|
||
## How to run | ||
Each of the commands has its own set of flags and options. To display the help message run with `--help` flag. | ||
|
||
1. Fetch the missing block header fields from a running Scroll L2 node via RPC and store in a file (approx 40min for 5.5M blocks). | ||
2. Deduplicate the headers file, print unique values and create a new file with the deduplicated headers | ||
|
||
```bash | ||
go run main.go fetch --rpc=http://localhost:8545 --start=0 --end=100 --batch=10 --parallelism=10 --output=headers.bin --humanOutput=true | ||
go run main.go dedup --input=headers.bin --output=headers-dedup.bin | ||
``` | ||
|
||
|
||
### With Docker | ||
To run the toolkit with Docker, build the Docker image and run the commands inside the container. | ||
|
||
```bash | ||
docker build -t export-headers-toolkit . | ||
|
||
# depending on the Docker config maybe finding the RPC container's IP with docker inspect is necessary. Potentially host IP works: http://172.17.0.1:8545 | ||
docker run --rm -v "$(pwd)":/app/result export-headers-toolkit fetch --rpc=<address> --start=0 --end=5422047 --batch=10000 --parallelism=10 --output=/app/result/headers.bin --humanOutput=/app/result/headers.csv | ||
docker run --rm -v "$(pwd)":/app/result export-headers-toolkit dedup --input=/app/result/headers.bin --output=/app/result/headers-dedup.bin | ||
``` | ||
|
||
|
||
|
306 changes: 306 additions & 0 deletions
306
rollup/missing_header_fields/export-headers-toolkit/cmd/dedup.go
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,306 @@ | ||
package cmd | ||
|
||
import ( | ||
"bufio" | ||
"bytes" | ||
"crypto/sha256" | ||
"encoding/binary" | ||
"fmt" | ||
"io" | ||
"log" | ||
"os" | ||
"strconv" | ||
"strings" | ||
|
||
"github.com/spf13/cobra" | ||
|
||
"github.com/scroll-tech/go-ethereum/common" | ||
|
||
"github.com/scroll-tech/go-ethereum/export-headers-toolkit/types" | ||
) | ||
|
||
// dedupCmd represents the dedup command | ||
var dedupCmd = &cobra.Command{ | ||
Use: "dedup", | ||
Short: "Deduplicate the headers file, print unique values and create a new file with the deduplicated headers", | ||
Long: `Deduplicate the headers file, print unique values and create a new file with the deduplicated headers. | ||
|
||
The binary layout of the deduplicated file is as follows: | ||
- 1 byte for the count of unique vanity | ||
- 32 bytes for each unique vanity | ||
- for each header: | ||
- 1 byte (bitmask, lsb first): | ||
- bit 0-5: index of the vanity in the sorted vanities list | ||
- bit 6: 0 if difficulty is 2, 1 if difficulty is 1 | ||
- bit 7: 0 if seal length is 65, 1 if seal length is 85 | ||
- 65 or 85 bytes for the seal`, | ||
Run: func(cmd *cobra.Command, args []string) { | ||
inputFile, err := cmd.Flags().GetString("input") | ||
if err != nil { | ||
log.Fatalf("Error reading output flag: %v", err) | ||
} | ||
outputFile, err := cmd.Flags().GetString("output") | ||
if err != nil { | ||
log.Fatalf("Error reading output flag: %v", err) | ||
} | ||
verifyFile, err := cmd.Flags().GetString("verify") | ||
if err != nil { | ||
log.Fatalf("Error reading verify flag: %v", err) | ||
} | ||
|
||
if verifyFile != "" { | ||
verifyInputFile(verifyFile, inputFile) | ||
} | ||
|
||
_, seenVanity, _ := runAnalysis(inputFile) | ||
runDedup(inputFile, outputFile, seenVanity) | ||
|
||
if verifyFile != "" { | ||
verifyOutputFile(verifyFile, outputFile) | ||
} | ||
|
||
runSHA256(outputFile) | ||
}, | ||
} | ||
|
||
func init() { | ||
rootCmd.AddCommand(dedupCmd) | ||
|
||
dedupCmd.Flags().String("input", "headers.bin", "headers file") | ||
dedupCmd.Flags().String("output", "headers-dedup.bin", "deduplicated, binary formatted file") | ||
dedupCmd.Flags().String("verify", "", "verify the input and output files with the given .csv file") | ||
} | ||
|
||
func runAnalysis(inputFile string) (seenDifficulty map[uint64]int, seenVanity map[[32]byte]bool, seenSealLen map[int]int) { | ||
reader := newHeaderReader(inputFile) | ||
defer reader.close() | ||
|
||
// track header fields we've seen | ||
seenDifficulty = make(map[uint64]int) | ||
seenVanity = make(map[[32]byte]bool) | ||
seenSealLen = make(map[int]int) | ||
|
||
reader.read(func(header *types.Header) { | ||
seenDifficulty[header.Difficulty]++ | ||
seenVanity[header.Vanity()] = true | ||
seenSealLen[header.SealLen()]++ | ||
}) | ||
|
||
// Print distinct values and report | ||
fmt.Println("--------------------------------------------------") | ||
for diff, count := range seenDifficulty { | ||
fmt.Printf("Difficulty %d: %d\n", diff, count) | ||
} | ||
|
||
for vanity := range seenVanity { | ||
fmt.Printf("Vanity: %x\n", vanity) | ||
} | ||
|
||
for sealLen, count := range seenSealLen { | ||
fmt.Printf("SealLen %d bytes: %d\n", sealLen, count) | ||
} | ||
|
||
fmt.Println("--------------------------------------------------") | ||
fmt.Printf("Unique values seen in the headers file (last seen block: %d):\n", reader.lastHeader.Number) | ||
fmt.Printf("Distinct count: Difficulty:%d, Vanity:%d, SealLen:%d\n", len(seenDifficulty), len(seenVanity), len(seenSealLen)) | ||
fmt.Printf("--------------------------------------------------\n\n") | ||
|
||
return seenDifficulty, seenVanity, seenSealLen | ||
} | ||
|
||
func runDedup(inputFile, outputFile string, seenVanity map[[32]byte]bool) { | ||
reader := newHeaderReader(inputFile) | ||
defer reader.close() | ||
|
||
writer := newMissingHeaderFileWriter(outputFile, seenVanity) | ||
defer writer.close() | ||
|
||
writer.missingHeaderWriter.writeVanities() | ||
|
||
reader.read(func(header *types.Header) { | ||
writer.missingHeaderWriter.write(header) | ||
}) | ||
} | ||
|
||
func runSHA256(outputFile string) { | ||
f, err := os.Open(outputFile) | ||
if err != nil { | ||
log.Fatalf("Error opening file: %v", err) | ||
} | ||
defer f.Close() | ||
|
||
h := sha256.New() | ||
if _, err = io.Copy(h, f); err != nil { | ||
log.Fatalf("Error hashing file: %v", err) | ||
} | ||
|
||
fmt.Printf("Deduplicated headers written to %s with sha256 checksum: %x\n", outputFile, h.Sum(nil)) | ||
} | ||
|
||
type headerReader struct { | ||
file *os.File | ||
reader *bufio.Reader | ||
lastHeader *types.Header | ||
} | ||
|
||
func newHeaderReader(inputFile string) *headerReader { | ||
f, err := os.Open(inputFile) | ||
if err != nil { | ||
log.Fatalf("Error opening input file: %v", err) | ||
} | ||
|
||
h := &headerReader{ | ||
file: f, | ||
reader: bufio.NewReader(f), | ||
} | ||
|
||
return h | ||
} | ||
|
||
func (h *headerReader) read(callback func(header *types.Header)) { | ||
headerSizeBytes := make([]byte, types.HeaderSizeSerialized) | ||
|
||
for { | ||
_, err := io.ReadFull(h.reader, headerSizeBytes) | ||
if err != nil { | ||
if err == io.EOF { | ||
break | ||
} | ||
log.Printf("Error reading headerSizeBytes: %v\n", err) | ||
} | ||
headerSize := binary.BigEndian.Uint16(headerSizeBytes) | ||
|
||
headerBytes := make([]byte, headerSize) | ||
_, err = io.ReadFull(h.reader, headerBytes) | ||
if err != nil { | ||
if err == io.EOF { | ||
break | ||
} | ||
log.Printf("Error reading headerBytes: %v\n", err) | ||
} | ||
header := new(types.Header).FromBytes(headerBytes) | ||
|
||
// sanity check: make sure headers are in order | ||
if h.lastHeader != nil && header.Number != h.lastHeader.Number+1 { | ||
fmt.Println("lastHeader:", h.lastHeader.String()) | ||
log.Fatalf("Missing block: %d, got %d instead", h.lastHeader.Number+1, header.Number) | ||
} | ||
h.lastHeader = header | ||
|
||
callback(header) | ||
} | ||
} | ||
|
||
func (h *headerReader) close() { | ||
h.file.Close() | ||
} | ||
|
||
type csvHeaderReader struct { | ||
file *os.File | ||
reader *bufio.Reader | ||
} | ||
|
||
func newCSVHeaderReader(verifyFile string) *csvHeaderReader { | ||
f, err := os.Open(verifyFile) | ||
if err != nil { | ||
log.Fatalf("Error opening verify file: %v", err) | ||
} | ||
|
||
h := &csvHeaderReader{ | ||
file: f, | ||
reader: bufio.NewReader(f), | ||
} | ||
|
||
return h | ||
} | ||
|
||
func (h *csvHeaderReader) readNext() *types.Header { | ||
line, err := h.reader.ReadString('\n') | ||
if err != nil { | ||
if err == io.EOF { | ||
return nil | ||
} | ||
log.Fatalf("Error reading line: %v", err) | ||
} | ||
|
||
s := strings.Split(strings.TrimSpace(line), ",") | ||
if len(s) != 4 { | ||
log.Fatalf("Malformed CSV line: %q", line) | ||
} | ||
|
||
num, err := strconv.ParseUint(s[0], 10, 64) | ||
if err != nil { | ||
log.Fatalf("Error parsing block number: %v", err) | ||
} | ||
difficulty, err := strconv.ParseUint(s[1], 10, 64) | ||
if err != nil { | ||
log.Fatalf("Error parsing difficulty: %v", err) | ||
} | ||
|
||
stateRoot := common.HexToHash(s[2]) | ||
|
||
extra := common.FromHex(strings.Split(s[3], "\n")[0]) | ||
|
||
header := types.NewHeader(num, difficulty, stateRoot, extra) | ||
return header | ||
} | ||
|
||
func (h *csvHeaderReader) close() { | ||
h.file.Close() | ||
} | ||
|
||
func verifyInputFile(verifyFile, inputFile string) { | ||
csvReader := newCSVHeaderReader(verifyFile) | ||
defer csvReader.close() | ||
|
||
binaryReader := newHeaderReader(inputFile) | ||
defer binaryReader.close() | ||
|
||
binaryReader.read(func(header *types.Header) { | ||
csvHeader := csvReader.readNext() | ||
|
||
if !csvHeader.Equal(header) { | ||
log.Fatalf("Header mismatch: %v != %v", csvHeader, header) | ||
} | ||
}) | ||
|
||
log.Printf("All headers match in %s and %s\n", verifyFile, inputFile) | ||
} | ||
|
||
func verifyOutputFile(verifyFile, outputFile string) { | ||
csvReader := newCSVHeaderReader(verifyFile) | ||
defer csvReader.close() | ||
|
||
dedupReader, err := NewReader(outputFile) | ||
if err != nil { | ||
log.Fatalf("Error opening dedup file: %v", err) | ||
} | ||
defer dedupReader.Close() | ||
|
||
for { | ||
header := csvReader.readNext() | ||
if header == nil { | ||
if _, _, err = dedupReader.ReadNext(); err == nil { | ||
log.Fatalf("Expected EOF, got more headers") | ||
} | ||
break | ||
} | ||
|
||
difficulty, stateRoot, extraData, err := dedupReader.Read(header.Number) | ||
if err != nil { | ||
log.Fatalf("Error reading header: %v", err) | ||
} | ||
|
||
if header.Difficulty != difficulty { | ||
log.Fatalf("Difficulty mismatch: headerNum %d: %d != %d", header.Number, header.Difficulty, difficulty) | ||
} | ||
if header.StateRoot != stateRoot { | ||
log.Fatalf("StateRoot mismatch: headerNum %d: %s != %s", header.Number, header.StateRoot, stateRoot) | ||
} | ||
if !bytes.Equal(header.ExtraData, extraData) { | ||
log.Fatalf("ExtraData mismatch: headerNum %d: %x != %x", header.Number, header.ExtraData, extraData) | ||
} | ||
} | ||
|
||
log.Printf("All headers match in %s and %s\n", verifyFile, outputFile) | ||
} |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.