Releases · flashbots/mempool-dumpster

transactions - deduplicated via ReplacingMergeTree, keeps earliest received transaction. raw_tx is saved as raw bytes to minimize amount of stored data
sourcelogs

Schemas:

CREATE TABLE IF NOT EXISTS transactions (
    received_at DateTime64(3, 'UTC'),
    hash String,
    chain_id String,
    tx_type Int64,
    from String,
    to String,
    value String,
    nonce String,
    gas String,
    gas_price String,
    gas_tip_cap String,
    gas_fee_cap String,
    data_size Int64,
    data_4bytes String,
    raw_tx String,

    ver Int64 MATERIALIZED -toUnixTimestamp(received_at)
)
ENGINE = ReplacingMergeTree(ver)
PRIMARY KEY (hash)
ORDER BY (hash)
PARTITION BY toDate(received_at)
COMMENT 'Transaction details, deduplicated by hash, will keep the transaction with earliest received_at.';

CREATE TABLE IF NOT EXISTS sourcelogs (
    received_at DateTime64(3),
    hash String,
    source String,
    location String,
)
ENGINE = MergeTree
PRIMARY KEY (received_at, hash)
ORDER BY (received_at, hash)
PARTITION BY toDate(received_at)
COMMENT 'Receipt log for every transaction the collector has seen';

Batched writes

By default, batches of 1k entries are written to each of these, as recommended by Clickhouse docs [1]:

We recommend inserting data in batches of at least 1,000 rows, and ideally between 10,000–100,000 rows. Fewer, larger inserts reduce the number of parts written, minimize merge load, and lower overall system resource usage.

It takes about 30 seconds to collect 1k transactions, which is the delay with which they end up in Clickhouse.

cli args

--clickhouse-dsn / CLICKHOUSE_DSN

Additional environment variables

https://github.com/flashbots/mempool-dumpster/blob/clickhouse/collector/consts.go#L35-L38

clickhouseBatchSize    = common.GetEnvInt("CLICKHOUSE_BATCH_SIZE", 1_000)
clickhouseSaveRetries  = common.GetEnvInt("CLICKHOUSE_SAVE_RETRIES", 5)

Metrics

Several metrics are added:

# curl -s localhost:9060/metrics | grep -E "tx|click" | grep -v "bucket"
clickhouse_batch_save_duration_milliseconds_sum{type="sourcelogs"} 39288506
clickhouse_batch_save_duration_milliseconds_count{type="sourcelogs"} 69580
clickhouse_batch_save_duration_milliseconds_sum{type="transactions"} 21450954
clickhouse_batch_save_duration_milliseconds_count{type="transactions"} 19176
clickhouse_batch_save_giveup_total 0
clickhouse_batch_save_retries_total 1
clickhouse_batch_save_success_total 88756
clickhouse_entries_saved_total{type="sourcelogs"} 69580000
clickhouse_entries_saved_total{type="transactions"} 19176000
clickhouse_errors_batch_save_total 1
clickhouse_errors_total 5
tx_received_first 18362351
tx_received_first{source="bloxroute"} 4737183
tx_received_first{source="chainbound"} 7871444
tx_received_first{source="local"} 4403021
tx_received_first{source="mempoolguru"} 1350703
tx_received_total 69583733
tx_received_total{source="bloxroute"} 8045519
tx_received_total{source="chainbound"} 12747958
tx_received_total{source="local"} 24598814
tx_received_total{source="mempoolguru"} 24191442
tx_received_trash 23837556
tx_received_trash{source="bloxroute"} 813989
tx_received_trash{source="chainbound"} 258733
tx_received_trash{source="local"} 11503433
tx_received_trash{source="mempoolguru"} 11261401