Releases: flashbots/mempool-dumpster
v1.2.0
Cleanups and improvements to v1.1.0 which shipped Clickhouse support for the collector.
More details here: https://github.com/flashbots/mempool-dumpster/releases/tag/v1.1.0
What's Changed since v1.1.0
- prefix metrics with mempool_dumpster_ in #75
- Minor cleanup of collector and processor in #76
- docker readme in #74
Full Changelog: v1.1.0...v1.2.0
Docker image: docker pull flashbots/mempool-dumpster:1.2.0
v1.1.0
Clickhouse support
The collector can now write to Clickhouse directly!
PRs:
- Clickhouse support for collector data storage by @metachris in #68
- Clickhouse TLS cleanup by @metachris in #71
- Clickhouse: store raw_tx as bytes instead of hex by @metachris in #72
- Make output directory optional for collector by @ilyaluk in #70
You can enable it by using the --clickhouse-dsn
cli flag / CLICKHOUSE_DSN
env variable. See also the docker-compose setup here.
Batches are written to two Clickhouse tables:
transactions
- deduplicated via ReplacingMergeTree, keeps earliest received transaction.raw_tx
is saved as raw bytes to minimize amount of stored datasourcelogs
Schemas:
CREATE TABLE IF NOT EXISTS transactions (
received_at DateTime64(3, 'UTC'),
hash String,
chain_id String,
tx_type Int64,
from String,
to String,
value String,
nonce String,
gas String,
gas_price String,
gas_tip_cap String,
gas_fee_cap String,
data_size Int64,
data_4bytes String,
raw_tx String,
ver Int64 MATERIALIZED -toUnixTimestamp(received_at)
)
ENGINE = ReplacingMergeTree(ver)
PRIMARY KEY (hash)
ORDER BY (hash)
PARTITION BY toDate(received_at)
COMMENT 'Transaction details, deduplicated by hash, will keep the transaction with earliest received_at.';
CREATE TABLE IF NOT EXISTS sourcelogs (
received_at DateTime64(3),
hash String,
source String,
location String,
)
ENGINE = MergeTree
PRIMARY KEY (received_at, hash)
ORDER BY (received_at, hash)
PARTITION BY toDate(received_at)
COMMENT 'Receipt log for every transaction the collector has seen';
Batched writes
By default, batches of 1k entries are written to each of these, as recommended by Clickhouse docs [1]:
We recommend inserting data in batches of at least 1,000 rows, and ideally between 10,000–100,000 rows. Fewer, larger inserts reduce the number of parts written, minimize merge load, and lower overall system resource usage.
It takes about 30 seconds to collect 1k transactions, which is the delay with which they end up in Clickhouse.
cli args
--clickhouse-dsn
/CLICKHOUSE_DSN
Additional environment variables
https://github.com/flashbots/mempool-dumpster/blob/clickhouse/collector/consts.go#L35-L38
clickhouseBatchSize = common.GetEnvInt("CLICKHOUSE_BATCH_SIZE", 1_000)
clickhouseSaveRetries = common.GetEnvInt("CLICKHOUSE_SAVE_RETRIES", 5)
Metrics
Several metrics are added:
# curl -s localhost:9060/metrics | grep -E "tx|click" | grep -v "bucket"
clickhouse_batch_save_duration_milliseconds_sum{type="sourcelogs"} 39288506
clickhouse_batch_save_duration_milliseconds_count{type="sourcelogs"} 69580
clickhouse_batch_save_duration_milliseconds_sum{type="transactions"} 21450954
clickhouse_batch_save_duration_milliseconds_count{type="transactions"} 19176
clickhouse_batch_save_giveup_total 0
clickhouse_batch_save_retries_total 1
clickhouse_batch_save_success_total 88756
clickhouse_entries_saved_total{type="sourcelogs"} 69580000
clickhouse_entries_saved_total{type="transactions"} 19176000
clickhouse_errors_batch_save_total 1
clickhouse_errors_total 5
tx_received_first 18362351
tx_received_first{source="bloxroute"} 4737183
tx_received_first{source="chainbound"} 7871444
tx_received_first{source="local"} 4403021
tx_received_first{source="mempoolguru"} 1350703
tx_received_total 69583733
tx_received_total{source="bloxroute"} 8045519
tx_received_total{source="chainbound"} 12747958
tx_received_total{source="local"} 24598814
tx_received_total{source="mempoolguru"} 24191442
tx_received_trash 23837556
tx_received_trash{source="bloxroute"} 813989
tx_received_trash{source="chainbound"} 258733
tx_received_trash{source="local"} 11503433
tx_received_trash{source="mempoolguru"} 11261401
Note: #75 (after this release) added a mempool_dumpster_
prefix to the metrics!
Other notable changes
- add docker compose file for testing by @metachris in #67
- pprof on metrics server by @metachris in #66
- refactor: use slices.Contains to simplify code by @yingshanghuangqiao in #69
- Fix goreleaser error by @metachris in #73
New Contributors
- @ilyaluk made their first contribution in #70
- @yingshanghuangqiao made their first contribution in #69
Full Changelog: v1.0.0...v1.1.0
Docker image: docker pull flashbots/mempool-dumpster:1.1.0
v1.0.0
What's Changed
- Single cmd entrypoint by @metachris in #59
- Prometheus Metrics by @metachris in #60
- bucket time config via env var by @metachris in #61
- fix healthcheck error handling by @metachris in #62
Full Changelog: v0.9.0...v1.0.0
Docker image: https://hub.docker.com/r/flashbots/mempool-dumpster/tags
v0.9.1-rc6
Changelog
Full Changelog: v0.9.0...v0.9.1-rc6
v0.9.0
What's Changed
- correct a typo by @facuzeta in #41
- update commands for running Merger by @fahimahmedx in #42
- chore(deps): update fiber-go to v1.9.2 by @mempirate in #44
- add item in overview to link to other ways of accessing data by @curcio in #45
- fix(fiber): don't panic by @mempirate in #46
- feat: bump fiber-go to v1.9.3, configure health checks by @mempirate in #47
- fiber v1.9.4 by @metachris in #52
- chore: increase fiber connection timeout by @estensen in #51
- chore: bump chainbound to pectra ready by @estensen in #55
- Dependency update and chainID checks by @metachris in #57
- fix: make chainbound reconnecting more robust by @estensen in #54
- start goreleaser by @metachris in #58
New Contributors
- @facuzeta made their first contribution in #41
- @fahimahmedx made their first contribution in #42
- @mempirate made their first contribution in #44
- @curcio made their first contribution in #45
- @estensen made their first contribution in #51
Full Changelog: v0.8.0...v0.9.0