Skip to content

Nimbus Bug Report: SSZ Deserialization and Database Corruption Issues in v25.5.0 #7189

@andib007

Description

@andib007

Nimbus Bug Report: SSZ Deserialization and Database Corruption Issues in v25.5.0
Summary
When running trusted node sync with Nimbus v25.5.0 on the Ethereum mainnet, the process fails with the error SSZ BeaconState: object dynamic portion starts at invalid offset. Additionally, after a seemingly successful sync, starting the beacon node via systemctl results in a fatal error: Could not load head state, database corrupt?. Downgrading to v25.4.1 resolves both issues, indicating a bug in v25.5.0.
Environment

Nimbus Version: v25.5.0 (failing), v25.4.1 (working)
Operating System: Linux (specific distribution anonymized)
Hardware: ARM-based single-board computer, 4GB RAM, SSD storage
Network: Ethereum mainnet
Checkpoint Providers Tested:
https://mainnet.checkpointz.sigp-dev.net
https://checkpointz.pietjepuk.net

Execution Client: Geth (version anonymized, confirmed synced in working case)

Steps to Reproduce

Run trusted node sync with v25.5.0:/usr/bin/nimbus_beacon_node trustedNodeSync --network=mainnet --trusted-node-url=https://mainnet.checkpointz.sigp-dev.net --data-dir=/home/XXXX/YYYY/ethereum/consensus/nimbus_database --log-level=DEBUG --backfill=false --reindex=false --state-id=finalized

Observe the sync fails with:ERR 2025-05-22 11:22:39.909+02:00 Unable to download checkpoint state error="SSZ BeaconState: object dynamic portion starts at invalid offset" syncTarget=finalized restUrl=https://mainnet.checkpointz.sigp-dev.net stateId=finalized

Run the same command without sudo -u XXXX:nimbus_beacon_node trustedNodeSync --network=mainnet --trusted-node-url=https://mainnet.checkpointz.sigp-dev.net --data-dir=/home/XXXX/YYYY/ethereum/consensus/nimbus_database --log-level=DEBUG --backfill=false --reindex=false --state-id=finalized

Sync succeeds, logging:NTC 2025-05-22 11:27:01.920+02:00 Database initialized from genesis blockRoot=4d611d5b93fdab69013a7f0a2f961caca0c853f87cfe9595fe50038163079360 stateRoot=7e76880eb67bbdc86250aa578958e9d0675e64e714337855204fb5abaaf82c2b stateSlot=0
NTC 2025-05-22 11:31:00.763+02:00 Checkpoint written to database blockRoot=87bda670a0290cadadfeea3fa852b82fcfb6811c363ceddcf354a6d13be27ccc stateRoot=49a293c4ccf22e1124cbda5dbce09b207c09fef9213daa0ff28b60dc9ddd82da stateSlot=11756736

Note: The stateSlot=0 log is incorrect; the correct slot (11756736) is logged later.
Start the Nimbus service via systemctl:sudo systemctl start nimbus_beacon_node

Observe the service fails with:FAT 2025-05-22 11:44:38.725+02:00 Could not load head state, database corrupt? head=87bda670:11756736 tail=87bda670:11756736

Downgrade to v25.4.1 and repeat steps 1-5. The sync and service startup succeed without errors.

Expected Behavior

Trusted node sync with v25.5.0 should successfully download and deserialize the finalized state from the checkpoint provider.
The database should initialize with the correct slot (e.g., ~11756736) and load without corruption errors when starting the beacon node via systemctl.

Actual Behavior

In v25.5.0, trusted node sync fails with SSZ BeaconState: object dynamic portion starts at invalid offset when run with sudo -u XXXX.
Even when sync succeeds without sudo, the database logs stateSlot=0 initially (likely a logging bug), and the systemctl service fails with Could not load head state, database corrupt?.
Downgrading to v25.4.1 resolves both issues, with successful sync and service startup.

Additional Observations

The Geth execution client logged a warning:WARN [05-22|11:44:43.018] Post-merge network, but no beacon client seen. Please launch one to follow the chain!

This was resolved by ensuring Geth and Nimbus use the same JWT secret and Engine API port (8551).
The stateSlot=0 log during successful sync in v25.5.0 is misleading, as the correct slot (11756736) is logged later, suggesting a logging or initialization bug.
The issue persists across multiple checkpoint providers, indicating a client-side problem in v25.5.0.

Logs
Relevant logs from v25.5.0:
NTC 2025-05-22 11:25:36.116+02:00 Starting trusted node sync databaseDir=/home/XXXX/YYYY/ethereum/consensus/nimbus_database/db backfill=false reindex=false syncTarget=finalized restUrl=https://mainnet.checkpointz.sigp-dev.net
NTC 2025-05-22 11:25:36.144+02:00 Downloading checkpoint state syncTarget=finalized restUrl=https://mainnet.checkpointz.sigp-dev.net stateId=finalized
DBG 2025-05-22 11:25:36.148+02:00 Sending REST request to remote server remote=mainnet.checkpointz.sigp-dev.net:443 request=/eth/v2/debug/beacon/states/finalized http_method=GET
DBG 2025-05-22 11:25:37.584+02:00 Got REST response headers from remote server remote=mainnet.checkpointz.sigp-dev.net:443 request=/eth/v2/debug/beacon/states/finalized status=200 http_method=GET
DBG 2025-05-22 11:26:38.720+02:00 Received REST response body from remote server remote=mainnet.checkpointz.sigp-dev.net:443 request=/eth/v2/debug/beacon/states/finalized contentType=application/octet-stream size=271079848
NTC 2025-05-22 11:27:01.920+02:00 Database initialized from genesis blockRoot=4d611d5b93fdab69013a7f0a2f961caca0c853f87cfe9595fe50038163079360 stateRoot=7e76880eb67bbdc86250aa578958e9d0675e64e714337855204fb5abaaf82c2b stateSlot=0
NTC 2025-05-22 11:31:00.763+02:00 Checkpoint written to database blockRoot=87bda670a0290cadadfeea3fa852b82fcfb6811c363ceddcf354a6d13be27ccc stateRoot=49a293c4ccf22e1124cbda5dbce09b207c09fef9213daa0ff28b60dc9ddd82da stateSlot=11756736
INF 2025-05-22 11:31:09.463+02:00 Loading finalized blocks finHigh=none() finalizedHead=87bda670:11756736
INF 2025-05-22 11:31:10.267+02:00 Block DAG initialized head=87bda670:11756736 finalizedHead=87bda670:11756736 tail=87bda670:11756736
FAT 2025-05-22 11:44:38.725+02:00 Could not load head state, database corrupt? head=87bda670:11756736 tail=87bda670:11756736

Suggested Fix

Investigate SSZ deserialization in v25.5.0 for compatibility with mainnet checkpoint providers.
Fix the database initialization to prevent corruption when loading via systemctl.
Correct the logging of stateSlot=0 during trusted node sync to reflect the actual finalized slot.

Workaround

Downgrade to Nimbus v25.4.1:sudo apt install nimbus=25.4.1

Ensure the systemctl service runs as the same user (XXXX) as the sync command:[Service]
User=XXXX
Group=XXXX
ExecStart=/usr/bin/nimbus_beacon_node --network=mainnet --data-dir=/home/XXXX/YYYY/ethereum/consensus/nimbus_database --log-level=INFO --tcp-port=9000 --udp-port=9000 --jwt-secret=/path/to/jwtsecret

Verify Geth sync and Engine API configuration.

Additional Notes

The issue was reproducible on an ARM-based single-board computer with SSD storage, suggesting possible architecture-specific bugs in v25.5.0.
Community assistance was sought via EthStaker Discord, but no direct fix was found until downgrading.

References

Nimbus Guide: Trusted Node Sync
Nimbus GitHub Releases
EthStaker Discord

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions