-
Notifications
You must be signed in to change notification settings - Fork 3
Change timing of AH/RC pre and post snapshots #247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Pre-migration: - RC snapshot: taken at block with state transition to AccountsMigrationInit - AH snapshot: taken at block with state transition to DataMigrationOngoing + 1 Post-migration: - RC snapshot: taken at block with state transition to CoolOff - AH snapshot: taken at block with state transition to CoolOff Driven-by: update runtime to include CoolOff as state transition for AH.
2025-10-10 22:42:37 [debug]: RC Finalized Block #30471701: 0xea0c8876856d9782c96fd208f6077949cf483844f1b2f0a7561026c8a1d6ca56 {
"service": "ahm"
}
2025-10-10 22:42:37 [info]: RC migration finished {
"service": "ahm",
"blockNumber": 30471701
}
2025-10-10 22:42:37 [debug]: AH Finalized Block #11193966: 0x877687cda9411ed3149375effe8879b34aa9b00e81924a32b1f360f240d788e1 {
"service": "ahm"
}
2025-10-10 22:42:37 RPC-CORE: getStorage(key: StorageKey, at?: BlockHash): StorageData:: Unable to decode storage ahMigrator.ahMigrationStage:: createType(PalletAhMigratorMigrationStage):: {"_enum":{"Pending":"Null","DataMigrationOngoing":"Null","CoolOff":"{\"endAt\":\"u32\"}","MigrationDone":"Null"}}:: Decoded input doesn't match input, received 0x02 (1 bytes), created 0x0200000000 (5 bytes)
file:///__w/ahm-dryrun/ahm-dryrun/node_modules/@polkadot/rpc-core/bundle.js:407
throw new Error(`Unable to decode storage ${key.section || 'unknown'}.${key.method || 'unknown'}:${entryNum}: ${error.message}`);
^
Error: Unable to decode storage ahMigrator.ahMigrationStage:: createType(PalletAhMigratorMigrationStage):: {"_enum":{"Pending":"Null","DataMigrationOngoing":"Null","CoolOff":"{\"endAt\":\"u32\"}","MigrationDone":"Null"}}:: Decoded input doesn't match input, received 0x02 (1 bytes), created 0x0200000000 (5 bytes)
at RpcCore._newType (file:///__w/ahm-dryrun/ahm-dryrun/node_modules/@polkadot/rpc-core/bundle.js:407:19)
at RpcCore._formatStorageData (file:///__w/ahm-dryrun/ahm-dryrun/node_modules/@polkadot/rpc-core/bundle.js:341:21)
at RpcCore._formatOutput (file:///__w/ahm-dryrun/ahm-dryrun/node_modules/@polkadot/rpc-core/bundle.js:306:25)
at RpcCore._formatResult (file:///__w/ahm-dryrun/ahm-dryrun/node_modules/@polkadot/rpc-core/bundle.js:168:20)
at callWithRegistry (file:///__w/ahm-dryrun/ahm-dryrun/node_modules/@polkadot/rpc-core/bundle.js:189:25)
at process.processTicksAndRejections (node:internal/process/task_queues:105:5) I believe that the error in decoding AH migration state is somehow expected while testing vs kusama and won't happen when we test vs polkadot.
this command
Since Kusama currently has Now, since we don't care about testing migration on Kusama anymore there is no point in having a migration for Ah migration stage there. And I expect the issue not to happen on Polkadot since we won't have any AhMigration stage with the old format on Asset Hub when we bite the network. @muharem @pepoviola @ggwpez @kianenigma please confirm that my assumption above makes sense. That said, on CI I am still not able to test AHM flow on Polkadot e.g. see https://github.com/paritytech/ahm-dryrun/actions/runs/18478441568/job/52648138711 where my job got stuck. |
Hi @sigurpol, I think you are correct about the decoding error. I will check why the migration stuck in polkadot and ping you for verify. |
@sigurpol yes, your understanding is right |
Once polkadot-fellows/runtimes#952 is merged (hopefully very soon), we can update this PR to point to latest main runtime and hopefully merge it |
Co-authored-by: Javier Viola <[email protected]>
@muharem maybe one thing worth mentioning. 2025-10-16 01:56:45 [debug]: RC migration in progress {
"service": "ahm",
"stage": {
"CoolOff": {
"endAt": "28,207,185"
}
}
}
2025-10-16 01:56:45 [debug]: AH migration in progress {
"service": "ahm",
"stage": "DataMigrationOngoing"
}
2025-10-16 01:56:53 [debug]: RC Finalized Block #28207184: 0x407194cc70112de50d7696f029f7786518c4927f69fe1eb2dc3a7d9c43f274c7 {
"service": "ahm"
}
2025-10-16 01:56:53 [debug]: AH Finalized Block #9981488: 0xd69d2f110de2707044d5ef010de11c66944701e0762934b146e109d1e0f7d266 {
"service": "ahm"
}
2025-10-16 01:56:53 [debug]: RC migration in progress {
"service": "ahm",
"stage": {
"CoolOff": {
"endAt": "28,207,185"
}
}
}
2025-10-16 01:56:53 [debug]: AH migration in progress {
"service": "ahm",
"stage": "DataMigrationOngoing"
}
2025-10-16 01:56:57 [debug]: RC Finalized Block #28207185: 0xe29f141971649c3dbbc6749224b49636725a579bcc1b5fbdfe2189d377feed12 {
"service": "ahm"
}
2025-10-16 01:56:57 [debug]: RC migration in progress {
"service": "ahm",
"stage": "SignalMigrationFinish"
}
2025-10-16 01:56:57 [debug]: AH Finalized Block #9981489: 0x29c5cf9e2a8d550340ed150d70c93883bd1e36d969fd46fa261b137d6ec8acd0 {
"service": "ahm"
}
2025-10-16 01:56:57 [debug]: AH migration in progress {
"service": "ahm",
"stage": "DataMigrationOngoing"
}
2025-10-16 01:57:05 [debug]: RC Finalized Block #28207186: 0xa7eccd4e86417acfc042f364650c272904eb108d4fe70f84939f849bdaa8d0a2 {
"service": "ahm"
}
2025-10-16 01:57:05 [info]: RC migration finished {
"service": "ahm",
"blockNumber": 28207186
}
2025-10-16 01:57:05 [debug]: AH Finalized Block #9981490: 0xf05c2f9dce81c85e67259445696da98b5c55142ff603dd8d2bd7228a89272141 {
"service": "ahm"
}
2025-10-16 01:57:05 [debug]: AH migration in progress {
"service": "ahm",
"stage": "DataMigrationOngoing"
}
2025-10-16 01:57:09 [debug]: AH Finalized Block #9981491: 0x573a54ecd60b404d5cdd3b0e85fdaa3bda23e274279bd5e4614d44eb723334a3 {
"service": "ahm"
}
2025-10-16 01:57:09 [debug]: AH migration in progress {
"service": "ahm",
"stage": "DataMigrationOngoing"
}
2025-10-16 01:57:17 [debug]: AH Finalized Block #9981492: 0x81b54198c97abc8f3c839fe202d8f0389a2f354742ad1b49935685888d7bdc04 {
"service": "ahm"
}
2025-10-16 01:57:17 [info]: AH migration finished {
"service": "ahm",
"blockNumber": 9981492
}
|
[DISCARDED - see later on] since when I performed the migration locally
Increasing cool-off period would solve the issue, I believe (and it's how we will run the migration for real...) |
@sigurpol on which branch the runtimes? |
As I was mentioning above: muharem-ahm-move-finish-stage branch (a3d3563c2c907d5ab4d4495d67bc44ec562d923d) |
ok as discussed with @muharem (@pepoviola please confirm 🙏 ): it is normal AH didn't went to CoolOff phase since the |
Now running AHM flow in CI with new runtime (aligned to latest from polkadot-fellows/runtimes#956 (current commit: 3039b0c88ace9471e75a6da4efd378817ee56e34). I expect CI to fails for the issue with collator getting stuck in premigration @pepoviola is investigating but at least I hope to get a valid pre-db state so I can then manually spawn + migrate + take snapshots |
Pre-migration:
Post-migration:
The monitor process just monitors RC and AH migration state and saves block hashes. Snapshots are taken after both AH and RC are in CoolOff migration state.Starting from a34356e, instead of WebSocket constant monitoring, we poll via RPC periodically (every 5min) until migration is done, and then we take the 4 snapshots.
Driven-by: update runtime to include CoolOff as state transition for AH coming from runtime polkadot-fellows/runtimes#952 (still to be merged!) -> UPDATE: now runtime points to polkadot-fellows/runtimes#956 (current commit: 3039b0c88ace9471e75a6da4efd378817ee56e34)
TODO: update to latest and greatest runtime once 956 is merged
How to test
CI
Test on CI via usual AHM flow (once we solve the issue for which we can't run AHM flow on CI!).
How to test manually
just zb spawn polkadot-bite
and wait for network to be upjust zb monitor-snapshots polkadot-bite polkadot
just zb start-migration polkadot-bite
(or ofc, do 1. 2. and 3. from a single terminal with
&
if you prefer)Wait for migration to end, at the end, the monitor script should take the 4 snaps sequentially.