-
Couldn't load subscription status.
- Fork 110
Description
It is possible that during disrupt_abort_repair, the call to storage_service/force_terminate_repair arives just as a repair is finished and before another one starts, so in effect it does nothing, so the nodetool repair does not fail, and it runs until finished, which may trigger the timeout of 120s on the thread.
Jun 09 23:49:17.110407 longevity-50gb-12h-2024-2-db-node-d169477f-2 scylla[6905]: [shard 0:strm] repair - repair[dbc91c86-2a77-4c7a-91f9-b93721bb823e]: starting user-requested repair for keyspace system_distributed_everywhere, repair id 56, options {"trace": "false", "primaryRange": "false", "jobThreads": "1", "incremental": "false", "parallelism": "parallel"}
...
Jun 09 23:49:18.147909 longevity-50gb-12h-2024-2-db-node-d169477f-2 scylla[6905]: [shard 0:strm] repair - repair[dbc91c86-2a77-4c7a-91f9-b93721bb823e]: completed successfully
Jun 09 23:49:18.282972 longevity-50gb-12h-2024-2-db-node-d169477f-2 scylla[6905]: [shard 0:strm] repair - Started to abort repair jobs={}, nr_jobs=0
Jun 09 23:49:18.315701 longevity-50gb-12h-2024-2-db-node-d169477f-2 scylla[6905]: [shard 0:strm] repair - repair[585f8d94-1250-4719-89ea-1f53a5bed91a]: starting user-requested repair for keyspace drop_table_during_repair_ks_6, repair id 57, options {"trace": "false", "primaryRange": "false", "jobThreads": "1", "incremental": "false", "parallelism": "parallel"}
Jun 09 23:49:18.315717 longevity-50gb-12h-2024-2-db-node-d169477f-2 scylla[6905]: [shard 0:strm] repair - repair[585f8d94-1250-4719-89ea-1f53a5bed91a]: completed successfully: no tables to repair
Packages
Scylla version: 2024.2.11-20250609.98e7e1fec707 with build-id 0137055552a86ec74fe7808066cd25cae9b712a1
Kernel Version: 5.15.0-1085-aws
Installation details
Cluster size: 4 nodes (i4i.4xlarge)
Scylla Nodes used in this run:
- longevity-50gb-12h-2024-2-db-node-d169477f-5 (34.201.146.63 | 10.12.10.226) (shards: 12)
- longevity-50gb-12h-2024-2-db-node-d169477f-4 (13.218.33.140 | 10.12.10.191) (shards: 11)
- longevity-50gb-12h-2024-2-db-node-d169477f-3 (18.209.14.117 | 10.12.9.2) (shards: -1)
- longevity-50gb-12h-2024-2-db-node-d169477f-2 (98.84.130.64 | 10.12.9.121) (shards: 11)
- longevity-50gb-12h-2024-2-db-node-d169477f-1 (13.218.246.21 | 10.12.10.20) (shards: 11)
OS / Image: ami-0abd2efc39812f7d0 (aws: undefined_region)
Test: longevity-150gb-asymmetric-cluster-12h-test
Test id: d169477f-0422-4d78-b8ce-864c35c693db
Test name: enterprise-2024.2/tier1/longevity-150gb-asymmetric-cluster-12h-test
Test method: longevity_test.LongevityTest.test_custom_time
Test config file(s):
Logs and commands
- Restore Monitor Stack command:
$ hydra investigate show-monitor d169477f-0422-4d78-b8ce-864c35c693db - Restore monitor on AWS instance using Jenkins job
- Show all stored logs command:
$ hydra investigate show-logs d169477f-0422-4d78-b8ce-864c35c693db
Logs:
- longevity-50gb-12h-2024-2-db-node-d169477f-5 - https://cloudius-jenkins-test.s3.amazonaws.com/d169477f-0422-4d78-b8ce-864c35c693db/20250609_120041/longevity-50gb-12h-2024-2-db-node-d169477f-5-d169477f.tar.gz
- db-cluster-d169477f.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/d169477f-0422-4d78-b8ce-864c35c693db/20250610_002243/db-cluster-d169477f.tar.gz
- sct-runner-events-d169477f.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/d169477f-0422-4d78-b8ce-864c35c693db/20250610_002243/sct-runner-events-d169477f.tar.gz
- sct-d169477f.log.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/d169477f-0422-4d78-b8ce-864c35c693db/20250610_002243/sct-d169477f.log.tar.gz
- loader-set-d169477f.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/d169477f-0422-4d78-b8ce-864c35c693db/20250610_002243/loader-set-d169477f.tar.gz
- monitor-set-d169477f.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/d169477f-0422-4d78-b8ce-864c35c693db/20250610_002243/monitor-set-d169477f.tar.gz
- ssl-conf-d169477f.tar.gz - https://cloudius-jenkins-test.s3.amazonaws.com/d169477f-0422-4d78-b8ce-864c35c693db/20250610_002243/ssl-conf-d169477f.tar.gz