Skip to content

System deadlock on ZFS 2.2.7 and kernel 6.12.28 #17322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
IvanVolosyuk opened this issue May 10, 2025 · 2 comments
Open

System deadlock on ZFS 2.2.7 and kernel 6.12.28 #17322

IvanVolosyuk opened this issue May 10, 2025 · 2 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@IvanVolosyuk
Copy link
Contributor

I am running ZFS 2.2.7 and started getting system freezes with some kind of all system deadlock on kernels 6.12.28(?)
All IO seem to stop, processes unkillable and the pool suspended due to IO errors. On next reboot - no traces of failures, so sounds like something was badly deadlocked.

System information

Type Version/Name
Distribution Name Gentoo
Distribution Version Live
Kernel Version 6.12.28
Architecture x86_64
OpenZFS Version 2.2.7

Describe the problem you're observing

The system becomes unresponsive, cannot start new programs or kill any. I can still use terminal and see new output from programs which still alive. On serial console I can see kernel messages, see below.

Describe how to reproduce the problem

I don't know yet what triggers this. I'm running a few qemu/kvm VMs with heavy use of GPU in one of them with, pool IO is minimal.
I recently upgraded the kernel from 6.12.25, 6.12.26 to 6.12.28 and started getting the system freezes.
This is the second one in a row I experienced. I rolled back to 6.12.25 for now to see if the problem persists.

Include any warning/errors/backtraces from the system logs

[67094.137926] INFO: task systemd-journal:4828 blocked for more than 122 seconds.
[67094.137946]       Tainted: G                T  6.12.26-gentoo-x86_64 #2
[67094.137948] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[67094.137973]  ? prepare_to_wait_event+0x180/0x180
[67094.137978]  txg_wait_synced_impl+0xf8/0x140
[67094.137983]  txg_wait_synced+0x10/0x40
[67094.137986]  dmu_tx_wait+0x28e/0x410
[67094.137989]  ? spl_kmem_zalloc+0x106/0x120
[67094.137993]  dmu_tx_assign+0x15f/0x500
[67094.137995]  zfs_dirty_inode+0xd4/0x300
[67094.138000]  ? dmu_read_impl+0x156/0x190
[67094.138003]  ? dnode_rele_and_unlock+0x58/0xf0
[67094.138006]  ? __mod_memcg_lruvec_state+0xe6/0x1b0
[67094.138010]  ? __lruvec_stat_mod_folio+0x97/0x130
[67094.138012]  ? folio_add_file_rmap_ptes+0x5d/0xf0
[67094.138015]  zpl_dirty_inode+0x2b/0x40
[67094.138017]  __mark_inode_dirty+0x56/0x370
[67094.138021]  generic_update_time+0x51/0x60
[67094.138023]  file_update_time+0x66/0x70
[67094.138025]  filemap_page_mkwrite+0x78/0x180
[67094.138028]  do_page_mkwrite+0x48/0xc0
[67094.138031]  do_wp_page+0xd5/0x1050
[67094.138033]  ? __pte_offset_map+0x1b/0x180
[67094.138035]  __handle_mm_fault+0x863/0x8d0
[67094.138038]  handle_mm_fault+0x187/0x2d0
[67094.138040]  do_user_addr_fault+0x172/0x690
[67094.138043]  exc_page_fault+0x78/0x170
[67094.138045]  asm_exc_page_fault+0x26/0x30
[67094.138049] RIP: 0033:0x7fa2a5d38d70
[67094.138051] RSP: 002b:00007fff64ab2d80 EFLAGS: 00010246
[67094.138053] RAX: 00007fa292456990 RBX: 00005571185cbdf0 RCX: 00007fa2a5deca00
[67094.138054] RDX: 0000000000000000 RSI: 00007fa292456990 RDI: 00005571185cbdf0
[67094.138055] RBP: 00007fff64ab2e10 R08: 00000000000000a4 R09: 00005571185cbe28
[67094.138056] R10: 0000000000000001 R11: 0000f8ac5e6038ae R12: 000000000586bed0
[67094.138057] R13: 00007fff64ab39e0 R14: 0000000000463dc0 R15: 00007fff64ab2f70
[67094.138060]  </TASK>
[67094.138094] INFO: task rasdaemon:6060 blocked for more than 122 seconds.
[67094.138095]       Tainted: G                T  6.12.26-gentoo-x86_64 #2
[67094.138096] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[67094.138097] task:rasdaemon       state:D stack:0     pid:6060  tgid:6060  ppid:1      flags:0x00004002
[67094.138099] Call Trace:
[67094.138100]  <TASK>
[67094.138102]  __schedule+0x4af/0xb50
[67094.138104]  schedule+0x27/0xd0
[67094.138106]  schedule_preempt_disabled+0x15/0x20
[67094.138108]  __mutex_lock.constprop.0+0x2e1/0x5d0
[67094.138110]  txg_wait_synced_impl+0x27/0x140
[67094.138112]  txg_wait_synced+0x10/0x40
[67094.138114]  dmu_tx_wait+0x28e/0x410
[67094.138116]  zfs_create+0x233/0x980
[67094.138118]  zpl_create+0xb6/0x1a0
[67094.138120]  path_openat+0x1070/0x1310
[67094.138123]  do_filp_open+0x9a/0x120
[67094.138126]  do_sys_openat2+0x74/0xa0
[67094.138128]  __x64_sys_openat+0x49/0x80
[67094.138130]  do_syscall_64+0x52/0x120
[67094.138133]  entry_SYSCALL_64_after_hwframe+0x50/0x58
[67094.138136] RIP: 0033:0x7f0cfad27407
[67094.138137] RSP: 002b:00007ffec58995c0 EFLAGS: 00000202 ORIG_RAX: 0000000000000101
[67094.138139] RAX: ffffffffffffffda RBX: 00000000000a0042 RCX: 00007f0cfad27407
[67094.138142] RDX: 00000000000a0042 RSI: 0000564120411bb0 RDI: 00000000ffffff9c
[67094.138145] RBP: 0000564120411bb0 R08: 0000000000000001 R09: 0000564120411bb0
[67094.138147] R10: 00000000000001a4 R11: 0000000000000202 R12: 00000000000001a4
[67094.138149] R13: 00000000000001a4 R14: 00000000000a0042 R15: 0000000000000042
[67094.138153]  </TASK>
[67094.138173] INFO: task systemd-journal:6668 blocked for more than 122 seconds.
[67094.138176]       Tainted: G                T  6.12.26-gentoo-x86_64 #2
[67094.138177] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[67094.138178] task:systemd-jo
[67162.263773] systemd[1]: systemd-timesyncd.service: Watchdog timeout (limit 3min)!

[67168.513988] systemd[1]: [email protected]: State 'stop-watchdog' timed out. Killing.

[67172.513678] systemd[1]: systemd-journald.service: State 'stop-watchdog' timed out. Killing.

[67252.514031] systemd[1]: systemd-timesyncd.service: State 'stop-watchdog' timed out. Killing.

[67258.764078] systemd[1]: [email protected]: Processes still around after SIGKILL. Ignoring.

[67262.764091] systemd[1]: systemd-journald.service: Processes still around after SIGKILL. Ignoring.

[67342.764461] systemd[1]: systemd-timesyncd.service: Processes still around after SIGKILL. Ignoring.

[67349.014499] systemd[1]: [email protected]: State 'final-sigterm' timed out. Killing.

[67353.014510] systemd[1]: systemd-journald.service: State 'final-sigterm' timed out. Killing.

[67371.944951] brkvm: port 3(tap1) entered disabled state
[67371.945064] tap1 (unregistering): left allmulticast mode
[67371.945069] tap1 (unregistering): left promiscuous mode
[67371.945072] brkvm: port 3(tap1) entered disabled state
[67371.961298] WARNING: Pool 'ssd' has encountered an uncorrectable I/O failure and has been suspended.
[67371.961298]

[67433.014899] systemd[1]: systemd-timesyncd.service: State 'final-sigterm' timed out. Killing.

[67439.264905] systemd[1]: [email protected]: Processes still around after final SIGKILL. Entering failed mode.
[67439.264925] systemd[1]: [email protected]: Failed with result 'watchdog'.
[67439.265483] systemd[1]: [email protected]: Found left-over process 6668 (systemd-journal) in control group while starting unit. Ignoring.
[67439.265493] systemd[1]: [email protected]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.

[67443.014953] systemd[1]: systemd-journald.service: Processes still around after final SIGKILL. Entering failed mode.
[67443.014971] systemd[1]: systemd-journald.service: Failed with result 'watchdog'.
[67443.016567] systemd[1]: systemd-journald.service: Found left-over process 4828 (systemd-journal) in control group while starting unit. Ignoring.
[67443.016573] systemd[1]: systemd-journald.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.

@IvanVolosyuk IvanVolosyuk added the Type: Defect Incorrect behavior (e.g. crash, hang) label May 10, 2025
@robn
Copy link
Member

robn commented May 10, 2025

Not much to go on. The hung tasks are waiting for the transaction to complete, which of course won't if the pool suspends.

Since the pool suspends, I'm curious what zpool events reports. And zpool status, to check the error counts (sounds like not much).

The gap between those warnings and the suspend suggests anywhere between 5 and 7 minutes between the IO being issued, and the pool suspending. Suggests a timeout somewhere, either the kernel returning the IO, or a ZFS timeout, maybe the deadman? Anything in dbgmsg either?

The logs look like you're doing a shutdown, are you? Or is this happening in "normal" operation as well?

What's your setting for zfs_vdev_disk_classic?

Kernel versions raise a slight eyebrow. If this were reproduceable on 6.12.25, but not on 6.12.24, I'd wonder if it's related to #17277. If it were, a variation of #17298 might help.

Any more info you get would be helpful, thanks!

@IvanVolosyuk
Copy link
Contributor Author

zfs_vdev_disk_classic is 1
It is also not a shutdown. It is during normal operations! The pool consists of two SSDs and scrub returned clean.

Thanks for pointers, I'll keep an eye on 'zpool events' and 'dbgmsg'.
I believe the problematic kernel is 6.12.28. If anyone experience something like this, let us know :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

2 participants