Skip to content

Kernel update breaks loopbacks on ZFS volumes #17277

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1JorgeB opened this issue Apr 27, 2025 · 7 comments · May be fixed by #17298
Open

Kernel update breaks loopbacks on ZFS volumes #17277

1JorgeB opened this issue Apr 27, 2025 · 7 comments · May be fixed by #17298
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@1JorgeB
Copy link

1JorgeB commented Apr 27, 2025

System information

Type Version/Name
Distribution Name Unraid
Distribution Version 7.1.0-rc.2
Kernel Version 6.12.25
Architecture x86_64
OpenZFS Version 2.3.1

Describe the problem you're observing

This looks more like a kernel issue to me, but if I report it there I expect them to say that ZFS is not supported, so thought of reporting it here first to see what you think.

After updating the kernel from 6.12.24 to 6.12.25, loopbacks on a ZFS volume hang

Describe how to reproduce the problem

Create a new loopback image on a zfs volume, can be a single device zfs volume, format the loopback device xfs, attempt to mount it and it hangs:

root@test2:~# dd if=/dev/zero of=/mnt/tank/image bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.219488 s, 4.9 GB/s
root@test2:~# losetup /dev/loop2 /mnt/tank/image 
root@test2:~# mkfs.xfs /dev/loop2
meta-data=/dev/loop2             isize=512    agcount=4, agsize=65536 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
         =                       exchange=0   metadir=0
data     =                       bsize=4096   blocks=262144, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=0
log      =internal log           bsize=4096   blocks=16384, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
         =                       rgcount=0    rgsize=0 extents
Discarding blocks...Done.
root@test2:~# mkdir /x
root@test2:~# mount /dev/loop2 /x


There are no panic/crashes logged, it stops here:

Apr 27 11:25:24 test2 kernel: loop2: detected capacity change from 0 to 2097152
Apr 27 11:26:04 test2 kernel: XFS (loop2): Mounting V5 Filesystem a6d3b0f6-ef7f-4e97-a665-e19a7b1074d2

Top shows that process using 100% CPU, it cannot be killed, and the server needs a hard reboot to recover:

top - 11:28:35 up 14 min,  0 users,  load average: 1.98, 0.96, 0.43
Tasks: 252 total, 2 running, 249 sleep, 1 d-sleep, 0 stopped, 0 zombie
%Cpu(s):  1.5 us, 26.3 sy,  0.0 ni, 72.1 id,  0.1 wa,  0.0 hi,  0.1 si,  0.0 st 
MiB Mem :  31932.4 total,  30148.2 free,    952.7 used,   1351.2 buff/cache     
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  30979.7 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                    
    449 root      20   0       0      0      0 R 100.0   0.0   2:30.51 kworker/u16:5+loop2          

P.S. if I format the loopback device btrfs, it doesn't hang on mount, but it does immediately after some i/o, reverting the kernel to 6.12.24 resolves the issue, if any other info is needed, please let me know.

Thanks.

Include any warning/errors/backtraces from the system logs

@1JorgeB 1JorgeB added the Type: Defect Incorrect behavior (e.g. crash, hang) label Apr 27, 2025
@sparksh
Copy link

sparksh commented Apr 27, 2025

Using kernel-6.13.12 with Fedora, it fails but doesn't hang:

zfs create cool/myVolume -V 5G
mkfs.xfs /dev/zd0
mkdir here
mount -t xfs /dev/zd0 here

Error message in shell session:

mount: /root/temp/here: wrong fs type, bad option, bad superblock
on /dev/zd0, missing codepage or helper program, or other error.
dmesg(1) may have more information after failed mount system call.

Error message from journald:

kernel: XFS (zd0): Mounting V5 Filesystem 2f563fe4-140f-4e1e-a562-74e561b30d95
kernel: XFS (zd0): Ending clean mount
kernel: XFS (zd0): Unmounting Filesystem 2f563fe4-140f-4e1e-a562-74e561b30d95

Checking the filesystem:

xfs_repair -n /dev/zd0

Shows:

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 2
        - agno = 3
        - agno = 1
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

Similar results using ext4:

Error reported in shell session:

mount: /root/temp/here: wrong fs type, bad option, bad superblock
on /dev/zd0, missing codepage or helper program, or other error.
dmesg(1) may have more information after failed mount system call.

journald:

kernel: EXT4-fs (zd0): mounted filesystem c25d755d-cd78-4dd9-b07c-61d74693b01b r/w with ordered data mode. Quota mode: none.
kernel: EXT4-fs (zd0): unmounting filesystem c25d755d-cd78-4dd9-b07c-61d74693b01b.

I ran a check on the ext4 volume:

fsck.ext4 /dev/zd0

Which shows:

e2fsck 1.47.0 (5-Feb-2023)
/dev/zd0: clean, 12/327680 files, 42398/1310720 blocks

So file system checks succeed for both xfs and ext4 but will not mount.

I had an encryted zvol created on a previous version of linux long ago.
It mounts and works correctly.

cryptsetup create ... test /dev/zd0
mount -t ext3 /dev/mapper/test here

@1JorgeB
Copy link
Author

1JorgeB commented Apr 27, 2025

Thank you for the reply, but unless I'm missing something, you are using a zvol, not a loopback image, though it does look like there's some issue with zvols and that kernel, but they appear to still be working normally for me with kernel 6.12.25.

Problem I've seen is just with loopback images mounted on top of a zfs dataset, with Unraid they are mostly used to store the Docker image, regular folders or a dataset can also be used, and those still work fine with kernel 6.12.25, as do loopback images with btrfs or xfs, problem is only with zfs.

@qubitnano
Copy link

I can reproduce this following the steps for xfs using:

NixOS
Kernel 6.12.25
ZFS 2.3.1
zfs on root on LUKS

dataset properties:

NAME        PROPERTY              VALUE                            SOURCE
zroot/root  type                  filesystem                       -
zroot/root  creation              Sat Mar  1 19:15 2025            -
zroot/root  used                  57.0M                            -
zroot/root  available             203G                             -
zroot/root  referenced            57.0M                            -
zroot/root  compressratio         1.04x                            -
zroot/root  mounted               yes                              -
zroot/root  quota                 none                             default
zroot/root  reservation           none                             default
zroot/root  recordsize            128K                             default
zroot/root  mountpoint            legacy                           local
zroot/root  sharenfs              off                              default
zroot/root  checksum              on                               default
zroot/root  compression           zstd                             inherited from zroot
zroot/root  atime                 off                              inherited from zroot
zroot/root  devices               on                               default
zroot/root  exec                  on                               default
zroot/root  setuid                on                               default
zroot/root  readonly              off                              default
zroot/root  zoned                 off                              default
zroot/root  snapdir               hidden                           default
zroot/root  aclmode               discard                          default
zroot/root  aclinherit            restricted                       default
zroot/root  createtxg             7                                -
zroot/root  canmount              on                               default
zroot/root  xattr                 on                               inherited from zroot
zroot/root  copies                1                                default
zroot/root  version               5                                -
zroot/root  utf8only              on                               -
zroot/root  normalization         formD                            -
zroot/root  casesensitivity       sensitive                        -
zroot/root  vscan                 off                              default
zroot/root  nbmand                off                              default
zroot/root  sharesmb              off                              default
zroot/root  refquota              none                             default
zroot/root  refreservation        none                             default
zroot/root  guid                  288749441994842549               -
zroot/root  primarycache          all                              default
zroot/root  secondarycache        all                              default
zroot/root  usedbysnapshots       0B                               -
zroot/root  usedbydataset         57.0M                            -
zroot/root  usedbychildren        0B                               -
zroot/root  usedbyrefreservation  0B                               -
zroot/root  logbias               latency                          default
zroot/root  objsetid              516                              -
zroot/root  dedup                 off                              default
zroot/root  mlslabel              none                             default
zroot/root  sync                  standard                         default
zroot/root  dnodesize             auto                             inherited from zroot
zroot/root  refcompressratio      1.04x                            -
zroot/root  written               57.0M                            -
zroot/root  logicalused           59.5M                            -
zroot/root  logicalreferenced     59.5M                            -
zroot/root  volmode               default                          default
zroot/root  filesystem_limit      none                             default
zroot/root  snapshot_limit        none                             default
zroot/root  filesystem_count      none                             default
zroot/root  snapshot_count        none                             default
zroot/root  snapdev               hidden                           default
zroot/root  acltype               posix                            inherited from zroot
zroot/root  context               none                             default
zroot/root  fscontext             none                             default
zroot/root  defcontext            none                             default
zroot/root  rootcontext           none                             default
zroot/root  relatime              on                               default
zroot/root  redundant_metadata    all                              default
zroot/root  overlay               on                               default
zroot/root  encryption            off                              default
zroot/root  keylocation           none                             default
zroot/root  keyformat             none                             default
zroot/root  pbkdf2iters           0                                default
zroot/root  special_small_blocks  0                                default
zroot/root  prefetch              all                              default
zroot/root  direct                standard                         default
zroot/root  longname              off                              default

git bisect 6.12.24..6.12.25 reveals:

78253d44e9d343258d7163ab70f4ecff4430a9b4 is the first bad commit
commit 78253d44e9d343258d7163ab70f4ecff4430a9b4 (HEAD)
Author: Christoph Hellwig <[email protected]>
Date:   Wed Apr 9 15:09:40 2025 +0200

    loop: stop using vfs_iter_{read,write} for buffered I/O
    
    [ Upstream commit f2fed441c69b9237760840a45a004730ff324faf ]
    
    vfs_iter_{read,write} always perform direct I/O when the file has the
    O_DIRECT flag set, which breaks disabling direct I/O using the
    LOOP_SET_STATUS / LOOP_SET_STATUS64 ioctls.
    
    This was recenly reported as a regression, but as far as I can tell
    was only uncovered by better checking for block sizes and has been
    around since the direct I/O support was added.
    
    Fix this by using the existing aio code that calls the raw read/write
    iter methods instead.  Note that despite the comments there is no need
    for block drivers to ever call flush_dcache_page themselves, and the
    call is a left-over from prehistoric times.
    
    Fixes: ab1cb278bc70 ("block: loop: introduce ioctl command of LOOP_SET_DIRECT_IO")
    Reported-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Christoph Hellwig <[email protected]>
    Reviewed-by: Ming Lei <[email protected]>
    Tested-by: Darrick J. Wong <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

 drivers/block/loop.c | 112 +++++++++++++++++-----------------------------------------------------------------------------------------------
 1 file changed, 17 insertions(+), 95 deletions(-)

Building 6.12.25 while reverting 78253d44e9d343258d7163ab70f4ecff4430a9b4 fixes this

Other notes:

  • zfs set direct=disabled zroot/root and redoing steps has no effect on 6.12.25 (bisect suggests something with directio ?)

  • Building 6.12.25 with 2.3.2 from zfs-2.3.2 patchset #17214 has no change

  • Creating the image with 6.12.24, writing data, booting 6.12.25, losetup and mounting loop causes same outcome as just reproducing the steps with 6.12.25 (100% kworker, need to hard reset). I can access data created with 6.12.24 on 6.12.25 with the commit reverted.

@ixhamza
Copy link
Member

ixhamza commented Apr 28, 2025

zfs set direct=disabled zroot/root and redoing steps has no effect on 6.12.25 (bisect suggests something with directio ?)

Just to confirm, is your backing disk (/mnt/tank/image) part of the zroot/root dataset? If not, you might want to try #17218 to see if it resolves the issue.

@qubitnano
Copy link

zfs set direct=disabled zroot/root and redoing steps has no effect on 6.12.25 (bisect suggests something with directio ?)

Just to confirm, is your backing disk (/mnt/tank/image) part of the zroot/root dataset? If not, you might want to try #17218 to see if it resolves the issue.

Unfortunately it does not

@PeterWang-dev
Copy link

PeterWang-dev commented May 3, 2025

Possibly same issue for me. There exists an rootfs image located at encrypted zfs in zpool. I used losetup -f -P rootfs.img to create loopback device and mounted it successfully. However, when trying to write data to it, that is, using dd if=/dev/random of=testfile bs=4K count=512K conv=fdatasync,notrunc status=progress caused problem. It seemed that the sync operation stuck and the process went into D state. Using echo w > /proc/sysrq-trigger to dump trace in dmesg, I saw:

Click to expand
[5月 3 12:03] sysrq: Show Blocked State
[  +0.001393] task:txg_sync        state:D stack:0     pid:49224 tgid:49224 ppid:2      task_flags:0x288040 flags:0x00004000
[  +0.000009] Call Trace:
[  +0.000002]  <TASK>
[  +0.000004]  __schedule+0x460/0x1ff0
[  +0.000022]  ? ttwu_queue_wakelist+0xf9/0x110
[  +0.000008]  ? try_to_wake_up+0x325/0x730
[  +0.000005]  schedule+0x27/0xf0
[  +0.000003]  schedule_timeout+0x84/0x100
[  +0.000003]  ? __pfx_process_timeout+0x10/0x10
[  +0.000004]  io_schedule_timeout+0x5b/0x90
[  +0.000006]  __cv_timedwait_io+0xbe/0x150 [spl af12c84ae427751769114f0474b67e3a2df37a77]
[  +0.000014]  ? __pfx_autoremove_wake_function+0x10/0x10
[  +0.000007]  zio_wait+0x13a/0x350 [zfs de4dde959c5ceb29641386757f258284b9ad1d65]
[  +0.000248]  dsl_pool_sync+0xe9/0x5c0 [zfs de4dde959c5ceb29641386757f258284b9ad1d65]
[  +0.000247]  ? add_timer+0x183/0x210
[  +0.000005]  spa_sync+0x597/0x1070 [zfs de4dde959c5ceb29641386757f258284b9ad1d65]
[  +0.000237]  ? spa_txg_history_init_io+0x19d/0x1c0 [zfs de4dde959c5ceb29641386757f258284b9ad1d65]
[  +0.000219]  txg_sync_thread+0x20b/0x3b0 [zfs de4dde959c5ceb29641386757f258284b9ad1d65]
[  +0.000216]  ? __pfx_txg_sync_thread+0x10/0x10 [zfs de4dde959c5ceb29641386757f258284b9ad1d65]
[  +0.000195]  ? __pfx_thread_generic_wrapper+0x10/0x10 [spl af12c84ae427751769114f0474b67e3a2df37a77]
[  +0.000011]  thread_generic_wrapper+0x5a/0x70 [spl af12c84ae427751769114f0474b67e3a2df37a77]
[  +0.000008]  kthread+0xec/0x230
[  +0.000004]  ? __pfx_kthread+0x10/0x10
[  +0.000002]  ret_from_fork+0x31/0x50
[  +0.000004]  ? __pfx_kthread+0x10/0x10
[  +0.000002]  ret_from_fork_asm+0x1a/0x30
[  +0.000004]  </TASK>
[5月 3 12:04] loop0: detected capacity change from 0 to 1000000000
[5月 3 12:06] loop0: detected capacity change from 0 to 1000000000
[  +3.512477] EXT4-fs (loop0): mounted filesystem 82651462-72e2-47de-8fa1-c2a1be805445 r/w with ordered data mode. Quota mode: none.
[5月 3 12:08] INFO: task kworker/u80:14:965 blocked for more than 122 seconds.
[  +0.000005]       Tainted: P           OE      6.14.4-zen1-2-zen #1
[  +0.000000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  +0.000001] task:kworker/u80:14  state:D stack:0     pid:965   tgid:965   ppid:2      task_flags:0x4248060 flags:0x00004000
[  +0.000003] Workqueue: writeback wb_workfn (flush-7:0)
[  +0.000005] Call Trace:
[  +0.000001]  <TASK>
[  +0.000001]  ? __pfx_wbt_cleanup_cb+0x10/0x10
[  +0.000003]  __schedule+0x460/0x1ff0
[  +0.000004]  ? __pfx_wbt_inflight_cb+0x10/0x10
[  +0.000001]  ? __pfx_wbt_cleanup_cb+0x10/0x10
[  +0.000001]  io_schedule+0x57/0x140
[  +0.000001]  rq_qos_wait+0xbc/0x130
[  +0.000002]  ? __pfx_rq_qos_wake_function+0x10/0x10
[  +0.000001]  ? __pfx_wbt_inflight_cb+0x10/0x10
[  +0.000001]  wbt_wait+0xc3/0x180
[  +0.000001]  __rq_qos_throttle+0x24/0x50
[  +0.000002]  blk_mq_submit_bio+0x21c/0x990
[  +0.000003]  __submit_bio+0xc3/0x270
[  +0.000002]  submit_bio_noacct_nocheck+0x31f/0x400
[  +0.000002]  ext4_io_submit+0x24/0x40
[  +0.000003]  ext4_do_writepages+0x494/0x1120
[  +0.000004]  ? ext4_writepages+0xab/0x170
[  +0.000001]  ext4_writepages+0xab/0x170
[  +0.000003]  do_writepages+0x87/0x280
[  +0.000003]  ? __blk_rq_map_sg+0xb0/0x440
[  +0.000002]  ? __sbitmap_get_word+0x2b/0x70
[  +0.000001]  ? sbitmap_get+0x14f/0x390
[  +0.000002]  __writeback_single_inode+0x41/0x350
[  +0.000001]  writeback_sb_inodes+0x256/0x5d0
[  +0.000003]  __writeback_inodes_wb+0x4c/0xf0
[  +0.000001]  wb_writeback+0x323/0x3b0
[  +0.000002]  wb_workfn+0x39a/0x5d0
[  +0.000001]  ? finish_task_switch.isra.0+0x99/0x2e0
[  +0.000002]  ? __schedule+0x468/0x1ff0
[  +0.000002]  process_one_work+0x190/0x360
[  +0.000003]  worker_thread+0x24f/0x380
[  +0.000001]  ? __pfx_worker_thread+0x10/0x10
[  +0.000002]  kthread+0xec/0x230
[  +0.000001]  ? __pfx_kthread+0x10/0x10
[  +0.000001]  ret_from_fork+0x31/0x50
[  +0.000003]  ? __pfx_kthread+0x10/0x10
[  +0.000001]  ret_from_fork_asm+0x1a/0x30
[  +0.000002]  </TASK>
[  +0.000251] INFO: task dd:71207 blocked for more than 122 seconds.
[  +0.000001]       Tainted: P           OE      6.14.4-zen1-2-zen #1
[  +0.000001] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  +0.000000] task:dd              state:D stack:0     pid:71207 tgid:71207 ppid:71206  task_flags:0x440100 flags:0x00004002
[  +0.000002] Call Trace:
[  +0.000001]  <TASK>
[  +0.000000]  ? __pfx_wbt_cleanup_cb+0x10/0x10
[  +0.000002]  __schedule+0x460/0x1ff0
[  +0.000001]  ? kmem_cache_alloc_noprof+0xe1/0x410
[  +0.000002]  ? __pfx_wbt_inflight_cb+0x10/0x10
[  +0.000001]  ? __pfx_wbt_cleanup_cb+0x10/0x10
[  +0.000001]  io_schedule+0x57/0x140
[  +0.000001]  rq_qos_wait+0xbc/0x130
[  +0.000001]  ? __pfx_rq_qos_wake_function+0x10/0x10
[  +0.000001]  ? __pfx_wbt_inflight_cb+0x10/0x10
[  +0.000001]  wbt_wait+0xc3/0x180
[  +0.000001]  __rq_qos_throttle+0x24/0x50
[  +0.000001]  blk_mq_submit_bio+0x21c/0x990
[  +0.000002]  __submit_bio+0xc3/0x270
[  +0.000002]  submit_bio_noacct_nocheck+0x31f/0x400
[  +0.000002]  ext4_io_submit+0x24/0x40
[  +0.000003]  ext4_do_writepages+0x494/0x1120
[  +0.000003]  ? ext4_writepages+0xab/0x170
[  +0.000001]  ext4_writepages+0xab/0x170
[  +0.000002]  do_writepages+0x87/0x280
[  +0.000002]  ? file_tty_write.isra.0+0x20c/0x350
[  +0.000003]  __filemap_fdatawrite_range+0xb0/0xd0
[  +0.000003]  file_write_and_wait_range+0xc9/0x160
[  +0.000002]  ext4_sync_file+0x86/0x3b0
[  +0.000003]  __x64_sys_fdatasync+0x4c/0x90
[  +0.000002]  do_syscall_64+0x7b/0x190
[  +0.000002]  ? do_syscall_64+0x87/0x190
[  +0.000001]  ? __x64_sys_write+0x71/0xf0
[  +0.000002]  ? syscall_exit_to_user_mode+0x10/0x210
[  +0.000002]  ? do_syscall_64+0x87/0x190
[  +0.000001]  ? irq_exit_rcu+0x55/0x100
[  +0.000002]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  +0.000003] RIP: 0033:0x73869e62a006
[  +0.000029] RSP: 002b:00007fff2fb11120 EFLAGS: 00000202 ORIG_RAX: 000000000000004b
[  +0.000002] RAX: ffffffffffffffda RBX: 0000000000004200 RCX: 000073869e62a006
[  +0.000000] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
[  +0.000001] RBP: 00007fff2fb11140 R08: 0000000000000000 R09: 0000000000000000
[  +0.000001] R10: 0000000000000000 R11: 0000000000000202 R12: 000073869e5956c8
[  +0.000000] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000080000
[  +0.000001]  </TASK>
[5月 3 12:10] INFO: task kworker/u80:14:965 blocked for more than 245 seconds.
[  +0.000005]       Tainted: P           OE      6.14.4-zen1-2-zen #1
[  +0.000001] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  +0.000001] task:kworker/u80:14  state:D stack:0     pid:965   tgid:965   ppid:2      task_flags:0x4248060 flags:0x00004000
[  +0.000003] Workqueue: writeback wb_workfn (flush-7:0)
[  +0.000006] Call Trace:
[  +0.000001]  <TASK>
[  +0.000001]  ? __pfx_wbt_cleanup_cb+0x10/0x10
[  +0.000004]  __schedule+0x460/0x1ff0
[  +0.000003]  ? __pfx_wbt_inflight_cb+0x10/0x10
[  +0.000001]  ? __pfx_wbt_cleanup_cb+0x10/0x10
[  +0.000001]  io_schedule+0x57/0x140
[  +0.000002]  rq_qos_wait+0xbc/0x130
[  +0.000002]  ? __pfx_rq_qos_wake_function+0x10/0x10
[  +0.000001]  ? __pfx_wbt_inflight_cb+0x10/0x10
[  +0.000001]  wbt_wait+0xc3/0x180
[  +0.000001]  __rq_qos_throttle+0x24/0x50
[  +0.000002]  blk_mq_submit_bio+0x21c/0x990
[  +0.000003]  __submit_bio+0xc3/0x270
[  +0.000002]  submit_bio_noacct_nocheck+0x31f/0x400
[  +0.000002]  ext4_io_submit+0x24/0x40
[  +0.000004]  ext4_do_writepages+0x494/0x1120
[  +0.000004]  ? ext4_writepages+0xab/0x170
[  +0.000001]  ext4_writepages+0xab/0x170
[  +0.000003]  do_writepages+0x87/0x280
[  +0.000003]  ? __blk_rq_map_sg+0xb0/0x440
[  +0.000002]  ? __sbitmap_get_word+0x2b/0x70
[  +0.000002]  ? sbitmap_get+0x14f/0x390
[  +0.000001]  __writeback_single_inode+0x41/0x350
[  +0.000002]  writeback_sb_inodes+0x256/0x5d0
[  +0.000002]  __writeback_inodes_wb+0x4c/0xf0
[  +0.000002]  wb_writeback+0x323/0x3b0
[  +0.000001]  wb_workfn+0x39a/0x5d0
[  +0.000001]  ? finish_task_switch.isra.0+0x99/0x2e0
[  +0.000003]  ? __schedule+0x468/0x1ff0
[  +0.000001]  process_one_work+0x190/0x360
[  +0.000003]  worker_thread+0x24f/0x380
[  +0.000002]  ? __pfx_worker_thread+0x10/0x10
[  +0.000002]  kthread+0xec/0x230
[  +0.000002]  ? __pfx_kthread+0x10/0x10
[  +0.000001]  ret_from_fork+0x31/0x50
[  +0.000002]  ? __pfx_kthread+0x10/0x10
[  +0.000001]  ret_from_fork_asm+0x1a/0x30
[  +0.000002]  </TASK>
[  +0.000269] INFO: task dd:71207 blocked for more than 245 seconds.
[  +0.000001]       Tainted: P           OE      6.14.4-zen1-2-zen #1
[  +0.000001] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  +0.000000] task:dd              state:D stack:0     pid:71207 tgid:71207 ppid:71206  task_flags:0x440100 flags:0x00004002
[  +0.000002] Call Trace:
[  +0.000001]  <TASK>
[  +0.000000]  ? __pfx_wbt_cleanup_cb+0x10/0x10
[  +0.000002]  __schedule+0x460/0x1ff0
[  +0.000002]  ? kmem_cache_alloc_noprof+0xe1/0x410
[  +0.000002]  ? __pfx_wbt_inflight_cb+0x10/0x10
[  +0.000001]  ? __pfx_wbt_cleanup_cb+0x10/0x10
[  +0.000001]  io_schedule+0x57/0x140
[  +0.000001]  rq_qos_wait+0xbc/0x130
[  +0.000001]  ? __pfx_rq_qos_wake_function+0x10/0x10
[  +0.000002]  ? __pfx_wbt_inflight_cb+0x10/0x10
[  +0.000001]  wbt_wait+0xc3/0x180
[  +0.000001]  __rq_qos_throttle+0x24/0x50
[  +0.000001]  blk_mq_submit_bio+0x21c/0x990
[  +0.000002]  __submit_bio+0xc3/0x270
[  +0.000002]  submit_bio_noacct_nocheck+0x31f/0x400
[  +0.000002]  ext4_io_submit+0x24/0x40
[  +0.000002]  ext4_do_writepages+0x494/0x1120
[  +0.000003]  ? ext4_writepages+0xab/0x170
[  +0.000002]  ext4_writepages+0xab/0x170
[  +0.000001]  do_writepages+0x87/0x280
[  +0.000002]  ? file_tty_write.isra.0+0x20c/0x350
[  +0.000003]  __filemap_fdatawrite_range+0xb0/0xd0
[  +0.000003]  file_write_and_wait_range+0xc9/0x160
[  +0.000002]  ext4_sync_file+0x86/0x3b0
[  +0.000002]  __x64_sys_fdatasync+0x4c/0x90
[  +0.000002]  do_syscall_64+0x7b/0x190
[  +0.000002]  ? do_syscall_64+0x87/0x190
[  +0.000001]  ? __x64_sys_write+0x71/0xf0
[  +0.000002]  ? syscall_exit_to_user_mode+0x10/0x210
[  +0.000002]  ? do_syscall_64+0x87/0x190
[  +0.000001]  ? irq_exit_rcu+0x55/0x100
[  +0.000002]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  +0.000003] RIP: 0033:0x73869e62a006
[  +0.000033] RSP: 002b:00007fff2fb11120 EFLAGS: 00000202 ORIG_RAX: 000000000000004b
[  +0.000001] RAX: ffffffffffffffda RBX: 0000000000004200 RCX: 000073869e62a006
[  +0.000001] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
[  +0.000001] RBP: 00007fff2fb11140 R08: 0000000000000000 R09: 0000000000000000
[  +0.000001] R10: 0000000000000000 R11: 0000000000000202 R12: 000073869e5956c8
[  +0.000000] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000080000
[  +0.000001]  </TASK>

Note that I am an Arch user, the kernel version I am using is the latest one and zfs version is:

$ zpool --version
zfs-2.3.1.r0.gf3e4043a36-1
zfs-kmod-2.3.1.r0.gf3e4043a36-1

which cherry-picks the fixes to make it work on latest kernel. I am pretty sure that this has nothing to do with this issue because I reproduced this issue on the latest LTS kernel (6.12.25-2) (currently on the job and can not reboot to lts kernel right away).

And after doing more tests, I found that this sync issue also persist on btrfs image. I wonder if the information I gave would help you to locate the problem.

@robn robn linked a pull request May 4, 2025 that will close this issue
13 tasks
@robn
Copy link
Member

robn commented May 4, 2025

Possible fix in #17298. If it's right, then it turns out it's an ancient bug that this kernel change just happened to start tickling. Thanks you for bisecting it, very useful!

tonyhutter added a commit to tonyhutter/zfs that referenced this issue May 7, 2025
Add a test case to reproduce issue openzfs#17277:

1. Make a pool
2. Write a file to the pool
3. Mount the file as a loopback device
4. Make an XFS filesystem on the loopback device
5. Mount the XFS filesystem... <hangs>

Signed-off-by: Tony Hutter <[email protected]>
tonyhutter added a commit to tonyhutter/zfs that referenced this issue May 7, 2025
Add a test case to reproduce issue openzfs#17277:

1. Make a pool
2. Write a file to the pool
3. Mount the file as a loopback device
4. Make an XFS filesystem on the loopback device
5. Mount the XFS filesystem... <hangs>

Signed-off-by: Tony Hutter <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants