Skip to content

Three-way deadlock between z_iput, sync(), and fsync() #7964

Open
@runderwo

Description

@runderwo

System information

Type Version/Name
Distribution Name Debian
Distribution Version Stretch
Linux Kernel 4.17.0-0.bpo.3-amd64
Architecture amd64
ZFS Version 0.7.9-3~bpo9+1
SPL Version 0.7.9-3~bpo9+1

Describe the problem you're observing

Three-way deadlock between z_iput, sync(), and fsync()

Describe how to reproduce the problem

Not sure - it just happens after a few weeks on an AFS fileserver with 4-spindle raidz

I previously had to disable swap due to #7734

There are no hardware errors logged and nothing apparently wrong with the disks. I can continue to use the pool but certain operations will hang in uninterruptible sleep until the system is rebooted.

Include any warning/errors/backtraces from the system logs

[1378261.919390]       Tainted: P           O      4.17.0-0.bpo.3-amd64 #1 Debian 4.17.17-1~bpo9+1
[1378261.924563] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1378261.929657] z_iput          D    0   348      2 0x80000000
[1378261.934686] Call Trace:
[1378261.939606]  ? __schedule+0x3dc/0x860
[1378261.944430]  ? __switch_to_asm+0x34/0x70
[1378261.949234]  ? __switch_to_asm+0x40/0x70
[1378261.953852]  ? bit_wait_io_timeout+0x90/0x90
[1378261.958381]  schedule+0x32/0x80
[1378261.962864]  bit_wait+0xd/0x50
[1378261.967328]  __wait_on_bit+0x55/0x80
[1378261.971814]  ? __switch_to+0x175/0x450
[1378261.976315]  ? bit_wait_io_timeout+0x90/0x90
[1378261.980768]  __inode_wait_for_writeback+0xa7/0xe0
[1378261.985272]  ? init_wait_var_entry+0x40/0x40
[1378261.989790]  inode_wait_for_writeback+0x21/0x30
[1378261.994288]  evict+0xab/0x1c0
[1378261.998756]  taskq_thread+0x2d8/0x500 [spl]
[1378262.003184]  ? wake_up_q+0x70/0x70
[1378262.007632]  kthread+0xf8/0x130
[1378262.012013]  ? taskq_thread_spawn+0x50/0x50 [spl]
[1378262.016393]  ? kthread_create_worker_on_cpu+0x70/0x70
[1378262.020772]  ret_from_fork+0x22/0x40
[1378262.025144] INFO: task HWRaidManager:29978 blocked for more than 120 seconds.
[1378262.029447]       Tainted: P           O      4.17.0-0.bpo.3-amd64 #1 Debian 4.17.17-1~bpo9+1
[1378262.033705] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1378262.037912] HWRaidManager   D    0 29978      1 0x20020080
[1378262.042067] Call Trace:
[1378262.046099]  ? __schedule+0x3dc/0x860
[1378262.050046]  ? default_file_splice_write+0x20/0x20
[1378262.053890]  schedule+0x32/0x80
[1378262.057654]  wb_wait_for_completion+0x5e/0x90
[1378262.061335]  ? remove_wait_queue+0x60/0x60
[1378262.064932]  sync_inodes_sb+0xa7/0x280
[1378262.068438]  ? __queue_work+0x15e/0x410
[1378262.071847]  ? default_file_splice_write+0x20/0x20
[1378262.075180]  iterate_supers+0xb5/0x100
[1378262.078417]  ksys_sync+0x40/0xb0
[1378262.081593]  __ia32_sys_sync+0xa/0x10
[1378262.084747]  do_fast_syscall_32+0x98/0x1e0
[1378262.087866]  entry_SYSCALL_compat_after_hwframe+0x45/0x4d
[1378262.090938] INFO: task vlserver:18469 blocked for more than 120 seconds.
[1378262.093961]       Tainted: P           O      4.17.0-0.bpo.3-amd64 #1 Debian 4.17.17-1~bpo9+1
[1378262.097022] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1378262.100142] vlserver        D    0 18469  18467 0x00000080
[1378262.103303] Call Trace:
[1378262.106405]  ? __schedule+0x3dc/0x860
[1378262.109524]  schedule+0x32/0x80
[1378262.112687]  cv_wait_common+0x115/0x130 [spl]
[1378262.115875]  ? remove_wait_queue+0x60/0x60
[1378262.119227]  zil_commit.part.13+0x86/0x8c0 [zfs]
[1378262.122464]  ? tsd_hash_search.isra.1+0x41/0x90 [spl]
[1378262.125708]  ? tsd_set+0x2db/0x480 [spl]
[1378262.128975]  ? _cond_resched+0x16/0x40
[1378262.132376]  zfs_fsync+0x70/0xe0 [zfs]
[1378262.135798]  zpl_fsync+0x63/0x90 [zfs]
[1378262.138987]  do_fsync+0x38/0x60
[1378262.142092]  __x64_sys_fsync+0x10/0x20
[1378262.145091]  do_syscall_64+0x55/0x110
[1378262.148037]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1378262.150909] RIP: 0033:0x7fdf15610560
[1378262.153769] RSP: 002b:00007ffddb903ad8 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
[1378262.156735] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fdf15610560
[1378262.159688] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005
[1378262.162604] RBP: 0000559976687a50 R08: 000000000000000c R09: 00007ffddb903a20
[1378262.165473] R10: 000055997532e220 R11: 0000000000000246 R12: 0000559976687a40
[1378262.168331] R13: 00007ffddb903cf8 R14: 00000000ffffff9c R15: 0000559976687a40
[1378382.755489] INFO: task z_iput:348 blocked for more than 120 seconds.
[1378382.758346]       Tainted: P           O      4.17.0-0.bpo.3-amd64 #1 Debian 4.17.17-1~bpo9+1
[1378382.761291] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1378382.764291] z_iput          D    0   348      2 0x80000000
[1378382.767307] Call Trace:
[1378382.770310]  ? __schedule+0x3dc/0x860
[1378382.773326]  ? __switch_to_asm+0x34/0x70
[1378382.776340]  ? __switch_to_asm+0x40/0x70
[1378382.779334]  ? bit_wait_io_timeout+0x90/0x90
[1378382.782321]  schedule+0x32/0x80
[1378382.785295]  bit_wait+0xd/0x50
[1378382.788256]  __wait_on_bit+0x55/0x80
[1378382.791216]  ? __switch_to+0x175/0x450
[1378382.794160]  ? bit_wait_io_timeout+0x90/0x90
[1378382.797118]  __inode_wait_for_writeback+0xa7/0xe0
[1378382.800087]  ? init_wait_var_entry+0x40/0x40
[1378382.803068]  inode_wait_for_writeback+0x21/0x30
[1378382.806036]  evict+0xab/0x1c0
[1378382.809011]  taskq_thread+0x2d8/0x500 [spl]
[1378382.811978]  ? wake_up_q+0x70/0x70
[1378382.814904]  kthread+0xf8/0x130
[1378382.817779]  ? taskq_thread_spawn+0x50/0x50 [spl]
[1378382.820652]  ? kthread_create_worker_on_cpu+0x70/0x70
[1378382.823497]  ret_from_fork+0x22/0x40
[1378382.826314] INFO: task auditd:3285 blocked for more than 120 seconds.
[1378382.829189]       Tainted: P           O      4.17.0-0.bpo.3-amd64 #1 Debian 4.17.17-1~bpo9+1
[1378382.832123] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1378382.835086] auditd          D    0  3285      1 0x00000000
[1378382.838049] Call Trace:
[1378382.841005]  ? __schedule+0x3dc/0x860
[1378382.843965]  schedule+0x32/0x80
[1378382.846931]  cv_wait_common+0x115/0x130 [spl]
[1378382.849881]  ? remove_wait_queue+0x60/0x60
[1378382.852994]  zil_commit.part.13+0x86/0x8c0 [zfs]
[1378382.855987]  ? tsd_hash_search.isra.1+0x41/0x90 [spl]
[1378382.858989]  ? tsd_set+0x2db/0x480 [spl]
[1378382.861953]  ? _cond_resched+0x16/0x40
[1378382.865043]  zfs_fsync+0x70/0xe0 [zfs]
[1378382.868107]  zpl_fsync+0x63/0x90 [zfs]
[1378382.870989]  do_fsync+0x38/0x60
[1378382.873795]  __x64_sys_fsync+0x10/0x20
[1378382.876586]  do_syscall_64+0x55/0x110
[1378382.879372]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1378382.882154] RIP: 0033:0x7fb4709786ed
[1378382.884956] RSP: 002b:00007fb46e4f0e30 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
[1378382.887829] RAX: ffffffffffffffda RBX: 0000559611942280 RCX: 00007fb4709786ed
[1378382.890722] RDX: 0000559611942280 RSI: 0000000000000000 RDI: 0000000000000004
[1378382.893611] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000f2e
[1378382.896514] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
[1378382.899411] R13: 00007ffd5156da3f R14: 0000000000000000 R15: 00007fb4713f3040
[1378382.902349] INFO: task HWRaidManager:29978 blocked for more than 120 seconds.
[1378382.905289]       Tainted: P           O      4.17.0-0.bpo.3-amd64 #1 Debian 4.17.17-1~bpo9+1
[1378382.908295] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1378382.911381] HWRaidManager   D    0 29978      1 0x20020080
[1378382.914432] Call Trace:
[1378382.917486]  ? __schedule+0x3dc/0x860
[1378382.920538]  ? default_file_splice_write+0x20/0x20
[1378382.923591]  schedule+0x32/0x80
[1378382.926608]  wb_wait_for_completion+0x5e/0x90
[1378382.929588]  ? remove_wait_queue+0x60/0x60
[1378382.932575]  sync_inodes_sb+0xa7/0x280
[1378382.935545]  ? __queue_work+0x15e/0x410
[1378382.938482]  ? default_file_splice_write+0x20/0x20
[1378382.941447]  iterate_supers+0xb5/0x100
[1378382.944384]  ksys_sync+0x40/0xb0
[1378382.947255]  __ia32_sys_sync+0xa/0x10
[1378382.950060]  do_fast_syscall_32+0x98/0x1e0
[1378382.952861]  entry_SYSCALL_compat_after_hwframe+0x45/0x4d
[1378382.955677] INFO: task vlserver:18469 blocked for more than 120 seconds.
[1378382.958524]       Tainted: P           O      4.17.0-0.bpo.3-amd64 #1 Debian 4.17.17-1~bpo9+1
[1378382.961416] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1378382.964385] vlserver        D    0 18469  18467 0x00000080
[1378382.967376] Call Trace:
[1378382.970326]  ? __schedule+0x3dc/0x860
[1378382.973312]  schedule+0x32/0x80
[1378382.976310]  cv_wait_common+0x115/0x130 [spl]
[1378382.979316]  ? remove_wait_queue+0x60/0x60
[1378382.982421]  zil_commit.part.13+0x86/0x8c0 [zfs]
[1378382.985437]  ? tsd_hash_search.isra.1+0x41/0x90 [spl]
[1378382.988453]  ? tsd_set+0x2db/0x480 [spl]
[1378382.991448]  ? _cond_resched+0x16/0x40
[1378382.994535]  zfs_fsync+0x70/0xe0 [zfs]
[1378382.997638]  zpl_fsync+0x63/0x90 [zfs]
[1378383.000610]  do_fsync+0x38/0x60
[1378383.003581]  __x64_sys_fsync+0x10/0x20
[1378383.006548]  do_syscall_64+0x55/0x110
[1378383.009489]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1378383.012460] RIP: 0033:0x7fdf15610560
[1378383.015399] RSP: 002b:00007ffddb903ad8 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
[1378383.018339] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fdf15610560
[1378383.021280] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005
[1378383.024203] RBP: 0000559976687a50 R08: 000000000000000c R09: 00007ffddb903a20
[1378383.027129] R10: 000055997532e220 R11: 0000000000000246 R12: 0000559976687a40
[1378383.030049] R13: 00007ffddb903cf8 R14: 00000000ffffff9c R15: 0000559976687a40
[1378503.587095] INFO: task z_iput:348 blocked for more than 120 seconds.
[1378503.590045]       Tainted: P           O      4.17.0-0.bpo.3-amd64 #1 Debian 4.17.17-1~bpo9+1
[1378503.593155] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1378503.596215] z_iput          D    0   348      2 0x80000000
[1378503.599285] Call Trace:
[1378503.602332]  ? __schedule+0x3dc/0x860
[1378503.605388]  ? __switch_to_asm+0x34/0x70
[1378503.608448]  ? __switch_to_asm+0x40/0x70
[1378503.611499]  ? bit_wait_io_timeout+0x90/0x90
[1378503.614549]  schedule+0x32/0x80
[1378503.617605]  bit_wait+0xd/0x50
[1378503.620655]  __wait_on_bit+0x55/0x80
[1378503.623694]  ? __switch_to+0x175/0x450
[1378503.626709]  ? bit_wait_io_timeout+0x90/0x90
[1378503.629708]  __inode_wait_for_writeback+0xa7/0xe0
[1378503.632690]  ? init_wait_var_entry+0x40/0x40
[1378503.635654]  inode_wait_for_writeback+0x21/0x30
[1378503.638607]  evict+0xab/0x1c0
[1378503.641578]  taskq_thread+0x2d8/0x500 [spl]
[1378503.644536]  ? wake_up_q+0x70/0x70
[1378503.647460]  kthread+0xf8/0x130
[1378503.650336]  ? taskq_thread_spawn+0x50/0x50 [spl]
[1378503.653185]  ? kthread_create_worker_on_cpu+0x70/0x70
[1378503.656003]  ret_from_fork+0x22/0x40
[1378503.658817] INFO: task auditd:3285 blocked for more than 120 seconds.
[1378503.661676]       Tainted: P           O      4.17.0-0.bpo.3-amd64 #1 Debian 4.17.17-1~bpo9+1
[1378503.664595] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1378503.667544] auditd          D    0  3285      1 0x00000000
[1378503.670506] Call Trace:
[1378503.673447]  ? __schedule+0x3dc/0x860
[1378503.676396]  schedule+0x32/0x80
[1378503.679353]  cv_wait_common+0x115/0x130 [spl]
[1378503.682306]  ? remove_wait_queue+0x60/0x60
[1378503.685414]  zil_commit.part.13+0x86/0x8c0 [zfs]
[1378503.688397]  ? tsd_hash_search.isra.1+0x41/0x90 [spl]
[1378503.691391]  ? tsd_set+0x2db/0x480 [spl]
[1378503.694360]  ? _cond_resched+0x16/0x40
[1378503.697439]  zfs_fsync+0x70/0xe0 [zfs]
[1378503.700490]  zpl_fsync+0x63/0x90 [zfs]
[1378503.703372]  do_fsync+0x38/0x60
[1378503.706182]  __x64_sys_fsync+0x10/0x20
[1378503.708964]  do_syscall_64+0x55/0x110
[1378503.711734]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1378503.714521] RIP: 0033:0x7fb4709786ed
[1378503.717325] RSP: 002b:00007fb46e4f0e30 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
[1378503.720204] RAX: ffffffffffffffda RBX: 0000559611942280 RCX: 00007fb4709786ed
[1378503.723099] RDX: 0000559611942280 RSI: 0000000000000000 RDI: 0000000000000004
[1378503.725991] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000f2e
[1378503.728895] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
[1378503.731792] R13: 00007ffd5156da3f R14: 0000000000000000 R15: 00007fb4713f3040
[1378503.734737] INFO: task HWRaidManager:29978 blocked for more than 120 seconds.
[1378503.737681]       Tainted: P           O      4.17.0-0.bpo.3-amd64 #1 Debian 4.17.17-1~bpo9+1
[1378503.740688] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1378503.743728] HWRaidManager   D    0 29978      1 0x20020080
[1378503.746782] Call Trace:
[1378503.749832]  ? __schedule+0x3dc/0x860
[1378503.752880]  ? default_file_splice_write+0x20/0x20
[1378503.755929]  schedule+0x32/0x80
[1378503.758928]  wb_wait_for_completion+0x5e/0x90
[1378503.761924]  ? remove_wait_queue+0x60/0x60
[1378503.764908]  sync_inodes_sb+0xa7/0x280
[1378503.767873]  ? __queue_work+0x15e/0x410
[1378503.770812]  ? default_file_splice_write+0x20/0x20
[1378503.773774]  iterate_supers+0xb5/0x100
[1378503.776702]  ksys_sync+0x40/0xb0
[1378503.779576]  __ia32_sys_sync+0xa/0x10
[1378503.782427]  do_fast_syscall_32+0x98/0x1e0
[1378503.785225]  entry_SYSCALL_compat_after_hwframe+0x45/0x4d
rpool:
    version: 5000
    name: 'rpool'
    state: 0
    txg: 475211
    pool_guid: 17441379133630177917
    errata: 0
    hostid: 715009
    hostname: 'seraph'
    com.delphix:has_per_vdev_zaps
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 17441379133630177917
        children[0]:
            type: 'raidz'
            id: 0
            guid: 11960073590447584870
            nparity: 1
            metaslab_array: 35
            metaslab_shift: 37
            ashift: 12
            asize: 15998825529344
            is_log: 0
            create_txg: 4
            com.delphix:vdev_zap_top: 40
            children[0]:
                type: 'disk'
                id: 0
                guid: 13773523373170097779
                path: '/dev/disk/by-id/ata-HGST_HUS724040ALA640_PN2334PBJ3SV0T-part3'
                whole_disk: 0
                DTL: 65
                create_txg: 4
                com.delphix:vdev_zap_leaf: 41
            children[1]:
                type: 'disk'
                id: 1
                guid: 14927593852271164084
                path: '/dev/disk/by-id/ata-WDC_WD4003FZEX-00Z4SA0_WD-WMC1F0D1ZE3K-part3'
                whole_disk: 0
                DTL: 77
                create_txg: 4
                com.delphix:vdev_zap_leaf: 268
            children[2]:
                type: 'disk'
                id: 2
                guid: 4795217381968720306
                path: '/dev/disk/by-id/ata-HGST_HUS724040ALA640_PN2334PBJ525GT-part3'
                whole_disk: 0
                DTL: 76
                create_txg: 4
                com.delphix:vdev_zap_leaf: 75
            children[3]:
                type: 'disk'
                id: 3
                guid: 14529900694222241193
                path: '/dev/disk/by-id/ata-WDC_WD4003FZEX-00Z4SA0_WD-WMC5D0D2U951-part3'
                whole_disk: 0
                DTL: 62
                create_txg: 4
                com.delphix:vdev_zap_leaf: 44
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 4h38m with 0 errors on Sun Sep  9 05:02:45 2018
config:

        NAME                                                  STATE     READ WRITE CKSUM
        rpool                                                 ONLINE       0     0     0
          raidz1-0                                            ONLINE       0     0     0
            ata-HGST_HUS724040ALA640_PN2334PBJ3SV0T-part3     ONLINE       0     0     0
            ata-WDC_WD4003FZEX-00Z4SA0_WD-WMC1F0D1ZE3K-part3  ONLINE       0     0     0
            sda3                                              ONLINE       0     0     0
            ata-WDC_WD4003FZEX-00Z4SA0_WD-WMC5D0D2U951-part3  ONLINE       0     0     0
        spares
          sdf3                                                AVAIL   

errors: No known data errors

Metadata

Metadata

Assignees

No one assigned

    Labels

    Bot: Not StaleOverride for the stale botType: DefectIncorrect behavior (e.g. crash, hang)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions