[prev in list] [next in list] [prev in thread] [next in thread]
List: freebsd-fs
Subject: zfs: repeatable assertion failure in zap_deref_leaf on 14-CURRENT
From: Lawrence Stewart <lstewart () freebsd ! org>
Date: 2021-10-04 4:06:44
Message-ID: 91c2e8a5-3e43-e295-b388-a21e58105bf2 () freebsd ! org
[Download RAW message or body]
Greetings all,
I'm able to reliably trigger a panic with ZFS implicated in the backtrace. This is on \
a laptop with a very simple zpool built on top of a single GELI gpart partition on a \
single nvme disk. A scrub has not found anything of concern.
I encounter the panic when I start Chromium and it begins restoring lots of previous \
windows/tabs. I can also trigger by simply running an ls or find on \
~/.cache/chromium/Cache so presumably something about the contents of that directory \
have been corrupted in an interesting way which is upsetting ZFS. Moving the \
directory out of the way allows me to start Chromium without issue.
I'm currently running world + kernel built a few days ago from Git main 335c4f8edb3a \
kernconf=GENERIC amd64 with a fully up to date package set and drm-kmod-current built \
from ports against my kernel.
I captured a core earlier today and have it around for any spelunking that would be \
helpful to anyone who may be interested. Ignoring the drm-related parts of the \
backtrace which happens after the ZFS panic, frame 36 looks like it's the cause of \
the crash - namely this assertion in zap.c:
ASSERT(err ||
ZAP_HASH_IDX(h, zap_leaf_phys(*lp)->l_hdr.lh_prefix_len) ==
zap_leaf_phys(*lp)->l_hdr.lh_prefix);
Below I've included some output from a kgdb session against the core I captured which \
I hope may be a useful starting point. As noted earlier I can easily move the Cache \
dir out of the way so this is not a blocking issue, but I've kept the Cache dir in \
case anyone is interested in debugging.
Cheers,
Lawrence
Full backtrace:
(kgdb) bt
#0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1 doadump (textdump=175153152) at /usr/src/sys/kern/kern_shutdown.c:399
#2 0xffffffff804cfece in db_fncall_generic (addr=<optimized out>, rv=<optimized \
out>, nargs=0, args=<optimized out>) at /usr/src/sys/ddb/db_command.c:610
#3 db_fncall (dummy1=<optimized out>, dummy2=<optimized out>, dummy3=<optimized \
out>, dummy4=<optimized out>) at /usr/src/sys/ddb/db_command.c:658
#4 0xffffffff804cfad2 in db_command (last_cmdp=<optimized out>, cmd_table=<optimized \
out>, dopager=dopager@entry=1) at /usr/src/sys/ddb/db_command.c:482
#5 0xffffffff804cf74d in db_command_loop () at /usr/src/sys/ddb/db_command.c:535
#6 0xffffffff804d2ec6 in db_trap (type=<optimized out>, code=<optimized out>) at \
/usr/src/sys/ddb/db_main.c:270 #7 0xffffffff80c75487 in kdb_trap (type=type@entry=3, \
code=code@entry=0, tf=tf@entry=0xfffffe01531a6f80) at \
/usr/src/sys/kern/subr_kdb.c:733 #8 0xffffffff810eedd3 in trap \
(frame=0xfffffe01531a6f80) at /usr/src/sys/amd64/amd64/trap.c:609 #9 <signal handler \
called> #10 kdb_enter (why=0xffffffff812e9008 "panic", msg=<optimized out>) at \
/usr/src/sys/kern/subr_kdb.c:506 #11 0xffffffff80c277e8 in vpanic \
(fmt=0xffffffff811f49e6 "running but not TDS_RUNNING", ap=<optimized out>, \
ap@entry=0xfffffe01531a70f0) at /usr/src/sys/kern/kern_shutdown.c:908 #12 \
0xffffffff80c27573 in panic (fmt=0xffffffff81e9e1e0 <cnputs_mtx> \
"\270\\*\201\377\377\377\377") at /usr/src/sys/kern/kern_shutdown.c:844
#13 0xffffffff80c856f7 in sleepq_switch (wchan=wchan@entry=0xfffff800063b0888, \
pri=pri@entry=0) at /usr/src/sys/kern/subr_sleepqueue.c:609
#14 0xffffffff80c85b2b in sleepq_timedwait (wchan=wchan@entry=0xfffff800063b0888, \
pri=pri@entry=0) at /usr/src/sys/kern/subr_sleepqueue.c:690
#15 0xffffffff80e9ae5f in linux_add_to_sleepqueue (wchan=0xfffff800063b0888, \
task=task@entry=0xfffff800063b0888, wmesg=<optimized out>, \
timeout=timeout@entry=11, state=state@entry=2) at \
/usr/src/sys/compat/linuxkpi/common/src/linux_schedule.c:70 #16 0xffffffff80e9acb6 in \
linux_wait_event_common (wqh=<optimized out>, wq=<optimized out>, timeout=11, \
state=2, lock=0x0) at /usr/src/sys/compat/linuxkpi/common/src/linux_schedule.c:281
#17 0xffffffff839b4960 in intel_dp_aux_xfer () from /boot/modules/i915kms.ko
#18 0xffffffff839b4269 in intel_dp_aux_transfer () from /boot/modules/i915kms.ko
#19 0xffffffff83a74b79 in drm_dp_dpcd_access () from /boot/modules/drm.ko
#20 0xffffffff83a74c27 in drm_dp_dpcd_write () from /boot/modules/drm.ko
#21 0xffffffff839acb62 in intel_dp_sink_dpms () from /boot/modules/i915kms.ko
#22 0xffffffff839a8d76 in intel_ddi_post_disable () from /boot/modules/i915kms.ko
#23 0xffffffff8393fe7b in haswell_crtc_disable () from /boot/modules/i915kms.ko
#24 0xffffffff83955cdc in intel_old_crtc_state_disables () from \
/boot/modules/i915kms.ko #25 0xffffffff839543c5 in intel_atomic_commit_tail () from \
/boot/modules/i915kms.ko #26 0xffffffff839507da in intel_atomic_commit () from \
/boot/modules/i915kms.ko #27 0xffffffff83a6d378 in drm_client_modeset_commit_atomic \
() from /boot/modules/drm.ko #28 0xffffffff83a6d0d6 in \
drm_client_modeset_commit_force () from /boot/modules/drm.ko #29 0xffffffff83aae50a \
in drm_fb_helper_restore_fbdev_mode_unlocked () from /boot/modules/drm.ko #30 \
0xffffffff83aa8397 in vt_kms_postswitch () from /boot/modules/drm.ko #31 \
0xffffffff80a79f8f in vt_window_switch (vw=0xfffffe014eee7928, \
vw@entry=0xffffffff81aea0f8 <vt_conswindow>) at /usr/src/sys/dev/vt/vt_core.c:620
#32 0xffffffff80a773af in vtterm_cngrab (tm=<optimized out>) at \
/usr/src/sys/dev/vt/vt_core.c:1832 #33 0xffffffff80bbba56 in cngrab () at \
/usr/src/sys/kern/kern_cons.c:383 #34 0xffffffff80c27720 in vpanic \
(fmt=0xffffffff82835173 "%s", ap=ap@entry=0xfffffe01531a7930) at \
/usr/src/sys/kern/kern_shutdown.c:892 #35 0xffffffff8258361a in spl_panic \
(file=<optimized out>, func=<optimized out>, line=-2127879961, \
fmt=0xffffffff81e9e1e0 <cnputs_mtx> "\270\\*\201\377\377\377\377") at \
/usr/src/sys/contrib/openzfs/module/os/freebsd/spl/spl_misc.c:107 #36 \
0xffffffff82749312 in zap_deref_leaf (zap=zap@entry=0xfffff8035a34d600, h=<optimized \
out>, tx=tx@entry=0x0, lt=lt@entry=RW_READER, lp=lp@entry=0xfffffe01531a7ab0) at \
/usr/src/sys/contrib/openzfs/module/zfs/zap.c:623 #37 0xffffffff8274b9a7 in \
fzap_cursor_retrieve (zap=0xffffffff81e9e1e0 <cnputs_mtx>, \
zc=zc@entry=0xfffffe01531a7aa0, za=za@entry=0xfffffe01531a7b78) at \
/usr/src/sys/contrib/openzfs/module/zfs/zap.c:1249 #38 0xffffffff82753fde in \
zap_cursor_retrieve (zc=zc@entry=0xfffffe01531a7aa0, za=za@entry=0xfffffe01531a7b78) \
at /usr/src/sys/contrib/openzfs/module/zfs/zap_micro.c:1583 #39 0xffffffff825a8a43 in \
zfs_readdir (vp=<optimized out>, uio=0xfffffe01531a7b60, cr=<optimized out>, \
eofp=0xfffffe01531a7d04, ncookies=<optimized out>, cookies=0x0) at \
/usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:1814 #40 \
zfs_freebsd_readdir (ap=<optimized out>) at \
/usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:4729 #41 \
0xffffffff811b8004 in VOP_READDIR_APV (vop=0xffffffff828f3360 <zfs_vnodeops>, \
a=a@entry=0xfffffe01531a7d08) at vnode_if.c:1939
#42 0xffffffff80d2a919 in VOP_READDIR (vp=0xfffff8035a1d5540, uio=0xfffffe01531a7d48, \
cred=<optimized out>, eofflag=0xfffffe01531a7d04, ncookies=0x0, cookies=0x0) at \
./vnode_if.h:985 #43 kern_getdirentries (td=<optimized out>, fd=<optimized out>,
buf=0x814a05000 <error: Cannot access memory at address 0x814a05000>, count=4096, \
basep=basep@entry=0xfffffe01531a7df0, residp=residp@entry=0x0, \
bufseg=UIO_USERSPACE) at /usr/src/sys/kern/vfs_syscalls.c:4193 #44 0xffffffff80d2ab19 \
in sys_getdirentries (td=0xffffffff81e9e1e0 <cnputs_mtx>, \
uap=0xfffffe0153346b30)
--Type <RET> for more, q to quit, c to continue without paging--
at /usr/src/sys/kern/vfs_syscalls.c:4140
#45 0xffffffff810efbae in syscallenter (td=<optimized out>) at \
/usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189 #46 amd64_syscall \
(td=0xfffffe0153346740, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1191 #47 <signal \
handler called> #48 0x000000080f1d79fa in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdfdfc418
(kgdb) frame 36
#36 0xffffffff82749312 in zap_deref_leaf (zap=zap@entry=0xfffff8035a34d600, \
h=<optimized out>, tx=tx@entry=0x0, lt=lt@entry=RW_READER, \
lp=lp@entry=0xfffffe01531a7ab0) at /usr/src/sys/contrib/openzfs/module/zfs/zap.c:623 \
623 ASSERT(err ||
(kgdb) p err
$10 = <optimized out>
(kgdb) p h
$11 = <optimized out>
(kgdb) p **lp
$13 = {l_dbu = {dbu_tqent = {tqent_task = {ta_link = {stqe_next = 0x0}, ta_pending = \
0, ta_priority = 0 '\000',
ta_flags = 0 '\000', ta_func = 0x0, ta_context = 0x0}, tqent_timeout_task = \
{q = 0x0, t = {ta_link = {
stqe_next = 0x0}, ta_pending = 0, ta_priority = 0 '\000', ta_flags = 0 \
'\000', ta_func = 0x0,
ta_context = 0x0}, c = {c_links = {le = {le_next = 0x0, le_prev = 0x0}, sle \
= {sle_next = 0x0}, tqe = {
tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_precision = 0, c_arg = \
0x0, c_func = 0x0, c_lock = 0x0,
c_flags = 0, c_iflags = 0, c_cpu = 0}, f = 0}, tqent_func = 0x0, tqent_arg \
= 0x0, tqent_id = 0, tqent_hash = {
cle_next = 0x0, cle_prev = 0x0}, tqent_type = 0 '\000', tqent_registered = 0 \
'\000', tqent_cancelled = 0 '\000',
tqent_rc = 0}, dbu_evict_func_sync = 0xffffffff8274c5d0 <zap_leaf_evict_sync>, \
dbu_evict_func_async = 0x0, dbu_clear_on_evict_dbufp = 0xfffff8035d29ca08}, \
l_rwlock = {lock_object = {
lo_name = 0xffffffff82865b9f <.L.str.25+1> "l->l_rwlock", lo_flags = 40960000, \
lo_data = 0, lo_witness = 0x0}, sx_lock = 33}, l_blkid = 182, l_bs = 14, l_dbuf = \
0xfffff8035d291b28}
(kgdb) info register
rax 0x12 18
rbx 0x0 0
rcx 0x1 1
rdx 0xffffffff812b20e7 -2127879961
rsi 0x80 128
rdi 0xffffffff81e9e1e0 -2115378720
rbp 0xfffffe01531a79a0 0xfffffe01531a79a0
rsp 0xfffffe01531a7960 0xfffffe01531a7960
r8 0x0 0
r9 0x1c200001 471859201
r10 0xffffffff81aea0f8 -2119261960
r11 0x20 32
r12 0x2 2
r13 0x0 0
r14 0xfffff8035a34d600 -8781694708224
r15 0x42 66
rip 0xffffffff82749312 0xffffffff82749312 <zap_deref_leaf+242>
eflags 0x86 [ PF SF ]
cs 0x20 32
ss 0x28 40
ds <unavailable>
es <unavailable>
fs <unavailable>
gs <unavailable>
fs_base <unavailable>
gs_base <unavailable>
(kgdb) disassemble zap_deref_leaf
Dump of assembler code for function zap_deref_leaf:
0xffffffff82749220 <+0>: push %rbp
0xffffffff82749221 <+1>: mov %rsp,%rbp
0xffffffff82749224 <+4>: push %r15
0xffffffff82749226 <+6>: push %r14
0xffffffff82749228 <+8>: push %r13
0xffffffff8274922a <+10>: push %r12
0xffffffff8274922c <+12>: push %rbx
0xffffffff8274922d <+13>: sub $0x18,%rsp
0xffffffff82749231 <+17>: mov %ecx,%r12d
0xffffffff82749234 <+20>: mov 0xe8(%rdi),%rax
0xffffffff8274923b <+27>: mov 0x18(%rax),%rax
0xffffffff8274923f <+31>: movabs $0x8000000000000001,%rcx
0xffffffff82749249 <+41>: cmp %rcx,(%rax)
0xffffffff8274924c <+44>: jg 0xffffffff82749314 <zap_deref_leaf+244>
0xffffffff82749252 <+50>: movabs $0x2f52ab2ab,%rcx
0xffffffff8274925c <+60>: cmp %rcx,0x8(%rax)
0xffffffff82749260 <+64>: jne 0xffffffff82749314 <zap_deref_leaf+244>
0xffffffff82749266 <+70>: mov %rdx,%r13
0xffffffff82749269 <+73>: mov %rsi,%r15
0xffffffff8274926c <+76>: mov %rdi,%r14
0xffffffff8274926f <+79>: mov %r8,-0x30(%rbp)
0xffffffff82749273 <+83>: mov 0x20(%rax),%rax
0xffffffff82749277 <+87>: mov %eax,%ecx
0xffffffff82749279 <+89>: neg %cl
0xffffffff8274927b <+91>: shr %cl,%rsi
0xffffffff8274927e <+94>: test %rax,%rax
0xffffffff82749281 <+97>: cmove %rax,%rsi
0xffffffff82749285 <+101>: lea -0x38(%rbp),%rdx
0xffffffff82749289 <+105>: call 0xffffffff8274aae0 <zap_idx_to_blk>
0xffffffff8274928e <+110>: mov %eax,%ebx
0xffffffff82749290 <+112>: test %eax,%eax
0xffffffff82749292 <+114>: jne 0xffffffff8274933a <zap_deref_leaf+282>
0xffffffff82749298 <+120>: mov -0x38(%rbp),%rsi
0xffffffff8274929c <+124>: mov %r14,%rdi
0xffffffff8274929f <+127>: mov %r13,%rdx
0xffffffff827492a2 <+130>: mov %r12d,%ecx
0xffffffff827492a5 <+133>: mov -0x30(%rbp),%r8
0xffffffff827492a9 <+137>: call 0xffffffff8274bf60 <zap_get_leaf_byblk>
0xffffffff827492ae <+142>: mov %eax,%ebx
0xffffffff827492b0 <+144>: test %eax,%eax
0xffffffff827492b2 <+146>: jne 0xffffffff8274933a <zap_deref_leaf+282>
0xffffffff827492b8 <+152>: mov -0x30(%rbp),%rax
0xffffffff827492bc <+156>: mov (%rax),%rax
0xffffffff827492bf <+159>: mov 0x108(%rax),%rax
0xffffffff827492c6 <+166>: mov 0x18(%rax),%rax
0xffffffff827492ca <+170>: mov 0x20(%rax),%edx
0xffffffff827492cd <+173>: mov %edx,%ecx
0xffffffff827492cf <+175>: neg %cl
0xffffffff827492d1 <+177>: shr %cl,%r15
0xffffffff827492d4 <+180>: xor %ebx,%ebx
0xffffffff827492d6 <+182>: test %dx,%dx
0xffffffff827492d9 <+185>: mov $0x0,%ecx
0xffffffff827492de <+190>: cmovne %r15,%rcx
0xffffffff827492e2 <+194>: cmp 0x10(%rax),%rcx
0xffffffff827492e6 <+198>: je 0xffffffff8274933a <zap_deref_leaf+282>
0xffffffff827492e8 <+200>: xor %ebx,%ebx
0xffffffff827492ea <+202>: mov $0xffffffff82869886,%rdi
0xffffffff827492f1 <+209>: mov $0xffffffff8282e438,%rsi
0xffffffff827492f8 <+216>: mov $0x271,%edx
0xffffffff827492fd <+221>: mov $0xffffffff82835173,%rcx
0xffffffff82749304 <+228>: mov $0xffffffff8283164f,%r8
0xffffffff8274930b <+235>: xor %eax,%eax
0xffffffff8274930d <+237>: call 0xffffffff825835e0 <spl_panic>
=> 0xffffffff82749312 <+242>: jmp 0xffffffff8274933a <zap_deref_leaf+282>
0xffffffff82749314 <+244>: mov 0x1a5f4d(%rip),%rdi # 0xffffffff828ef268 \
<sdt_sdt___set__error+72> 0xffffffff8274931b <+251>: mov $0x5,%ebx
0xffffffff82749320 <+256>: test %rdi,%rdi
0xffffffff82749323 <+259>: je 0xffffffff8274933a <zap_deref_leaf+282>
--Type <RET> for more, q to quit, c to continue without paging--
0xffffffff82749325 <+261>: mov $0x5,%esi
0xffffffff8274932a <+266>: xor %edx,%edx
0xffffffff8274932c <+268>: xor %ecx,%ecx
0xffffffff8274932e <+270>: xor %r8d,%r8d
0xffffffff82749331 <+273>: xor %r9d,%r9d
0xffffffff82749334 <+276>: call *-0xc364d2(%rip) # 0xffffffff81b12e68 \
<sdt_probe_func> 0xffffffff8274933a <+282>: mov %ebx,%eax
0xffffffff8274933c <+284>: add $0x18,%rsp
0xffffffff82749340 <+288>: pop %rbx
0xffffffff82749341 <+289>: pop %r12
0xffffffff82749343 <+291>: pop %r13
0xffffffff82749345 <+293>: pop %r14
0xffffffff82749347 <+295>: pop %r15
0xffffffff82749349 <+297>: pop %rbp
0xffffffff8274934a <+298>: ret
End of assembler dump.
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic