[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openbsd-bugs
Subject:    Re: VM panic'ed in _rb_remove
From:       Paul de Weerd <weerd () weirdnet ! nl>
Date:       2021-07-11 16:57:30
Message-ID: YOsi+vfIkiXt5Hp4 () despair ! weirdnet ! nl
[Download RAW message or body]

About half an hour ago, my home gateway (not a VM) crashed in a
similar way:

uvm_fault(0xfffffd886b74ccc8, 0x18, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at      _rb_remove+0x294:       cmpl    $0x1,0x18(%rdi)
    TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
*494318  96931    864           0          0    1K rsync
_rb_remove(ffffffff81f19df0,fffffd8100015000,fffffd8124af9c00) at _rb_remove+0x294
uvm_pmr_get1page(8,0,ffff800033548400,100000,0,0) at uvm_pmr_get1page+0x467
uvm_pmr_getpages(8,100000,0,1,0,8) at uvm_pmr_getpages+0x40a
uvm_pagerealloc_multi(fffffd85042ff990,0,8000,22,ffffffff822597c0) at \
uvm_pagerealloc_multi+0xce buf_realloc_pages(fffffd85042ff8d0,ffffffff822597c0,2) at \
buf_realloc_pages+0xaf buf_flip_high(fffffd85042ff8d0) at buf_flip_high+0x70
bufcache_recover_dmapages(0,8) at bufcache_recover_dmapages+0x10b
buf_get(fffffd82cd153430,75,8000) at buf_get+0xcf
getblk(fffffd82cd153430,75,8000,0,ffffffffffffffff) at getblk+0x71
ffs1_balloc(fffffd846ebfc790,3a8000,2da,fffffd876b8e10c8,1,ffff800033548a90) at \
ffs1_balloc+0xd19 ffs_write(ffff800033548b10) at ffs_write+0x229
VOP_WRITE(fffffd82cd153430,ffff800033548c78,1,fffffd876b8e10c8) at VOP_WRITE+0x4f
vn_write(fffffd87a7aec7f8,ffff800033548c78,0) at vn_write+0xcf
dofilewritev(ffff8000335b02a0,6,ffff800033548c78,0,ffff800033548d50) at \
dofilewritev+0x14d end trace frame: 0xffff800033548ce0, count: 0
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports.  Insufficient info makes it difficult to find and fix bugs.

Obviously different, but same _rb_remove function at play.

In this instance, the machine was just rsync'ing the latest OpenBSD
snapshot from a local mirror (i have a cron that keeps the amd64 snap
in sync, for easy installs on a bunch of local VMs).

This machine does not have NFS mounts, but is NFS server for the local
network (serving the amd64 snap to local machines).

Paul


On Wed, Jul 07, 2021 at 08:46:18AM +0200, Paul de Weerd wrote:
> Hi all,
> 
> This morning I found one of my vmm VMs at the ddb> prompt.  Mail, logs
> and the ps output from ddb all suggest this was during the daily(8)
> run of security(8): i did get the daily mail, but not the one from
> security (I was expecting one, as the machine was upgraded a few hours
> earlier).
> 
> This is what ddb said:
> 
> uvm_fault(0xfffffd80324d7660, 0x4, 0, 2) -> e
> kernel: page fault trap, code=0
> Stopped at      _rb_remove+0x1eb:       movq    %r13,0(%rsi)
> TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
> *146794   1685      0         0x2          0    0  perl
> ddb> show panic
> *cpu0: uvm_fault(0xfffffd80324d7660, 0x4, 0, 2) -> e
> ddb> trace
> _rb_remove(ffffffff81ec86b8,4,fffffd80356113e0) at _rb_remove+0x1eb
> nfs_reclaim(ffff800014e32d70) at nfs_reclaim+0x7d
> VOP_RECLAIM(fffffd8033259db0,ffff800014cfad20) at VOP_RECLAIM+0x50
> vclean(fffffd8033259db0,8,ffff800014cfad20) at vclean+0x156
> vgonel(fffffd8033259db0,ffff800014cfad20) at vgonel+0x5f
> getnewvnode(1,ffff8000000e5c00,ffffffff81f40378,ffff800014e32ef0) at \
> getnewvnode+0x1eb
> ffs_vget(ffff8000000e5c00,32d5a,ffff800014e32fc0) at ffs_vget+0x8a
> ufs_lookup() at ufs_lookup+0xc6e
> VOP_LOOKUP(fffffd803c4bd2f8,ffff800014e33200,ffff800014e33250) at VOP_LOOKUP+0x46
> vfs_lookup(ffff800014e331d0) at vfs_lookup+0x363
> namei(ffff800014e331d0) at namei+0x275
> dofstatat(ffff800014cfad20,ffffff9c,92364a7ea00,92326352390,2) at dofstatat+0x8c
> syscall(ffff800014e33450) at syscall+0x359
> Xsyscall() at Xsyscall+0x128
> end of kernel
> end trace frame: 0x7f7ffffed300, count: -14
> 
> Since there's a reference to nfs (nfs_reclaim) in the trace: this
> machine does have an NFS mount to a NFS server in my local network.
> 
> I tried to follow the instructions of ddb.html:
> 
> - _rb_remove comes from kern/tree_subr.c
> - objdump -dlr /usr/share/relink/kernel/GENERIC/tree_subr.o > /tmp/ts
> - ==> 0000000000000000 <_rb_remove>:
> - instruction at 00000000000001eb: 1eb:   4c 89 2e   mov %r13,(%rsi)
> - instruction 'mov %r13,(%rsi)' matches ddb output
> - that's from /usr/src/sys/kern/subr_tree.c:381
> - that line has: RBH_ROOT(rbt) = child;
> 
> This is way above my pay grade, but hopefully useful to someone else.
> 
> Cheers,
> 
> Paul 'WEiRD' de Weerd
> 
> --- dmesg ------------------------------------------------------------
> OpenBSD 6.9-current (GENERIC) #101: Mon Jul  5 10:31:56 MDT 2021
> deraadt@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> real mem = 1056952320 (1007MB)
> avail mem = 1009606656 (962MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf36e0 (10 entries)
> bios0: vendor SeaBIOS version "1.14.0-OpenBSD-vmm" date 01/01/2011
> bios0: OpenBSD VMM
> acpi at bios0 not configured
> cpu0 at mainbus0: (uniprocessor)
> cpu0: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3394.35 MHz, 06-3c-03
> cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,CX8,SEP,PGE,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2 \
> ,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV \
> ,NXE,PAGE1GB,LONG,LAHF,ABM,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,MD_CLEAR,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> cpu0: using VERW MDS workaround
> pvbus0 at mainbus0: OpenBSD
> pvclock0 at pvbus0
> pci0 at mainbus0 bus 0
> pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00
> virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00
> viornd0 at virtio0
> virtio0: irq 3
> virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Network" rev 0x00
> vio0 at virtio1: address fe:e1:bb:d1:c6:d9
> virtio1: irq 5
> virtio2 at pci0 dev 3 function 0 "Qumranet Virtio Storage" rev 0x00
> vioblk0 at virtio2
> scsibus1 at vioblk0: 1 targets
> sd0 at scsibus1 targ 0 lun 0: <VirtIO, Block Device, >
> sd0: 10240MB, 512 bytes/sector, 20971520 sectors
> virtio2: irq 6
> virtio3 at pci0 dev 4 function 0 "OpenBSD VMM Control" rev 0x00
> vmmci0 at virtio3
> virtio3: irq 7
> isa0 at mainbus0
> isadma0 at isa0
> com0 at isa0 port 0x3f8/8 irq 4: ns8250, no fifo
> com0: console
> dt: 445 probes
> vscsi0 at root
> scsibus2 at vscsi0: 256 targets
> softraid0 at root
> scsibus3 at softraid0: 256 targets
> root on sd0a (f63229f8671d4a9a.a) swap on sd0b dump on sd0b
> ----------------------------------------------------------------------
> 
> -- 
> > ++++++++[<++++++++++>-]<+++++++.>+++[<------>-]<.>+++[<+
> +++++++++++>-]<.>++[<------------>-]<+.--------------.[-]
> http://www.weirdnet.nl/                 
> 

-- 
> ++++++++[<++++++++++>-]<+++++++.>+++[<------>-]<.>+++[<+
+++++++++++>-]<.>++[<------------>-]<+.--------------.[-]
                 http://www.weirdnet.nl/                 


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic