[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openbsd-bugs
Subject:    Re: OpenBSD amd64 6.9 repeatable kernel panic starting X
From:       M Smith <msbsd () ipsec ! co ! nz>
Date:       2021-09-15 21:50:57
Message-ID: a782c0b3-f4d1-f953-5127-4eeda546e36f () ipsec ! co ! nz
[Download RAW message or body]


On 16/09/21 2:29 am, Martin Pieuchot wrote:
> On 13/09/21(Mon) 08:25, M Smith wrote:
> > On 8/09/21 3:37 am, Martin Pieuchot wrote:
> > > Hello,
> > > 
> > > Thanks for your bug report.
> > > 
> > > On 07/09/21(Tue) 15:18, M Smith wrote:
> > > > > Synopsis:	OpenBSD amd64 6.9 repeatable kernel panic starting X
> > > > > Category:	kernel
> > > > > Environment:
> > > > 
> > > > 	System      : OpenBSD 6.9
> > > > 	Details     : OpenBSD 6.9 (GENERIC.MP) #4: Tue Aug 10 08:12:23 MDT 2021
> > > > 			root@syspatch-69-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> > > >  
> > > > 	Architecture: OpenBSD.amd64
> > > > 	Machine     : amd64
> > > > 
> > > > > Description:
> > > > 
> > > > 		I have been investigating a largely repeatable OpenBSD 6.9 amd64 panic.  \
> > > > Essentially the OS drops into the kernel debugger about 90% of the time when \
> > > > starting X on specific hardware, and is doing so with what seems like a \
> > > > memory related issue - possibly errant modification by concurrent threads.
> > > 
> > > Indeed.  You're certainly hitting a VM/pmap bug.
> > > 
> > > > 	The event is reproducible across two independent machines (both new).  Each \
> > > > machine has identical underlying hardware.  A memory checker run overnight on \
> > > > one machine did not identify any underlying memory issues.
> > > 
> > > That points to something in your setup which exposes the bug.
> > > 
> > > > 	The hardware: Avalue EMS-TGL-S85-A1-1R, CPU an 11th Gen Intel(R) Core(TM) \
> > > > i7-1185G7E @ 2.80GHz with 2x 16GB memory boards (32GB in total). 
> > > > 	The mentioned possible errant memory modification, the assertion underlying \
> > > > this panic (https://www.sirranet.co.nz/openbsd_542456/69_panic.html) suggests \
> > > > that kernel execution has failed to obtain a necessary exclusivity lock.  \
> > > > Various other panics differ in that many feature assertions based on \
> > > > "pool_do_get ... offset ???" with the offset identifying the trigger \
> > > > condition, hinting at a memory inconsistency. 
> > > > 	Testing on 7.0-current \
> > > > (https://www.sirranet.co.nz/openbsd_542456/70_panic.html) sometimes results \
> > > > in a panic on boot before invoking startX, other times the boot fails to \
> > > > complete cleanly at the kernel linking step with the error "reodering \
> > > > libraries ld in calloc(): chunk infor corrupted" and simular errors.  Whether \
> > > > these two events are related to the 6.9 panic is anything but conclusive. 
> > > > 	I see others have posted what looks like the same issue.  I have posted the \
> > > > above detail however as the assert identifying the lack of kernel lock looks \
> > > > as though it may be of some value.  \
> > > > https://marc.info/?t=161769314800002&r=1&w=2  \
> > > > https://marc.info/?t=162390602600001&r=1&w=2
> > > 
> > > All those report have in common a 1th Gen Intel CPU.
> > > 
> > > > 	Any ideas would be greatly appreciated.
> > > 
> > > You could start by booting bsd.sp to rule out any HW problem.
> > 
> > Sorry for the delay in replying.
> > 
> > Both 6.9 and 7.0 crash when booting bsd.sp
> > https://www.sirranet.co.nz/openbsd_542456/69_reply.html
> > https://www.sirranet.co.nz/openbsd_542456/70_reply.html
> 
> That rules out any concurrency issue.
> 
> > > Does the corruption happen with a vanilla install or does running
> > > particular program makes it easier to happen?
> > 
> > These are both basic installs. After a fresh install I have run fw_update,
> > and on the 6.9 machine syspatch was run. Other than that we have enabled
> > xenodm. No other software or packages are installed or running. The machines
> > don't always crash on first boot, but after a handful of reboot they do.
> > 
> > > > 	I can easily test/re-test on both 6.9 and 7.0-current).
> > > 
> > > Does it also happen if you disable drm at boot?
> > > 
> > 
> > On both 6.9 and 7.0  if I disable drm the machine panics on reboot. (Images
> > in the links above.)
> 
> Please make sure you also disable inteldrm(4).  That's why you're
> getting a panic on 6.9.  This is to see if the issue is related to
> the graphic driver.
> 
> 

With drm and inteldrm installed 6.9 still crashes often. I booted a 10 
times. A few times it booted, the rest of the time it crashed. (Again 
this is with a basic install with xenodm enabled.)
https://www.sirranet.co.nz/openbsd_542456/69_drm_inteldrm.html

Thanks
Megan


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic