[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openbsd-bugs
Subject:    Re: Kernel panic involving drm
From:       Mark Kettenis <mark.kettenis () xs4all ! nl>
Date:       2021-07-14 8:09:08
Message-ID: 5613a3d1248de040 () bloch ! sibelius ! xs4all ! nl
[Download RAW message or body]

> Date: Wed, 14 Jul 2021 15:50:46 +1000
> From: Jonathan Gray <jsg@jsg.id.au>
> 
> On Tue, Jul 13, 2021 at 10:11:29PM +0100, Tom Murphy wrote:
> > Hi Jonathan,
> > 
> > On Tue, Jul 13, 2021 at 01:13:03PM +1000, Jonathan Gray wrote:
> > > On Mon, Jul 12, 2021 at 06:22:36PM +0000, Tom Murphy wrote:
> > > > I had firefox open (various tabs/windows) and was playing a 3D game
> > > > (games/quakespasm) and after a random amount of time I got a hard lock up,
> > > > but the second time it happened I was able to get into a ddb prompt. I've
> > > > added the panic message and trace and dmesg.
> > > > 
> > > > I don't have a serial console on this laptop so had to transcribe this by
> > > > hand from a photo I took on my phone. (Is there an easier way to save
> > > > these?)
> > > 
> > > It is possible to get a trace out of a crash dump, see crash(8).
> > > But yes serial or amt sol is easier.
> > 
> > Thanks! I'll have a closer look at crash(8).
> > 
> > > > panic: kernel diagnostic assertion "to_ticks >= 0" failed: file
> > > > "/usr/src/sys/kern/kern_timeout.c", line 299
> > > > Stopped at      db_enter+0x10:  popq   %rbp
> > > >     TID    PID    UID     PRFLAGS    PFLAGS   CPU     COMMAND
> > > > *395451  18070      0     0x14000     0x200     0K    drmtskl
> > > >   61931  46160      0     0x14000     0x200     2     drmwq
> > > >  185485  53694      0     0x14000     0x200     1     drmwq
> > > >  284820  77991      0     0x14000     0x200     3     drmwq
> > > > db_enter() at db_enter+0x10
> > > > panic(ffffffff81e5f243) at panic+0xbf
> > > > __assert(ffffffff81ec940a,ffffffff81eb2a3e,12b,ffffffff81ec1795) at
> > > > __assert+0x2b
> > > > timeout_add(ffff800000bf0410,ffffffff) at timeout_add+0x1cc
> > > > process_csb(ffff800000bf0000) at process_csb+0x38b
> > > > execlists_submission_tasklet(ffff800000bf0000) at
> > > > execlists_submission_tasklet+0x48
> > > > tasklet_run(ffff800000bf03c0) at tasklet_run+0x44
> > > > taskq_thread(ffff800000220f00) at taskq_thread+0x81
> > > > end trace frame: 0x0, count: 7
> > > 
> > > The timeout_add() call comes from i915_utils.c set_timer_ms()
> > > 	mod_timer(t, jiffies + timeout ?: 1);
> > > 
> > > can you try this patch?
> > > 
> > > Index: sys/dev/pci/drm/include/linux/timer.h
> > > ===================================================================
> > > RCS file: /cvs/src/sys/dev/pci/drm/include/linux/timer.h,v
> > > retrieving revision 1.6
> > > diff -u -p -r1.6 timer.h
> > > --- sys/dev/pci/drm/include/linux/timer.h	7 Jul 2021 02:38:36 -0000	1.6
> > > +++ sys/dev/pci/drm/include/linux/timer.h	13 Jul 2021 02:54:26 -0000
> > > @@ -24,10 +24,20 @@
> > >  #include <sys/kernel.h>
> > >  #include <linux/ktime.h>
> > >  
> > > -#define mod_timer(x, y)		timeout_add((x), ((y) - jiffies))
> > >  #define del_timer_sync(x)	timeout_del_barrier((x))
> > >  #define del_timer(x)		timeout_del((x))
> > >  #define timer_pending(x)	timeout_pending((x))
> > > +
> > > +static inline int
> > > +mod_timer(struct timeout *to, unsigned long j)
> > > +{
> > > +	int ticks = j - jiffies;
> > > +	if (ticks <= 0) {
> > > +		timeout_del(to);
> > > +		return timeout_add(to, 1);
> > > +	}
> > > +	return timeout_add(to, ticks);
> > > +}
> > >  
> > >  static inline unsigned long
> > >  round_jiffies_up(unsigned long j)
> > 
> > This patch seems to work for me. I did some pretty rigorous testing
> > with it and attempted to recreate the same conditions that made it
> > crash however I wasn't able to get the kernel panic so that is a good
> > sign!
> 
> Thanks, I committed a slightly different version which does the test
> without a type conversion.
> 
> Index: sys/dev/pci/drm/include/linux/timer.h
> ===================================================================
> RCS file: /cvs/src/sys/dev/pci/drm/include/linux/timer.h,v
> retrieving revision 1.6
> diff -u -p -r1.6 timer.h
> --- sys/dev/pci/drm/include/linux/timer.h	7 Jul 2021 02:38:36 -0000	1.6
> +++ sys/dev/pci/drm/include/linux/timer.h	14 Jul 2021 04:49:03 -0000
> @@ -24,10 +24,19 @@
>  #include <sys/kernel.h>
>  #include <linux/ktime.h>
>  
> -#define mod_timer(x, y)		timeout_add((x), ((y) - jiffies))
>  #define del_timer_sync(x)	timeout_del_barrier((x))
>  #define del_timer(x)		timeout_del((x))
>  #define timer_pending(x)	timeout_pending((x))
> +
> +static inline int
> +mod_timer(struct timeout *to, unsigned long j)
> +{
> +	if (j <= jiffies) {
> +		timeout_del(to);

Any reason why you do a timeout_del() here?

> +		return timeout_add(to, 1);
> +	}
> +	return timeout_add(to, j - jiffies);
> +}
>  
>  static inline unsigned long
>  round_jiffies_up(unsigned long j)
> 
> 

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic