[prev in list] [next in list] [prev in thread] [next in thread] 

List:       freebsd-hackers
Subject:    Re: 9.3-RELEASE panic: spin lock held too long
From:       Konstantin Belousov <kostikbel () gmail ! com>
Date:       2016-08-11 8:18:02
Message-ID: 20160811081802.GF83214 () kib ! kiev ! ua
[Download RAW message or body]

On Thu, Aug 11, 2016 at 03:43:22AM +0430, Hooman Fazaeli wrote:
> On 2016-08-10 22:10, Ryan Stone wrote:
> > On Wed, Aug 10, 2016 at 1:23 PM, Hooman Fazaeli <hoomanfazaeli@gmail.com \
> > <mailto:hoomanfazaeli@gmail.com>> wrote: 
> > No. I have panics involving 'turnstile lock' (see the original post) and 'sched \
> > lock 2' too. 
> > 
> > That doesn't necessarily mean that the root cause isn't due to sched lock 0 being \
> > leaked.  You'd have to dig into the cores and look at the chain of dependent \
> > locks to be sure.  Give the patch a  try; it should panic quite quickly if it's \
> > the issue I am thinking of.
> 
> Sure, I will.
> BTW, what do you exactly mean by lock leaking?
> 
> Is there a list for the possible causes of 'spin lock held too long' panics?
> I mean, what sorts of coding bugs may cause a thread to hold a spin lock for
> a long time? Such a list would provide me an starting point for diagnostics.
It is impossible to provide the complete list.

Possible causes are:
- already mentioned lock leak;
- lock recursion (sometimes);
- something which delays execution of the protected region, which takes the
  spinlock for otherwise legitimate reasons and period, eg.
	infinite or too aggressive looping, e.g. due to a deadlock
	with spinlocks;
	NMI with run-away handler;
	failed and stopped executing core;
	SMI or hypervisor taking control off the OS on the given CPU, while
	allowing other thread on other CPU to run and notice that.
and so on.

> 
> And, How much long is 'too long'? What is the justification behind
> the few million for() loop iterations that _mtx_lock_spin waits
> to grab a spin lock?
This is purely based on real-life experience on the hardware. If faster
CPUs with slower inter-core communication facilities ever appear, the
constant might need an adjustment. It is fine for currently fastest
hardware, and by design is ok for anything slower.
_______________________________________________
freebsd-hackers@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic