[prev in list] [next in list] [prev in thread] [next in thread] 

List:       dragonfly-bugs
Subject:    [DragonFlyBSD - Bug #2436] panic: assertion "lp->lwp_qcpu == dd->cpuid" failed in dfly_acquire_curpr
From:       Matthew Dillon via Redmine <bugtracker-admin () leaf ! dragonflybsd ! org>
Date:       2013-01-23 19:07:48
Message-ID: redmine.journal-11216.20130123110748 () leaf ! dragonflybsd ! org
[Download RAW message or body]


Issue #2436 has been updated by dillon.


Well, the assertion protects against the scheduling metrics being applied to the \
wrong cpu, which over time would cause cpus to be improperly weighted and potentially \
locked out for no reason.  The assertion itself is correct, the question is why is it \
being hit?

So far I haven't had any luck tracking down code paths where qcpu would be wrong.  \
The only possibility I can think of is that whatever system call that backtrace \
indicates the thread was calling (can't tell from the backtrace), it may have moved \
the thread to a different current cpu without telling the scheduler.

I need a kernel core to determine whether that is the case or not.

-Matt
----------------------------------------
Bug #2436: panic: assertion "lp->lwp_qcpu == dd->cpuid" failed in \
dfly_acquire_curproc http://bugs.dragonflybsd.org/issues/2436

Author: thomas.nikolajsen
Status: New
Priority: Normal
Assignee: 
Category: 
Target version: 


On current master changing cpumask using dfly scheduler can result in panic.
Problem is on both DragonFly i386 & x86_64.
Scheduler bsd4 doesn't have this problem.

E.g. on 8 core system running 'usched dfly:3 true' a few times while doing \
buildkernel triggers panic. Core dump avail on request.

 -thomas
-
Unread portion of the kernel message buffer:
panic: assertion "lp->lwp_qcpu == dd->cpuid" failed in dfly_acquire_curproc at \
/usr/src/sys/kern/usched_dfly.c:382 cpuid = 0
Trace beginning at frame 0xe4347c54
panic(ffffffff,0,c0396874,e4347c88,d92e1b80) at panic+0x1a8 0xc01bf150
panic(c0396874,c03af2b4,c03af386,c03af174,17e) at panic+0x1a8 0xc01bf150
dfly_acquire_curproc(daee0e00,e4347d00,10,0,0) at dfly_acquire_curproc+0x1ca \
0xc01ca47b syscall2(e4347d40) at syscall2+0x420 0xc037b5af
Xint0x80_syscall() at Xint0x80_syscall+0x36 0xc034c246
Debugger("panic")

CPU0 stopping CPUs: 0x000000fe
 stopped
..
_get_mycpu () at ./machine/thread.h:79
79          __asm ("movl %%fs:globaldata,%0" : "=r" (gd) : "m"(__mycpu__dummy));
(kgdb) bt
#0  _get_mycpu () at ./machine/thread.h:79
#1  md_dumpsys (di=0xc079d820)
    at /usr/src/sys/platform/pc32/i386/dump_machdep.c:266
#2  0xc01be8fe in dumpsys () at /usr/src/sys/kern/kern_shutdown.c:925
#3  0xc015938a in db_fncall (dummy1=-1070290686, dummy2=0,
    dummy3=-1072326021, dummy4=0xe4347ae4 "4m4\300\037\361<\300")
    at /usr/src/sys/ddb/db_command.c:539
#4  0xc015986f in db_command (aux_cmd_tablep_end=0xc03ee69c,
    aux_cmd_tablep=0xc03ee698, cmd_table=<optimized out>,
    last_cmdp=<optimized out>) at /usr/src/sys/ddb/db_command.c:401
#5  db_command_loop () at /usr/src/sys/ddb/db_command.c:467
#6  0xc015c3ce in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_trap.c:71
#7  0xc034ac75 in kdb_trap (type=3, code=0, regs=0xe4347c04)
    at /usr/src/sys/platform/pc32/i386/db_interface.c:151
#8  0xc037ae34 in trap (frame=0xe4347c04)
    at /usr/src/sys/platform/pc32/i386/trap.c:850
#9  0xc034c197 in calltrap ()
    at /usr/src/sys/platform/pc32/i386/exception.s:787
#10 0xc034a902 in breakpoint () at ./cpu/cpufunc.h:72
#11 Debugger (msg=0xc03ad70a "panic")
    at /usr/src/sys/platform/pc32/i386/db_interface.c:333
#12 0xc01bf165 in panic (
    fmt=0xc0396874 "assertion \"%s\" failed in %s at %s:%u")
    at /usr/src/sys/kern/kern_shutdown.c:822
#13 0xc01ca47b in dfly_acquire_curproc (lp=0xdaee0e00)
    at /usr/src/sys/kern/usched_dfly.c:382
#14 0xc037b5af in userexit (lp=<optimized out>)
    at /usr/src/sys/platform/pc32/i386/trap.c:362
#15 syscall2 (frame=0xe4347d40) at /usr/src/sys/platform/pc32/i386/trap.c:1419
#16 0xc034c246 in Xint0x80_syscall ()
    at /usr/src/sys/platform/pc32/i386/exception.s:878
#17 0x0000001f in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)


-- 
You have received this notification because you have either subscribed to it, or are \
involved in it. To change your notification preferences, please click here: \
http://bugs.dragonflybsd.org/my/account


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic