[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-bugs-dist
Subject:    [valgrind] [Bug 339537] New: two threads hang in pthread_spin_lock and pthread_spin_unlock
From:       sage () newdream ! net <sage () newdream ! net>
Date:       2014-09-30 18:52:12
Message-ID: bug-339537-17878 () http ! bugs ! kde ! org/
[Download RAW message or body]

https://bugs.kde.org/show_bug.cgi?id=339537

            Bug ID: 339537
           Summary: two threads hang in pthread_spin_lock and
                    pthread_spin_unlock
           Product: valgrind
           Version: unspecified
          Platform: Ubuntu Packages
                OS: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: NOR
         Component: memcheck
          Assignee: jseward@acm.org
          Reporter: sage@newdream.net

I have two threads running under valgrind, one stuck in pthread_spin_lock, and
one stuck in pthread_spin_unlock:

Thread 43 (Thread 21954):
#0  pthread_spin_lock () at ../nptl/sysdeps/x86_64/pthread_spin_lock.S:33
#1  0x000000000068c11a in ceph_spin_lock (l=0x41264a8) at
./include/Spinlock.h:45
#2  lock (this=0x41264a8) at ./include/Spinlock.h:94
#3  Locker (s=..., this=<synthetic pointer>) at ./include/Spinlock.h:105
#4  is_active (this=0x4126000) at osd/OSD.h:1074
#5  OSD::dispatch_context (this=this@entry=0x4126000, ctx=..., pg=pg@entry=0x0,
curmap=..., handle=handle@entry=0x21b38a70) at osd/OSD.cc:7129
#6  0x0000000000698977 in OSD::process_peering_events (this=0x4126000, pgs=...,
handle=...) at osd/OSD.cc:8467
#7  0x00000000006edfb8 in OSD::PeeringWQ::_process (this=<optimized out>,
pgs=..., handle=...) at osd/OSD.h:1595
#8  0x0000000000b70ee6 in ThreadPool::worker (this=0x41264b0, wt=0xc412090) at
common/WorkQueue.cc:128
#9  0x0000000000b71f90 in ThreadPool::WorkThread::entry (this=<optimized out>)
at common/WorkQueue.h:318
#10 0x0000000005d72182 in start_thread (arg=0x21b39700) at pthread_create.c:312
#11 0x000000000784438d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 27 (Thread 21938):
#0  pthread_spin_unlock () at ../nptl/sysdeps/x86_64/pthread_spin_unlock.S:23
#1  0x00000000006597de in ceph_spin_unlock (l=0x41264a8) at
./include/Spinlock.h:50
#2  unlock (this=0x41264a8) at ./include/Spinlock.h:98
#3  ~Locker (this=<synthetic pointer>, __in_chrg=<optimized out>) at
./include/Spinlock.h:108
#4  is_active (this=0x4126000) at osd/OSD.h:1075
#5  OSD::require_self_aliveness (this=this@entry=0x4126000, op=...,
epoch=epoch@entry=2071) at osd/OSD.cc:6742
#6  0x00000000006776fe in OSD::require_same_or_newer_map
(this=this@entry=0x4126000, op=..., epoch=2071,
is_fast_dispatch=is_fast_dispatch@entry=false) at osd/OSD.cc:6802
#7  0x000000000069fd56 in OSD::handle_pg_notify (this=0x4126000, op=...) at
osd/OSD.cc:7310
#8  0x00000000006a2c58 in OSD::dispatch_op (this=this@entry=0x4126000, op=...)
at osd/OSD.cc:5690
#9  0x00000000006a81d8 in OSD::_dispatch (this=this@entry=0x4126000,
m=m@entry=0x333592a0) at osd/OSD.cc:5843
#10 0x00000000006a88a7 in OSD::ms_dispatch (this=0x4126000, m=0x333592a0) at
osd/OSD.cc:5386
#11 0x0000000000c203e9 in ms_deliver_dispatch (m=0x333592a0, this=0x4068700) at
msg/Messenger.h:532
#12 DispatchQueue::entry (this=0x40688b8) at msg/DispatchQueue.cc:185
#13 0x0000000000b5cd8d in DispatchQueue::DispatchThread::entry (this=<optimized
out>) at msg/DispatchQueue.h:104
#14 0x0000000005d72182 in start_thread (arg=0x19b29700) at pthread_create.c:312
#15 0x000000000784438d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This looks somewhat similar to #336435, but that fix should be in my version
(1:3.10~20140411-0ubuntu1)


Reproducible: Sometimes

Steps to Reproduce:
This occurs very infrequently in our regression tests, but we have seen it
several times.  It is tracked here:  http://tracker.ceph.com/issues/8822.




Valgrind us run like so:

 valgrind --num-callers=50 --suppressions=/home/ubuntu/cephtest/valgrind.supp
--xml=yes --xml-file=/var/log/ceph/valgrind/osd.1.log --time-stamp=yes
--tool=memcheck ceph-osd -f -i 1

Gdb tells me they are hung on the same pthread_spinlock_t, and it looks like
this:

(gdb) p _lock
$1 = {lock = -1}

Valgrind version is 1:3.10~20140411-0ubuntu1

OS is 
Distributor ID: Ubuntu
Description:    Ubuntu 14.04.1 LTS
Release:        14.04
Codename:       trusty

-- 
You are receiving this mail because:
You are watching all bug changes.
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic