[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openembedded-core
Subject:    [OE-core] [PATCH 1/1] glibc: Allow 64 bit atomics for x86
From:       juro.bystricky () intel ! com (Bystricky, Juro)
Date:       2015-10-30 16:02:22
Message-ID: 6E51916E4A1F32428260031F4C7CD2B6108DAF13 () ORSMSX109 ! amr ! corp ! intel ! com
[Download RAW message or body]

So I put my insomnia to a good use and I think I can explain everything.
The offending code is in nptl/sem_wait.c:

#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_1)
int
attribute_compat_text_section
__old_sem_wait (sem_t *sem)
{
  int *futex = (int *) sem;
  int err;

  do
    {
      if (atomic_decrement_if_positive (futex) > 0)
	return 0;

      /* Enable asynchronous cancellation.  Required by the standard.  */
      int oldtype = __pthread_enable_asynccancel ();

      /* Always assume the semaphore is shared.  */
      err = lll_futex_wait (futex, 0, LLL_SHARED);

      /* Disable asynchronous cancellation.  */
      __pthread_disable_asynccancel (oldtype);
    }
  while (err == 0 || err == -EWOULDBLOCK);

  __set_errno (-err);
  return -1;
}

This is where we were looping forever, since the value in futex was -2.
So "atomic_decrement_if_positive" was not executed, never reaching "return 0".
The reason why the value was -2 is because struct_new_sem has changed from glibc 2.20 \
to 2.221:

/* Semaphore variable structure.  */
struct new_sem
{
#if __HAVE_64B_ATOMICS
  /* The data field holds both value (in the least-significant 32 bytes) and
     nwaiters.  */
# if __BYTE_ORDER == __LITTLE_ENDIAN
#  define SEM_VALUE_OFFSET 0
# elif __BYTE_ORDER == __BIG_ENDIAN
#  define SEM_VALUE_OFFSET 1
# else
# error Unsupported byte order.
# endif
# define SEM_NWAITERS_SHIFT 32
# define SEM_VALUE_MASK (~(unsigned int)0)
  uint64_t data;
  int private;
  int pad;
#else
# define SEM_VALUE_SHIFT 1
# define SEM_NWAITERS_MASK ((unsigned int)1)
  unsigned int value;
  unsigned int nwaiters;
  int private;
  int pad;
#endif
};

In particular for __HAVE_64B_ATOMICS == 0 the field "value" has a different meaning \
now: top 31 bits correspond to the value (shifted left by 1) and the bit 0 indicates \
if there are any nwaiters. (I guess the reasoning was semaphore value cannot be \
negative and so it only needs 31 bits and we can use the extra bit we saved for \
something else)  So when we initialize the semaphore value with SEM_VALUE_MAX   \
(2147483647) we actually store  0x7FFFFFFF << 1 which is 0xFFFFFFFE which is  (-2). 
(bitbake/pyhon/multiprocessing initialize some semaphores to SEM_VALUE_MAX)    
So the reason forcing  __HAVE_64B_ATOMICS to 1 works for x32 has not much to do with \
the  64bit atomics as such, but because the "value" filed is defined differently, as \
a 32bit value. Long story short,  the reason this is broken is that the "old code" \
uses new structures.  (to fix this   "atomic_decrement_if_positive"  needs to be \
replaced by a call that understands the new layout, similar to the "new code")

I will report  this to glibc developers and wait for their reply.

Juro




> -----Original Message-----
> From: Richard Purdie [mailto:richard.purdie at linuxfoundation.org]
> Sent: Friday, October 30, 2015 12:13 AM
> To: Randy Witt
> Cc: Bystricky, Juro; jurobystricky at hotmail.com; openembedded-
> core at lists.openembedded.org
> Subject: Re: [OE-core] [PATCH 1/1] glibc: Allow 64 bit atomics for x86
> 
> On Thu, 2015-10-29 at 15:29 -0700, Randy Witt wrote:
> > 
> > 
> > On Thu, Oct 29, 2015 at 3:20 PM, Richard Purdie
> > <richard.purdie at linuxfoundation.org> wrote:
> > On Thu, 2015-10-29 at 13:43 -0700, Juro Bystricky wrote:
> > > This patch fixes [YOCTO#8140].
> > > 
> > > The fix consist of allowing 64bit atomic ops for x86.
> > > This should be safe for i586 and newer CPUs.
> > > It also makes the synchronization more efficient.
> > > 
> > > Signed-off-by: Juro Bystricky <juro.bystricky at intel.com>
> > > ---
> > > .../glibc/glibc/use_64bit_atomics.patch            | 24
> > ++++++++++++++++++++++
> > > meta/recipes-core/glibc/glibc_2.22.bb              |  1 +
> > > 2 files changed, 25 insertions(+)
> > > create mode 100644
> > meta/recipes-core/glibc/glibc/use_64bit_atomics.patch
> > 
> > Since the patch is only changing nativesdk 32 bit x86 and we
> > know that
> > the 32 bit SDK is pretty broken at the moment I've merged this
> > on the
> > basis that it can't really make it any worse. There is
> > pressure to move
> > to the next rc candidate for 2.0.
> > 
> > I would like to get to the bottom of the real issue here and I
> > still
> > suspect 32 bit x86 images are likely broken too :/.
> > 
> > 
> > With Juro's change I built core-image-minimal using a 32-bit
> > buildtools on a 32-bit host(vm) and that appears to boot fine using
> > runqemu.
> 
> Right, but can we run bitbake in a qemux86 image (on target) or does it show
> the same issue as bug 8140? If it shows the same issue, futexes are likely
> bust on 32 bit x86 targets too.
> 
> Cheers,
> 
> Richard
> 
> 
> 
> 


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic