[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-hotspot-runtime-dev
Subject:    Re: Thread stack size issue related to glibc TLS bug
From:       Jiangli Zhou <jianglizhou () google ! com>
Date:       2019-05-30 15:23:12
Message-ID: CALrW1jxO1yKLgJjO0QiuA8rinmJXOrdRLK4cdVPGe3-FNjfH6w () mail ! gmail ! com
[Download RAW message or body]

Hi David,

This is a link to __pthread_get_minstack that I find in the public
domain: https://code.woboq.org/userspace/glibc/nptl/nptl-init.c.html.
It has copyright of 2002-2019, so it's probably the latest version.

size_t
__pthread_get_minstack (const pthread_attr_t *attr)
{
 return GLRO(dl_pagesize) + __static_tls_size + PTHREAD_STACK_MIN;
}

The PTHREAD_STACK_MIN appears to be 4-pages for the glibc version
(2.24) that I'm linking with. The dl_pagesize is 1-page.

We could go with the suggestion that you brought up earlier by only
adding the min_stack_size to the current thread stack size if it is
certain percent of the stack size. With the added dl_pagesize and
PTHREAD_STACK_MIN, 25% probably is reasonable? I've made changes
(including the percent suggestion) on top the existing patch and also
added a jtreg test based on the test case attached in JDK-8130425. On
JDK 13, the test could fail with different symptoms (segfault or hang)
depending on the TLS sizes without the fix.

BTW, I noticed that Florian had already made the comment change in
glibc for __pthread_get_minstack. Thanks!

Best regards,
Jiangli

On Wed, May 29, 2019 at 10:42 PM David Holmes <david.holmes@oracle.com> wrote:
> 
> Hi Florian,
> 
> On 25/05/2019 6:50 am, Florian Weimer wrote:
> > * Jiangli Zhou:
> > 
> > > Hi Florian,
> > > 
> > > On Fri, May 24, 2019 at 2:46 AM Florian Weimer <fweimer@redhat.com> wrote:
> > > > 
> > > > * Jiangli Zhou:
> > > > 
> > > > > [3] change: http://cr.openjdk.java.net/~jiangli/tls_size/webrev/
> > > > > (contributed by Jeremy Manson)
> > > > 
> > > > _dl_get_tls_static_info is an internal symbol (it carries a
> > > > GLIBC_PRIVATE symbol version).  Its implementation can change at any
> > > > time.  Please do not do this.
> > > 
> > > Point taken. Thanks.
> > > 
> > > It appears that asan is already carrying the same type of fix:
> > > 
> > > https://github.com/gcc-mirror/gcc/blob/master/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cc
> > >  
> > > As the issue has not been able to be addressed in the glibc layer, all
> > > the others have to workaround it (and possibly by using the glibc
> > > private APIs, _dl_get_tls_static_info or __pthread_get_minstack). That
> > > would cause more dependencies on the private APIs.
> > 
> > _dl_get_tls_static_info changed ABI on i386 in glibc 2.27 already, so
> > you really don't want to use that (in case someone backported that
> > cleanup to an earlier version, although that might be bit unlikely).
> > 
> > The sanitizers are special and have a much shorter shelf life than the
> > OpenJDK binaries.
> > 
> > > One alternative (besides fixing glibc) may be making
> > > _dl_get_tls_static_info public. Would that be possible?
> > 
> > For __pthread_get_minstack, I can add a comment to the glibc sources
> > that if the ABI changes (or if TLS allocations are no longer considered
> > part of the stack), we should rename the function.  Then, as long as you
> > use a weak reference or dlsym (the latter is preferable for the sack of
> > RPM-based distributions which require special steps to reference
> > GLIBC_PRIVATE symbols directly), old binaries would keep working if we
> > make further changes.
> > 
> > Since _dl_get_tls_static_info already changed ABI once, I really can't
> > recommend its use.  Especially if you can work with
> > __pthread_get_minstack instead.
> 
> Can you explain for me what value this __pthread_get_minstack is defined
> to return? Will it accommodate any required TLS plus some other glibc
> specific overhead? How much memory are we talking about?
> 
> > The value of PTHREAD_STACK_MIN is rather problematic on x86-64 (for the
> > reasons I explained earlier), but it's not likely we are going to change
> > the value of the constant any time soon.  It's more likely that we are
> > going to work around the AVX-512 register file issue by providing
> > exactly four usable stack pages with PTHREAD_STACK_MIN, and not less
> > than two as we did until recently.  So you can assume that it's indeed
> > possible to use PTHREAD_STACK_MIN and sysconf (_SC_PAGESIZE) to compute
> > the static TLS area size.
> 
> Sorry can you elaborate on that calculation please?
> 
> Thanks,
> David
> -----
> 
> > Thanks,
> > Florian
> > 


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic