'Re: [LTP] Failing kernel/mem/mtest07/mallocstress (was: Re: [PATCH]'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ltp-list
Subject:    Re: [LTP] Failing kernel/mem/mtest07/mallocstress (was: Re: [PATCH]
From:       Daniel Gollub <dgollub () suse ! de>
Date:       2008-10-25 10:51:55
Message-ID: 200810251251.55882.dgollub () suse ! de
[Download RAW message or body]

On Saturday 25 October 2008 12:27:24 Daniel Gollub wrote:
> On Saturday 25 October 2008 03:13:57 Jin Bing Guo wrote:
> > Hi Daniel,
> >
> > On 10/24 15:32PM, Daniel Gollub wrote:
> > > On Friday 24 October 2008 15:15:54 Ji?? Pale鑕k wrote:
> > > > On Fri, 24 Oct 2008 14:16:29 +0200, Daniel Gollub <dgollub@suse.de>
>
> wrote:
> > > > > I'm looking right now also into the mallocstress testcase.
> > > > > With more recent kernel I experience (e.g. 2.6.27) failing
> > > > > mallocstress   on
> > > > > x86_64. (Didn't tested different architectures yet).
> > > > >
> > > > > On which kernel did you test mallocstress?
> > > > > 2.6.27? Or something different?
> > > >
> > > > 2.6.27-rc8, i386. However, I didn't notice until you asked, that when
> > > > I tested the patch, the test actually succeeded, which is very weird.
> > > > Before, I got a message "malloc: Cannot allocate memory". I have a
> > > > theory, that it is caused by swapping of the semop() and malloc()
> > > > calls (see the patch). That means before, a thread first waited on
> > > > the semaphore, and when it got released, other threads might have
> > > > been already stressing the memory, so there wasn't any free and
> > > > malloc() of the return variable would fail. If that is really the
> > > > case, the patch doesn't fix it, only lowers the probablity of such
> > > > behaviour. However, making a proper patch should be easy in that
> > > > case.
> > >
> > > On x86_64 i get slightly different problem:
> > >
> > > x86_64:~/:[1]# ulimit -c unlimited
> > > x86_64:~/:[0]# ./mallocstress
> > > Aborted (core dumped)
> > > x86_64:~/:[134]# uname -i
> > > x86_64
> > > x86_64:~/:[0]# uname -r
> > > 2.6.27.1-2-default
> > > x86_64:~/:[0]# gdb mallocstress core.5217
> > > [[[[[ .. snipped the default amount of threads .... ]]]]]]
> > > [New Thread 5221]
> > > Core was generated by `./mallocstress'.
> > > Program terminated with signal 6, Aborted.
> > > #0  0x00007f641d5f4725 in *__GI_raise (sig=<value optimized out>)
> > >    from /lib64/libc.so.6
> > > (gdb) bt
> > > #0  0x00007f641d5f4725 in *__GI_raise (sig=<value optimized out>)
> > >    from /lib64/libc.so.6
> > > #1  0x00007f641d5f5d13 in *__GI_abort () from /lib64/libc.so.6
> > > #2  0x00007f641d6380b0 in malloc_printerr (action=2,
> > >     str=0x7f641d6e501b "free(): invalid pointer", ptr=0x1461)
> > >    from /lib64/libc.so.6
> > > #3  0x0000000000400e48 in allocate_free (repeat=100, scheme=0)
> > >     at mallocstress.c:233
> > > #4  0x0000000000400f4e in alloc_mem (threadnum=0x7fff25d57fb4)
> > >     at mallocstress.c:281
> > > #5  0x00007f641d925070 in start_thread (arg=<value optimized out>)
> > >    from /lib64/libpthread.so.0
> > > #6  0x00007f641d697a7d in clone () from /lib64/libc.so.6
> > > #7  0x0000000000000000 in ?? ()
> > > (gdb)
> >
> > I also encountered this porblem on 2.6.27.1-2-ppc64 (SLES11 Beta3).
> > # uname -a
> > Linux venuslp12 2.6.27.1-2-ppc64 #1 SMP 2008-10-16 20:35:15 +0200 ppc64
> > ppc64 ppc64 GNU/Linux
> > # ./mallocstress
> > Aborted (core dumped)
> > # uname -i
> > ppc64
> > # uname -r
> > 2.6.27.1-2-ppc64
> > # gdb mallocstress ./core.6218
> > [[[[[ .. snipped the default amount of threads .... ]]]]]]
> > [New Thread 6274]
> > [New Thread 6265]
> > [New Thread 6273]
> > [New Thread 6218]
> > [New Thread 6261]
> > [New Thread 6220]
> > [New Thread 6267]
> > [New Thread 6277]
> > Core was generated by `./mallocstress '.
> > Program terminated with signal 6, Aborted.
> > #0  0x00000400001c62a0 in .raise () from /lib64/libc.so.6
> > (gdb)
>
> Thanks for this information!
>
> I tried to bisect this, unfortunately on a different platfrom were i
> original found the problem - and realized the issue doesn't appear with the
> same kernel at all on this platform...
>
> How much main memory do you have on your ppc64 testhost?
>
> Not quite sure, but size of main memory was the first major difference i
> found between the systems i tested - affected host (x86_64): 8GB main
> memory - non-affected: 4GB main memory (two different systems: x86_64 and
> i386).

Just booted the 8GB box, where i experience that mallocstress is failing, with 
mem=4G and mem=2G - it's still failing. 

It's also failing now on the 4GB machine i tested before successfully.

I'm completely on the wrong path...
I'll try another bisect round on those machines.

Maybe it's not only a kernel thing - maybe glibc is here involved as well.

best regards,
Daniel

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Ltp-list mailing list
Ltp-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ltp-list

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic