[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-core-devel
Subject:    Re: malloc performance
From:       Henrik Johnson <hpj () globecom ! se>
Date:       2002-02-12 13:23:33
[Download RAW message or body]

My application TOra (Not in any of the KDE packages, but it is a KDE 
application and already supports KDE3) uses threads extensively. So 
making KDE not working with threads would basically force me to go back 
to Qt only. As long as it is optional (And distributions not using it) I 
have no problems with this.

/Mauritz
GlobeCom AB

Lubos Lunak wrote:

> Hello,
>
> see http://sources.redhat.com/ml/libc-alpha/2002-02/msg00107.html for 
>details. In short, we're using malloc() very extensively, and a noticeable 
>part of the execution time is spent handling dynamic allocations. Which means 
>that malloc() should be very very fast, and if it's not, this affects overall 
>KDE performance. The problem is we're linking against -lpthread, which makes 
>malloc() use a mutex for locking (even though most KDE apps aren't actually 
>threaded at the present time), and this makes malloc() to be not that very 
>very fast.
>
> I tried to do some benchmarks, and I e.g. managed to reduce time needed for 
>fully rendering $QTDIR/doc/html/functions.html from 60s to 39s (30%) by 
>LD_PRELOAD-ing a different malloc() implementation (Doug Lea's malloc), which 
>I also tweaked a bit. Real world cases are a bit difficult to measure, but 
>the improvement should be at least 10% everywhere.
>
> This is only for glibc < 2.3 , I don't know about other systems. Also, with 
>the current glibc CVS (i.e. the yet to be released glibc-2.3), malloc() uses 
>already a spinlock instead of a mutex, and it has almost the same performance 
>as my tuned malloc().
>
> I'm going to include this malloc() implementation in libkdecore, and I 
>already got ok from Dirk, as long it has to be explicitly enabled by a 
>configure switch. It was already discussed a bit on IRC too. In case you have 
>some thoughts on this, feel free to comment. I'll describe what I exactly 
>want to do.
>
> There will be a configure option for this, disabled by default (not enough 
>time to really test it, if nothing else). It will work only with glibc, as I 
>have no idea about the situation with non-glibc systems. It also requires a 
>spinlock implementation (i.e. some assembler), I have right now only a x86 
>one.
>
> However, I'd like to keep it also after glibc-2.3 is released (still only 
>optional). Even with malloc() from the current glibc CVS, I can get about 5% 
>improvement on the functions.html page with the tuned malloc(). Glibc 
>malloc() still has malloc hooks, and is optimized for many threads (it's 
>ptmalloc, which is a threaded version of Doug Lea's malloc), which is 
>something we don't need. The only time I needed malloc hooks was for 
>kdesdk/kmtrace, which is LD_PRELOAD-ed anyway, so it can work around it. Code 
>optimized for many threads - I'd first have to see a KDE application where 
>that's needed. Not to mention that I even tried the malloc() implementations 
>with several threads running, and the simple spinlock only variant didn't 
>perform worse than the glibc one with 4 threads doing nothing just calling 
>malloc() and free() in loops (but I don't have access to SMP machine, so 
>there it might be different).
>
> BTW, just to show how damn fast malloc() has to be: I have here also a test 
>version of malloc(), which only allocates memory continuously from a large 
>array and free() is empty function(practically unusable, but as close to 
>no-op as possible). That functions.html example needs 35s then (vs 39s with 
>the tuned malloc()). If I add 'for(int i=0;i<70;++i);' to both this malloc() 
>and free(), it becomes 39s (gcc doesn't optimise out empty loops). I also 
>tried to write my own malloc(), which did only a few bitfield operations and 
>little pointer arithmetics - not fast enough, 10% slower then glibc-2.3 
>malloc (even though it needs about 5-8% less memory, but I doubt anyone is 
>going to trade that for speed).
>
> Having the possibility to use a malloc() tuned for KDE's needs isn't IMHO a 
>thing that can break anything. I'm also going to do some improvements to 
>kdesdk/kmtrace, so it will be hopefully possible to find places where we do 
>so many allocations (even though I doubt we can do much about that).
> Hmm ... any thoughts?
>


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic