[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-core-devel
Subject: malloc performance
From: Lubos Lunak <l.lunak () sh ! cvut ! cz>
Date: 2002-02-12 13:13:34
[Download RAW message or body]
Hello,
see http://sources.redhat.com/ml/libc-alpha/2002-02/msg00107.html for
details. In short, we're using malloc() very extensively, and a noticeable
part of the execution time is spent handling dynamic allocations. Which means
that malloc() should be very very fast, and if it's not, this affects overall
KDE performance. The problem is we're linking against -lpthread, which makes
malloc() use a mutex for locking (even though most KDE apps aren't actually
threaded at the present time), and this makes malloc() to be not that very
very fast.
I tried to do some benchmarks, and I e.g. managed to reduce time needed for
fully rendering $QTDIR/doc/html/functions.html from 60s to 39s (30%) by
LD_PRELOAD-ing a different malloc() implementation (Doug Lea's malloc), which
I also tweaked a bit. Real world cases are a bit difficult to measure, but
the improvement should be at least 10% everywhere.
This is only for glibc < 2.3 , I don't know about other systems. Also, with
the current glibc CVS (i.e. the yet to be released glibc-2.3), malloc() uses
already a spinlock instead of a mutex, and it has almost the same performance
as my tuned malloc().
I'm going to include this malloc() implementation in libkdecore, and I
already got ok from Dirk, as long it has to be explicitly enabled by a
configure switch. It was already discussed a bit on IRC too. In case you have
some thoughts on this, feel free to comment. I'll describe what I exactly
want to do.
There will be a configure option for this, disabled by default (not enough
time to really test it, if nothing else). It will work only with glibc, as I
have no idea about the situation with non-glibc systems. It also requires a
spinlock implementation (i.e. some assembler), I have right now only a x86
one.
However, I'd like to keep it also after glibc-2.3 is released (still only
optional). Even with malloc() from the current glibc CVS, I can get about 5%
improvement on the functions.html page with the tuned malloc(). Glibc
malloc() still has malloc hooks, and is optimized for many threads (it's
ptmalloc, which is a threaded version of Doug Lea's malloc), which is
something we don't need. The only time I needed malloc hooks was for
kdesdk/kmtrace, which is LD_PRELOAD-ed anyway, so it can work around it. Code
optimized for many threads - I'd first have to see a KDE application where
that's needed. Not to mention that I even tried the malloc() implementations
with several threads running, and the simple spinlock only variant didn't
perform worse than the glibc one with 4 threads doing nothing just calling
malloc() and free() in loops (but I don't have access to SMP machine, so
there it might be different).
BTW, just to show how damn fast malloc() has to be: I have here also a test
version of malloc(), which only allocates memory continuously from a large
array and free() is empty function(practically unusable, but as close to
no-op as possible). That functions.html example needs 35s then (vs 39s with
the tuned malloc()). If I add 'for(int i=0;i<70;++i);' to both this malloc()
and free(), it becomes 39s (gcc doesn't optimise out empty loops). I also
tried to write my own malloc(), which did only a few bitfield operations and
little pointer arithmetics - not fast enough, 10% slower then glibc-2.3
malloc (even though it needs about 5-8% less memory, but I doubt anyone is
going to trade that for speed).
Having the possibility to use a malloc() tuned for KDE's needs isn't IMHO a
thing that can break anything. I'm also going to do some improvements to
kdesdk/kmtrace, so it will be hopefully possible to find places where we do
so many allocations (even though I doubt we can do much about that).
Hmm ... any thoughts?
--
Lubos Lunak
llunak@suse.cz ; l.lunak@kde.org
http://dforce.sh.cvut.cz/~seli
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic