'Re: threads'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-core-devel
Subject:    Re: threads
From:       Lubos Lunak <l.lunak () sh ! cvut ! cz>
Date:       1999-12-03 15:17:36
[Download RAW message or body]

On Pá, 03 pro 1999, Stephan Kulow wrote :
> Jo Dillon wrote:
[long discussion skipped]

 Reading the discussion, it seems to me that some people don't understand
exactly the diference between preemptive multithreading and Pth, and also don't
know what threads can provide, and what's the price paid for it. Also, 70+
posts, probably a summary would come handy, and since I spent some
time thinking ( and coding ) about KDE&multithreading, I dare to sum up and
also explain some things. However, I don't think I'm a threads guru, so if you
disagree or if I miss something, feel free to comment :
 
 First of all, using threads within KDE doesn't mean everyone will have to use
it. If they would be used only in few places where it's worth using them, only
few programmers actually need to know how to write MT code. 

Warning, LONG !!! Summary is at the bottom.

 Basically, we have 3 possibilities :

No threads at all
==============

 Some of you definitely think this is the best solution,
maybe they're right. The disadvantages of this can show if you have a operation
that is likely to need some time or that may block. Generally, the only visible
result of this is that the GUI blocks for some time.
 There are solutions for this : External processes, QTimer, or processEvents().

  - External processes : Example of this are kioslaves. The disadvantage here
is complicated and slow communication and data exchange. The advantage is that
if the external process blocks it doesn't matter, because the main process
doesn't block.

   - QTimer : The task has to be simply split to smaller parts, and the slot
connected to QTimer's signal calls one of them. This slot has to know which
part to call each time, and you also need to save _every_ status information (
almost all variables - is it called state machine in English ? ).   The problem
here is that this doesn't help with blocking system calls, and it can be also
very complicated to split and _how_ to split it. Imagine, for example, HTML
parsing and it's table in table in table - normally, you'd probably have
parse_table() function or something like that, which would call parse_table(),
which would call parse_table(). But with QTimer, you simply can't do this,
because this function would have to return quickly.

 - processEvents : Also doesn't help with blocking system calls, but seems to
solve the problem above with parse_table(). The problem here is that this can
cause reentrancy problems. Imagine this : You call processEvents() in you code,
and give control to Qt main loop for a moment, but what if from this main loop
QTimer emits a signal and the slot connected also calls processEvents() ? Doing
several things "simultaneously" this way would be really difficult.

Preemptive threads
================

Preemptive means that at any time context switch may occur, and therefore if
one thread blocks all other can still run. If you are excerienced, this can be
the best solution. However, there are still problems with this :
  - preemptive threads are not available for all platforms, and they need
thread safe libraries ( including libc ). Most platforms don't have even libc
thread-safe, and many libraries are not thread-safe at all ( including kdelibs )
For now, this means, that we either restrict KDE to those platforms which have
preemptive threads ( not many ), or that we can't use preemptive threads.
   - also, neither Qt nor kdelibs are thread-safe, but this is not that bad
problem
   - because the context switch can occur at any time, every operation with
data shared between several threads need to be exclusive locked ( mutexes ). If
you forget this, or make a mistake, you'll get very strange behaviour, which
can be difficult to debug. That's why some people find threads too complicated
( I don't :) ).   It would be probably also possible to expect preemptive
threads, but not really require them. This way everything would work fine with
preemptive threads, and would work not that good with cooperative threads.
However, what's the difference between blocking single-threaded and blocking
multi-threaded app ? But this of course wouldn't be usable where blocking
system calls would block for too long ( i.e. this cannot be used to replace
kioslaves ).

Cooperative threads ( GNU Pth, to be exact )
====================================

 GNU Pth ( www.gnu.org/software/pth ) is very portable, libraries don't have to
be thread safe, and those that don't know about Pth don't even need to be
reentrant ( simply said, it will work with any libraries, Qt will be thread
safe with Pth, kdelibs will be thread safe with Pth ). Also, because there are \
exactly defined context switch points, debugging threaded app using Pth is as \
difficult as debugging single-threaded app ( you can even rely on such thing like \
running the app twice will produce exactly the same results ).   However, the price \
paid for this is quite high : Pth is cooperative ( which is not necessarily a problem \
), and system calls block _every_ thread. Without this problem, Pth would be the best \
solution for now : Easy to use and portable. But with blocking system calls, Pth is \
almost equivalent to QTimer or processEvents() based solutions. It is 'almost' \
because apps written with Pth would be easier to code, no splitting to smaller parts.
It's actually very similar to processEvents() based solutions, but you can also
do several operations "simultaneously". Things like blocking in Qt main loop
can be quite easily solved.
 Pth provides some means how to avoid the blocking system calls problem : It
can map some low-level system calls like select() or read() to its functions,
which don't block. There are two types of mapping : Hard and soft.
  - Hard syscall mapping : select(), read() etc. are included with libpth, and
or called instead those ones in libc. It works even with libraries built before
using Pth. This way even high-level calls like fopen() should work without
blocking the whole process. However, it relies on syscall() call, which
according to Pth's docs not all platforms have - Pth docs lists AIX and SCO5 as
those that don't have it. I personally think that dropping AIX and SCO5
wouldn't be a problem, but there's one more problem with hard mapping. Every
library which would call one of these calls would have to be reentrant ( i.e.
no local static variables, etc. ), or you'd have to ensure that this library
function won't be called several times at the same time - which would be very
complicated I'm afraid ( can you imagine being able to call fopen() only in one
thread at once ?
  - Soft system mapping : Simply said, this is just a buch of macros like
#define select pth_select . Every library would have to be recompiled in order
to not block when call read() or select(). So if you'd like fopen() which
doesn't block whole process, you'd have to recompile it, and face the
reentrancy problems described above ( hard system mappings ). I'm not sure if
it would be a good idea to ship KDE with its own glibc2 .

Some other things to consider 
========================

 There are also some other problems with any threads, worst of them is thread
cancellation + stack cleanup. Just calling pthread_cancel() doesn't guarantee
you that your stack allocated objects will be destroyed ( i.e. calling
destructor ). AFAIK only few compilers destroy stack allocated objects when
cancelling a thread ( Sun's C++ compiler ? ). I noticed a discussion on egcs
mailing list about providing this, but right now it doesn't support it. Not
using stack allocated objects with threads wouldn't be IMHO worth it. There are
few libraries which have support for stack allocated objects. There are also
problems if a system call has to be cancelled ( i.e. the thread is blocked in a
system call and pthread_cancel() is called for it ),  especially when using Pth
+ hard system mapping some system calls could break things down or at least
leak resources. But I think I know how to destroy stack allocated objects even
for this case.
 The second problem, not that important as the first one, is that if thread get
more widely used, support library which does more than just mapping pthread
calls to its methods would be needed - things like messages, queues, barriers,
thread pools, Qt signal emitting between threads and all those things would
probably come handy. But compared to other problems described here, this is
almost no problem.

================
Summary
================

External processes - slow communication and data exchange, all else is fine.
This is how we solve blocking system calls now.

QTimer, processEvents() - using it can be sometimes very complicated, and it
doesn't help with blocking system calls. That's how some long lasting
operations are solved now.

Preemptive threads - not widely available, for many people too complicated

Pth with hard system mapping - we would have to drop some platforms like AIX or
SCO5, according to Pth docs, it works at least with *FSD, Linux, HP/UX,
Solaris, IRIX, UnixWare. Using non-reentrant libraries ( which can include libc
) would be a bit complicated, and one would also have to keep in mind many libc
calls could cause context-switch.

Pth with soft system mapping - we would have to ship some more libraries (
or at least parts of them ) with KDE at least for platforms - all things that
we'd want to change from blocking to non-blocking ( which can include libc ).
This would actually make the situation same to the with Pth + hard system
mapping without having to drop some platforms.

Any other ideas ?

Comments ? Preemptive threads seem to be unacceptable for now, so IMHO the
question is : Is using Pth worth the trouble described above ?

 Lubos Lunak
 l.lunak@email.cz http://dforce.sh.cvut.cz/~seli


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic