[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-smp
Subject:    Re: Different schedulers available for Linux: Comparison
From:       "Robert G. Brown" <rgb () phy ! duke ! edu>
Date:       1997-09-18 15:00:55
[Download RAW message or body]

On Thu, 18 Sep 1997, Mr M S Aitchison wrote:

> [I-don't-exactly-know-what-I'm-talking-about-but-it-sounds-good-to-me mode on]
> 
> By any chance would it be possible in Linux to have the scheduler some
> sort of module that could be exchanged _in a running system_?  I think
> Solaris can change the scheduler like this.

<Flame On>

Solaris cannot (afaik) change its scheduler, it changes a "dispatch
table" that determines the tuning and behavior of its scheduler.  The
Solaris scheduler (tuned or not) sucks incredibly, too.  Solaris
wasn't nicknamed "Slowaris" among the cognoscenti well through 2.4 for
nothing -- you could take a Sparc 5 Solaris 2.3 box with 64 MB of
memory, run a single, maximally "niced" (Solaris uses prio_cntl, not
nice, but same difference) background job on it, and reduce
interactive performance on the console to a totally unacceptable
crawl.  A sparcstation 4 with have the memory, running SunOS 4.1.3,
could run three or four similar backgrounded processes and a console
user couldn't even tell they were there.  I know this to be true from
bitter experience, believe me.  Even now I think that Solaris has a
remarkably "poor" scheduler from the point of view of balancing
multitasking/multiuser/console performance on a workstation.  A real
shame -- Sun used up most of its good karma accumulated with SunOS
(which was a joy to use and manage) long before Solaris was even
moderately comparable in basic quality.

<Flame Off>

On the good side, for the most part the linux scheduler is damn good,
probably the easy match for the SunOS scheduler, which was perceptably
the best in its day.  For MOST users, there is probably very little to
be gained by attempting to parametrically tune it or otherwise
intervene at a very low level -- schedulers are too important and easy
to break to leave to non-kernel-hacking-supergenius types.  I'd much
rather leave messing with it to Linus and Alan Cox and all the folks
that have a very definite clue about what they are doing and what will
break when they tweak it, although (the way this list works) there are
probably LOTS of folks out there who have such a clue and also LOTS of
folks who can and do have fun playing and learning even where they are
clueless.

To them I say: God Bless -- play and have a good time, and contribute
any real breakthroughs back to the Holy Keepers of the Purity of the
Kernel, but don't try to institutionalize a roll-your-own scheduler
hook in the kernel.  The cost in overhead of doing so would probably
eat most of the benefit -- a scheduler lives at the very heart of the
kernel and it is hard to see how to modularize it in a way that
doesn't preclude critical optimizations.  I can see little active harm
in putting tuning parameters in that a sufficiently knowledgable user
can tweak, but then, those parameters are already there in the
scheduler code, and if you don't know where to look and what they do
as it is, you probably have no business tweaking them.

> I suspect that different scheduler tactics will be needed under
> different conditions (by the looks of it, having SMP is in itself a
> significant condition to make some schedulers not a great idea).
> Therefore, something that can be changed based on operating conditions,
> by a person or by the system, sounds a fun thing to do, especially
> since I suspect many SMP systems are bought to handle "seriously big"
> jobs where bit of difference to the scheduler might make a big
> difference to performance.
> 
> I recall a Data General system that had a "tuning" option that could be
> turned on for a while (slight performace hit while it was on), then the
> system could be re-tuned based on the gathered statistics.  Of course,
> this could go further than just fiddling with scheduler parameters, but
> if the system could introspectively look at significant parameters
> every so often and slip in a different scheduler algorithm or tune
> parameters (perhaps based on accurate simulations, or simple
> heuristics) people could then distribute not just schedulers but
> strategies for scheduler self-optimization based on their experience.

I don't disagree with this -- if you are building a distributed
supercomputer, for example, you might well benefit from somehow
altering the way networking processes are scheduled or processed (e.g.
how pre-emptive, blocking or non-blocking I/O) in the kernel.
However, doing so might make the system non-optimal for a "standard
workstation" and almost certainly will destabilize the kernel and make
it break in new and exotic ways.  This might be acceptable in a
reduced function array of systems like a stack of networked
supercomputing elements with very few running processes but the basic
kernel and networking support and a couple of numerical jobs and no
interactive "users" at all, but might be totally unacceptable in a
user/production environment such as where I'm working from right now.

My only remark is that you will have to attempt the kernel hacking
associated with such a use-specific optimizing re-engineering
yourself.  The linux kernel already "supports" such tinkering -- that
is part of the point of having the complete source available.  You
even have Linus et.al.  (probably) listening in and ready to help out
and/or incorporate anything you come up with that is a real
improvement into the distribution series.  

I actually pursued the question of scheduler "improvements" for a very
brief while on this list eight or so months ago, and what I learned
was enough to convince me that the scheduler is already far better
than what I'd come up with on my own EXCEPT with respect to the
graining and distribution of interrupts on SMP machines, which is
still an active area of development.  Well, even here it's probably
better than what I could do, but at least here I might eventually be
able to make some small contribution.  I do think that coming up with
a distributed supercomputer-specific kernel variant would be a Good
Thing, though, and hope that somebody tackles it...

   rgb

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb@phy.duke.edu

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic