[prev in list] [next in list] [prev in thread] [next in thread]
List: taroon-list
Subject: Re: Taroon + numa question ?
From: "Paul Krizak" <paul.krizak () amd ! com>
Date: 2006-11-28 17:41:27
Message-ID: 456C74C7.2040401 () amd ! com
[Download RAW message or body]
Unfortunately RHEL3's NUMA implementation is extraordinarly broken.
We've been working with them for many months now trying to fix it, and
have finally given up and are waiting for RHEL5 to implement a correct
NUMA architecture for our CPUs.
As far as we can tell, here's what happens for various OS/NUMA combinations:
RHEL3, U<7 + NUMA:
* Processes that grow larger than one "NUMA node", i.e. the memory
attached to one node, will dip into swap instead of using memory from
other nodes.
* The kernel has no clue where the memory is mapped in relation to the
CPU cores, and so makes dumb decisions on where to put processes. Using
cpuset helps, but is impractical for a batch compute node.
RHEL3, U>=7 + NUMA:
* Processes that grow larger than one "NUMA node" will use memory from
other nodes, but at a SEVERE performance penalty (though not as great as
using swap)
* The kernel still has no clue where the memory is mapped to CPU cores.
RHEL4, U>=2 + NUMA:
* Memory allocation works fine. No significant performance penalty as a
process grows beyond one node.
* The kernel is still clueless about where to put processes.
Given the poor luck we've had with NUMA, our compute nodes run with the
following configuration options, and perform (on average) better than
with NUMA enabled:
* ACPI SRAT table disabled
* Node Interleaving enabled
* Bank Interleaving enabled
"numa=off" on kernel command line
Various hardware and software vendors balk at the disabling of NUMA, but
our internal benchmarks don't lie -- NUMA is way slower (in general) in
RHEL3 and RHEL4 than non-NUMA in RHEL3 and RHEL4 on the Opteron platform.
Paul Krizak 5900 E. Ben White Blvd. MS 625
Advanced Micro Devices Austin, TX 78741
Linux/Unix Systems Engineering Phone: (512) 602-8775
Silicon Design Division Cell: (512) 791-0686
Stanley, Jon wrote:
>> Hmm, somehow I think NUMA doesn't really come in to play with a single
>> socket whatevercore cpu's.
>>
>
> This is correct - a single physical socket is one NUMA node. However,
> in the event of a process needing 6GB of RAM, it will get the memory
> from another node prior to going to swap for it. There is a performance
> penalty with inter-node memory allocation, but it's not really something
> that can be avoided, I don't think.
>
> Again, I could be totally wrong.
>
> --
> Taroon-list mailing list
> Taroon-list@redhat.com
> https://www.redhat.com/mailman/listinfo/taroon-list
>
>
--
Taroon-list mailing list
Taroon-list@redhat.com
https://www.redhat.com/mailman/listinfo/taroon-list
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic