[prev in list] [next in list] [prev in thread] [next in thread] 

List:       tru64-unix-managers
Subject:    SUMMARY: paging/swapping improvements
From:       grabow () imk ! fzk ! de (Udo Grabowski)
Date:       1999-09-30 12:13:55
[Download RAW message or body]

Hello Managers !

The problem with swap and paging (original post see below) 
is entirely solved, a warm thanks to Jeremy Hibberd and 
Bryan Lavelle ! Thanks to Donn Aiken and Frank Wortner
who suggested to setup a mfs system and just to buy more
hardware, which, of course, would be the most effective
(though most costly) improvement...

The solution is to install patch kit #4 , 
wich includes a couple of improvements to scheduler/kernel/
malloc/sysconfig-base, and rebuild the kernel. The tuning I 
did was in the right direction, additionally enabling vm-agressive
also helps. The memory-demanding application now runs fast
while constantly paging, vmstat 1 shows that page-ins and -outs 
are now balanced (compare to original posting), and even when
approaching vm-swap-free-reserved no freeze of the system occurs.
The physical memory is also much more filled because the process
does not get swapped out entirely any more. Sorry that I forgot
to mention that we are using lazy swap mode, so the process is
indeed not at the limit of available space.
---------------------------------------------------------------
load averages:  0.10,  0.14,  0.12 
58 processes:  1 running, 1 waiting, 10 sleeping, 46 idle
CPU states:  9.2% user,  0.0% nice,  4.4% system, 86.3% idle
Memory: Real: 230M/620M act/tot  Virtual: 1303M/2364M use/tot  Free: 16M

  PID USERNAME PRI NICE  SIZE   RES STATE   TIME    CPU COMMAND
 1406 grabow    42    0 1756M  512M WAIT    4:01 11.70% <Kopra>
 1446 root      44    0  880K  344K sleep   0:06  0.90% <vmstat>
 1447 grabow    44    0 2712K  352K run     0:02  0.00% <top>
   21 root      44    0 1680K   49K sleep   0:02  0.00% <update>
  374 root      44    0 1848K   40K sleep   0:00  0.00% <snmpd>

Virtual Memory Statistics: (pagesize = 8192)
  procs    memory         pages                          intr        cpu      
  r  w  u  act  free wire fault cow zero react pin pout  in  sy  cs  us  sy  id
  2 70 25   72K 1746 4934 4434    0 4422  736   11   89 112  19 605   0   1  98
  2 70 25   72K 2053 4934   85    0   59    1   22   73  98  23 567   0   1  99
  2 71 24   72K 1978 4934  110    0   59    1   51   38 144  44 775   5   3  92
  2 71 24   72K 1887 4936  248    0  147    1   97   62 145 465 720  45   7  48
  2 71 24   72K 2016 4937  697    0  679    2   18   86 147  23 767  10   2  88
  2 71 24   72K 1991 4937  177    0   59    4  118   46 139  47 744  32   3  64
  2 71 24   72K 2024 4938  314    0  271    0   43   49 180  30 917  16   3  82
  3 71 23   73K 1051 4944  236    0   59    0  177   35 145 363 873  41  23  36
  2 71 24   72K 1802 4959 4699    0 4671  931   28   69 103 417  1K  20  18  62
  2 71 24   72K 2028 4959  199    0  190  125    9   86 123  56 683  33   2  65
  4 70 23   73K  767 4968  243    0  125    5  118    9 235  88  4K  50  28  22
  2 71 24   73K 1102 4977 3490    0 3469  354   21  231  97 416  1K   4  61  34
  2 71 24   73K 1134 4977  203    0  190   12   13   80  99  15 568   0   2  98
  2 71 24   73K 1115 4978  120    0  106    0   13   72 100  43 579  15   2  83
  2 71 24   73K 1152 4978   71    0   59    8   13   83 105  15 586   0   1  98
  2 71 24   73K 1188 4978   72    0   59    2   13   80 101  28 580   0   2  98
  2 69 26   73K 1212 4978   70    0   59    0   11   78  95 422 566   1   3  96
  2 71 24   73K 1249 4978  211    0  191    0   20   76 113  34 613   0   2  98
  2 71 24   73K 1301 4978   75    0   59    2   16   91 115  19 610   4   1  95
  2 71 24   73K 1357 4984 2427    0 2419   62    8  114 143  51  1K  26  22  52
  2 71 24   72K 1686 4984   73    0   59    0   14  105 125 295 635   5   2  93
-------------------------------------------------------------------
The answers in detail:
-------------------------
Well the similar kernel changes we made have not fixed our problem ( you
might remember my original posting about this ). We have come across a patch
which might be relevant (vm_perform_v40dbl11) which fixes a virtual memory
problem in DU 4.0D. Apparently it's included in pk#4 ( compaq tech support
says patch 640 apparently ). We have a Trucluster (v 1.5) system comprising
two AS8400 5/625 systems running Tru64 V4.0D each with 4GB memory and 12GB
swap. They are running Sap R3 v4.0B with an Oracle database ( v 8.0.4 ).
Current patch kit is #3.

  Patch for the fix :

BLITZ TITLE: DIGITAL UNIX V4.0D/E VM PERFORMANCE PATCH - which addresses the
manner in which the operating system was
managing VM resources (e.g., page swapping) for systems operating near the
lower limit of available virtual memory. 

  We will be applying patch kit 4 shortly and I will keep you posted.

Jeremy Hibberd
-------------------------------------------------------------------
There are some virtual memory performance patches that I would suggest that
you get from Compaq and install.  They force the system to start reclaiming
memory pages earlier and more aggressively.  They are available for 4.0d
patch kit 3, not patch kit 2 (also available for 4.0e no patch kit).  There
used to be one for PK2, but I don't know if it's available anymore.  If you
have a software contract with Compaq, log a call and tell them you need the
vm performance patch for 4.0d and patch level.  If you don't have a contract
you can pay for a service call on a per call basis. 

Bryan
-------------------------------------------------------------------
My apologies if this sounds like a trite answer.  It is not meant to be.

Were I in your position,  I would seriously consider getting more memory.  I
would also strongly consider increasing the amount of available paging
space.  I think what you are seeing is a tremendous and sudden demand for
large amounts of memory scattered throughout the program.  Given that your
program requires a large amount of virtual memory,  its size of 1.7GB is
uncomfortably (in my opinion) close to your paging space size.  If the page
space is fragmented,  the system will have a difficult time servicing
requests for large amounts of additional paging space.  Perhaps that is what
you are seeing here.

Sorry if this isn't the solution you are looking for,  but I just think you
are trying to stuff too big a program into too little VM space.  I wouldn't
mind being proved wrong,  though.  :-)

Frank
-------------------------------------------------------------------
I'm really lousy at this stuff.  Do you have any money to spend to upgrade 
the hardware? If so, I would go with the obvious.  More RAM, faster disks.
Would you be able to set up a mfs (Memory File System) to change how your
process allocates memory?  I have never done this, so I'm not sure it will
be of help.  Probably depends on how your program is structured.

Donn Aiken
Regents College
====================================================================
My original post:
====================================================================
We have an application with a high memory demand (~ 1.7 GB).
Our Dec 500au 4.0D (Patchkit 2 applied) is equipped with
640 MB RAM and 2.2 GB swap space, user/proc limits 2.2 GB.
Because the system freezes (as also reported a few days ago here
on the list), which occurs when only vm-page-free-reserved pages
are left, I've modified the vm-section parameters with dxkerneltuner:

vm-page-free-  target   2048   appr. 16MB
               swap     1664         13MB
               optimal  1536         12MB
               min      1024          8MB
               reserved  768          6MB

(some ubc tuning also occured as recommended here on the list and in
the docs). What we observe now is that paging starts as requested at
free-target, but then very quickly the process demands the rest of the
pages so that we still get down to the reserved limit -> FREEZE.
The second effect is that if hard swapping starts, the most demanding
process is swapped out -- just the one we would keep running on the
basis of paging :-< ...

So I tried to push limits of vm-page-free-target up to several thousands
of pages to start paging long before we are at the limit. But then
a sys_check complains that our limit is at 2048 pages. I did not
found the parameter to increase this value in the docs. Is it vm_max_wired ?
Will it help to increase this value ? What else do we have to push up to keep
our process running while paging, not swapping ?

Here are some stats from top and vmstat 1:
------------------------------------------------
load averages:  0.26,  0.06,  0.03                                     16:10:09
57 processes:  1 running, 1 waiting, 16 sleeping, 39 idle
CPU states:  0.0% user,  0.0% nice,  0.0% system, 99.9% idle
Memory: Real: 228M/621M act/tot  Virtual: 995M/2364M use/tot  Free: 12M

  PID USERNAME PRI NICE  SIZE   RES STATE   TIME    CPU COMMAND
21751 grabow    42    0 1738M   14M WAIT    3:55  7.60% <Kopra>   shortly before freezing
22965 root      44    0  864K  335K sleep   0:10  0.90% <vmstat>
  548 root      42    0 2648K  212K sleep   0:01  0.30% <pim>
21635 grabow    44    0 2640K  344K run     0:05  0.00% <top>
   21 root      44    0 1624K   57K sleep   0:03  0.00% <update>
-------------------------------------------------
  procs    memory         pages                          intr        cpu
  r  w  u  act  free wire fault cow zero react pin pout  in  sy  cs  us  sy  id
  2 72 24   72K 1676 4797  540    0   60  256  480    0 483 417  2K  15   6  78
  2 72 24   73K 1186 4797  677    0  192   14  485    0 490  23  2K  13   6  82
  2 72 24   73K  922 4797  489    0   60   16  427    0 266  15  1K   6   3  90  <- before freezing
  5 69 24   73K  767 4797  265    0   60 4104  205    0 194  4K 42K   5   3  92  <- after freeze
  2 70 26   71K 3047 4801  376    0  193 4572  131   12 100 489 713   0   0 100
  2 73 25   71K 2890 4827  383   73   91  256  127    0 124 616 715   3   4  93
  2 72 24   71K 2793 4820  753  146  164  256  229    0 183  3K 922   5   6  88
  2 72 24   72K 2461 4820  448    0   60  256  314    0 347  15  1K   1   3  96
  3 71 24   72K 1982 4820  542    0   60  256  451    0 469 411  2K  10   7  83
  2 72 24   73K 1425 4820  757    0  196   13  560    0 573  25  2K  16   6  79
  2 72 24   73K  871 4820  626    0   60 1888  567    0 546 287  2K  16   8  77  <- before freezing
  8 65 24   73K  767 4820  201    0   60 4040  140    0 263 29K285K   2   3  96  <- after freeze
  4 69 24   73K  934 4796  406    1  252  13K   90   17 168 32K  1K   0   0 100    note the high
  2 71 24   73K 1053 4794  202    7   60 2184  102   24 135 293 833   0   4  96    context switch rate!
  2 70 24   71K 3088 4799  126    1   60   87   41    3  93 214 592   0   4  96
  2 70 24   71K 2786 4799  398    0   58  256  197    0 310  15  1K   1   4  95
  2 70 24   72K 2429 4799  583    0   58  256  435    0 353  18  1K   4   3  93
  2 70 24   72K 1965 4799  570    0  186  255  381    0 474 417  2K  12   7  80
  2 70 24   73K 1431 4805  578    0   58  255  520    0 526  25  2K  15   6  78
  2 70 24   73K  872 4805  616    0   58  143  558    1 576  15  2K  16   7  78  <- before freezing
  7 65 23   73K  767 4805  308    0   58 5562  249    0 283 28K287K   1   3  96  <- after freeze
  2 69 25   71K 3039 4804  407    8  249  13K   45    2 106 32K 796   0   0 100
  2 70 24   71K 2895 4810  279    0   58  256  136    0 173  59 862   1   3  96
  2 70 24   72K 2446 4816  632    0   58  256  436    0 441  15  2K   4   5  91
  2 70 24   72K 2063 4816  402    0   58  256  345    0 398  20  1K   8   4  88
  2 70 24   73K 1512 4816  605    0   58  256  545    0 545 427  2K  16   9  76
  2 70 24   73K  943 4816  746    0  190  208  556    0 576  15  2K  16   6  78  <- before freezing
-------------------------------------------------
Dr. Udo Grabowski                           email: udo.grabowski@imk.fzk.de
Institut f. Meteorologie und Klimaforschung II, Forschungszentrum Karslruhe
Postfach 3640, D-76021 Karlsruhe, Germany           Tel: (+49) 7247 82-6026
http://www.fzk.de/imk/imk2/ame/grabowski/           Fax:         "    -6141

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic