[prev in list] [next in list] [prev in thread] [next in thread] 

List:       beowulf
Subject:    Re: Diagnostic tools
From:       <alvin () Maggie ! Linux-Consulting ! com>
Date:       2002-10-22 2:50:31
[Download RAW message or body]


hi don

thanx for your link ... will check it later..
( i cant seem to get thru to your server right now

thanx
alvin


On Mon, 21 Oct 2002, Donald Becker wrote:

> On Mon, 21 Oct 2002 alvin@Maggie.Linux-Consulting.com wrote:
> 
> > On Mon, 21 Oct 2002, Manel Soria wrote:
> > 
> > > We are looking for a diagnostic tool that (ideally) would
> > > allow us to determine what component/s of a node fail. It should
> > > test the processor, RAM, disk and network cards under heavy load
> > > but in repeatable conditions.
> > 
> > testing those items individually is a lot of work ...
> > 
> > test process/proceedure is more important  than the actual test ??
> > 
> > - many different cpu/disk/memory/nic tests
> > 	http://www.Linux-1U.net/Diags/
> 
> The only Linux hardware tests you list are a CPU test (cpuburn) and many
> entries for memtest86.  You missed several Linux "SMART"-based disk
> diagnostics tools and the NIC diagnostics at
> http://www.scyld.com/diag/index.html
> 
> > > -Monitor the CPU temperature.
> > 
> > use i2c-2.6.5 and lm_sensors to read the health monitors on the
> > mbotherboard
> > 
> > also get a regular digital thermometer from your local hw store
> > for sanity checking
> 
> Good advice, since lm_sensors can only guess what type of thermal sensor
> is on the motherboard.  When the guessed calibration is off, it is
> usually way off, but you cannot count on that.

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit \
http://www.beowulf.org/mailman/listinfo/beowulf


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic