[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-poweredge
Subject:    Re: [Linux-PowerEdge] missing temps among other things? (PE-T610)
From:       James Leone <linuxcpa () gmail ! com>
Date:       2013-06-28 7:30:17
Message-ID: CADFdLRfr9sW-5uELMyVK+oQT7Mn=Yrsv6ndRBJKbDSd9Ki-AuA () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


I'm telling you! Trying to get a simple BIOS manager on one server has been
a real treat. I probably won't get a Dell again and if I did I'd never put
Linux on it. This has taken way too long to learn, nothing was clear until
I spent about a month figuring out that several packages did the same
thing. That its not a good idea to use the drivers and software page for
the server, use a repository. If I activate Impi over lan Samba dies. You
can't do half the things people say you can unless you have an IDRAC, like
ipmitool over lan. That none of the diagnostic programs work. And there is
so many overlapping settings in Linux' etc directory you could break the
system with the wrong RPM.
Its a lot of work that would be an easy download with Windows. Why do I
have a list of multiple RPMS? Can't I just download 1 thing for OMSA and
that's it? Well I guess that's not the mindset -

J


On Fri, Jun 28, 2013 at 12:15 AM, James Leone <linuxcpa@gmail.com> wrote:

> I had a voltage over surge the day the box arrived. The server has been
> like that ever since. Now it doesn't want me to resolve dns.
>
>
> On Thu, Jun 27, 2013 at 7:09 PM, Linda A. Walsh <dell@tlinx.org> wrote:
>
>>
>>
>> I've been recording the system temps, fan speeds and volts/amps/watts for
>> my system since I got it ...
>>
>> I've had a total of 9 temps, one labeled ambient and one labeled planar
>> and 7 unclear, though 2 of them look like degrees below TjMAX (i.e.
>> negative).
>>
>> Different from the one from "sistemes oficina" below,
>> I have 2 more temps, but also have a 2nd
>> PS, so those two might be tied into that.
>>
>> (Does anyone know what the others are?)
>>
>> Anyway, recently, all the temps went dead except for ambient.
>>
>>
>> Sistemes Oficina wrote:
>> > Hi,
>> > Just info for comparation
>> > Server T610
>> > [root@hostaname ~]#  ipmitool sdr list full
>> > Temp             | -63 degrees C     | ok
>> > Temp             | 50 degrees C      | ok
>> > Temp             | 34 degrees C      | ok
>> > Temp             | disabled          | ns
>> > Ambient Temp     | 20 degrees C      | ok
>> > Planar Temp      | 26 degrees C      | ok
>> > FAN 1 RPM        | 10080 RPM         | ok
>> > FAN 2 RPM        | 10080 RPM         | ok
>> > FAN 3 RPM        | 1560 RPM          | ok
>> > FAN 4 RPM        | 1560 RPM          | ok
>> > Temp             | 32 degrees C      | ok
>> > Temp             | disabled          | ns
>> > Temp             | 54 degrees C      | cr
>> > Current 1        | 0.48 Amps         | ok
>> > Current 2        | disabled          | ns
>> > Voltage 1        | 230 Volts         | ok
>> > Voltage 2        | disabled          | ns
>> > System Level     | 112 Watts         | ok
>> > [root@hostname ~]# uname -a
>> > Linux hostname 2.6.18-194.32.1.el5 #1 SMP Mon Dec 20 10:52:42 EST 2010
>> > x86_64 x86_64 x86_64 GNU/Linux
>> ====
>> The same output as above on my machine now shows:
>> Ishtar:law# ipmitool sdr list full
>> Temp             | disabled          | ns
>> Temp             | disabled          | ns
>> Temp             | disabled          | ns
>> Temp             | disabled          | ns
>> Ambient Temp     | 30 degrees C      | ok
>> Planar Temp      | disabled          | ns
>> FAN 1 RPM        | 3000 RPM          | ok
>> FAN 2 RPM        | 3000 RPM          | ok
>> FAN 3 RPM        | 3000 RPM          | ok
>> FAN 4 RPM        | 3000 RPM          | ok
>> Temp             | disabled          | ns
>> Temp             | disabled          | ns
>> Temp             | disabled          | ns
>> Current 1        | 1 Amps            | ok
>> Current 2        | 1.28 Amps         | ok
>> Voltage 1        | 116 Volts         | ok
>> Voltage 2        | 116 Volts         | ok
>> System Level     | 280 Watts         | ok
>> Ishtar:law# uname -a
>> Linux Ishtar 3.9.7-Isht-Van #1 SMP PREEMPT Sat Jun 22 17:21:09 PDT 2013
>> x86_64 x86_64 x86_64 GNU/Linux
>>
>> And a full sdr shows:
>> Ishtar:law# ipmitool sdr
>> Temp             | disabled          | ns
>> Temp             | disabled          | ns
>> Temp             | disabled          | ns
>> Temp             | disabled          | ns
>> Ambient Temp     | 30 degrees C      | ok
>> Planar Temp      | disabled          | ns
>> CMOS Battery     | 0x00              | ok
>> VCORE            | 0x00              | ok
>> VCORE            | 0x00              | ok
>> 0.75 VTT PG      | 0x00              | ok
>> 0.75 VTT PG      | 0x00              | ok
>> IOH THERMTRIP    | Not Readable      | ns
>> 1.5V PG          | 0x00              | ok
>> 1.8V PG          | 0x00              | ok
>> 3.3V PG          | 0x00              | ok
>> 5V PG            | 0x00              | ok
>> MEM PG           | 0x00              | ok
>> MEM PG           | 0x00              | ok
>> VTT PG           | 0x00              | ok
>> VTT PG           | 0x00              | ok
>> 0.9V PG          | 0x00              | ok
>> 1.8 PLL PG       | 0x00              | ok
>> 1.8 PLL PG       | 0x00              | ok
>> 8.0V PG          | 0x00              | ok
>> 1.1V PG          | 0x00              | ok
>> 1.0V LOM PG      | 0x00              | ok
>> 1.0V AUX PG      | 0x00              | ok
>> 1.05V PG         | 0x00              | ok
>> PFault Fail Safe | Not Readable      | ns
>> Heatsink Pres    | 0x00              | ok
>> iDRAC6 Ent Pres  | 0x00              | ok
>> USB Cable Pres   | 0x00              | ok
>> Stor Adapt Pres  | 0x00              | ok
>> FAN 1 RPM        | 3000 RPM          | ok
>> FAN 2 RPM        | 3000 RPM          | ok
>> FAN 3 RPM        | 3000 RPM          | ok
>> FAN 4 RPM        | 3000 RPM          | ok
>> Presence         | 0x00              | ok
>> Presence         | 0x00              | ok
>> Presence         | 0x00              | ok
>> Presence         | 0x00              | ok
>> Presence         | 0x00              | ok
>> Status           | 0x00              | ok
>> Status           | 0x00              | ok
>> Status           | 0x00              | ok
>> Status           | 0x00              | ok
>> OS Watchdog      | 0x00              | ok
>> SEL              | Not Readable      | ns
>> Intrusion        | 0x00              | ok
>> PS Redundancy    | 0x00              | ok
>> Fan Redundancy   | 0x00              | ok
>> CPU Temp Interf  | Not Readable      | ns
>> Drive            | 0x00              | ok
>> Cable SAS A      | 0x00              | ok
>> Cable SAS B      | 0x00              | ok
>> DKM Status       | 0x00              | ok
>> ECC Corr Err     | Not Readable      | ns
>> ECC Uncorr Err   | Not Readable      | ns
>> I/O Channel Chk  | Not Readable      | ns
>> PCI Parity Err   | Not Readable      | ns
>> PCI System Err   | Not Readable      | ns
>> SBE Log Disabled | Not Readable      | ns
>> Logging Disabled | Not Readable      | ns
>> Unknown          | Not Readable      | ns
>> CPU Protocol Err | Not Readable      | ns
>> CPU Bus PERR     | Not Readable      | ns
>> CPU Init Err     | Not Readable      | ns
>> CPU Machine Chk  | Not Readable      | ns
>> Memory Spared    | Not Readable      | ns
>> Memory Mirrored  | Not Readable      | ns
>> Memory RAID      | Not Readable      | ns
>> Memory Added     | Not Readable      | ns
>> Memory Removed   | Not Readable      | ns
>> Memory Cfg Err   | Not Readable      | ns
>> Mem Redun Gain   | Not Readable      | ns
>> PCIE Fatal Err   | Not Readable      | ns
>> Chipset Err      | Not Readable      | ns
>> Err Reg Pointer  | Not Readable      | ns
>> Mem ECC Warning  | Not Readable      | ns
>> Mem CRC Err      | Not Readable      | ns
>> USB Over-current | Not Readable      | ns
>> POST Err         | Not Readable      | ns
>> Hdwr version err | Not Readable      | ns
>> Mem Overtemp     | Not Readable      | ns
>> Mem Fatal SB CRC | Not Readable      | ns
>> Mem Fatal NB CRC | Not Readable      | ns
>> OS Watchdog Time | Not Readable      | ns
>> Non Fatal PCI Er | Not Readable      | ns
>> Fatal IO Error   | Not Readable      | ns
>> MSR Info Log     | Not Readable      | ns
>> Temp             | disabled          | ns
>> Temp             | disabled          | ns
>> Temp             | disabled          | ns
>> Current 1        | 1.08 Amps         | ok
>> Current 2        | 1.28 Amps         | ok
>> Voltage 1        | 116 Volts         | ok
>> Voltage 2        | 116 Volts         | ok
>> System Level     | 280 Watts         | ok
>> Power Optimized  | 0x00              | ok
>> ROMB Battery     | 0x00              | ok
>> SD vFLash Status | Not Readable      | ns
>>
>> ----
>> I don't remember all those saying not readable before...
>> They didn't say anything interesting, (had hex numbers)... but
>> not a big bunch of 'not readables'... and no temps disabled.
>>
>>
>> Has anyone seen anything like this before and what might be causing
>> the problem?  Is it a bios setting somewhere maybe?
>>
>> FWIW, I booted off an OMSA CD, and it was only able to read
>> the planar temp as well...so about 8 of my temp sensors have
>> gone AWOL!  ;-/   HW issue?  SW?  Fixable by me?
>>
>> Was asking a support engineer, but then got this back and I started
>> getting suspicious about getting a straight answer:
>>
>> -----
>>
>>                 "This is --. the 3^rd shift resolution manager with
>> Dell. I am still working on this for you. We are running into a few
>> hurdles gathering the proper resources to help us with the temperature
>> sensors. From the Hardware supported level, the iDRAC (racadm) I
>> mentioned, I am being told that it only monitors system board ambient
>> and IO planar ambient (similar to OMSA). Even if the sensors testing
>> that you are doing is at the OS Level, I am looking for an OS Resource
>> that may know and give us guidance. We will keep you posted."
>>
>> -----
>>
>> The planar ambient and system board ambient??!?  how can a temp be ambient
>> AND for a physical object?
>>
>> ARG!...
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Linux-PowerEdge mailing list
>> Linux-PowerEdge@dell.com
>> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
>>
>
>

[Attachment #5 (text/html)]

<div dir="ltr"><div>I&#39;m telling you! Trying to get a simple BIOS manager on one \
server has been a real treat. I probably won&#39;t get a Dell again and if I did \
I&#39;d never put Linux on it. This has taken way too long to learn, nothing was \
clear until I spent about a month figuring out that several packages did the same \
thing. That its not a good idea to use the drivers and software page for the server, \
use a repository. If I activate Impi over lan Samba dies. You can&#39;t do half the \
things people say you can unless you have an IDRAC, like ipmitool over lan. That none \
of the diagnostic programs work. And there is so many overlapping settings in \
Linux&#39; etc directory you could break the system with the wrong RPM. <br> Its a \
lot of work that would be an easy download with Windows. Why do I have a list of \
multiple RPMS? Can&#39;t I just download 1 thing for OMSA and that&#39;s it? Well I \
guess that&#39;s not the mindset - <br><br></div>J  <br> </div><div \
class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Jun 28, 2013 at 12:15 \
AM, James Leone <span dir="ltr">&lt;<a href="mailto:linuxcpa@gmail.com" \
target="_blank">linuxcpa@gmail.com</a>&gt;</span> wrote:<br> <blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr">I had a voltage over surge the day the box \
arrived. The server has been like that ever since. Now it doesn&#39;t want me to \
resolve dns.<br> </div><div class="HOEnZb"><div class="h5"><div \
class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Jun 27, 2013 at 7:09 PM, \
Linda A. Walsh <span dir="ltr">&lt;<a href="mailto:dell@tlinx.org" \
target="_blank">dell@tlinx.org</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><br> <br>
I&#39;ve been recording the system temps, fan speeds and volts/amps/watts for<br>
my system since I got it ...<br>
<br>
I&#39;ve had a total of 9 temps, one labeled ambient and one labeled planar<br>
and 7 unclear, though 2 of them look like degrees below TjMAX (i.e.<br>
negative).<br>
<br>
Different from the one from &quot;sistemes oficina&quot; below,<br>
I have 2 more temps, but also have a 2nd<br>
PS, so those two might be tied into that.<br>
<br>
(Does anyone know what the others are?)<br>
<br>
Anyway, recently, all the temps went dead except for ambient.<br>
<br>
<br>
Sistemes Oficina wrote:<br>
&gt; Hi,<br>
&gt; Just info for comparation<br>
&gt; Server T610<br>
&gt; [root@hostaname ~]#  ipmitool sdr list full<br>
&gt; Temp             | -63 degrees C     | ok<br>
&gt; Temp             | 50 degrees C      | ok<br>
&gt; Temp             | 34 degrees C      | ok<br>
&gt; Temp             | disabled          | ns<br>
&gt; Ambient Temp     | 20 degrees C      | ok<br>
&gt; Planar Temp      | 26 degrees C      | ok<br>
&gt; FAN 1 RPM        | 10080 RPM         | ok<br>
&gt; FAN 2 RPM        | 10080 RPM         | ok<br>
&gt; FAN 3 RPM        | 1560 RPM          | ok<br>
&gt; FAN 4 RPM        | 1560 RPM          | ok<br>
&gt; Temp             | 32 degrees C      | ok<br>
&gt; Temp             | disabled          | ns<br>
&gt; Temp             | 54 degrees C      | cr<br>
&gt; Current 1        | 0.48 Amps         | ok<br>
&gt; Current 2        | disabled          | ns<br>
&gt; Voltage 1        | 230 Volts         | ok<br>
&gt; Voltage 2        | disabled          | ns<br>
&gt; System Level     | 112 Watts         | ok<br>
&gt; [root@hostname ~]# uname -a<br>
&gt; Linux hostname 2.6.18-194.32.1.el5 #1 SMP Mon Dec 20 10:52:42 EST 2010<br>
&gt; x86_64 x86_64 x86_64 GNU/Linux<br>
====<br>
The same output as above on my machine now shows:<br>
Ishtar:law# ipmitool sdr list full<br>
Temp             | disabled          | ns<br>
Temp             | disabled          | ns<br>
Temp             | disabled          | ns<br>
Temp             | disabled          | ns<br>
Ambient Temp     | 30 degrees C      | ok<br>
Planar Temp      | disabled          | ns<br>
FAN 1 RPM        | 3000 RPM          | ok<br>
FAN 2 RPM        | 3000 RPM          | ok<br>
FAN 3 RPM        | 3000 RPM          | ok<br>
FAN 4 RPM        | 3000 RPM          | ok<br>
Temp             | disabled          | ns<br>
Temp             | disabled          | ns<br>
Temp             | disabled          | ns<br>
Current 1        | 1 Amps            | ok<br>
Current 2        | 1.28 Amps         | ok<br>
Voltage 1        | 116 Volts         | ok<br>
Voltage 2        | 116 Volts         | ok<br>
System Level     | 280 Watts         | ok<br>
Ishtar:law# uname -a<br>
Linux Ishtar 3.9.7-Isht-Van #1 SMP PREEMPT Sat Jun 22 17:21:09 PDT 2013<br>
x86_64 x86_64 x86_64 GNU/Linux<br>
<br>
And a full sdr shows:<br>
Ishtar:law# ipmitool sdr<br>
Temp             | disabled          | ns<br>
Temp             | disabled          | ns<br>
Temp             | disabled          | ns<br>
Temp             | disabled          | ns<br>
Ambient Temp     | 30 degrees C      | ok<br>
Planar Temp      | disabled          | ns<br>
CMOS Battery     | 0x00              | ok<br>
VCORE            | 0x00              | ok<br>
VCORE            | 0x00              | ok<br>
0.75 VTT PG      | 0x00              | ok<br>
0.75 VTT PG      | 0x00              | ok<br>
IOH THERMTRIP    | Not Readable      | ns<br>
1.5V PG          | 0x00              | ok<br>
1.8V PG          | 0x00              | ok<br>
3.3V PG          | 0x00              | ok<br>
5V PG            | 0x00              | ok<br>
MEM PG           | 0x00              | ok<br>
MEM PG           | 0x00              | ok<br>
VTT PG           | 0x00              | ok<br>
VTT PG           | 0x00              | ok<br>
0.9V PG          | 0x00              | ok<br>
1.8 PLL PG       | 0x00              | ok<br>
1.8 PLL PG       | 0x00              | ok<br>
8.0V PG          | 0x00              | ok<br>
1.1V PG          | 0x00              | ok<br>
1.0V LOM PG      | 0x00              | ok<br>
1.0V AUX PG      | 0x00              | ok<br>
1.05V PG         | 0x00              | ok<br>
PFault Fail Safe | Not Readable      | ns<br>
Heatsink Pres    | 0x00              | ok<br>
iDRAC6 Ent Pres  | 0x00              | ok<br>
USB Cable Pres   | 0x00              | ok<br>
Stor Adapt Pres  | 0x00              | ok<br>
FAN 1 RPM        | 3000 RPM          | ok<br>
FAN 2 RPM        | 3000 RPM          | ok<br>
FAN 3 RPM        | 3000 RPM          | ok<br>
FAN 4 RPM        | 3000 RPM          | ok<br>
Presence         | 0x00              | ok<br>
Presence         | 0x00              | ok<br>
Presence         | 0x00              | ok<br>
Presence         | 0x00              | ok<br>
Presence         | 0x00              | ok<br>
Status           | 0x00              | ok<br>
Status           | 0x00              | ok<br>
Status           | 0x00              | ok<br>
Status           | 0x00              | ok<br>
OS Watchdog      | 0x00              | ok<br>
SEL              | Not Readable      | ns<br>
Intrusion        | 0x00              | ok<br>
PS Redundancy    | 0x00              | ok<br>
Fan Redundancy   | 0x00              | ok<br>
CPU Temp Interf  | Not Readable      | ns<br>
Drive            | 0x00              | ok<br>
Cable SAS A      | 0x00              | ok<br>
Cable SAS B      | 0x00              | ok<br>
DKM Status       | 0x00              | ok<br>
ECC Corr Err     | Not Readable      | ns<br>
ECC Uncorr Err   | Not Readable      | ns<br>
I/O Channel Chk  | Not Readable      | ns<br>
PCI Parity Err   | Not Readable      | ns<br>
PCI System Err   | Not Readable      | ns<br>
SBE Log Disabled | Not Readable      | ns<br>
Logging Disabled | Not Readable      | ns<br>
Unknown          | Not Readable      | ns<br>
CPU Protocol Err | Not Readable      | ns<br>
CPU Bus PERR     | Not Readable      | ns<br>
CPU Init Err     | Not Readable      | ns<br>
CPU Machine Chk  | Not Readable      | ns<br>
Memory Spared    | Not Readable      | ns<br>
Memory Mirrored  | Not Readable      | ns<br>
Memory RAID      | Not Readable      | ns<br>
Memory Added     | Not Readable      | ns<br>
Memory Removed   | Not Readable      | ns<br>
Memory Cfg Err   | Not Readable      | ns<br>
Mem Redun Gain   | Not Readable      | ns<br>
PCIE Fatal Err   | Not Readable      | ns<br>
Chipset Err      | Not Readable      | ns<br>
Err Reg Pointer  | Not Readable      | ns<br>
Mem ECC Warning  | Not Readable      | ns<br>
Mem CRC Err      | Not Readable      | ns<br>
USB Over-current | Not Readable      | ns<br>
POST Err         | Not Readable      | ns<br>
Hdwr version err | Not Readable      | ns<br>
Mem Overtemp     | Not Readable      | ns<br>
Mem Fatal SB CRC | Not Readable      | ns<br>
Mem Fatal NB CRC | Not Readable      | ns<br>
OS Watchdog Time | Not Readable      | ns<br>
Non Fatal PCI Er | Not Readable      | ns<br>
Fatal IO Error   | Not Readable      | ns<br>
MSR Info Log     | Not Readable      | ns<br>
Temp             | disabled          | ns<br>
Temp             | disabled          | ns<br>
Temp             | disabled          | ns<br>
Current 1        | 1.08 Amps         | ok<br>
Current 2        | 1.28 Amps         | ok<br>
Voltage 1        | 116 Volts         | ok<br>
Voltage 2        | 116 Volts         | ok<br>
System Level     | 280 Watts         | ok<br>
Power Optimized  | 0x00              | ok<br>
ROMB Battery     | 0x00              | ok<br>
SD vFLash Status | Not Readable      | ns<br>
<br>
----<br>
I don&#39;t remember all those saying not readable before...<br>
They didn&#39;t say anything interesting, (had hex numbers)... but<br>
not a big bunch of &#39;not readables&#39;... and no temps disabled.<br>
<br>
<br>
Has anyone seen anything like this before and what might be causing<br>
the problem?  Is it a bios setting somewhere maybe?<br>
<br>
FWIW, I booted off an OMSA CD, and it was only able to read<br>
the planar temp as well...so about 8 of my temp sensors have<br>
gone AWOL!  ;-/   HW issue?  SW?  Fixable by me?<br>
<br>
Was asking a support engineer, but then got this back and I started<br>
getting suspicious about getting a straight answer:<br>
<br>
-----<br>
<br>
                &quot;This is --. the 3^rd shift resolution manager with<br>
Dell. I am still working on this for you. We are running into a few<br>
hurdles gathering the proper resources to help us with the temperature<br>
sensors. From the Hardware supported level, the iDRAC (racadm) I<br>
mentioned, I am being told that it only monitors system board ambient<br>
and IO planar ambient (similar to OMSA). Even if the sensors testing<br>
that you are doing is at the OS Level, I am looking for an OS Resource<br>
that may know and give us guidance. We will keep you posted.&quot;<br>
<br>
-----<br>
<br>
The planar ambient and system board ambient??!?  how can a temp be ambient<br>
AND for a physical object?<br>
<br>
ARG!...<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
_______________________________________________<br>
Linux-PowerEdge mailing list<br>
<a href="mailto:Linux-PowerEdge@dell.com" \
target="_blank">Linux-PowerEdge@dell.com</a><br> <a \
href="https://lists.us.dell.com/mailman/listinfo/linux-poweredge" \
target="_blank">https://lists.us.dell.com/mailman/listinfo/linux-poweredge</a><br> \
</blockquote></div><br></div> </div></div></blockquote></div><br></div>



_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic