[prev in list] [next in list] [prev in thread] [next in thread] 

List:       illumos-discuss
Subject:    [discuss] Re: [smartos-discuss] Illumos failure on EPYC at Hetzner
From:       Sriram Narayanan <sriramnrn () gmail ! com>
Date:       2019-02-03 9:06:05
Message-ID: CANiY96aSB3kkMg6UDN=viXsoaPTFgnbfxKBHHOAtbU1ocdOG_w () mail ! gmail ! com
[Download RAW message or body]

On Wed, Jan 9, 2019 at 12:21 PM Robert Mustacchi <rm@joyent.com> wrote:

> On 1/8/19 20:13 , Sriram Narayanan wrote:
> > Hi all:
> >
> > I got access to an EPYC at Hetzner. When I booted the latest SmartOS on
> it,
> > I got the error "MCA is not available on core 0 (cmi_hdl_create returned
> > NULL)".
> >
> > This is specifically here:
> >
> https://github.com/joyent/illumos-joyent/blob/master/usr/src/uts/i86pc/os/cmi_hw.c#L1242
> >
> > Has anyone got SmartOS or another Illumos based distro running on an
> EPYC?
>
> Hi Sriram,
>
> This and a number of other improvements are in the works around this and
> other EPYC improvements. This particular issue is a multi-pronged
> problem of topology which I just finished putting a fix together for
> yesterday. Should hopefully wrap up a good chunk of this and other stuff
> over the next week or so and have it upstream in illumos shortly there
> after. This is https://smartos.org/bugview/OS-7486 and the fix I'm
> currently testing out is https://cr.joyent.us/#/c/5324/. There's some
> other stuff I have in the queue to improve the general AMD EPYC support.
>

Hi Robert:

I was able to install and boot SmartOS build joyent_20190131T012237Z on an
AMD EPYC at Hetzner. Thanks very much for your work in this space.

The prtconf-v output is here:
https://gist.github.com/sriramnrn/1b5326544fcecfea32e2d667369ab741

There is an error that "fmadm faulty" reports:
Fault class : fault.io.pciex.device-interr
Affects     : dev:////pci@1e,0/pci1022,1454@8,1/pci1458,1000@0,2
                  faulted and taken out of service
FRU         : "MB"
(hc://:product-id=MZ31-AR0-00:server-id=karna:chassis-id=01234567890123456789AB/motherboard=0)
                  faulty

Here is a screenshot of the server console: https://imgur.com/a/WHWfAOa

Is there a test suite that I could run to assess whether the CPU and the
various devices are ready for use with SmartOS?

-- Sriram

------------------------------------------
illumos: illumos-discuss
Permalink: https://illumos.topicbox.com/groups/discuss/Tf06a5f6adce2665e-M3785dbbe5b0bd80951eadf73
Delivery options: https://illumos.topicbox.com/groups/discuss/subscription

[Attachment #3 (text/html)]

<html><html><div dir="ltr"><div dir="ltr"><div dir="ltr"><br /></div><br /><div \
class="gmail_quote"><div class="gmail_attr" dir="ltr">On Wed, Jan 9, 2019 at 12:21 PM \
Robert Mustacchi &lt;<a href="mailto:rm@joyent.com">rm@joyent.com</a>&gt; wrote:<br \
/></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 1/8/19 20:13 , \
Sriram Narayanan wrote:<br />&gt; Hi all:<br />&gt; <br />&gt; I got access to an \
EPYC at Hetzner. When I booted the latest SmartOS on it,<br />&gt; I got the error \
&quot;MCA is not available on core 0 (cmi_hdl_create returned<br />&gt; \
NULL)&quot;.<br />&gt; <br />&gt; This is specifically here:<br />&gt; <a \
href="https://github.com/joyent/illumos-joyent/blob/master/usr/src/uts/i86pc/os/cmi_hw.c#L1242" \
rel="noreferrer" target="_blank">https://github.com/joyent/illumos-joyent/blob/master/usr/src/uts/i86pc/os/cmi_hw.c#L1242</a><br \
/>&gt; <br />&gt; Has anyone got SmartOS or another Illumos based distro running on \
an EPYC?<br /><br />Hi Sriram,<br /><br />This and a number of other improvements are \
in the works around this and<br />other EPYC improvements. This particular issue is a \
multi-pronged<br />problem of topology which I just finished putting a fix together \
for<br />yesterday. Should hopefully wrap up a good chunk of this and other stuff<br \
/>over the next week or so and have it upstream in illumos shortly there<br />after. \
This is <a href="https://smartos.org/bugview/OS-7486" rel="noreferrer" \
target="_blank">https://smartos.org/bugview/OS-7486</a> and the fix I&#39;m<br \
/>currently testing out is <a href="https://cr.joyent.us/#/c/5324/" rel="noreferrer" \
target="_blank">https://cr.joyent.us/#/c/5324/</a>. There&#39;s some<br />other stuff \
I have in the queue to improve the general AMD EPYC support.<br \
/></blockquote><div><br /></div><div><div>Hi Robert:</div><div><br /></div><div>I was \
able to install and boot SmartOS build joyent_20190131T012237Z on an AMD EPYC at \
Hetzner. Thanks very much for your work in this space.</div><div><br /></div><div>The \
prtconf-v output is here: <a \
href="https://gist.github.com/sriramnrn/1b5326544fcecfea32e2d667369ab741">https://gist.github.com/sriramnrn/1b5326544fcecfea32e2d667369ab741</a></div><div><br \
/></div><div>There is an error that &quot;fmadm faulty&quot; reports:</div><div>Fault \
class : fault.io.pciex.device-interr</div><div>Affects&nbsp; &nbsp; &nbsp;: \
dev:////pci@1e,0/pci1022,1454@8,1/pci1458,1000@0,2</div><div>&nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; faulted and taken out of \
service</div><div>FRU&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;: &quot;MB&quot; \
(hc://:product-id=MZ31-AR0-00:server-id=karna:chassis-id=01234567890123456789AB/motherboard=0)</div><div>&nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; faulty</div><div><span \
style="white-space:pre">				</span>&nbsp;&nbsp;</div><div>Here is a screenshot of the \
server console: <a href="https://imgur.com/a/WHWfAOa">https://imgur.com/a/WHWfAOa</a></div><div><br \
/></div><div>Is there a test suite that I could run to assess whether the CPU and the \
various devices are ready for use with SmartOS?</div><div><br /></div></div><div>-- \
Sriram</div><div><br /></div></div></div></div></html><div id="topicbox-footer" \
style="margin:10px 0 0;border-top:1px solid \
#ddd;border-color:rgba(0,0,0,.15);padding:7px 0;">

<strong><a href="https://illumos.topicbox.com/latest" \
style="color:inherit;text-decoration:none">illumos</a></strong>  / illumos-discuss / \
see <a href="https://illumos.topicbox.com/groups/discuss">discussions</a>
  +
<a href="https://illumos.topicbox.com/groups/discuss/members">participants</a>
  +
<a href="https://illumos.topicbox.com/groups/discuss/subscription">delivery \
options</a> <a href="https://illumos.topicbox.com/groups/discuss/Tf06a5f6adce2665e-M3785dbbe5b0bd80951eadf73" \
style="float:right">Permalink</a> </div>
</html>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic