'Re: Fedora 32 System-Wide Change proposal: x86-64 micro-architecture update'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       fedora-devel-list
Subject:    Re: Fedora 32 System-Wide Change proposal: x86-64 micro-architecture update
From:       Peter Robinson <pbrobinson () gmail ! com>
Date:       2019-07-31 23:32:47
Message-ID: CALeDE9N40hh0n5J7jush72HiiFUuc+f3xPbu5YpM+WwvKNZu7A () mail ! gmail ! com
[Download RAW message or body]

> > I disagree with ANY raised vector instruction requirement, considering that:
> > * it would make Fedora incompatible with some hardware out there,
>
> That's already so for hardware which is at least of similar age to
> SSE2-only x86_64, i.e. POWER7; my build logs show -mcpu=power8.

For ppc64le, which is the only Power64 architecture Fedora now
supports, the first HW that was supported running Linux on little
endian was Power8 HW so that is exactly as expected. As opposed to
ppc64 which is big endian which until it was retired still supported
what ever generation was the Power Mac G5.

> > * the performance increase to be had is marginal, given that we are mostly
> >   talking about code written in C or C++ without even compiler vectorization
> >   (-ftree-vectorize) turned on,
>
> I forget the details, but libxsmm is something that depends on an
> instruction introduced with SSE3, and is a good example of portable
> performance engineering over a wide range of (x86_64) processors.
>
> > * there are already mechanisms for runtime feature detection, which are
> >   already widely used in those few packages that can actually benefit from
> >   the vector instructions (because they are performance-sensitive and
> >   because they have handwritten assembly or vector intrinsics code),
>
> I disagree that dynamic dispatch is sufficiently widely used in
> scientific code (probably can't be with Fortran).  Also recent GCC can
> provide decent performance for specific targets without target-specific
> programming.  BLIS' portable C version DGEMM got around 60%(?) the speed
> of the hand-tuned implementation built for haswell, as reported
> somewhere in the BLIS issues.  For people who don't know, DEGMM
> (generalized matrix-matrix multiplication) is as SIMD-intensive as it
> gets, with high enough floating point intensity relative to memory
> access for large enough dimensions; non-matrix-matrix linear algebra
> typically doesn't if it doesn't fit in cache.
> _______________________________________________
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-leave@lists.fedoraproject.org
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic