[prev in list] [next in list] [prev in thread] [next in thread] 

List:       fedora-devel-list
Subject:    Re: F40 Change Proposal: Optimized Binaries for the AMD64 Architecture (System-Wide)
From:       John Reiser <jreiser () bitwagon ! com>
Date:       2023-12-31 22:52:53
Message-ID: eba66071-eecf-4805-8f2e-a5b0f7e18d31 () bitwagon ! com
[Download RAW message or body]

> Additional paths will be inserted into the search path used for
> executables on systems which have a compatible CPU.

Searching $PATH is a slow operation. It is so slow that a shell script 
which typically processes many files using utilities from packages 
coreutils and/or binutils often factors-out the PATH search by using
shell variables:

	CP=$(which cp)
	  ...
	$(CP) $file.in $file.out

Do not add directories to $PATH.  Any executable which may benefit
significantly from micro-architectural enhancements should use
the IFUNC mechanism explicitly.  If the developer of the executable
cannot be bothered to use IFUNC, then the uses of the executable
should not slow down EVERY shell path search in the entire session.
Glibc already uses IFUNC for many mem*() and str*() functions (such as
memcpy, strlen, etc.), which covers the vast majority of "random"
cases which usually benefit from such microarchitecture enhancements.

> Fedora binaries for the AMD64 architecture are compiled with
> code-generation flags that support almost all CPU variants. But newer
> generations of processors gained additional instructions that may be
> used to generate faster code. A vendor-independent x86-64 psABI
> supplement defines four "microachitecture levels": `x86-64-v1` (the
> baseline, our code targets this), `x86-64-v2` (+`SSE3`, CentoOS
> targets this), `x86-64-v3` (+`AVX`[edit: and + 'AVX2']), `x86-64-v4` (+`AVX512`) [1].
Please note the edit: -v3 includes both AVX and AVX2.

There are x86_64 CPUs which have AVX (128-bit xmm registers) but not 
AVX2 (256-bit ymm registers): for instance, AMD A10-7890K (cpu family 
21, model 56: 4 CPU cores, 8 graphics cores), which was current in 2019.
Why is there no corresponding microarchitecure level?  In many cases
AVX2 provides significant benefit over AVX, without the monstrosity
of AVX512 (512-bit zmm registers) which requires vastly more chip area 
and power consumption.
--
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic