[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cfe-dev
Subject:    Re: [cfe-dev] [PROPOSAL] Reintroduce guards for Intel	intrinsic	headers
From:       "Robinson, Paul" <Paul_Robinson () playstation ! sony ! com>
Date:       2015-07-30 21:53:36
Message-ID: E3B07FDB86BFF041819DC057DEED8FEABDC979CA () USCULXMSG04 ! am ! sony ! com
[Download RAW message or body]

[Attachment #2 (text/plain)]

The problem that we reported in PR24125 is fundamentally that for intrinsics \
implemented as macros (rather than inline functions) the symptom for "you didn't set \
the right target" is a backend crash. For those intrinsics there's no function to \
attach the attribute to.  I was thinking about re-introducing the #ifdefs for those \
cases, so we'd be going back to the "undefined identifier" diagnostic from the \
frontend.  But I'd be happier with some other solution that worked more smoothly for \
                macros.
--paulr

From: cfe-dev-bounces@cs.uiuc.edu [mailto:cfe-dev-bounces@cs.uiuc.edu] On Behalf Of \
                Smith, Kevin B
Sent: Thursday, July 30, 2015 1:43 PM
To: Reid Kleckner; Vedant Kumar; Eric Christopher
Cc: Clang Developers List
Subject: Re: [cfe-dev] [PROPOSAL] Reintroduce guards for Intel intrinsic headers



From: cfe-dev-bounces@cs.uiuc.edu<mailto:cfe-dev-bounces@cs.uiuc.edu> \
                [mailto:cfe-dev-bounces@cs.uiuc.edu] On Behalf Of Reid Kleckner
Sent: Thursday, July 30, 2015 10:13 AM
To: Vedant Kumar; Eric Christopher
Cc: Clang Developers List
Subject: Re: [cfe-dev] [PROPOSAL] Reintroduce guards for Intel intrinsic headers

I'm opposed to this. Going forward, I would really like target intrinsics to be \
available regardless of the current feature set, so users don't need hacks like \
these.

I see two ways to do this with different tradeoffs:
1. Diagnose missing target attributes when calling the intel intrinsics. I was \
surprised to find that we don't already do this. 2. We could support some automatic \
transfer of the target attribute to the caller when calling these intrinsics, but I \
worry that this is too confusing.

KBS> Regarding automatic transfer of the target attribute.  It seems like something \
like this:

static __inline __m256
__attribute__((__always_inline__, __nodebug__))
_mm256_add_ps(__m256 __a, __m256 __b)
{
  __builtin_assume(__has_feature(FEATURE_AVX));
  return __a + __b;
}

might be a good way to represent this. It has the nice property that the \
__builtin_assume only applies at the exact point in execution where it occurs.  You \
could probably add something like this as a small addition to what Eric has already \
done with the target attribute, or you could try to make inlining of something with \
the target attribute automatically apply a target __builtin_assume property.

The difficulty is defining how the semantics of the different target properties have \
to be retained during optimization and code generation. I'm sure this difficulty is \
why the granularity that Eric is working on is at the routine level.

Kevin


Implicitly setting a target attribute may block inlining that the user expected to \
happen, for example. Alternatively, there may be a dynamic cpuid check in the same \
function between SSE2 and AVX variants of the same algorithm, and now the SSE2 loop \
will unexpectedly use AVX instructions.

So we should probably settle with telling the user to add -msseNN or \
__atribute__((target(("sseNN")))).

On Thu, Jul 30, 2015 at 8:53 AM, Vedant Kumar <vsk@apple.com<mailto:vsk@apple.com>> \
wrote: I've run into some code which no longer compiles because of two recent \
changes:

  41885d3 Update the intel intrinsic headers to use the target attribute support.
  695aff1 Use a define for per-file function attributes for the Intel intrinsic \
headers.

Specifically, one project defines its own SSE4.1 emulation routines when the real \
intrinsics aren't available. This is a problem because they've reused the names of \
the intrinsics. E.g;

> #ifndef __SSE4_1__
> #define _mm_extract_epi8(a_, ndx) ({ ... })
> static inline __m128i _mm_blendv_epi8(__m128i a, __m128i b, __m128i mask) { ... }
> ...
> #endif

SSE4.1 intrinsics now leak into the project when it's being compiled for targets \
without SSE4.1 support. Compilation fails with "error: redefinition ...".

When these changes were initially being discussed, I think our stance was that we \
shouldn't support code like this [1]. However, we should reconsider for the sake of \
avoiding breakage. AFAICT, we would need to revert just two types of changes:

In lib/Headers/__wmmintrin_aes.h:

> -#if defined (__SSE4_2__) || defined (__SSE4_1__)
> #include <smmintrin.h>
> -#endif


In lib/Headers/smmintrin.h:

> -#ifndef __SSE4_1__
> -#error "SSE4.1 instruction set not enabled"
> -#else

I don't see any downsides to reintroducing these guards. If everyone's OK with this, \
I can mail a patch in. The alternative is to have clients rewrite their emulation \
layers like this:

> #ifdef __SSE4_1__
> #define compat_mm_extract_epi8 _mm_extract_epi8
> static inline __m128i combat_mm_blendv_epi8(__m128i a, __m128i b, __m128i mask) \
> __attribute__((__target__(("sse4.1")))) { return _mm_blendv_epi8(a, b, mask);
> }
> ...
> #else /* OK, no native SSE 4.1. Define our own. */
> #define compat_mm_extract_epi8(a_, ndx) ({ ... })
> static inline __m128i compat_mm_blendv_epi8(__m128i a, __m128i b, __m128i mask) { \
>                 ... }
> ...
> #endif

... and then replace all calls to intrinsics with calls to the new compatibility \
routines. This seems like a lot of tedious work, and I'd love to help people avoid it \
:).

Let me know what you think!

vedant

[1] http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20150615/131192.html
_______________________________________________
cfe-dev mailing list
cfe-dev@cs.uiuc.edu<mailto:cfe-dev@cs.uiuc.edu>
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


[Attachment #3 (text/html)]

<html xmlns:v="urn:schemas-microsoft-com:vml" \
xmlns:o="urn:schemas-microsoft-com:office:office" \
xmlns:w="urn:schemas-microsoft-com:office:word" \
xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" \
xmlns="http://www.w3.org/TR/REC-html40"> <head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
	{font-family:Tahoma;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
	{mso-style-priority:99;
	mso-style-link:"Balloon Text Char";
	margin:0in;
	margin-bottom:.0001pt;
	font-size:8.0pt;
	font-family:"Tahoma","sans-serif";}
span.EmailStyle17
	{mso-style-type:personal;
	font-family:"Arial","sans-serif";
	color:#1F497D;}
span.BalloonTextChar
	{mso-style-name:"Balloon Text Char";
	mso-style-priority:99;
	mso-style-link:"Balloon Text";
	font-family:"Tahoma","sans-serif";}
span.EmailStyle20
	{mso-style-type:personal-reply;
	font-family:"Calibri","sans-serif";
	color:#1F497D;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-size:10.0pt;}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span \
style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D">The \
problem that we reported in PR24125 is fundamentally that for intrinsics implemented \
as macros (rather than inline functions) the symptom for &quot;you didn't  set the \
right target&quot; is a backend crash. For those intrinsics there's no function to \
attach the attribute to.&nbsp; I was thinking about re-introducing the #ifdefs for \
those cases, so we'd be going back to the &quot;undefined identifier&quot; diagnostic \
from the frontend.&nbsp;  But I'd be happier with some other solution that worked \
more smoothly for macros.<o:p></o:p></span></p> <p class="MsoNormal"><span \
style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D">--paulr<o:p></o:p></span></p>
 <p class="MsoNormal"><a name="_MailEndCompose"><span \
style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span></a></p>
 <div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span \
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;">From:</span></b><span \
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;"> \
cfe-dev-bounces@cs.uiuc.edu [mailto:cfe-dev-bounces@cs.uiuc.edu] <b>On Behalf Of \
</b>Smith, Kevin B<br> <b>Sent:</b> Thursday, July 30, 2015 1:43 PM<br>
<b>To:</b> Reid Kleckner; Vedant Kumar; Eric Christopher<br>
<b>Cc:</b> Clang Developers List<br>
<b>Subject:</b> Re: [cfe-dev] [PROPOSAL] Reintroduce guards for Intel intrinsic \
headers<o:p></o:p></span></p> </div>
</div>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
 <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
 <p class="MsoNormal"><b><span \
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;">From:</span></b><span \
style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;"> <a \
href="mailto:cfe-dev-bounces@cs.uiuc.edu">cfe-dev-bounces@cs.uiuc.edu</a> [<a \
href="mailto:cfe-dev-bounces@cs.uiuc.edu">mailto:cfe-dev-bounces@cs.uiuc.edu</a>] \
<b>On Behalf Of </b>Reid Kleckner<br> <b>Sent:</b> Thursday, July 30, 2015 10:13 \
AM<br> <b>To:</b> Vedant Kumar; Eric Christopher<br>
<b>Cc:</b> Clang Developers List<br>
<b>Subject:</b> Re: [cfe-dev] [PROPOSAL] Reintroduce guards for Intel intrinsic \
headers<o:p></o:p></span></p> <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<div>
<p class="MsoNormal">I'm opposed to this. Going forward, I would really like target \
intrinsics to be available regardless of the current feature set, so users don't need \
hacks like these.<o:p></o:p></p> <div>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class="MsoNormal">I see two ways to do this with different \
tradeoffs:<o:p></o:p></p> </div>
<div>
<p class="MsoNormal">1. Diagnose missing target attributes when calling the intel \
intrinsics. I was surprised to find that we don't already do this.<o:p></o:p></p> \
</div> <div>
<p class="MsoNormal">2. We could support some automatic transfer of the target \
attribute to the caller when calling these intrinsics, but I worry that this is too \
confusing.<o:p></o:p></p> <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
 <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D">KBS&gt; \
Regarding automatic transfer of the target attribute.&nbsp; It seems like something \
like this:<o:p></o:p></span></p> <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
 <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D">static \
__inline __m256 <br>
__attribute__((__always_inline__, __nodebug__))<o:p></o:p></span></p>
<p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D">_mm256_add_ps(__m256 \
__a, __m256 __b)<o:p></o:p></span></p> <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D">{<o:p></o:p></span></p>
 <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D">&nbsp; \
__builtin_assume(__has_feature(FEATURE_AVX));<br> &nbsp; return __a &#43; \
__b;<o:p></o:p></span></p> <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D">}<o:p></o:p></span></p>
 <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
 <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D">might \
be a good way to represent this. It has the nice property that the __builtin_assume \
only applies at the exact point in execution where<o:p></o:p></span></p> <p \
class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D">it \
occurs.&nbsp; You could probably add something like this as a small addition to what \
Eric has already done with the target attribute, or you could try to make inlining  \
of something with the target attribute automatically apply a target __builtin_assume \
property.<o:p></o:p></span></p> <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
 <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D">The \
difficulty is defining how the semantics of the different target properties have to \
be retained during optimization and code generation.<o:p></o:p></span></p> <p \
class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D">I'm \
sure this difficulty is why the granularity that Eric is working on is at the routine \
level.<o:p></o:p></span></p> <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
 <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D">Kevin<o:p></o:p></span></p>
 <p class="MsoNormal"><span \
style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
 </div>
<div>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class="MsoNormal">Implicitly setting a target attribute may block inlining that \
the user expected to happen, for example. Alternatively, there may be a dynamic cpuid \
check in the same function between SSE2 and AVX variants of the same algorithm, and \
now  the SSE2 loop will unexpectedly use AVX instructions.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
<div>
<p class="MsoNormal">So we should probably settle with telling the user to add \
-msseNN or __atribute__((target((&quot;sseNN&quot;)))).<o:p></o:p></p> </div>
</div>
<div>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<div>
<p class="MsoNormal">On Thu, Jul 30, 2015 at 8:53 AM, Vedant Kumar &lt;<a \
href="mailto:vsk@apple.com" target="_blank">vsk@apple.com</a>&gt; \
wrote:<o:p></o:p></p> <p class="MsoNormal">I've run into some code which no longer \
compiles because of two recent changes:<br> <br>
&nbsp; 41885d3 Update the intel intrinsic headers to use the target attribute \
support.<br> &nbsp; 695aff1 Use a define for per-file function attributes for the \
Intel intrinsic headers.<br> <br>
Specifically, one project defines its own SSE4.1 emulation routines when the real \
intrinsics aren't available. This is a problem because they've reused the names of \
the intrinsics. E.g;<br> <br>
&gt; #ifndef __SSE4_1__<br>
&gt; #define _mm_extract_epi8(a_, ndx) ({ ... })<br>
&gt; static inline __m128i _mm_blendv_epi8(__m128i a, __m128i b, __m128i mask) { ... \
}<br> &gt; ...<br>
&gt; #endif<br>
<br>
SSE4.1 intrinsics now leak into the project when it's being compiled for targets \
without SSE4.1 support. Compilation fails with &quot;error: redefinition \
...&quot;.<br> <br>
When these changes were initially being discussed, I think our stance was that we \
shouldn't support code like this [1]. However, we should reconsider for the sake of \
avoiding breakage. AFAICT, we would need to revert just two types of changes:<br> \
<br> In lib/Headers/__wmmintrin_aes.h:<br>
<br>
&gt; -#if defined (__SSE4_2__) || defined (__SSE4_1__)<br>
&gt;&nbsp; #include &lt;smmintrin.h&gt;<br>
&gt; -#endif<br>
<br>
<br>
In lib/Headers/smmintrin.h:<br>
<br>
&gt; -#ifndef __SSE4_1__<br>
&gt; -#error &quot;SSE4.1 instruction set not enabled&quot;<br>
&gt; -#else<br>
<br>
I don't see any downsides to reintroducing these guards. If everyone's OK with this, \
I can mail a patch in. The alternative is to have clients rewrite their emulation \
layers like this:<br> <br>
&gt; #ifdef __SSE4_1__<br>
&gt; #define compat_mm_extract_epi8 _mm_extract_epi8<br>
&gt; static inline __m128i combat_mm_blendv_epi8(__m128i a, __m128i b, __m128i mask) \
__attribute__((__target__((&quot;sse4.1&quot;)))) {<br> &gt;&nbsp; &nbsp;return \
_mm_blendv_epi8(a, b, mask);<br> &gt; }<br>
&gt; ...<br>
&gt; #else /* OK, no native SSE 4.1. Define our own. */<br>
&gt; #define compat_mm_extract_epi8(a_, ndx) ({ ... })<br>
&gt; static inline __m128i compat_mm_blendv_epi8(__m128i a, __m128i b, __m128i mask) \
{ ... }<br> &gt; ...<br>
&gt; #endif<br>
<br>
... and then replace all calls to intrinsics with calls to the new compatibility \
routines. This seems like a lot of tedious work, and I'd love to help people avoid it \
:).<br> <br>
Let me know what you think!<br>
<br>
vedant<br>
<br>
[1] <a href="http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20150615/131192.html" \
target="_blank"> http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20150615/131192.html</a><br>
 _______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" \
target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><o:p></o:p></p> \
</div> <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
</div>
</div>
</body>
</html>



_______________________________________________
cfe-dev mailing list
cfe-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

--===============6381810838451086987==--

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic