From cfe-dev  Mon Aug 03 20:37:45 2015
From: Hans Wennborg <hans () chromium ! org>
Date: Mon, 03 Aug 2015 20:37:45 +0000
To: cfe-dev
Subject: Re: [cfe-dev] [PROPOSAL] Reintroduce guards for Intel intrinsic headers
Message-Id: <CAB8jPhfJ_LZFpEo2QVHAyYqE=uGR=7y-RuTrmsTr+nwxsdH62g () mail ! gmail ! com>
X-MARC-Message: https://marc.info/?l=cfe-dev&m=143863482414375

On Sun, Aug 2, 2015 at 5:09 PM, Chandler Carruth <chandlerc@google.com> wrote:
> Would cherrypicking the diagnostics to the 3.7 branch be better or worse?
> (I'm of two minds, curious what others think...)

The alternative of reverting would have the downside of missing out on
some of the target attribute functionality in 3.7. I haven't been
following closely enough to determine how great a loss that would be,
but as far as I understand, this is still a work in progress, right?

The alternative of cherry-picking diagnostics has the problem that I
don't think any diagnostics have landed yet :-)

My inclination is to revert back to safety here. The revert would be
pretty hairy though, as there have been a number of changes to these
files since r239883 landed. I'm still trying to figure this out.


> On Sun, Aug 2, 2015 at 12:00 PM Justin Bogner <mail@justinbogner.com> wrote:
>>
>> Eric Christopher <echristo@gmail.com> writes:
>> > On Thu, Jul 30, 2015 at 10:12 AM Reid Kleckner <rnk@google.com> wrote:
>> >
>> >     I'm opposed to this. Going forward, I would really like target
>> > intrinsics
>> >     to be available regardless of the current feature set, so users
>> > don't need
>> >     hacks like these.
>> >
>> > Agreed.
>> >
>> >
>> >     I see two ways to do this with different tradeoffs:
>> >     1. Diagnose missing target attributes when calling the intel
>> > intrinsics. I
>> >     was surprised to find that we don't already do this.
>> >
>> > Sorry. This is on my list of things to do.
>>
>> +hans
>>
>> I agree with the direction of moving to use target attributes instead of
>> relying on flaky ifdefs, but without any errors or warnings here this is
>> a pretty serious diagnostic regression.
>>
>> I think we should revert this on the 3.7 branch. It can stay as is on
>> trunk assuming the diagnostics are coming soon.
>>
>> Right now we end up in spaces where we get crashes in the backend
>> instead of a sensible error in far too many situations. Notably:
>>
>>   https://llvm.org/bugs/show_bug.cgi?id=24125
>>   https://llvm.org/bugs/show_bug.cgi?id=24087
>>   https://llvm.org/bugs/show_bug.cgi?id=24335
>>
>> Additionally, I'm told this causes issues with configure scripts
>> misdetecting available features, as well as strange compatibility issues
>> like the one that led to this thread.
>>
>> This feature is woefully incomplete. We need the warnings/errors for it
>> to be acceptable quality.
>>
>> >
>> >     2. We could support some automatic transfer of the target attribute
>> > to the
>> >     caller when calling these intrinsics, but I worry that this is too
>> >     confusing.
>> >
>> > We could, but it's probably better to leave it as is.
>> >
>> > -eric
>> >
>> >
>> >     Implicitly setting a target attribute may block inlining that the
>> > user
>> >     expected to happen, for example. Alternatively, there may be a
>> > dynamic
>> >     cpuid check in the same function between SSE2 and AVX variants of
>> > the same
>> >     algorithm, and now the SSE2 loop will unexpectedly use AVX
>> > instructions.
>> >
>> >     So we should probably settle with telling the user to add -msseNN or
>> >     __atribute__((target(("sseNN")))).
>> >
>> >     On Thu, Jul 30, 2015 at 8:53 AM, Vedant Kumar <vsk@apple.com> wrote:
>> >
>> >         I've run into some code which no longer compiles because of two
>> > recent
>> >         changes:
>> >
>> >           41885d3 Update the intel intrinsic headers to use the target
>> >         attribute support.
>> >           695aff1 Use a define for per-file function attributes for the
>> > Intel
>> >         intrinsic headers.
>> >
>> >         Specifically, one project defines its own SSE4.1 emulation
>> > routines
>> >         when the real intrinsics aren't available. This is a problem
>> > because
>> >         they've reused the names of the intrinsics. E.g;
>> >
>> >         > #ifndef __SSE4_1__
>> >         > #define _mm_extract_epi8(a_, ndx) ({ ... })
>> >         > static inline __m128i _mm_blendv_epi8(__m128i a, __m128i b,
>> > __m128i
>> >         mask) { ... }
>> >         > ...
>> >         > #endif
>> >
>> >         SSE4.1 intrinsics now leak into the project when it's being
>> > compiled
>> >         for targets without SSE4.1 support. Compilation fails with
>> > "error:
>> >         redefinition ...".
>> >
>> >         When these changes were initially being discussed, I think our
>> > stance
>> >         was that we shouldn't support code like this [1]. However, we
>> > should
>> >         reconsider for the sake of avoiding breakage. AFAICT, we would
>> > need to
>> >         revert just two types of changes:
>> >
>> >         In lib/Headers/__wmmintrin_aes.h:
>> >
>> >         > -#if defined (__SSE4_2__) || defined (__SSE4_1__)
>> >         >  #include <smmintrin.h>
>> >         > -#endif
>> >
>> >         In lib/Headers/smmintrin.h:
>> >
>> >         > -#ifndef __SSE4_1__
>> >         > -#error "SSE4.1 instruction set not enabled"
>> >         > -#else
>> >
>> >         I don't see any downsides to reintroducing these guards. If
>> > everyone's
>> >         OK with this, I can mail a patch in. The alternative is to have
>> >         clients rewrite their emulation layers like this:
>> >
>> >         > #ifdef __SSE4_1__
>> >         > #define compat_mm_extract_epi8 _mm_extract_epi8
>> >         > static inline __m128i combat_mm_blendv_epi8(__m128i a, __m128i
>> > b,
>> >         __m128i mask) __attribute__((__target__(("sse4.1")))) {
>> >         >   return _mm_blendv_epi8(a, b, mask);
>> >         > }
>> >         > ...
>> >         > #else /* OK, no native SSE 4.1. Define our own. */
>> >         > #define compat_mm_extract_epi8(a_, ndx) ({ ... })
>> >         > static inline __m128i compat_mm_blendv_epi8(__m128i a, __m128i
>> > b,
>> >         __m128i mask) { ... }
>> >         > ...
>> >         > #endif
>> >
>> >         ... and then replace all calls to intrinsics with calls to the
>> > new
>> >         compatibility routines. This seems like a lot of tedious work,
>> > and I'd
>> >         love to help people avoid it :).
>> >
>> >         Let me know what you think!
>> >
>> >         vedant
>> >
>> >         [1] http://lists.cs.uiuc.edu/pipermail/cfe-commits/
>> >         Week-of-Mon-20150615/131192.html
_______________________________________________
cfe-dev mailing list
cfe-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev