From cfe-dev Mon Aug 03 20:37:45 2015 From: Hans Wennborg Date: Mon, 03 Aug 2015 20:37:45 +0000 To: cfe-dev Subject: Re: [cfe-dev] [PROPOSAL] Reintroduce guards for Intel intrinsic headers Message-Id: X-MARC-Message: https://marc.info/?l=cfe-dev&m=143863482414375 On Sun, Aug 2, 2015 at 5:09 PM, Chandler Carruth wrote: > Would cherrypicking the diagnostics to the 3.7 branch be better or worse? > (I'm of two minds, curious what others think...) The alternative of reverting would have the downside of missing out on some of the target attribute functionality in 3.7. I haven't been following closely enough to determine how great a loss that would be, but as far as I understand, this is still a work in progress, right? The alternative of cherry-picking diagnostics has the problem that I don't think any diagnostics have landed yet :-) My inclination is to revert back to safety here. The revert would be pretty hairy though, as there have been a number of changes to these files since r239883 landed. I'm still trying to figure this out. > On Sun, Aug 2, 2015 at 12:00 PM Justin Bogner wrote: >> >> Eric Christopher writes: >> > On Thu, Jul 30, 2015 at 10:12 AM Reid Kleckner wrote: >> > >> > I'm opposed to this. Going forward, I would really like target >> > intrinsics >> > to be available regardless of the current feature set, so users >> > don't need >> > hacks like these. >> > >> > Agreed. >> > >> > >> > I see two ways to do this with different tradeoffs: >> > 1. Diagnose missing target attributes when calling the intel >> > intrinsics. I >> > was surprised to find that we don't already do this. >> > >> > Sorry. This is on my list of things to do. >> >> +hans >> >> I agree with the direction of moving to use target attributes instead of >> relying on flaky ifdefs, but without any errors or warnings here this is >> a pretty serious diagnostic regression. >> >> I think we should revert this on the 3.7 branch. It can stay as is on >> trunk assuming the diagnostics are coming soon. >> >> Right now we end up in spaces where we get crashes in the backend >> instead of a sensible error in far too many situations. Notably: >> >> https://llvm.org/bugs/show_bug.cgi?id=24125 >> https://llvm.org/bugs/show_bug.cgi?id=24087 >> https://llvm.org/bugs/show_bug.cgi?id=24335 >> >> Additionally, I'm told this causes issues with configure scripts >> misdetecting available features, as well as strange compatibility issues >> like the one that led to this thread. >> >> This feature is woefully incomplete. We need the warnings/errors for it >> to be acceptable quality. >> >> > >> > 2. We could support some automatic transfer of the target attribute >> > to the >> > caller when calling these intrinsics, but I worry that this is too >> > confusing. >> > >> > We could, but it's probably better to leave it as is. >> > >> > -eric >> > >> > >> > Implicitly setting a target attribute may block inlining that the >> > user >> > expected to happen, for example. Alternatively, there may be a >> > dynamic >> > cpuid check in the same function between SSE2 and AVX variants of >> > the same >> > algorithm, and now the SSE2 loop will unexpectedly use AVX >> > instructions. >> > >> > So we should probably settle with telling the user to add -msseNN or >> > __atribute__((target(("sseNN")))). >> > >> > On Thu, Jul 30, 2015 at 8:53 AM, Vedant Kumar wrote: >> > >> > I've run into some code which no longer compiles because of two >> > recent >> > changes: >> > >> > 41885d3 Update the intel intrinsic headers to use the target >> > attribute support. >> > 695aff1 Use a define for per-file function attributes for the >> > Intel >> > intrinsic headers. >> > >> > Specifically, one project defines its own SSE4.1 emulation >> > routines >> > when the real intrinsics aren't available. This is a problem >> > because >> > they've reused the names of the intrinsics. E.g; >> > >> > > #ifndef __SSE4_1__ >> > > #define _mm_extract_epi8(a_, ndx) ({ ... }) >> > > static inline __m128i _mm_blendv_epi8(__m128i a, __m128i b, >> > __m128i >> > mask) { ... } >> > > ... >> > > #endif >> > >> > SSE4.1 intrinsics now leak into the project when it's being >> > compiled >> > for targets without SSE4.1 support. Compilation fails with >> > "error: >> > redefinition ...". >> > >> > When these changes were initially being discussed, I think our >> > stance >> > was that we shouldn't support code like this [1]. However, we >> > should >> > reconsider for the sake of avoiding breakage. AFAICT, we would >> > need to >> > revert just two types of changes: >> > >> > In lib/Headers/__wmmintrin_aes.h: >> > >> > > -#if defined (__SSE4_2__) || defined (__SSE4_1__) >> > > #include >> > > -#endif >> > >> > In lib/Headers/smmintrin.h: >> > >> > > -#ifndef __SSE4_1__ >> > > -#error "SSE4.1 instruction set not enabled" >> > > -#else >> > >> > I don't see any downsides to reintroducing these guards. If >> > everyone's >> > OK with this, I can mail a patch in. The alternative is to have >> > clients rewrite their emulation layers like this: >> > >> > > #ifdef __SSE4_1__ >> > > #define compat_mm_extract_epi8 _mm_extract_epi8 >> > > static inline __m128i combat_mm_blendv_epi8(__m128i a, __m128i >> > b, >> > __m128i mask) __attribute__((__target__(("sse4.1")))) { >> > > return _mm_blendv_epi8(a, b, mask); >> > > } >> > > ... >> > > #else /* OK, no native SSE 4.1. Define our own. */ >> > > #define compat_mm_extract_epi8(a_, ndx) ({ ... }) >> > > static inline __m128i compat_mm_blendv_epi8(__m128i a, __m128i >> > b, >> > __m128i mask) { ... } >> > > ... >> > > #endif >> > >> > ... and then replace all calls to intrinsics with calls to the >> > new >> > compatibility routines. This seems like a lot of tedious work, >> > and I'd >> > love to help people avoid it :). >> > >> > Let me know what you think! >> > >> > vedant >> > >> > [1] http://lists.cs.uiuc.edu/pipermail/cfe-commits/ >> > Week-of-Mon-20150615/131192.html _______________________________________________ cfe-dev mailing list cfe-dev@cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev