[prev in list] [next in list] [prev in thread] [next in thread]
List: llvm-dev
Subject: Re: [llvm-dev] [RFC] Enable "#pragma omp declare simd" in the LoopVectorizer
From: "Tian, Xinmin via llvm-dev" <llvm-dev () lists ! llvm ! org>
Date: 2016-11-30 17:16:12
Message-ID: E42C235343FF1744BA43DDCC4DF1F1BA896FD6E3 () ORSMSX115 ! amr ! corp ! intel ! com
[Download RAW message or body]
Hi Francesco,
Good to know, you are working on the support for this feature. I assume you knew the \
RFC below. The VectorABI mangling we proposed were approved by C++ Clang FE name \
mangling owner David M from Google, the ClangFE support was committed in its main \
trunk by Alexey.
"Proposal for function vectorization and loop vectorization with function calls", \
March 2, 2016. Intel Corp. \
http://lists.llvm.org/pipermail/cfe-dev/2016-March/047732.html.
Matt submitted patch to generate vector variants for function definitions, not just \
function declarations. You may want to take a look. Ayal's RFC will be also needed \
to support vectorization of function body in general.
I agreed, we should have an option -fopenmp-simd to enable SIMD only, both GCC and \
ICC have similar options.
I would suggest we shall sync-up on these work, so we don't duplicate the effort.
Thanks,
Xinmin
-----Original Message-----
From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of Francesco \
Petrogalli via llvm-dev
Sent: Wednesday, November 30, 2016 7:11 AM
To: llvm-dev@lists.llvm.org
Cc: nd <nd@arm.com>
Subject: [llvm-dev] [RFC] Enable "#pragma omp declare simd" in the LoopVectorizer
Dear all,
I have just created a couple of differential reviews to enable the vectorisation of \
loops that have function calls to routines marked with "#pragma omp declare simd".
They can be (re)viewed here:
* https://reviews.llvm.org/D27249
* https://reviews.llvm.org/D27250
The current implementation allows the loop vectorizer to generate vector code for \
source file as:
#pragma omp declare simd
double f(double x);
void aaa(double *x, double *y, int N) {
for (int i = 0; i < N; ++i) {
x[i] = f(y[i]);
}
}
by invoking clang with arguments:
$> clang -fopenmp -c -O3 file.c […]
Such functionality should provide a nice interface for vector libraries developers \
that can be used to inform the loop vectorizer of the availability of an external \
library with the vector implementation of the scalar functions in the loops. For \
this, all is needed to do is to mark with "#pragma omp declare simd" the function \
declaration in the header file of the library and generate the associated symbols in \
the object file of the library according to the name scheme of the vector ABI (see \
notes below).
I am interested in any feedback/suggestion/review the community might have regarding \
this behaviour.
Below you find a description of the implementation and some notes.
Thanks,
Francesco
-----------
The functionality is implemented as follow:
1. Clang CodeGen generates a set of global external variables for each of the \
function declarations marked with the OpenMP pragma. Each of such globals are named \
according a mangling that is generated by llvm::TargetLibraryInfoImpl (TLII), and \
holds the vector signature of the associated vector function. (See examples in the \
tests of the clang patch. Each scalar function can generate multiple vector functions \
depending on the clauses of the declare simd directives) 2. When clang created the \
TLII, it processes the llvm::Module and finds out which of the globals of the module \
have the correct mangling and type so that they be added to the TLII as a list of \
vector function that can be associated to the original scalar one. 3. The \
LoopVectorizer looks for the available vector functions through the TLII not by \
scalar name and vectorisation factor but by scalar name and vector function \
signature, thus enabling the vectorizer to be able to distinguish a "vector \
vpow1(vector x, vector y)" from a "vector vpow2(vector x, scalar y)". (The second one \
corresponds to a "declare simd uniform(y)" for a "scalar pow(scalar x, scalar y)" \
declaration). (Notice that the changes in the loop vectorizer are minimal.)
Notes:
1. To enable SIMD only for OpenMP, leaving all the multithread/target behaviour \
behind, we should enable this also with a new option:
-fopenmp-simd
2. The AArch64 vector ABI in the code is essentially the same as for the Intel one \
(apart from the prefix and the masking argument), and it is based on the clauses \
associated to "declare simd" in OpenMP 4.0. For OpenMP4.5, the parameters section of \
the mangled name should be updated. This update will not change the vectorizer \
behaviour as all the vectorizer needs to detect a vectorizable function is the \
original scalar name and a compatible vector function signature. Of course, any \
changes/updates in the ABI will have to be reflected in the symbols of the binary \
file of the library. 3. Whistle this is working only for function declaration, the \
same functionality can be used when (if) clang will implement the declare simd OpenMP \
pragma for function definitions. 4. I have enabled this for any loop that invokes the \
scalar function call, not just for those annotated with "#pragma omp for simd". I \
don't have any preference here, but at the same time I don't see any reason why this \
shouldn't be enabled by default for non annotated loops. Let me know if you disagree, \
I'd happily change the functionality if there are sound reasons behind that.
_______________________________________________
LLVM Developers mailing list
llvm-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
llvm-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic