'Re: [llvm-dev] [RFC] Enable "#pragma omp declare simd" in the LoopVectorizer'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       llvm-dev
Subject:    Re: [llvm-dev] [RFC] Enable "#pragma omp declare simd" in the	LoopVectorizer
From:       "Tian, Xinmin via llvm-dev" <llvm-dev () lists ! llvm ! org>
Date:       2016-11-30 17:16:12
Message-ID: E42C235343FF1744BA43DDCC4DF1F1BA896FD6E3 () ORSMSX115 ! amr ! corp ! intel ! com
[Download RAW message or body]

Hi Francesco,

Good to know, you are working on the support for this feature. I assume you knew the \
RFC below.  The VectorABI mangling we proposed were approved by C++ Clang FE name \
mangling owner David M from Google,  the ClangFE support was committed in its main \
trunk by Alexey. 

"Proposal for function vectorization and loop vectorization with function calls", \
March 2, 2016. Intel Corp.  \
http://lists.llvm.org/pipermail/cfe-dev/2016-March/047732.html.

Matt submitted patch to generate vector variants for function definitions, not just \
function declarations. You may want to take a look.  Ayal's RFC will be also needed \
to support vectorization of function body in general. 

I agreed, we should have an option -fopenmp-simd to enable SIMD only, both GCC and \
ICC have similar options. 

I would suggest we shall sync-up on these work, so we don't duplicate the effort. 

Thanks,
Xinmin

-----Original Message-----
From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of Francesco \
                Petrogalli via llvm-dev
Sent: Wednesday, November 30, 2016 7:11 AM
To: llvm-dev@lists.llvm.org
Cc: nd <nd@arm.com>
Subject: [llvm-dev] [RFC] Enable "#pragma omp declare simd" in the LoopVectorizer

Dear all,

I have just created a couple of differential reviews to enable the vectorisation of \
loops that have function calls to routines marked with "#pragma omp declare simd".

They can be (re)viewed here:

* https://reviews.llvm.org/D27249

* https://reviews.llvm.org/D27250

The current implementation allows the loop vectorizer to generate vector code for \
source file as:

  #pragma omp declare simd
  double f(double x);

  void aaa(double *x, double *y, int N) {
    for (int i = 0; i < N; ++i) {
      x[i] = f(y[i]);
    }
  }

by invoking clang with arguments:

  $> clang -fopenmp -c -O3 file.c […]

Such functionality should provide a nice interface for vector libraries developers \
that can be used to inform the loop vectorizer of the availability of an external \
library with the vector implementation of the scalar functions in the loops. For \
this, all is needed to do is to mark with "#pragma omp declare simd" the function \
declaration in the header file of the library and generate the associated symbols in \
the object file of the library according to the name scheme of the vector ABI (see \
notes below).

I am interested in any feedback/suggestion/review the community might have regarding \
this behaviour.

Below you find a description of the implementation and some notes.

Thanks,

Francesco 

-----------

The functionality is implemented as follow:

1. Clang CodeGen generates a set of global external variables for each of the \
function declarations marked with the OpenMP pragma. Each of such globals are named \
according a mangling that is generated by llvm::TargetLibraryInfoImpl (TLII), and \
holds the vector signature of the associated vector function. (See examples in the \
tests of the clang patch. Each scalar function can generate multiple vector functions \
depending on the clauses of the declare simd directives) 2. When clang created the \
TLII, it processes the llvm::Module and finds out which of the globals of the module \
have the correct mangling and type so that they be added to the TLII as a list of \
vector function that can be associated to the original scalar one. 3. The \
LoopVectorizer looks for the available vector functions through the TLII not by \
scalar name and vectorisation factor but by scalar name and vector function \
signature, thus enabling the vectorizer to be able to distinguish a "vector \
vpow1(vector x, vector y)" from a "vector vpow2(vector x, scalar y)". (The second one \
corresponds to a "declare simd uniform(y)" for a "scalar pow(scalar x, scalar y)" \
declaration). (Notice that the changes in the loop vectorizer are minimal.)

Notes:

1. To enable SIMD only for OpenMP, leaving all the multithread/target behaviour \
                behind, we should enable this also with a new option:
-fopenmp-simd
2. The AArch64 vector ABI in the code is essentially the same as for the Intel one \
(apart from the prefix and the masking argument), and it is based on the clauses \
associated to "declare simd" in OpenMP 4.0. For OpenMP4.5, the parameters section of \
the mangled name should be updated. This update will not change the vectorizer \
behaviour as all the vectorizer needs to detect a vectorizable function is the \
original scalar name and a compatible vector function signature. Of course, any \
changes/updates in the ABI will have to be reflected in the symbols of the binary \
file of the library. 3. Whistle this is working only for function declaration, the \
same functionality can be used when (if) clang will implement the declare simd OpenMP \
pragma for function definitions. 4. I have enabled this for any loop that invokes the \
scalar function call, not just for those annotated with "#pragma omp for simd". I \
don't have any preference here, but at the same time I don't see any reason why this \
shouldn't be enabled by default for non annotated loops. Let me know if you disagree, \
I'd happily change the functionality if there are sound reasons behind that.

_______________________________________________
LLVM Developers mailing list
llvm-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
llvm-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

[prev in list] [next in list] [prev in thread] [next in thread]