[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gcc-bugs
Subject:    [Bug tree-optimization/82374] New: #pragma GCC optimize is not applied to openmp-generated functions
From:       "mikulas at artax dot karlin.mff.cuni.cz" <gcc-bugzilla () gcc ! gnu ! org>
Date:       2017-09-30 13:59:28
Message-ID: bug-82374-4 () http ! gcc ! gnu ! org/bugzilla/
[Download RAW message or body]

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82374

            Bug ID: 82374
           Summary: #pragma GCC optimize is not applied to
                    openmp-generated functions
           Product: gcc
           Version: 7.2.0
            Status: UNCONFIRMED
          Keywords: missed-optimization, openmp
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mikulas at artax dot karlin.mff.cuni.cz
  Target Milestone: ---
              Host: x86_64-pc-linux-gnu
            Target: x86_64-pc-linux-gnu
             Build: x86_64-pc-linux-gnu

#pragma GCC optimize (and the function attribute optimize) is not applied to
functions generated with openmp.

This is example code. Compile it with -fopenmp (with or without -O2) and you'll
notice that the function "f_nonomp" is properly vectorized (i.e. it uses the
"addps" instruction) and the function "f" is not vectorized (i.e. the generated
function f._omp_fn.0 performs the addition on single elements using the "addss"
instruction and no vectorization is done).

If we use __attribute((optimize("-O2","-ftree-vectorize"))) on the function
"f", then again, the optimization is not applied to openmp-generated function
"f._omp_fn.0".

If we use "-O3" or "-ftree-vectorize" on the command-line, then both functions
are properly vectorized.


#pragma GCC optimize("-O2", "-ftree-vectorize")

#define SIZE    (1024 * 1024 * 1024)

float a[SIZE];
float b[SIZE];
float c[SIZE];

void f(void)
{
        int i;
#pragma omp parallel for
        for (i = 0; i < SIZE; i++)
                c[i] = a[i] + b[i];
}

void f_nonomp(void)
{
        int i;
        for (i = 0; i < SIZE; i++)
                c[i] = a[i] + b[i];
}=
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic