[prev in list] [next in list] [prev in thread] [next in thread]
List: pcc-list
Subject: Re: [Pcc] libatomic-ops, pcc and memory barriers?
From: scj () yaccman ! com
Date: 2015-01-28 19:16:50
Message-ID: af846c8a81bc9a6c4d38ea6c3bf3b318.squirrel () webmail ! yaccman ! com
[Download RAW message or body]
> Thanks for the details Anders!
>
I confess that I don't know much about how memory barriers work with
current C/C++ compilers and dialects. But I wanted to point out an
excellent optimization that you might want to be aware of when
implementing such a feature...
Suppose you are computing a recurrence relation, something like:
double a[N];
...
<code that sets a[0] and a[1]>
...
for( int i=1; i<(N-2); ++i ) {
a[i+1] = <some computation involving a[i] and a[i-1]>;
}
Assuming that a[i] and a[i-1] get loaded into registers, and a[i+1] is
computed into a register before being stored, then when you loop around to
do the next iteration, you can simply move the a[i] register into the
a[i-1] register and the a[i+1] register into the a[i] register, and you
are ready to go. This optimization can easily double or triple the speed
of execution for such a loop, since the only memory operations after you
get started are stores.
However, if someone wants to effect a memory barrier that "caused values
loaded into registers to be reloaded", the programmer might well expect
that changing the memory value of a[i] or a[i-1] before entering an
iteration that uses it would affect the computation. But it would be a
very savvy compiler that could properly unravel what to do with the
optimized loop to achieve this behavior...
There is a paper I wrote with Randy Allen (In PLDI '88) that discusses
some of the challenges of compiling C on a vector/parallel machine...
_______________________________________________
Pcc mailing list
Pcc@lists.ludd.ltu.se
https://lists.ludd.ltu.se/cgi-bin/mailman/listinfo/pcc
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic