[prev in list] [next in list] [prev in thread] [next in thread]
List: cfe-dev
Subject: Re: [cfe-dev] [LLVMdev] Clang devirtualization proposal
From: Reid Kleckner <rnk () google ! com>
Date: 2015-08-01 1:18:16
Message-ID: CACs=ty+Uhra9S_FTUoBygFVk7FEJ2Da9DXsaM_1DOBGLz7ofiQ () mail ! gmail ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
On Fri, Jul 31, 2015 at 3:53 PM, Philip Reames <listmail@philipreames.com>
wrote:
>
> I'm wondering if there's a problematic interaction with CSE here.
> Consider this example is pseudo LLVM IR:
> v1 = load i64, %p, !invariant.group !Type1
> ; I called destructor/placement new for the same type, but that optimized
> entirely away
> p2 = invariant.group.barrier(p1)
> if (p1 != p2) return.
> store i64 0, %p2, !invariant.group !Type1
> v2 = load i64, %p2, !invariant.group !Type1
> ret i64 v1 - v2
>
> (Assume that !Type is used to describe a write once integer field within
> some class. Not all instances have the same integer value.)
>
> Having CSE turn this into:
> v1 = load i64, %p, !invariant.group !Type1
> p2 = invariant.group.barrier(p1)
> if (p1 != p2) return.
> store i64 0, %p1, !invariant.group !Type1
> v2 = load i64, %p1, !invariant.group !Type1
> ret i64 v1 - v2
>
> And then GVN turn this into:
> v1 = load i64, %p, !invariant.group !Type1
> p2 = invariant.group.barrier(p1)
> if (p1 != p2) return.
> ret i64 v1 - v1 (-> 0)
>
> This doesn't seem like the result I'd expect. Is there something about my
> initial IR which is wrong/invalid in some way? Is the invariant.group
> required to be specific to a single bitpattern across all usages within a
> function/module/context? That would be reasonable, but I don't think is
> explicit said right now. It also makes !invariant.group effectively
> useless for describing constant fields which are constant per instance
> rather than per-class.
>
Yes, this family of examples scares me. :) It seems we've discovered a new
device testing IR soundness. We used it to build a test case that shows
that 'readonly' on arguments without 'nocapture' doesn't let you forward
stores across such a call.
Consider this pseudo-IR and some possible transforms that I would expect to
be semantics preserving:
void f(i32* readonly %a, i32* %b) {
llvm.assume(%a == %b)
store i32 42, i32* %b
}
...
%p = alloca i32
store i32 13, i32* %p
call f(i32* readonly %p, i32* %p)
%r = load i32, i32* %p
; Propagate llvm.assume info
void f(i32* readonly %a, i32* %b) {
store i32 42, i32* %a
}
...
%p = alloca i32
store i32 13, i32* %p
call f(i32* readonly %p, i32* %p)
%r = load i32, i32* %p
; Delete dead args
void f(i32* readonly %a) {
store i32 42
}
...
%p = alloca i32
store i32 13, i32* %p
call f(i32* readonly %p)
%r = load i32, i32* %p
; Forward store %p to load %p, since the only use of %p is readonly
void f(i32* readonly %a) {
store i32 42
}
...
%p = alloca i32
call f(i32* readonly %p)
%r = i32 13
Today LLVM will not do the final transform because it requires readonly on
the entire function, or nocapture on the argument. nocapture cannot be
inferred due to the assume comparison.
[Attachment #5 (text/html)]
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Jul 31, 2015 \
at 3:53 PM, Philip Reames <span dir="ltr"><<a \
href="mailto:listmail@philipreames.com" \
target="_blank">listmail@philipreames.com</a>></span> wrote:<blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"> I'm wondering if \
there's a problematic interaction with CSE here. Consider this example is \
pseudo LLVM IR:<br> v1 = load i64, %p, !invariant.group !Type1<br>
; I called destructor/placement new for the same type, but that
optimized entirely away<br>
p2 = invariant.group.barrier(p1)<br>
if (p1 != p2) return.<br>
store i64 0, %p2, !invariant.group !Type1<br>
v2 = load i64, %p2, !invariant.group !Type1<br>
ret i64 v1 - v2<br>
<br>
(Assume that !Type is used to describe a write once integer field
within some class. Not all instances have the same integer value.)<br>
<br>
Having CSE turn this into:<br>
v1 = load i64, %p, !invariant.group !Type1<br>
p2 = invariant.group.barrier(p1)<br>
if (p1 != p2) return.<br>
store i64 0, %p1, !invariant.group !Type1<br>
v2 = load i64, %p1, !invariant.group !Type1<br>
ret i64 v1 - v2<br>
<br>
And then GVN turn this into:<br>
v1 = load i64, %p, !invariant.group !Type1<br>
p2 = invariant.group.barrier(p1)<br>
if (p1 != p2) return.<br>
ret i64 v1 - v1 (-> 0)<br>
<br>
This doesn't seem like the result I'd expect. Is there something
about my initial IR which is wrong/invalid in some way? Is the
invariant.group required to be specific to a single bitpattern
across all usages within a function/module/context? That would be
reasonable, but I don't think is explicit said right now. It also
makes !invariant.group effectively useless for describing constant
fields which are constant per instance rather than per-class. \
<br></div></blockquote><div><br></div><div>Yes, this family of examples scares me. :) \
It seems we've discovered a new device testing IR soundness. We used it to build \
a test case that shows that 'readonly' on arguments without \
'nocapture' doesn't let you forward stores across such a \
call.</div><div><br></div><div>Consider this pseudo-IR and some possible transforms \
that I would expect to be semantics preserving:</div><div><br></div><div>void f(i32* \
readonly %a, i32* %b) {</div><div> llvm.assume(%a == %b)</div><div> store i32 42, \
i32* %b<br>}</div><div> ...</div> %p = alloca i32<div> store i32 13, i32* \
%p</div><div> call f(i32* readonly %p, i32* %p)</div><div> %r = load i32, i32* \
%p</div><div><br></div><div><div>; Propagate llvm.assume info</div><div>void f(i32* \
readonly %a, i32* %b) {</div><div><div> store i32 42, i32* \
%a<br></div></div><div>}<br></div></div><div><div> ...</div> %p = alloca i32<div> \
store i32 13, i32* %p</div><div> call f(i32* readonly %p, i32* %p)</div><div> %r \
= load i32, i32* %p</div></div><div><br></div><div>; Delete dead args</div><div>void \
f(i32* readonly %a) {</div><div> store i32 42</div><div>}</div><div><div> \
...</div><div> %p = alloca i32</div><div> store i32 13, i32* %p</div><div> call \
f(i32* readonly %p)</div><div> %r = load i32, i32* \
%p</div></div><div><br></div><div>; Forward store %p to load %p, since the only use \
of %p is readonly</div><div><div>void f(i32* readonly %a) {</div><div> store i32 \
42</div><div>}</div><div><div> ...</div> %p = alloca i32<div> call f(i32* \
readonly %p)</div><div> %r = i32 13</div></div></div><div><br></div><div>Today LLVM \
will not do the final transform because it requires readonly on the entire function, \
or nocapture on the argument. nocapture cannot be inferred due to the assume \
comparison.</div></div></div></div>
_______________________________________________
cfe-dev mailing list
cfe-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic