'Re: [cfe-dev] [LLVMdev] Clang devirtualization proposal'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cfe-dev
Subject:    Re: [cfe-dev] [LLVMdev] Clang devirtualization proposal
From:       Reid Kleckner <rnk () google ! com>
Date:       2015-08-01 1:18:16
Message-ID: CACs=ty+Uhra9S_FTUoBygFVk7FEJ2Da9DXsaM_1DOBGLz7ofiQ () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]

On Fri, Jul 31, 2015 at 3:53 PM, Philip Reames <listmail@philipreames.com>
wrote:
>
> I'm wondering if there's a problematic interaction with CSE here.
> Consider this example is pseudo LLVM IR:
> v1 = load i64, %p, !invariant.group !Type1
> ; I called destructor/placement new for the same type, but that optimized
> entirely away
> p2 = invariant.group.barrier(p1)
> if (p1 != p2) return.
> store i64 0, %p2, !invariant.group !Type1
> v2 = load i64, %p2, !invariant.group !Type1
> ret i64 v1 - v2
>
> (Assume that !Type is used to describe a write once integer field within
> some class.  Not all instances have the same integer value.)
>
> Having CSE turn this into:
> v1 = load i64, %p, !invariant.group !Type1
> p2 = invariant.group.barrier(p1)
> if (p1 != p2) return.
> store i64 0, %p1, !invariant.group !Type1
> v2 = load i64, %p1, !invariant.group !Type1
> ret i64 v1 - v2
>
> And then GVN turn this into:
> v1 = load i64, %p, !invariant.group !Type1
> p2 = invariant.group.barrier(p1)
> if (p1 != p2) return.
> ret i64 v1 - v1 (-> 0)
>
> This doesn't seem like the result I'd expect.  Is there something about my
> initial IR which is wrong/invalid in some way?  Is the invariant.group
> required to be specific to a single bitpattern across all usages within a
> function/module/context?  That would be reasonable, but I don't think is
> explicit said right now.  It also makes !invariant.group effectively
> useless for describing constant fields which are constant per instance
> rather than per-class.
>

Yes, this family of examples scares me. :) It seems we've discovered a new
device testing IR soundness. We used it to build a test case that shows
that 'readonly' on arguments without 'nocapture' doesn't let you forward
stores across such a call.

Consider this pseudo-IR and some possible transforms that I would expect to
be semantics preserving:

void f(i32* readonly %a, i32* %b) {
  llvm.assume(%a == %b)
  store i32 42, i32* %b
}
  ...
  %p = alloca i32
  store i32 13, i32* %p
  call f(i32* readonly %p, i32* %p)
  %r = load i32, i32* %p

; Propagate llvm.assume info
void f(i32* readonly %a, i32* %b) {
  store i32 42, i32* %a
}
  ...
  %p = alloca i32
  store i32 13, i32* %p
  call f(i32* readonly %p, i32* %p)
  %r = load i32, i32* %p

; Delete dead args
void f(i32* readonly %a) {
  store i32 42
}
  ...
  %p = alloca i32
  store i32 13, i32* %p
  call f(i32* readonly %p)
  %r = load i32, i32* %p

; Forward store %p to load %p, since the only use of %p is readonly
void f(i32* readonly %a) {
  store i32 42
}
  ...
  %p = alloca i32
  call f(i32* readonly %p)
  %r = i32 13

Today LLVM will not do the final transform because it requires readonly on
the entire function, or nocapture on the argument. nocapture cannot be
inferred due to the assume comparison.

[Attachment #5 (text/html)]

<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Jul 31, 2015 \
at 3:53 PM, Philip Reames <span dir="ltr">&lt;<a \
href="mailto:listmail@philipreames.com" \
target="_blank">listmail@philipreames.com</a>&gt;</span> wrote:<blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000">  I&#39;m wondering if \
there&#39;s a problematic interaction with CSE here.    Consider this example is \
pseudo LLVM IR:<br>  v1 = load i64, %p, !invariant.group !Type1<br>
    ; I called destructor/placement new for the same type, but that
    optimized entirely away<br>
    p2 = invariant.group.barrier(p1)<br>
    if (p1 != p2) return.<br>
    store i64 0, %p2, !invariant.group !Type1<br>
    v2 = load i64, %p2, !invariant.group !Type1<br>
    ret i64 v1 - v2<br>
    <br>
    (Assume that !Type is used to describe a write once integer field
    within some class.   Not all instances have the same integer value.)<br>
    <br>
    Having CSE turn this into:<br>
    v1 = load i64, %p, !invariant.group !Type1<br>
    p2 = invariant.group.barrier(p1)<br>
    if (p1 != p2) return.<br>
    store i64 0, %p1, !invariant.group !Type1<br>
    v2 = load i64, %p1, !invariant.group !Type1<br>
    ret i64 v1 - v2<br>
    <br>
    And then GVN turn this into:<br>
    v1 = load i64, %p, !invariant.group !Type1<br>
    p2 = invariant.group.barrier(p1)<br>
    if (p1 != p2) return.<br>
    ret i64 v1 - v1 (-&gt; 0)<br>
    <br>
    This doesn&#39;t seem like the result I&#39;d expect.   Is there something
    about my initial IR which is wrong/invalid in some way?   Is the
    invariant.group required to be specific to a single bitpattern
    across all usages within a function/module/context?   That would be
    reasonable, but I don&#39;t think is explicit said right now.   It also
    makes !invariant.group effectively useless for describing constant
    fields which are constant per instance rather than per-class.   \
<br></div></blockquote><div><br></div><div>Yes, this family of examples scares me. :) \
It seems we&#39;ve discovered a new device testing IR soundness. We used it to build \
a test case that shows that &#39;readonly&#39; on arguments without \
&#39;nocapture&#39; doesn&#39;t let you forward stores across such a \
call.</div><div><br></div><div>Consider this pseudo-IR and some possible transforms \
that I would expect to be semantics preserving:</div><div><br></div><div>void f(i32* \
readonly %a, i32* %b) {</div><div>   llvm.assume(%a == %b)</div><div>   store i32 42, \
i32* %b<br>}</div><div>   ...</div>   %p = alloca i32<div>   store i32 13, i32* \
%p</div><div>   call f(i32* readonly %p, i32* %p)</div><div>   %r = load i32, i32* \
%p</div><div><br></div><div><div>; Propagate llvm.assume info</div><div>void f(i32* \
readonly %a, i32* %b) {</div><div><div>   store i32 42, i32* \
%a<br></div></div><div>}<br></div></div><div><div>   ...</div>   %p = alloca i32<div> \
store i32 13, i32* %p</div><div>   call f(i32* readonly %p, i32* %p)</div><div>   %r \
= load i32, i32* %p</div></div><div><br></div><div>; Delete dead args</div><div>void \
f(i32* readonly %a) {</div><div>   store i32 42</div><div>}</div><div><div>   \
...</div><div>   %p = alloca i32</div><div>   store i32 13, i32* %p</div><div>   call \
f(i32* readonly %p)</div><div>   %r = load i32, i32* \
%p</div></div><div><br></div><div>; Forward store %p to load %p, since the only use \
of %p is readonly</div><div><div>void f(i32* readonly %a) {</div><div>   store i32 \
42</div><div>}</div><div><div>   ...</div>   %p = alloca i32<div>   call f(i32* \
readonly %p)</div><div>   %r = i32 13</div></div></div><div><br></div><div>Today LLVM \
will not do the final transform because it requires readonly on the entire function, \
or nocapture on the argument. nocapture cannot be inferred due to the assume \
comparison.</div></div></div></div>

_______________________________________________
cfe-dev mailing list
cfe-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

[prev in list] [next in list] [prev in thread] [next in thread]