[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cfe-dev
Subject:    Re: [cfe-dev] C++11 and enhacned devirtualization
From:       Richard Smith <richard () metafoo ! co ! uk>
Date:       2015-07-17 0:12:19
Message-ID: CAOfiQqnpQeUO3C7K3oHGWYTdKeXADOsVxaCMXiKi5_BW6=dAZw () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


On Thu, Jul 16, 2015 at 3:19 PM, John McCall <rjmccall@apple.com> wrote:

> On Jul 16, 2015, at 2:38 PM, Richard Smith <richard@metafoo.co.uk> wrote:
> On Thu, Jul 16, 2015 at 2:03 PM, John McCall <rjmccall@apple.com> wrote:
>
>> On Jul 16, 2015, at 11:46 AM, Richard Smith <richard@metafoo.co.uk>
>> wrote:
>> On Thu, Jul 16, 2015 at 11:29 AM, John McCall <rjmccall@apple.com> wrote:
>>
>>> > On Jul 15, 2015, at 10:11 PM, Hal Finkel <hfinkel@anl.gov> wrote:
>>> >
>>> > Hi everyone,
>>> >
>>> > C++11 added features that allow for certain parts of the class
>>> hierarchy to be closed, specifically the 'final' keyword and the semantics
>>> of anonymous namespaces, and I think we take advantage of these to enhance
>>> our ability to perform devirtualization. For example, given this situation:
>>> >
>>> > struct Base {
>>> >  virtual void foo() = 0;
>>> > };
>>> >
>>> > void external();
>>> > struct Final final : Base {
>>> >  void foo() {
>>> >    external();
>>> >  }
>>> > };
>>> >
>>> > void dispatch(Base *B) {
>>> >  B->foo();
>>> > }
>>> >
>>> > void opportunity(Final *F) {
>>> >  dispatch(F);
>>> > }
>>> >
>>> > When we optimize this code, we do the expected thing and inline
>>> 'dispatch' into 'opportunity' but we don't devirtualize the call to foo().
>>> The fact that we know what the vtable of F is at that callsite is not
>>> exploited. To a lesser extent, we can do similar things for final virtual
>>> methods, and derived classes in anonymous namespaces (because Clang could
>>> determine whether or not a class (or method) there is effectively final).
>>> >
>>> > One possibility might be to @llvm.assume to say something about what
>>> the vtable ptr of F might be/contain should it be needed later when we emit
>>> the initial IR for 'opportunity' (and then teach the optimizer to use that
>>> information), but I'm not at all sure that's the best solution. Thoughts?
>>>
>>> The problem with any sort of @llvm.assume-encoded information about
>>> memory contents is that C++ does actually allow you to replace objects in
>>> memory, up to and including stuff like:
>>>
>>> {
>>>   MyClass c;
>>>
>>>   // Reuse the storage temporarily.  UB to access the object through ‘c'
>>> now.
>>>   c.~MyClass();
>>>   auto c2 = new (&c) MyOtherClass();
>>>
>>>   // The storage has to contain a ‘MyClass' when it goes out of scope.
>>>   c2->~MyOtherClass();
>>>   new (&c) MyClass();
>>> }
>>>
>>> The standard frontend devirtualization optimizations are permitted under
>>> a couple of different language rules, specifically that:
>>> 1. If you access an object through an l-value of a type, it has to
>>> dynamically be an object of that type (potentially a subobject).
>>> 2. Object replacement as above only "forwards" existing formal
>>> references under specific conditions, e.g. the dynamic type has to be the
>>> same, ‘const' members have to have the same value, etc.  Using an
>>> unforwarded reference (like the name of the local variable ‘c' above)
>>> doesn't formally refer to a valid object and thus has undefined behavior.
>>>
>>> You can apply those rules much more broadly than the frontend does, of
>>> course; but those are the language tools you get.
>>
>>
>> Right. Our current plan for modelling this is:
>>
>> 1) Change the meaning of the existing !invariant.load metadata (or add
>> another parallel metadata kind) so that it allows load-load forwarding
>> (even if the memory is not known to be unmodified between the loads) if:
>>
>>
>> invariant.load currently allows the load to be reordered pretty
>> aggressively, so I think you need a new metadata.
>>
>
> Our thoughts were:
> 1) The existing !invariant.load is redundant because it's exactly
> equivalent to a call to @llvm.invariant.start and a load.
>
>
> No, that would not be arbitrarily hoistable.
>
> 2) The new semantics are a more strict form of the old semantics, so no
> special action is required to upgrade old IR.
> ... so changing the meaning of the existing metadata seemed preferable to
> adding a new, similar-but-not-quite-identical, form of the metadata. But
> either way seems fine.
>
>>   a) both loads have !invariant.load metadata with the same operand, and
>>   b) the pointer operands are the same SSA value (being must-alias is not
>> sufficient)
>> 2) Add a new intrinsic "i8* @llvm.invariant.barrier(i8*)" that produces a
>> new pointer that is different for the purpose of !invariant.load. (Some
>> other optimizations are permitted to look through the barrier.)
>>
>>
>> In particular, "new (&c) MyOtherClass()" would be emitted as something
>> like this:
>>
>>   %1 = call @operator new(size, %c)
>>   %2 = call @llvm.invariant.barrier(%1)
>>   call @MyOtherClass::MyOtherClass(%2)
>>   %vptr = load %2
>>   %known.vptr = icmp eq %vptr, @MyOtherClass::vptr, !invariant.load
>> !MyBaseClass.vptr
>>   call @llvm.assume(%known.vptr)
>>
>>
>> Hmm.  And all v-table loads have this invariant metadata?
>>
>
> That's the idea (but it's not essential that they do, we just lose
> optimization power if not).
>
>
>> I am concerned about mixing files with and without barriers.
>>
>
> I think we'd need to always generate the barrier (even at -O0, to support
> LTO between non-optimized and optimized code). I don't think we can support
> LTO between IR using the metadata and old IR that didn't contain the
> relevant barriers. How important is that use case?
>
>
> Well, all current IR does not contain the barrier, so this would be a
> statement that current C++ IR will never be correctly LTO-able with future
> C++ IR.  That is generally something we try to avoid, yes.
>
> We were probably going to put this behind a -fstrict-something flag, at
> least to start off with, so we can create a transition period where we
> generate the barrier by default but don't generate the metadata if
> necessary.
>
>
> If this goes into the function flags, perhaps there's a way to prevent LTO
> between functions that disagree about the flag.
>

OK, that seems doable.

[Attachment #5 (text/html)]

<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Jul 16, 2015 \
at 3:19 PM, John McCall <span dir="ltr">&lt;<a href="mailto:rjmccall@apple.com" \
target="_blank">rjmccall@apple.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div><div \
class="h5"><blockquote type="cite"><div>On Jul 16, 2015, at 2:38 PM, Richard Smith \
&lt;<a href="mailto:richard@metafoo.co.uk" \
target="_blank">richard@metafoo.co.uk</a>&gt; wrote:</div><div><div dir="ltr"><div \
class="gmail_extra"><div class="gmail_quote">On Thu, Jul 16, 2015 at 2:03 PM, John \
McCall <span dir="ltr">&lt;<a href="mailto:rjmccall@apple.com" \
target="_blank">rjmccall@apple.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div style="word-wrap:break-word"><div><div><div><blockquote \
type="cite"><div>On Jul 16, 2015, at 11:46 AM, Richard Smith &lt;<a \
href="mailto:richard@metafoo.co.uk" target="_blank">richard@metafoo.co.uk</a>&gt; \
wrote:</div><div><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On \
Thu, Jul 16, 2015 at 11:29 AM, John McCall <span dir="ltr">&lt;<a \
href="mailto:rjmccall@apple.com" target="_blank">rjmccall@apple.com</a>&gt;</span> \
wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px \
#ccc solid;padding-left:1ex"><span>&gt; On Jul 15, 2015, at 10:11 PM, Hal Finkel \
&lt;<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>&gt; \
wrote:<br> &gt;<br>
&gt; Hi everyone,<br>
&gt;<br>
&gt; C++11 added features that allow for certain parts of the class hierarchy to be \
closed, specifically the &#39;final&#39; keyword and the semantics of anonymous \
namespaces, and I think we take advantage of these to enhance our ability to perform \
devirtualization. For example, given this situation:<br> &gt;<br>
&gt; struct Base {<br>
&gt;   virtual void foo() = 0;<br>
&gt; };<br>
&gt;<br>
&gt; void external();<br>
&gt; struct Final final : Base {<br>
&gt;   void foo() {<br>
&gt;      external();<br>
&gt;   }<br>
&gt; };<br>
&gt;<br>
&gt; void dispatch(Base *B) {<br>
&gt;   B-&gt;foo();<br>
&gt; }<br>
&gt;<br>
&gt; void opportunity(Final *F) {<br>
&gt;   dispatch(F);<br>
&gt; }<br>
&gt;<br>
&gt; When we optimize this code, we do the expected thing and inline \
&#39;dispatch&#39; into &#39;opportunity&#39; but we don&#39;t devirtualize the call \
to foo(). The fact that we know what the vtable of F is at that callsite is not \
exploited. To a lesser extent, we can do similar things for final virtual methods, \
and derived classes in anonymous namespaces (because Clang could determine whether or \
not a class (or method) there is effectively final).<br> &gt;<br>
&gt; One possibility might be to @llvm.assume to say something about what the vtable \
ptr of F might be/contain should it be needed later when we emit the initial IR for \
&#39;opportunity&#39; (and then teach the optimizer to use that information), but \
I&#39;m not at all sure that&#39;s the best solution. Thoughts?<br> <br>
</span>The problem with any sort of @llvm.assume-encoded information about memory \
contents is that C++ does actually allow you to replace objects in memory, up to and \
including stuff like:<br> <br>
{<br>
   MyClass c;<br>
<br>
   // Reuse the storage temporarily.   UB to access the object through ‘c' now.<br>
   c.~MyClass();<br>
   auto c2 = new (&amp;c) MyOtherClass();<br>
<br>
   // The storage has to contain a ‘MyClass' when it goes out of scope.<br>
   c2-&gt;~MyOtherClass();<br>
   new (&amp;c) MyClass();<br>
}<br>
<br>
The standard frontend devirtualization optimizations are permitted under a couple of \
different language rules, specifically that:<br> 1. If you access an object through \
an l-value of a type, it has to dynamically be an object of that type (potentially a \
subobject).<br> 2. Object replacement as above only "forwards" existing formal \
references under specific conditions, e.g. the dynamic type has to be the same, \
‘const' members have to have the same value, etc.   Using an unforwarded reference \
(like the name of the local variable ‘c' above) doesn't formally refer to a valid \
object and thus has undefined behavior.<br> <br>
You can apply those rules much more broadly than the frontend does, of course; but \
those are the language tools you get.</blockquote><div><br></div><div>Right. Our \
current plan for modelling this is:</div><div><br></div><div>1) Change the meaning of \
the existing !invariant.load metadata (or add another parallel metadata kind) so that \
it allows load-load forwarding (even if the memory is not known to be unmodified \
between the loads) if:</div></div></div></div></div></blockquote><div><br></div></div></div>invariant.load \
currently allows the load to be reordered pretty aggressively, so I think you need a \
new metadata.</div></div></blockquote><div><br></div><div>Our thoughts \
were:</div><div>1) The existing !invariant.load is redundant because it&#39;s exactly \
equivalent to a call to @llvm.invariant.start and a \
load.</div></div></div></div></div></blockquote><div><br></div></div></div>No, that \
would not be arbitrarily hoistable.</div><div><span class=""><br><blockquote \
type="cite"><div><div dir="ltr"><div class="gmail_extra"><div \
class="gmail_quote"><div>2) The new semantics are a more strict form of the old \
semantics, so no special action is required to upgrade old IR.</div><div>... so \
changing the meaning of the existing metadata seemed preferable to adding a new, \
similar-but-not-quite-identical, form of the metadata. But either way seems \
fine.</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px \
#ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><span><blockquote \
type="cite"><div><div dir="ltr"><div class="gmail_extra"><div \
class="gmail_quote"><div>   a) both loads have !invariant.load metadata with the same \
operand, and</div><div>   b) the pointer operands are the same SSA value (being \
must-alias is not sufficient)</div><div>2) Add a new intrinsic &quot;i8* \
@llvm.invariant.barrier(i8*)&quot; that produces a new pointer that is different for \
the purpose of !invariant.load. (Some other optimizations are permitted to look \
through the barrier.)</div></div></div></div></div></blockquote><blockquote \
type="cite"><div><div dir="ltr"><div class="gmail_extra"><div \
class="gmail_quote"><div><br></div><div>In particular, &quot;new (&amp;c) \
MyOtherClass()&quot; would be emitted as something like \
this:</div><div><br></div><div>   %1 = call @operator new(size, %c)</div><div>   %2 = \
call @llvm.invariant.barrier(%1)</div><div>   call \
@MyOtherClass::MyOtherClass(%2)</div><div>   %vptr = load %2</div><div>   %known.vptr \
= icmp eq %vptr, @MyOtherClass::vptr, !invariant.load !MyBaseClass.vptr</div><div>   \
call @llvm.assume(%known.vptr)</div></div></div></div></div></blockquote><div><br></div></span>Hmm. \
And all v-table loads have this invariant \
metadata?</div></div></blockquote><div><br></div><div>That&#39;s the idea (but \
it&#39;s not essential that they do, we just lose optimization power if \
not).</div><div>  </div><blockquote class="gmail_quote" style="margin:0 0 0 \
.8ex;border-left:1px #ccc solid;padding-left:1ex"><div \
style="word-wrap:break-word"><div>I am concerned about mixing files with and without \
barriers.</div></div></blockquote><div><br></div><div>I think we&#39;d need to always \
generate the barrier (even at -O0, to support LTO between non-optimized and optimized \
code). I don&#39;t think we can support LTO between IR using the metadata and old IR \
that didn&#39;t contain the relevant barriers. How important is that use \
case?</div></div></div></div></div></blockquote><div><br></div></span><div>Well, all \
current IR does not contain the barrier, so this would be a statement that current \
C++ IR will never be correctly LTO-able with future C++ IR.   That is generally \
something we try to avoid, yes.</div><span class=""><br><blockquote \
type="cite"><div><div dir="ltr"><div class="gmail_extra"><div \
class="gmail_quote"><div>We were probably going to put this behind a \
-fstrict-something flag, at least to start off with, so we can create a transition \
period where we generate the barrier by default but don&#39;t generate the metadata \
if necessary.</div></div></div></div> </div></blockquote></span></div><br><div>If \
this goes into the function flags, perhaps there's a way to prevent LTO between \
functions that disagree about the \
flag.</div></div></blockquote><div><br></div><div>OK, that seems doable.  \
</div></div></div></div>



_______________________________________________
cfe-dev mailing list
cfe-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic