'Re: [cfe-dev] C++11 and enhacned devirtualization'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cfe-dev
Subject:    Re: [cfe-dev] C++11 and enhacned devirtualization
From:       John McCall <rjmccall () apple ! com>
Date:       2015-07-16 22:19:13
Message-ID: 26975AC3-A357-4AF0-A1A3-4206813BC500 () apple ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]

> On Jul 16, 2015, at 2:38 PM, Richard Smith <richard@metafoo.co.uk> wrote:
> On Thu, Jul 16, 2015 at 2:03 PM, John McCall <rjmccall@apple.com \
> <mailto:rjmccall@apple.com>> wrote:
> > On Jul 16, 2015, at 11:46 AM, Richard Smith <richard@metafoo.co.uk \
> > <mailto:richard@metafoo.co.uk>> wrote: On Thu, Jul 16, 2015 at 11:29 AM, John \
> > McCall <rjmccall@apple.com <mailto:rjmccall@apple.com>> wrote:
> > > On Jul 15, 2015, at 10:11 PM, Hal Finkel <hfinkel@anl.gov \
> > > <mailto:hfinkel@anl.gov>> wrote: 
> > > Hi everyone,
> > > 
> > > C++11 added features that allow for certain parts of the class hierarchy to be \
> > > closed, specifically the 'final' keyword and the semantics of anonymous \
> > > namespaces, and I think we take advantage of these to enhance our ability to \
> > > perform devirtualization. For example, given this situation: 
> > > struct Base {
> > > virtual void foo() = 0;
> > > };
> > > 
> > > void external();
> > > struct Final final : Base {
> > > void foo() {
> > > external();
> > > }
> > > };
> > > 
> > > void dispatch(Base *B) {
> > > B->foo();
> > > }
> > > 
> > > void opportunity(Final *F) {
> > > dispatch(F);
> > > }
> > > 
> > > When we optimize this code, we do the expected thing and inline 'dispatch' into \
> > > 'opportunity' but we don't devirtualize the call to foo(). The fact that we \
> > > know what the vtable of F is at that callsite is not exploited. To a lesser \
> > > extent, we can do similar things for final virtual methods, and derived classes \
> > > in anonymous namespaces (because Clang could determine whether or not a class \
> > > (or method) there is effectively final). 
> > > One possibility might be to @llvm.assume to say something about what the vtable \
> > > ptr of F might be/contain should it be needed later when we emit the initial IR \
> > > for 'opportunity' (and then teach the optimizer to use that information), but \
> > > I'm not at all sure that's the best solution. Thoughts?
> > 
> > The problem with any sort of @llvm.assume-encoded information about memory \
> > contents is that C++ does actually allow you to replace objects in memory, up to \
> > and including stuff like: 
> > {
> > MyClass c;
> > 
> > // Reuse the storage temporarily.  UB to access the object through ‘c' now.
> > c.~MyClass();
> > auto c2 = new (&c) MyOtherClass();
> > 
> > // The storage has to contain a ‘MyClass' when it goes out of scope.
> > c2->~MyOtherClass();
> > new (&c) MyClass();
> > }
> > 
> > The standard frontend devirtualization optimizations are permitted under a couple \
> > of different language rules, specifically that: 1. If you access an object \
> > through an l-value of a type, it has to dynamically be an object of that type \
> > (potentially a subobject). 2. Object replacement as above only "forwards" \
> > existing formal references under specific conditions, e.g. the dynamic type has \
> > to be the same, ‘const' members have to have the same value, etc.  Using an \
> > unforwarded reference (like the name of the local variable ‘c' above) doesn't \
> > formally refer to a valid object and thus has undefined behavior. 
> > You can apply those rules much more broadly than the frontend does, of course; \
> > but those are the language tools you get. 
> > Right. Our current plan for modelling this is:
> > 
> > 1) Change the meaning of the existing !invariant.load metadata (or add another \
> > parallel metadata kind) so that it allows load-load forwarding (even if the \
> > memory is not known to be unmodified between the loads) if:
> 
> invariant.load currently allows the load to be reordered pretty aggressively, so I \
> think you need a new metadata. 
> Our thoughts were:
> 1) The existing !invariant.load is redundant because it's exactly equivalent to a \
> call to @llvm.invariant.start and a load.

No, that would not be arbitrarily hoistable.

> 2) The new semantics are a more strict form of the old semantics, so no special \
>                 action is required to upgrade old IR.
> ... so changing the meaning of the existing metadata seemed preferable to adding a \
> new, similar-but-not-quite-identical, form of the metadata. But either way seems \
> fine.
> > a) both loads have !invariant.load metadata with the same operand, and
> > b) the pointer operands are the same SSA value (being must-alias is not \
> > sufficient) 2) Add a new intrinsic "i8* @llvm.invariant.barrier(i8*)" that \
> > produces a new pointer that is different for the purpose of !invariant.load. \
> > (Some other optimizations are permitted to look through the barrier.) 
> > In particular, "new (&c) MyOtherClass()" would be emitted as something like this:
> > 
> > %1 = call @operator new(size, %c)
> > %2 = call @llvm.invariant.barrier(%1)
> > call @MyOtherClass::MyOtherClass(%2)
> > %vptr = load %2
> > %known.vptr = icmp eq %vptr, @MyOtherClass::vptr, !invariant.load \
> > !MyBaseClass.vptr call @llvm.assume(%known.vptr)
> 
> Hmm.  And all v-table loads have this invariant metadata?
> 
> That's the idea (but it's not essential that they do, we just lose optimization \
> power if not). 
> I am concerned about mixing files with and without barriers.
> 
> I think we'd need to always generate the barrier (even at -O0, to support LTO \
> between non-optimized and optimized code). I don't think we can support LTO between \
> IR using the metadata and old IR that didn't contain the relevant barriers. How \
> important is that use case?

Well, all current IR does not contain the barrier, so this would be a statement that \
current C++ IR will never be correctly LTO-able with future C++ IR.  That is \
generally something we try to avoid, yes.

> We were probably going to put this behind a -fstrict-something flag, at least to \
> start off with, so we can create a transition period where we generate the barrier \
> by default but don't generate the metadata if necessary.

If this goes into the function flags, perhaps there's a way to prevent LTO between \
functions that disagree about the flag.

John.

[Attachment #5 (unknown)]

<html><head><meta http-equiv="Content-Type" content="text/html \
charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; \
-webkit-line-break: after-white-space;" class=""><div><blockquote type="cite" \
class=""><div class="">On Jul 16, 2015, at 2:38 PM, Richard Smith &lt;<a \
href="mailto:richard@metafoo.co.uk" class="">richard@metafoo.co.uk</a>&gt; \
wrote:</div><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div \
class="gmail_quote">On Thu, Jul 16, 2015 at 2:03 PM, John McCall <span dir="ltr" \
class="">&lt;<a href="mailto:rjmccall@apple.com" target="_blank" \
class="">rjmccall@apple.com</a>&gt;</span> wrote:<br class=""><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div style="word-wrap:break-word" class=""><div class=""><div \
class=""><div class="h5"><blockquote type="cite" class=""><div class="">On Jul 16, \
2015, at 11:46 AM, Richard Smith &lt;<a href="mailto:richard@metafoo.co.uk" \
target="_blank" class="">richard@metafoo.co.uk</a>&gt; wrote:</div><div class=""><div \
dir="ltr" class=""><div class="gmail_extra"><div class="gmail_quote">On Thu, Jul 16, \
2015 at 11:29 AM, John McCall <span dir="ltr" class="">&lt;<a \
href="mailto:rjmccall@apple.com" target="_blank" \
class="">rjmccall@apple.com</a>&gt;</span> wrote:<br class=""><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><span class="">&gt; On Jul 15, 2015, at 10:11 PM, Hal Finkel \
&lt;<a href="mailto:hfinkel@anl.gov" target="_blank" class="">hfinkel@anl.gov</a>&gt; \
wrote:<br class=""> &gt;<br class="">
&gt; Hi everyone,<br class="">
&gt;<br class="">
&gt; C++11 added features that allow for certain parts of the class hierarchy to be \
closed, specifically the 'final' keyword and the semantics of anonymous namespaces, \
and I think we take advantage of these to enhance our ability to perform \
devirtualization. For example, given this situation:<br class=""> &gt;<br class="">
&gt; struct Base {<br class="">
&gt;&nbsp; virtual void foo() = 0;<br class="">
&gt; };<br class="">
&gt;<br class="">
&gt; void external();<br class="">
&gt; struct Final final : Base {<br class="">
&gt;&nbsp; void foo() {<br class="">
&gt;&nbsp; &nbsp; external();<br class="">
&gt;&nbsp; }<br class="">
&gt; };<br class="">
&gt;<br class="">
&gt; void dispatch(Base *B) {<br class="">
&gt;&nbsp; B-&gt;foo();<br class="">
&gt; }<br class="">
&gt;<br class="">
&gt; void opportunity(Final *F) {<br class="">
&gt;&nbsp; dispatch(F);<br class="">
&gt; }<br class="">
&gt;<br class="">
&gt; When we optimize this code, we do the expected thing and inline 'dispatch' into \
'opportunity' but we don't devirtualize the call to foo(). The fact that we know what \
the vtable of F is at that callsite is not exploited. To a lesser extent, we can do \
similar things for final virtual methods, and derived classes in anonymous namespaces \
(because Clang could determine whether or not a class (or method) there is \
effectively final).<br class=""> &gt;<br class="">
&gt; One possibility might be to @llvm.assume to say something about what the vtable \
ptr of F might be/contain should it be needed later when we emit the initial IR for \
'opportunity' (and then teach the optimizer to use that information), but I'm not at \
all sure that's the best solution. Thoughts?<br class=""> <br class="">
</span>The problem with any sort of @llvm.assume-encoded information about memory \
contents is that C++ does actually allow you to replace objects in memory, up to and \
including stuff like:<br class=""> <br class="">
{<br class="">
&nbsp; MyClass c;<br class="">
<br class="">
&nbsp; // Reuse the storage temporarily.&nbsp; UB to access the object through ‘c' \
now.<br class=""> &nbsp; c.~MyClass();<br class="">
&nbsp; auto c2 = new (&amp;c) MyOtherClass();<br class="">
<br class="">
&nbsp; // The storage has to contain a ‘MyClass' when it goes out of scope.<br \
class=""> &nbsp; c2-&gt;~MyOtherClass();<br class="">
&nbsp; new (&amp;c) MyClass();<br class="">
}<br class="">
<br class="">
The standard frontend devirtualization optimizations are permitted under a couple of \
different language rules, specifically that:<br class=""> 1. If you access an object \
through an l-value of a type, it has to dynamically be an object of that type \
(potentially a subobject).<br class=""> 2. Object replacement as above only \
"forwards" existing formal references under specific conditions, e.g. the dynamic \
type has to be the same, ‘const' members have to have the same value, etc.&nbsp; \
Using an unforwarded reference (like the name of the local variable ‘c' above) \
doesn't formally refer to a valid object and thus has undefined behavior.<br \
class=""> <br class="">
You can apply those rules much more broadly than the frontend does, of course; but \
those are the language tools you get.</blockquote><div class=""><br \
class=""></div><div class="">Right. Our current plan for modelling this is:</div><div \
class=""><br class=""></div><div class="">1) Change the meaning of the existing \
!invariant.load metadata (or add another parallel metadata kind) so that it allows \
load-load forwarding (even if the memory is not known to be unmodified between the \
loads) if:</div></div></div></div></div></blockquote><div class=""><br \
class=""></div></div></div>invariant.load currently allows the load to be reordered \
pretty aggressively, so I think you need a new metadata.</div></div></blockquote><div \
class=""><br class=""></div><div class="">Our thoughts were:</div><div class="">1) \
The existing !invariant.load is redundant because it's exactly equivalent to a call \
to @llvm.invariant.start and a \
load.</div></div></div></div></div></blockquote><div><br class=""></div>No, that \
would not be arbitrarily hoistable.</div><div><br class=""><blockquote type="cite" \
class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div \
class="gmail_quote"><div class="">2) The new semantics are a more strict form of the \
old semantics, so no special action is required to upgrade old IR.</div><div \
class="">... so changing the meaning of the existing metadata seemed preferable to \
adding a new, similar-but-not-quite-identical, form of the metadata. But either way \
seems fine.</div><blockquote class="gmail_quote" style="margin:0 0 0 \
.8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word" \
class=""><div class=""><span class=""><blockquote type="cite" class=""><div \
class=""><div dir="ltr" class=""><div class="gmail_extra"><div \
class="gmail_quote"><div class="">&nbsp; a) both loads have !invariant.load metadata \
with the same operand, and</div><div class="">&nbsp; b) the pointer operands are the \
same SSA value (being must-alias is not sufficient)</div><div class="">2) Add a new \
intrinsic "i8* @llvm.invariant.barrier(i8*)" that produces a new pointer that is \
different for the purpose of !invariant.load. (Some other optimizations are permitted \
to look through the barrier.)</div></div></div></div></div></blockquote><blockquote \
type="cite" class=""><div class=""><div dir="ltr" class=""><div \
class="gmail_extra"><div class="gmail_quote"><div class=""><br class=""></div><div \
class="">In particular, "new (&amp;c) MyOtherClass()" would be emitted as something \
like this:</div><div class=""><br class=""></div><div class="">&nbsp; %1 = call \
@operator new(size, %c)</div><div class="">&nbsp; %2 = call \
@llvm.invariant.barrier(%1)</div><div class="">&nbsp; call \
@MyOtherClass::MyOtherClass(%2)</div><div class="">&nbsp; %vptr = load %2</div><div \
class="">&nbsp; %known.vptr = icmp eq %vptr, @MyOtherClass::vptr, !invariant.load \
!MyBaseClass.vptr</div><div class="">&nbsp; call \
@llvm.assume(%known.vptr)</div></div></div></div></div></blockquote><div class=""><br \
class=""></div></span>Hmm.&nbsp; And all v-table loads have this invariant \
metadata?</div></div></blockquote><div class=""><br class=""></div><div \
class="">That's the idea (but it's not essential that they do, we just lose \
optimization power if not).</div><div class="">&nbsp;</div><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div style="word-wrap:break-word" class=""><div class="">I am \
concerned about mixing files with and without barriers.</div></div></blockquote><div \
class=""><br class=""></div><div class="">I think we'd need to always generate the \
barrier (even at -O0, to support LTO between non-optimized and optimized code). I \
don't think we can support LTO between IR using the metadata and old IR that didn't \
contain the relevant barriers. How important is that use \
case?</div></div></div></div></div></blockquote><div><br class=""></div><div>Well, \
all current IR does not contain the barrier, so this would be a statement that \
current C++ IR will never be correctly LTO-able with future C++ IR. &nbsp;That is \
generally something we try to avoid, yes.</div><br class=""><blockquote type="cite" \
class=""><div class=""><div dir="ltr" class=""><div class="gmail_extra"><div \
class="gmail_quote"><div class="">We were probably going to put this behind a \
-fstrict-something flag, at least to start off with, so we can create a transition \
period where we generate the barrier by default but don't generate the metadata if \
necessary.</div></div></div></div> </div></blockquote></div><br class=""><div \
class="">If this goes into the function flags, perhaps there's a way to prevent LTO \
between functions that disagree about the flag.</div><div class=""><br \
class=""></div><div class="">John.</div></body></html>

_______________________________________________
cfe-dev mailing list
cfe-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

[prev in list] [next in list] [prev in thread] [next in thread]