[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cfe-dev
Subject:    Re: [cfe-dev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang
From:       henry miller <hank () millerfarm ! com>
Date:       2013-01-14 12:27:32
Message-ID: 1cb4efa1-1c3a-4d58-bc75-79654c0b4c72 () email ! android ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


As a user I like this it is hard to understand what each level does.  I know projects \
that are using O3 because 'more must be better'.  I don't know how to explain that it \
might be slower, and measuring performance is tricky. (Many programs do multpile \
things, and spend most of their time waiting on the event loop)

Would it be unreasonable to ask for a new/seperate set of optimizations: optimize \
debug.  This would apple agressive optimizations, but not "significantly" changing \
the order of the code. 

I don't know the optimizer, but I know as a user of compilers that minimal \
optimization is often the difference between painfully slow program execution and \
okay performance. However debugging optimized programs can be difficult because the \
debugger jumps all over making the problem hard to understand. 

I'll leave it to experts to debate shades.


Chandler Carruth <chandlerc@gmail.com> wrote:

> This has been an idea floating around in my head for a while and after
> several discussions with others it continues to hold up so I thought I
> would mail it out. Sorry for cross posting to both lists, but this is
> an
> issue that would significantly impact both LLVM and Clang.
> 
> Essentially, LLVM provides canned optimization "levels" for frontends
> to
> re-use. This is nothing new. However, we don't have good names for
> them, we
> don't expose them at the IR level, and we have a hard time figuring out
> which optimizations belong in which levels. I'd like to try addressing
> that
> by coming up with names and a description of the basic intend goal of
> each
> level. I would like, if folks are happy with these ideas, to add these
> types of descriptions along side these attributes to the langref. Ideas
> on
> other (better?) places to document this would be welcome. Certainly,
> Clang's documentation would need to be updated to reflect this.
> 
> Hopefully we can minimally debate this until the bikeshed is a
> tolerable
> shade. Note that I'm absolutely biased based on the behavior of Clang
> and
> GCC with these optimization levels, and the relevant history there.
> However, I'm adding and deviating from the purely historical
> differences to
> try and better model the latest developments in LLVM's optimizer... So
> here
> goes:
> 
> 
> 1) The easiest: 'Minimize Size' or '-Oz'
> - Attribute: minsize (we already have it, nothing to do here)
> - Goal: minimize the size of the resulting binary, at (nearly) any
> cost.
> 
> 
> 2) Optimize for size or '-Os'
> - Attribute: optsize (we already have it, nothing to do here)
> - Goal: Optimize the execution of the binary without unreasonably[1]
> increasing the binary size.
> This one is a bit fuzzy, but usually people don't have a hard time
> figuring
> out where the line is. The primary difference between minsize and
> optsize
> is that with minsize a pass is free to *hurt* performance to shrink the
> size.
> 
> [1] The definition of 'unreasonable' is of course subjective, but here
> is
> at least one strong indicator: any code size growth which is inherently
> *speculative* (that is, there isn't a known, demonstrable performance
> benefit, but rather it is "often" or "maybe" a benefit) is unlikely to
> be a
> good fit in optsize. The canonical example IMO is a vectorizer -- while
> it
> is reasonable to vectorize a loop, if the vector version might not be
> executed, and thus the scalar loop remains as well, then it is a poor
> fit
> for optsize.
> 
> 
> 3) Optimize quickly or '-O1'
> - Attribute: quickopt (this would be a new attribute)
> - Goal: Perform basic optimizations to improve both performance and
> simplicity of the code, but perform them *quickly*.
> This level is all about compile time, but in a holistic sense. It tries
> to
> perform basic optimizations to get reasonably efficient code, and get
> it
> very quickly.
> 
> 
> 4) Good, well-balanced optimizations, or '-O2'
> - Attribute: opt (new attribute)
> - Goal: produce a well optimized binary trading off compile time,
> space,
> and runtime efficiency.
> This should be an excellent default for general purpose programs. The
> idea
> is to do as much optimization as we can, in as reasonable of a time
> frame,
> and with as reasonable code size impact as possible. This level should
> always produce binaries at least as fast as optsize, but they might be
> both
> bigger and faster. This level should always produce binaries at least
> as
> fast as quickopt, but they might be both slower to compile.
> 
> 
> 5) Optimize to the max or '-O3'
> - Attribute: maxopt (new attribute)
> - Goal: produce the fastest binary possible.
> This level has historically been almost exclusively about trading off
> more
> binary size for speed than '-O2', but I would propose we change it to
> be
> more about trading off either binary size or compilation time to
> achieve a
> better performing binary. This level should always produce binaries at
> least as fast as opt, but they might be faster at the cost of them
> being
> larger and taking more time to compile. This would in some cases be a
> change for LLVM and is definitely a deviation from GCC where O3 will in
> many cases produce *slower* binaries due to code size increases that
> are
> not accompanied by corresponding performance increases.
> 
> 
> To go with these LLVM attributes I'd like to add support for adding
> attributes in Clang, both compatible with GCC and with the names above
> for
> clarity. The goal being to allow a specific function to have its
> optimization level overridden from the command line based level.
> 
> 
> A final note: I would like to remove all other variations on the '-O'
> flag.
> That includes the really strange '-O4' behavior. Whether the
> compilation is
> LTO should be an orthogonal decision to the particular level of
> optimization, and we have -flto to achieve this.
> 
> -Chandler
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> cfe-dev mailing list
> cfe-dev@cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.


[Attachment #5 (text/html)]

<html><head/><body><html><head></head><body>As a user I like this it is hard to \
understand what each level does.  I know projects that are using O3 because &#39;more \
must be better&#39;.  I don&#39;t know how to explain that it might be slower, and \
measuring performance is tricky. (Many programs do multpile things, and spend most of \
their time waiting on the event loop)<br> <br>
Would it be unreasonable to ask for a new/seperate set of optimizations: optimize \
debug.  This would apple agressive optimizations, but not &quot;significantly&quot; \
changing the order of the code. <br> <br>
I don&#39;t know the optimizer, but I know as a user of compilers that minimal \
optimization is often the difference between painfully slow program execution and \
okay performance. However debugging optimized programs can be difficult because the \
debugger jumps all over making the problem hard to understand. <br> <br>
I&#39;ll leave it to experts to debate shades.<br>
<br><br><div class="gmail_quote">Chandler Carruth &lt;chandlerc@gmail.com&gt; \
wrote:<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: \
1px solid rgb(204, 204, 204); padding-left: 1ex;"> <div dir="ltr"><div \
class="gmail_default" style="style">This has been an idea floating around in my head \
for a while and after several discussions with others it continues to hold up so I \
thought I would mail it out. Sorry for cross posting to both lists, but this is an \
issue that would significantly impact both LLVM and Clang.</div> <div \
class="gmail_default" style="style"><br /></div><div class="gmail_default" \
style="style">Essentially, LLVM provides canned optimization &quot;levels&quot; for \
frontends to re-use. This is nothing new. However, we don&#39;t have good names for \
them, we don&#39;t expose them at the IR level, and we have a hard time figuring out \
which optimizations belong in which levels. I&#39;d like to try addressing that by \
coming up with names and a description of the basic intend goal of each level. I \
would like, if folks are happy with these ideas, to add these types of descriptions \
along side these attributes to the langref. Ideas on other (better?) places to \
document this would be welcome. Certainly, Clang&#39;s documentation would need to be \
updated to reflect this.</div> <div class="gmail_default" style="style"><br \
/></div><div class="gmail_default" style="style">Hopefully we can minimally debate \
this until the bikeshed is a tolerable shade. Note that I&#39;m absolutely biased \
based on the behavior of Clang and GCC with these optimization levels, and the \
relevant history there. However, I&#39;m adding and deviating from the purely \
historical differences to try and better model the latest developments in LLVM&#39;s \
optimizer... So here goes:</div> <div class="gmail_default" style="style"><br \
/></div><div class="gmail_default" style="style"><br /></div><div \
class="gmail_default" style="style">1) The easiest: &#39;Minimize Size&#39; or \
&#39;-Oz&#39;</div><div class="gmail_default" style="style">- Attribute: minsize (we \
already have it, nothing to do here)<br /> </div><div class="gmail_default" \
style="style">- Goal: minimize the size of the resulting binary, at (nearly) any \
cost.</div><div class="gmail_default" style="style"><br /></div><div \
class="gmail_default" style="style"><br /></div><div class="gmail_default" \
style="style"> 2) Optimize for size or &#39;-Os&#39;</div><div class="gmail_default" \
style="style">- Attribute: optsize (we already have it, nothing to do here)</div><div \
class="gmail_default" style="style">- Goal: Optimize the execution of the binary \
without unreasonably[1] increasing the binary size.</div> <div class="gmail_default" \
style="style">This one is a bit fuzzy, but usually people don&#39;t have a hard time \
figuring out where the line is. The primary difference between minsize and optsize is \
that with minsize a pass is free to *hurt* performance to shrink the size.</div> <div \
class="gmail_default" style="style"><br /></div><div class="gmail_default" \
style="style">[1] The definition of &#39;unreasonable&#39; is of course subjective, \
but here is at least one strong indicator: any code size growth which is inherently \
*speculative* (that is, there isn&#39;t a known, demonstrable performance benefit, \
but rather it is &quot;often&quot; or &quot;maybe&quot; a benefit) is unlikely to be \
a good fit in optsize. The canonical example IMO is a vectorizer -- while it is \
reasonable to vectorize a loop, if the vector version might not be executed, and thus \
the scalar loop remains as well, then it is a poor fit for optsize.</div> <div \
class="gmail_default" style="style"><br /></div><div class="gmail_default" \
style="style"><br /></div><div class="gmail_default" style="style">3) Optimize \
quickly or &#39;-O1&#39;</div><div class="gmail_default" style="style">- Attribute: \
quickopt (this would be a new attribute)</div> <div class="gmail_default" \
style="style">- Goal: Perform basic optimizations to improve both performance and \
simplicity of the code, but perform them *quickly*.</div><div class="gmail_default" \
style="style">This level is all about compile time, but in a holistic sense. It tries \
to perform basic optimizations to get reasonably efficient code, and get it very \
quickly.</div> <div class="gmail_default" style="style"><br /></div><div \
class="gmail_default" style="style"><br /></div><div class="gmail_default" \
style="style">4) Good, well-balanced optimizations, or &#39;-O2&#39;</div><div \
class="gmail_default" style="style">- Attribute: opt (new attribute)</div> <div \
class="gmail_default" style="style">- Goal: produce a well optimized binary trading \
off compile time, space, and runtime efficiency.</div><div class="gmail_default" \
style="style">This should be an excellent default for general purpose programs. The \
idea is to do as much optimization as we can, in as reasonable of a time frame, and \
with as reasonable code size impact as possible. This level should always produce \
binaries at least as fast as optsize, but they might be both bigger and faster. This \
level should always produce binaries at least as fast as quickopt, but they might be \
both slower to compile.</div> <div class="gmail_default" style="style"><br \
/></div><div class="gmail_default" style="style"><br /></div><div \
class="gmail_default" style="style">5) Optimize to the max or &#39;-O3&#39;</div><div \
class="gmail_default" style="style">- Attribute: maxopt (new attribute)</div> <div \
class="gmail_default" style="style">- Goal: produce the fastest binary \
possible.</div><div class="gmail_default" style="style">This level has historically \
been almost exclusively about trading off more binary size for speed than \
&#39;-O2&#39;, but I would propose we change it to be more about trading off either \
binary size or compilation time to achieve a better performing binary. This level \
should always produce binaries at least as fast as opt, but they might be faster at \
the cost of them being larger and taking more time to compile. This would in some \
cases be a change for LLVM and is definitely a deviation from GCC where O3 will in \
many cases produce *slower* binaries due to code size increases that are not \
accompanied by corresponding performance increases.</div> <div class="gmail_default" \
style="style"><br /></div><div class="gmail_default" style="style"><br /></div><div \
class="gmail_default" style="style">To go with these LLVM attributes I&#39;d like to \
add support for adding attributes in Clang, both compatible with GCC and with the \
names above for clarity. The goal being to allow a specific function to have its \
optimization level overridden from the command line based level.</div> <div \
class="gmail_default" style="style"><br /></div><div class="gmail_default" \
style="style"><br /></div><div class="gmail_default" style="style">A final note: I \
would like to remove all other variations on the &#39;-O&#39; flag. That includes the \
really strange &#39;-O4&#39; behavior. Whether the compilation is LTO should be an \
orthogonal decision to the particular level of optimization, and we have -flto to \
achieve this.</div> <div class="gmail_default" style="style"><br /></div><div \
class="gmail_default" style="style">-Chandler</div></div> <p style="margin-top: \
2.5em; margin-bottom: 1em; border-bottom: 1px solid #000"></p><pre \
style="white-space: pre-wrap; word-wrap:break-word; font-family: sans-serif; \
margin-top: 0px"><hr /><br />cfe-dev mailing list<br />cfe-dev@cs.uiuc.edu<br /><a \
href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br \
                /></pre></blockquote></div><br>
-- <br>
Sent from my Android phone with K-9 Mail. Please excuse my \
brevity.</body></html></body></html>



_______________________________________________
cfe-dev mailing list
cfe-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic