'Re: [llvm-dev] CTMark - regular LLVM and CLANG compile-time tracking'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       llvm-dev
Subject:    Re: [llvm-dev] CTMark - regular LLVM and CLANG compile-time tracking
From:       Mehdi Amini via llvm-dev <llvm-dev () lists ! llvm ! org>
Date:       2016-11-28 21:18:53
Message-ID: 7782D0F2-1ECD-4647-B6EB-B9909943141F () apple ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]

> On Nov 17, 2016, at 3:15 PM, Matthias Braun <mbraun@apple.com> wrote:
> 
> > 
> > On Nov 17, 2016, at 3:10 PM, Mehdi Amini <mehdi.amini@apple.com \
> > <mailto:mehdi.amini@apple.com>> wrote: 
> > > 
> > > On Nov 17, 2016, at 3:00 PM, Matthias Braun <mbraun@apple.com \
> > > <mailto:mbraun@apple.com>> wrote: 
> > > 
> > > > On Nov 17, 2016, at 2:55 PM, Mehdi Amini <mehdi.amini@apple.com \
> > > > <mailto:mehdi.amini@apple.com>> wrote: 
> > > > Hi Gerolf,
> > > > 
> > > > This is really cool!
> > > > I'm very excited about this initiative and I hope we'll be able to get to a \
> > > > stage where compile time regression are handled like other regression: if \
> > > > they are not expected / justified by the commit author promptly, the commit \
> > > > should be reverted in the meantime! 
> > > > I'd like to suggest adding to CTMark the "empty" compile test (and maybe \
> > > > "empty + one empty function"), unless it is too noisy to measure. It is an \
> > > > interesting test to complete the existing ones because it measures the \
> > > > general overhead of setting up all the "infrastructure" (static initializers, \
> > > > creating a pass pipeline, etc.)
> > > That would indeed be a very interesting test, however this will be way too \
> > > short to measure predictably on its own. 
> > 
> > Are you afraid of the measurement noise for this?
> > 
> > > I could see it working if we had a flag that artifically runs the compilation \
> > > pipeline hundreds of times or alternatively puts the whole compiler into \
> > > something like googlebenchmark.
> > 
> > Since I'm interested in the startup time in general, it'd have to be a loop in a \
> > shell script that invokes clang ~1000 times (or better: with a statistical \
> > measurement of "confidence" that stops the loop when it reaches a threshold or \
> > don't make progress anymore), and returns something like the geometric mean.
> Can someone report actual experiences with this? I would expect extra noise if we \
> get the operating system involved as well creating thousands of processes. 
> > 
> > I think I remember Michael G. doing something like that for Swift performance \
> > testing?
> Well to come back to the end-to-end vs. unittest comparison. If startup time gets \
> out of hand it will show up in CTMark as well. If it does not get out of hand and \
> requires special measurement techniques because it is a lot less than 0.5s then \
> probably end user won't care as well. 

(Repeating here what we discussed offline): a x2 increase for the empty test may not \
show up on a test where for instance SelectionDAG takes most of the time. It may \
still matters on some codebase and with O0, or for LTO for instance where clang does \
not go through CodeGen.

— 
Mehdi

[Attachment #5 (text/html)]

<html><head><meta http-equiv="Content-Type" content="text/html \
charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; \
-webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote \
type="cite" class=""><div class="">On Nov 17, 2016, at 3:15 PM, Matthias Braun &lt;<a \
href="mailto:mbraun@apple.com" class="">mbraun@apple.com</a>&gt; wrote:</div><br \
class="Apple-interchange-newline"><div class=""><blockquote type="cite" class="" \
style="font-family: Helvetica; font-size: 12px; font-style: normal; \
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: \
auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; \
widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; \
-webkit-text-stroke-width: 0px;"><div class=""><br \
class="Apple-interchange-newline">On Nov 17, 2016, at 3:10 PM, Mehdi Amini &lt;<a \
href="mailto:mehdi.amini@apple.com" class="">mehdi.amini@apple.com</a>&gt; \
wrote:</div><br class="Apple-interchange-newline"><div class=""><blockquote \
type="cite" class="" style="font-family: Helvetica; font-size: 12px; font-style: \
normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; \
orphans: auto; text-align: start; text-indent: 0px; text-transform: none; \
white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: \
0px;"><div class=""><br class="Apple-interchange-newline">On Nov 17, 2016, at 3:00 \
PM, Matthias Braun &lt;<a href="mailto:mbraun@apple.com" \
class="">mbraun@apple.com</a>&gt; wrote:</div><br \
class="Apple-interchange-newline"><div class=""><div class="" style="word-wrap: \
break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><br \
class=""><div class=""><blockquote type="cite" class=""><div class="">On Nov 17, \
2016, at 2:55 PM, Mehdi Amini &lt;<a href="mailto:mehdi.amini@apple.com" \
class="">mehdi.amini@apple.com</a>&gt; wrote:</div><br \
class="Apple-interchange-newline"><div class=""><div class="" style="word-wrap: \
break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Hi \
Gerolf,<div class=""><br class=""></div><div class="">This is really cool!</div><div \
class="">I'm very excited about this initiative and I hope we'll be able to get to a \
stage where compile time regression are handled like other regression: if they are \
not expected / justified by the commit author promptly, the commit should be reverted \
in the meantime!</div><div class=""><br class=""></div><div class="">I'd like to \
suggest adding to CTMark the "empty" compile test (and maybe "empty + one empty \
function"), unless it is too noisy to measure.</div><div class="">It is an \
interesting test to complete the existing ones because it measures the general \
overhead of setting up all the "infrastructure" (static initializers, creating a pass \
pipeline, etc.)</div></div></div></blockquote><div class="">That would indeed be a \
very interesting test, however this will be way too short to measure predictably on \
its own.&nbsp;</div></div></div></div></blockquote><div class="" style="font-family: \
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; \
font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; \
text-indent: 0px; text-transform: none; white-space: normal; widows: auto; \
word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class=""></div><div class="" \
style="font-family: Helvetica; font-size: 12px; font-style: normal; \
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: \
auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; \
widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">Are you afraid of \
the measurement noise for this?</div><br class="" style="font-family: Helvetica; \
font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; \
letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; \
text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; \
-webkit-text-stroke-width: 0px;"><blockquote type="cite" class="" style="font-family: \
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; \
font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; \
text-indent: 0px; text-transform: none; white-space: normal; widows: auto; \
word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div class=""><div class="" \
style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: \
after-white-space;"><div class=""><div class="">I could see it working if we had a \
flag that artifically runs the compilation pipeline hundreds of times or \
alternatively puts the whole compiler into something like \
googlebenchmark.</div></div></div></div></blockquote><div class="" \
style="font-family: Helvetica; font-size: 12px; font-style: normal; \
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: \
auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; \
widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br \
class=""></div><div class="" style="font-family: Helvetica; font-size: 12px; \
font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: \
normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; \
white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: \
0px;">Since I'm interested in the startup time in general, it'd have to be a loop in \
a shell script that invokes clang ~1000 times (or better: with a statistical \
measurement of "confidence" that stops the loop when it reaches a threshold or don't \
make progress anymore), and returns something like the geometric \
mean.</div></div></blockquote><div style="font-family: Helvetica; font-size: 12px; \
font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: \
normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; \
white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: \
0px;" class="">Can someone report actual experiences with this? I would expect extra \
noise if we get the operating system involved as well creating thousands of \
processes.</div><div style="font-family: Helvetica; font-size: 12px; font-style: \
normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; \
orphans: auto; text-align: start; text-indent: 0px; text-transform: none; \
white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: \
0px;" class=""><br class=""></div><blockquote type="cite" class="" \
style="font-family: Helvetica; font-size: 12px; font-style: normal; \
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: \
auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; \
widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; \
-webkit-text-stroke-width: 0px;"><div class=""><div class="" style="font-family: \
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; \
font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; \
text-indent: 0px; text-transform: none; white-space: normal; widows: auto; \
word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class=""></div><div class="" \
style="font-family: Helvetica; font-size: 12px; font-style: normal; \
font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: \
auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; \
widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">I think I remember \
Michael G. doing something like that for Swift performance \
testing?</div></div></blockquote><div style="font-family: Helvetica; font-size: 12px; \
font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: \
normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; \
white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: \
0px;" class="">Well to come back to the end-to-end vs. unittest comparison. If \
startup time gets out of hand it will show up in CTMark as well. If it does not get \
out of hand and requires special measurement techniques because it is a lot less than \
0.5s then probably end user won't care as well.</div><div style="font-family: \
Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; \
font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; \
text-indent: 0px; text-transform: none; white-space: normal; widows: auto; \
word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br \
class=""></div></div></blockquote><div><br class=""></div><div>(Repeating here what \
we discussed offline): a x2 increase for the empty test may not show up on a test \
where for instance SelectionDAG takes most of the time.</div><div>It may still \
matters on some codebase and with O0, or for LTO for instance where clang does not go \
through CodeGen.</div><div><br \
class=""></div><div>—&nbsp;</div><div>Mehdi</div><div><br \
class=""></div></div></body></html>

[Attachment #6 (text/plain)]

_______________________________________________
LLVM Developers mailing list
llvm-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

[prev in list] [next in list] [prev in thread] [next in thread]