[prev in list] [next in list] [prev in thread] [next in thread]
List: gcc-bugs
Subject: [Bug tree-optimization/46590] long compile time with -O2 and many loops
From: "tkoenig at gcc dot gnu.org" <gcc-bugzilla () gcc ! gnu ! org>
Date: 2019-03-31 19:47:12
Message-ID: bug-46590-4-YOgtokvRgn () http ! gcc ! gnu ! org/bugzilla/
[Download RAW message or body]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46590
--- Comment #48 from Thomas Koenig <tkoenig at gcc dot gnu.org> ---
The test case from comment#5 and comment#6 has regressed for M7/8/9:
$ time gfortran-4.8 -O1 gener-4.f90
real 0m11.509s
user 0m11.356s
sys 0m0.148s
$ time gfortran-7 -O1 gener-4.f90
real 0m23.630s
user 0m23.475s
sys 0m0.142s
$ time gfortran-8 -O1 gener-4.f90
real 0m23.702s
user 0m23.356s
sys 0m0.335s
$ time gfortran -O1 gener-4.f90
real 0m24.708s
user 0m24.577s
sys 0m0.107s
(where gfortran is a recent trunk, without checking).
About half the time is spent in df live&initialized regs, with another
big chunk in tree copy headers:
$ gfortran -O1 -ftime-report gener-4.f90
Time variable usr sys wall
GGC
phase setup : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
182 kB ( 0%)
phase parsing : 0.30 ( 1%) 0.02 ( 8%) 0.32 ( 1%)
18037 kB ( 11%)
phase opt and generate : 23.81 ( 99%) 0.24 ( 92%) 24.06 ( 99%)
143289 kB ( 89%)
callgraph construction : 0.04 ( 0%) 0.00 ( 0%) 0.03 ( 0%)
4980 kB ( 3%)
ipa function summary : 0.05 ( 0%) 0.00 ( 0%) 0.04 ( 0%)
1414 kB ( 1%)
ipa inlining heuristics : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%)
0 kB ( 0%)
ipa pure const : 0.00 ( 0%) 0.00 ( 0%) 0.02 ( 0%)
0 kB ( 0%)
cfg construction : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%)
890 kB ( 1%)
cfg cleanup : 0.05 ( 0%) 0.00 ( 0%) 0.08 ( 0%)
0 kB ( 0%)
trivially dead code : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( 0%)
0 kB ( 0%)
df scan insns : 0.07 ( 0%) 0.00 ( 0%) 0.07 ( 0%)
0 kB ( 0%)
df multiple defs : 0.04 ( 0%) 0.00 ( 0%) 0.03 ( 0%)
0 kB ( 0%)
df reaching defs : 0.97 ( 4%) 0.01 ( 4%) 0.99 ( 4%)
0 kB ( 0%)
df live regs : 0.24 ( 1%) 0.00 ( 0%) 0.21 ( 1%)
0 kB ( 0%)
df live&initialized regs : 12.11 ( 50%) 0.01 ( 4%) 12.03 ( 49%)
0 kB ( 0%)
df use-def / def-use chains : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
0 kB ( 0%)
df reg dead/unused notes : 0.24 ( 1%) 0.00 ( 0%) 0.24 ( 1%)
2811 kB ( 2%)
register information : 0.02 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
0 kB ( 0%)
alias analysis : 0.06 ( 0%) 0.00 ( 0%) 0.08 ( 0%)
2048 kB ( 1%)
alias stmt walking : 1.55 ( 6%) 0.06 ( 23%) 1.65 ( 7%)
92 kB ( 0%)
register scan : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
189 kB ( 0%)
rebuild jump labels : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%)
0 kB ( 0%)
parser (global) : 0.30 ( 1%) 0.02 ( 8%) 0.32 ( 1%)
18037 kB ( 11%)
inline parameters : 0.04 ( 0%) 0.00 ( 0%) 0.03 ( 0%)
513 kB ( 0%)
tree gimplify : 0.07 ( 0%) 0.01 ( 4%) 0.08 ( 0%)
13934 kB ( 9%)
tree eh : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
0 kB ( 0%)
tree CFG construction : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
5209 kB ( 3%)
tree CFG cleanup : 0.34 ( 1%) 0.01 ( 4%) 0.37 ( 2%)
1697 kB ( 1%)
tree copy propagation : 0.03 ( 0%) 0.00 ( 0%) 0.04 ( 0%)
0 kB ( 0%)
tree PTA : 0.21 ( 1%) 0.00 ( 0%) 0.22 ( 1%)
1269 kB ( 1%)
tree PHI insertion : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
2644 kB ( 2%)
tree SSA rewrite : 0.05 ( 0%) 0.00 ( 0%) 0.05 ( 0%)
3119 kB ( 2%)
tree SSA other : 0.01 ( 0%) 0.02 ( 8%) 0.03 ( 0%)
0 kB ( 0%)
tree SSA incremental : 0.08 ( 0%) 0.00 ( 0%) 0.10 ( 0%)
4729 kB ( 3%)
tree operand scan : 0.04 ( 0%) 0.01 ( 4%) 0.06 ( 0%)
3526 kB ( 2%)
dominator optimization : 0.27 ( 1%) 0.01 ( 4%) 0.23 ( 1%)
5850 kB ( 4%)
tree SRA : 0.13 ( 1%) 0.00 ( 0%) 0.14 ( 1%)
562 kB ( 0%)
tree CCP : 0.32 ( 1%) 0.01 ( 4%) 0.30 ( 1%)
1226 kB ( 1%)
tree reassociation : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%)
0 kB ( 0%)
tree FRE : 0.60 ( 2%) 0.02 ( 8%) 0.59 ( 2%)
2505 kB ( 2%)
tree code sinking : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
60 kB ( 0%)
tree linearize phis : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
2 kB ( 0%)
tree backward propagate : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
0 kB ( 0%)
tree forward propagate : 0.05 ( 0%) 0.00 ( 0%) 0.04 ( 0%)
816 kB ( 1%)
tree phiprop : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%)
0 kB ( 0%)
tree conservative DCE : 0.02 ( 0%) 0.02 ( 8%) 0.07 ( 0%)
0 kB ( 0%)
tree aggressive DCE : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( 0%)
768 kB ( 0%)
tree DSE : 0.05 ( 0%) 0.00 ( 0%) 0.04 ( 0%)
0 kB ( 0%)
tree loop invariant motion : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( 0%)
0 kB ( 0%)
tree canonical iv : 0.04 ( 0%) 0.00 ( 0%) 0.05 ( 0%)
2262 kB ( 1%)
scev constant prop : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
228 kB ( 0%)
complete unrolling : 0.25 ( 1%) 0.02 ( 8%) 0.24 ( 1%)
9319 kB ( 6%)
tree iv optimization : 0.13 ( 1%) 0.01 ( 4%) 0.15 ( 1%)
7884 kB ( 5%)
tree copy headers : 3.15 ( 13%) 0.01 ( 4%) 3.17 ( 13%)
4763 kB ( 3%)
dominance frontiers : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%)
0 kB ( 0%)
dominance computation : 0.06 ( 0%) 0.00 ( 0%) 0.06 ( 0%)
0 kB ( 0%)
out of ssa : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( 0%)
0 kB ( 0%)
expand vars : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
589 kB ( 0%)
expand : 0.10 ( 0%) 0.00 ( 0%) 0.10 ( 0%)
23667 kB ( 15%)
post expand cleanups : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
0 kB ( 0%)
forward prop : 0.09 ( 0%) 0.00 ( 0%) 0.07 ( 0%)
908 kB ( 1%)
CSE : 0.11 ( 0%) 0.00 ( 0%) 0.11 ( 0%)
1458 kB ( 1%)
dead code elimination : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( 0%)
0 kB ( 0%)
dead store elim1 : 0.08 ( 0%) 0.00 ( 0%) 0.07 ( 0%)
2334 kB ( 1%)
dead store elim2 : 0.06 ( 0%) 0.00 ( 0%) 0.06 ( 0%)
2436 kB ( 2%)
loop init : 0.18 ( 1%) 0.00 ( 0%) 0.21 ( 1%)
8614 kB ( 5%)
loop invariant motion : 0.34 ( 1%) 0.00 ( 0%) 0.45 ( 2%)
151 kB ( 0%)
loop fini : 0.02 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
0 kB ( 0%)
branch prediction : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( 0%)
778 kB ( 0%)
combiner : 0.11 ( 0%) 0.00 ( 0%) 0.10 ( 0%)
1168 kB ( 1%)
integrated RA : 0.28 ( 1%) 0.00 ( 0%) 0.26 ( 1%)
9050 kB ( 6%)
LRA non-specific : 0.12 ( 0%) 0.00 ( 0%) 0.15 ( 1%)
394 kB ( 0%)
LRA virtuals elimination : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( 0%)
1547 kB ( 1%)
LRA create live ranges : 0.06 ( 0%) 0.00 ( 0%) 0.05 ( 0%)
121 kB ( 0%)
LRA hard reg assignment : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
0 kB ( 0%)
reload CSE regs : 0.11 ( 0%) 0.00 ( 0%) 0.10 ( 0%)
1373 kB ( 1%)
thread pro- & epilogue : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( 0%)
4 kB ( 0%)
hard reg cprop : 0.03 ( 0%) 0.01 ( 4%) 0.04 ( 0%)
0 kB ( 0%)
reorder blocks : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( 0%)
121 kB ( 0%)
shorten branches : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( 0%)
0 kB ( 0%)
final : 0.07 ( 0%) 0.00 ( 0%) 0.07 ( 0%)
1178 kB ( 1%)
straight-line strength reduction : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( 0%)
508 kB ( 0%)
rest of compilation : 0.15 ( 1%) 0.00 ( 0%) 0.16 ( 1%)
1247 kB ( 1%)
remove unused locals : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( 0%)
0 kB ( 0%)
address taken : 0.03 ( 0%) 0.00 ( 0%) 0.01 ( 0%)
0 kB ( 0%)
TOTAL : 24.11 0.26 24.39
161510 kB=
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic