[prev in list] [next in list] [prev in thread] [next in thread]
List: gcc-bugs
Subject: BUG x86 fp stack overflow compiling at -O0
From: grahams <grahams () rcp ! co ! uk>
Date: 1999-12-31 19:07:31
[Download RAW message or body]
Hi
Bootstrapping egcs-19991228 using installed egcs-19991221 fails because of
differences in nearly all files.
Investigation showed that it was local register allocation selecting the
registers in a different order. I tracked this down to the calculation
of register priorities which determines the order in which register are
allocated.
The following macro is used in local-alloc.c to calculate allocation priorities
and this was sometimes returning bogus results (i.e. -2147483648).
#define QTY_CMP_PRI(q) \
((int) (((double) (floor_log2 (qty[q].n_refs) * qty[q].n_refs * qty[q].size) \
/ (qty[q].death - qty[q].birth)) * 10000))
This was being miscompiled by the installed egcs-19991221 compiler at -O0
because the fix_truncsi_1 pattern was failing to pop the FP stack.
Here is an analysis of the bug and a possible fix.
In i386.md there are define_insn's for *fix_truncsi_1 and *fix_truncdi_1
instruction patterns and corresponding define_split's. The define_insn's
use output_fix_truncate () in i386.c which uses REG_DEAD notes to determine
if the top of the FP stack dies and thus needs to be popped. These REG_DEAD
notes are added during the reg_to_stack pass. So after the reg_to_stack pass
has completed the insns will have any necessary REG_DEAD notes.
Consider the following insn post reg_to_stack pass
(insn 32 30 34 (parallel[
(set (reg:SI 1 edx)
(fix:SI (reg:DF 8 st(0))))
(clobber (mem:SI (plus:SI (reg:SI 6 ebp)
(const_int -4 [0xfffffffc])) 0))
(clobber (mem:SI (plus:SI (reg:SI 6 ebp)
(const_int -8 [0xfffffff8])) 0))
(clobber (reg:SI 0 eax))
] ) 152 {*fix_truncsi_1} (insn_list 30 (nil))
(expr_list:REG_DEAD (reg:DF 8 st(0))
(expr_list:REG_UNUSED (reg:SI 0 eax)
(nil))))
Final instruction selection is now performed and the above
insn gets split by the define_split which matches the fix_truncsi_1
pattern. When the instruction is split any reg notes on the original
insn are lost.
Later output_fix_truncate () is called decides that the top of the FP
stack does NOT die because there is no REG_DEAD note and hence does
not pop the FP stack.
The conclusion is that the define_split's corresponding to the
fix_truncsi_1/fix_truncdi_1 insn cannot be used after the reg_to_stack
because any REG_DEAD notes will be lost and output_fix_truncate () will
generate the WRONG code if the original insn had a REG_DEAD note for
the top of FP stack.
In my local tree I have modified the define_split to be disabled
after the reg_to_pass if the insn has a REG_DEAD note for operands[1]
Here's the original define_insn define_split for fix_truncsi_1
(define_insn "*fix_truncsi_1"
[(set (match_operand:SI 0 "nonimmediate_operand" "=m,?r")
(fix:SI (match_operand 1 "register_operand" "f,f")))
(clobber (match_operand:SI 2 "memory_operand" "=o,o"))
(clobber (match_operand:SI 3 "memory_operand" "=m,m"))
(clobber (match_scratch:SI 4 "=&r,r"))]
"TARGET_80387 && FLOAT_MODE_P (GET_MODE (operands[1]))"
"* return output_fix_trunc (insn, operands);"
[(set_attr "type" "multi")])
(define_split
[(set (match_operand:SI 0 "register_operand" "")
(fix:SI (match_operand 1 "register_operand" "")))
(clobber (match_operand:SI 2 "memory_operand" ""))
(clobber (match_operand:SI 3 "memory_operand" ""))
(clobber (match_scratch:SI 4 ""))]
"reload_completed
&& find_regno_note (insn, REG_DEAD, REGNO(operands[1])))"
[(parallel [(set (match_dup 3) (fix:SI (match_dup 1)))
(clobber (match_dup 2))
(clobber (match_dup 3))
(clobber (match_dup 4))])
(set (match_dup 0) (match_dup 3))]
"")
and here's the modified define_split (the define_insn is unchanged)
(define_split
[(set (match_operand:SI 0 "register_operand" "")
(fix:SI (match_operand 1 "register_operand" "")))
(clobber (match_operand:SI 2 "memory_operand" ""))
(clobber (match_operand:SI 3 "memory_operand" ""))
(clobber (match_scratch:SI 4 ""))]
"reload_completed
&& !(reg_to_stack_started
&& find_regno_note (insn, REG_DEAD, REGNO(operands[1])))"
[(parallel [(set (match_dup 3) (fix:SI (match_dup 1)))
(clobber (match_dup 2))
(clobber (match_dup 3))
(clobber (match_dup 4))])
(set (match_dup 0) (match_dup 3))]
"")
Note reg_to_stack is a new global set at the beginning of the
reg_to_stack pass and a similar change was added to the define_split
for the fix_truncdi_1 insn.
I have attached a small program which shows the effects of the
bug when compiled with egcs-19991221/28 on x86 using -O0 or -O1
but works at -O2 (it works using -O2 because the instruction gets
split before the reg_to_stack pass and not post reg_to-stack pass
as occurs using -O0 or -O1)
The correct out should be
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
but when it fails the output is
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=10000
priority(4,1,16,24)=-2147483648
priority(4,1,16,24)=-2147483648
priority(4,1,16,24)=-2147483648
We get the wrong answers because the FP stack has overflowed.
With the above modification to the define_split patterns
I get correct output for all optimization levels.
Graham
["bug3.c" (application/x-unknown-content-type-cfile)]
#include <stdio.h>
static int floor_log2(unsigned long long int x)
{
int log = -1;
while (x != 0)
{
log++;
x >>= 1;
}
return log;
}
static int priority(int nrefs, int size, int birth, int death)
{
return (int)((((double)(floor_log2 (nrefs) * nrefs * size))
/ (death - birth)) * 10000);
}
int nrefs = 4;
int size = 1;
int birth = 16;
int death = 24;
int main()
{
int i;
for (i = 0; i < 10; i++)
{
int p = priority(nrefs, size, birth, death);
printf("priority(%d,%d,%d,%d)=%d\n",
nrefs, size, birth, death, p);
}
return 0;
}
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic