[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gcc
Subject:    Re: IRA vs CANNOT_CHANGE_MODE_CLASS, + 4.7 IRA regressions?
From:       Vladimir Makarov <vmakarov () redhat ! com>
Date:       2011-07-29 18:13:42
Message-ID: 4E32F856.5070606 () redhat ! com
[Download RAW message or body]

On 07/27/2011 05:59 PM, Sandra Loosemore wrote:
> Consider this bit of code:
>
> extern double a[20];
>
> double test1 (int n)
> {
>   double accum = 0.0;
>   int i;
>
>   for (i=0; i<n; i++) { accum -= a[i]; }
>   accum = fabs (accum);
>   return accum;
> }
>
> which is compiled for MIPS using
>
> mipsisa32r2-sde-elf-gcc -O3 -fno-inline -fno-unroll-loops 
> -march=74kf1_1 -S abstest.c
>
> With a GCC 4.6 compiler, this produces:
> ...
> .L3:
>     mtc1    $3,$f2
>     ldc1    $f0,0($5)
>     addiu    $5,$5,8
>     mtc1    $2,$f3
>     sub.d    $f2,$f2,$f0
>     mfc1    $3,$f2
>     bne    $5,$4,.L3
>     mfc1    $2,$f3
>
>     ext    $5,$2,0,31
>     move    $4,$3
> .L2:
>     mtc1    $4,$f0
>     j    $31
>     mtc1    $5,$f1
> ...
>
> This is terrible code, with all that pointless register-shuffling 
> inside the loop -- what's gone wrong?  Well, the bit-twiddling 
> expansion of "fabs" produced by optabs.c uses subreg expressions, and 
> on MIPS CANNOT_CHANGE_MODE_CLASS disallows use of FP registers for 
> integer operations.  And, when IRA sees that, it decides it cannot 
> alloc "accum" to a FP reg at all, even if it obviously makes sense to 
> put it there for the rest of its lifetime.
>
> On mainline trunk, things are even worse as it's spilling to memory, 
> not just shuffling between registers:
>
> .L3:
>     ldc1    $f0,0($2)
>     addiu    $2,$2,8
>     sub.d    $f2,$f2,$f0
>     bne    $2,$3,.L3
>     sdc1    $f2,0($sp)
>
>     lw    $2,0($sp)
>     ext    $3,$2,0,31
>     lw    $2,4($sp)
> .L2:
>     sw    $2,4($sp)
>     sw    $3,0($sp)
>     lw    $3,4($sp)
>     lw    $2,0($sp)
>     addiu    $sp,$sp,8
>     mtc1    $3,$f0
>     j    $31
>     mtc1    $2,$f1
>
> I've been experimenting with a patch to the MIPS backend to add 
> define_insn_and_split patterns for floating-point abs -- the idea is 
> to attach some constraints to the insns to tell IRA it needs a GP reg 
> for these operations, so it can apply its usual cost analysis and 
> reload logic instead of giving up.  Then the split to introduce the 
> subreg expansion happens after reload when we already have the right 
> register class.  This seems to work well enough on 4.6; for this 
> particular example, I'm getting:
>
> .L3:
>     ldc1    $f2,0($2)
>     addiu    $2,$2,8
>     bne    $2,$4,.L3
>     sub.d    $f0,$f0,$f2
>
>     mfc1    $2,$f1
>     ext    $2,$2,0,31
>     j    $31
>     mtc1    $2,$f1
>
> However, same patch on mainline is still giving spills to memory.  :-(
>
> So, here's my question.  Is it worthwhile for me to continue this 
> approach of trying to make the MIPS backend smarter?  Or is the way 
> IRA deals with CANNOT_CHANGE_MODE_CLASS fundamentally broken and in 
> need of fixing in a target-inspecific way?  And/or is there some other 
> regression in IRA on mainline that's causing it to spill to memory 
> when it didn't used to in 4.6?
>
I think the second ("fixing in a target-inspecific way").  Instead of 
prohibiting class for a pseudo (that what is happening for class 
FP_REGS) because the pseudo can change its mode, impossibility of 
changing mode should be reflected in the class cost (through some reload 
cost evaluation).

I'll try to fix it.  The only problem is that it will take sometime 
because the fix should be tested on a few platforms.  It would be nice 
to make PR not to forget about the problem.
> BTW, the unary "neg" operator has the same problem as "abs" on MIPS; 
> can't use the hardware instruction because it does the wrong thing 
> with NaNs, and can't twiddle the sign bit directly in a FP register.  
> With both abs/neg now generating unnecessary memory spills, this seems 
> like a fairly important performance regression....

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic