'Re: [LLVMdev] 64bit MRV problem: { float, float, float} -> { double,'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       llvm-dev
Subject:    Re: [LLVMdev] 64bit MRV problem: { float, float, float} -> { double,
From:       Duncan Sands <baldrick () free ! fr>
Date:       2010-01-29 14:25:12
Message-ID: 4B62EFC8.1050607 () free ! fr
[Download RAW message or body]

Hi Ralf,

> llvm-gcc -c -emit-llvm -O3 produces this:
> 
> %struct.float3 = type { float, float, float }
> define void @test(double %a.0, float %a.1, %struct.float3* nocapture
> %res) nounwind noinline {
> entry:
>   %tmp8 = bitcast double %a.0 to i64              ; <i64> [#uses=1]
>   %tmp9 = zext i64 %tmp8 to i96                   ; <i96> [#uses=1]
>   %tmp1 = lshr i96 %tmp9, 32                      ; <i96> [#uses=1]
>   %tmp2 = trunc i96 %tmp1 to i32                  ; <i32> [#uses=1]
>   %tmp3 = bitcast i32 %tmp2 to float              ; <float> [#uses=1]
>   %0 = getelementptr inbounds %struct.float3* %res, i64 0, i32 1 ;
> <float*> [#uses=1]
>   store float %tmp3, float* %0, align 4
>   ret void
> }

it is reasonable to expect the optimizers to turn this at least into

   %tmp8 = bitcast double %a.0 to i64
   %tmp2 = lshr i64 %tmp8, 32
   %tmp3 = bitcast i32 %tmp2 to float

> define void @xyz(float %aX, float %aY, float %aZ, float* noalias
> nocapture %resX, float* noalias nocapture %resY, float* noalias
> nocapture %resZ) nounwind {
> entry:
>   %0 = fadd float %aZ, 5.000000e-01          ; <float> [#uses=1]
>   %1 = fadd float %aY, 5.000000e-01          ; <float> [#uses=1]
>   %2 = fadd float %aX, 5.000000e-01          ; <float> [#uses=1]
>   %tmp16.i.i = bitcast float %1 to i32            ; <i32> [#uses=1]
>   %tmp17.i.i = zext i32 %tmp16.i.i to i96         ; <i96> [#uses=1]
>   %tmp18.i.i = shl i96 %tmp17.i.i, 32             ; <i96> [#uses=1]
>   %tmp19.i = zext i96 %tmp18.i.i to i128          ; <i128> [#uses=1]
>   %tmp8.i = lshr i128 %tmp19.i, 32                ; <i128> [#uses=1]
>   %tmp9.i = trunc i128 %tmp8.i to i32             ; <i32> [#uses=1]
>   %tmp10.i = bitcast i32 %tmp9.i to float         ; <float> [#uses=1]
>   store float %2, float* %resX, align 4
>   store float %tmp10.i, float* %resY, align 4
>   store float %0, float* %resZ, align 4
>   ret void
> }

Likewise, here it is reasonable to expect the optimizers to be able to get rid
of all the mucking around with %1, and understand that %tmp10.i is the same as
%1.

So I think you should open a bug report for this.  Please include everything
in this email, as well as the original C for the second testcase.

> Any ideas where that code comes from or why it cannot be removed?

I think it comes directly from the frontend ABI logic.  The optimizers should
be able to handle this, but since they aren't handling it they need to be
improved.

Ciao,

Duncan.
_______________________________________________
LLVM Developers mailing list
LLVMdev@cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
[prev in list] [next in list] [prev in thread] [next in thread]