[prev in list] [next in list] [prev in thread] [next in thread] 

List:       mono-devel-list
Subject:    Re: [Mono-dev] Mono.SIMD
From:       Alan McGovern <alan.mcgovern () gmail ! com>
Date:       2009-02-23 13:32:56
Message-ID: 117799f00902230532i1a6bf9c9j85b2643e00df0450 () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Hey,

The C++ code seems very similar to the C# SIMD code, so I don't know what
would make the C# version any faster. This question would be best directed
at jit guys, who may know what causes the difference.

If you want to try speeding up the mono version, you should just use trial
and error to see if you can rewrite things so that you can get better
performance. For example, unrolling the loop may improve performance
noticably.

Alan.

On Mon, Feb 23, 2009 at 1:16 PM, Johann Nadalutti <jnadalutti@gmail.com>wrote:

> Hey,
>  thanks a lot for your modifications.
>  I have now SIMD x3 faster than 4DFloat version !
>  I make the same code in C++ and It's x3 more faster than Mono.SIMD.
> I just want to know why and how to optimize my Mono code.
>  What do you use as IDE to develop and debug Mono ?
>
>
> My Visual C++ code for test:
>
> class VectorSIMD
> {
> public:
>
>     VectorSIMD();
>     VectorSIMD(float x, float y, float z, float w);
>
>     VectorSIMD operator*(const VectorSIMD& other)
>     {
>         VectorSIMD r;
>         r.vec = _mm_mul_ps(vec, other.vec);
>         return r;
>     }
>
>     VectorSIMD operator*(float f)
>     {
>         VectorSIMD r;
>         __m128 b = _mm_load1_ps(&f);
>         r.vec = _mm_mul_ps(vec, b);
>         return r;
>     }
>
>
>     VectorSIMD operator+(const VectorSIMD& other)
>     {
>         VectorSIMD r;
>         r.vec = _mm_add_ps(vec, other.vec);
>         return r;
>     }
>
>     //Datas
>     union
>     {
>         __m128 vec;
>         struct { float x, y, z, w; };
>     };
>
> };
>
> VectorSIMD::VectorSIMD()
> {
> }
>
> VectorSIMD::VectorSIMD(float _x, float _y, float _z, float _w)
> {
>     x=_x;    y=_y; z=_z; w=_w;
> }
>
>
> VectorSIMD GradientSIMD()
> {
>   VectorSIMD finv_WH(1.0f / (_W*_H), 1.0f / (_W*_H), 1.0f / (_W*_H), 1.0f /
> (_W*_H));
>     VectorSIMD ret(0.0, 0.0, 0.0, 0.0);
>
>     VectorSIMD a(0.0f, 0.0f, 1.0f, 1.0f);
>     a =a + VectorSIMD(0.0f, 1.0f, 0.0f, 1.0f);
>     a =a + VectorSIMD(1.0f, 0.0f, 0.0f, 1.0f);
>     a =a + VectorSIMD(0.5f, 0.5f, 1.0f, 1.0f);
>
>
>     //Process operator
>   VectorSIMD yVec(_H, _H, 0, 0);
>   VectorSIMD yDiff(-1.0f, -1.0f, 1.0f, 1.0f);
>     for (int y=0; y<_H; y++)
>     {
>         VectorSIMD factor = yVec * finv_WH;
>         yVec = yVec + yDiff;
>
>         VectorSIMD xVec(_W, 0, _W, 0);
>         VectorSIMD xDiff(-1.0f, 1.0f, -1.0f, 1.0f);
>         for (int x=0; x<_W; x++)
>         {
>             ret=ret+(a*xVec*factor);
>             xVec=xVec+xDiff;
>         }
>     }
>
>     return ret;
> }
>
>
> Johann.
>
>
>
>
> 2009/2/23 Alan McGovern <alan.mcgovern@gmail.com>
>
> Hey,
>>
>> The big issue you're having is that you haven't implemented a SIMD
>> algorithm ;) I spent 15 mins 'optimising' your code and came up with this.
>> Notice that I made everything a SIMD operation. There is no scalar code in
>> the method anymore. This tripled performance as compared to the non-SIMD
>> version. On my machine:
>>
>> -FLOAT 00:00:00.3888930 Color
>> -SIMD   00:00:00.1266820 Mono.Simd.Vector4f
>>
>> You'd want to double check the result just in case I made a mistake with
>> my alterations.
>>
>> Alan.
>>
>>         public static Vector4f GradientSIMD()
>>         {
>>             Vector4f finv_WH = new Vector4f (1.0f / (w*h), 1.0f / (w*h),
>> 1.0f / (w*h), 1.0f / (w*h));
>>             Vector4f ret = new Vector4f();
>>
>>             Vector4f a = new Vector4f(0.0f, 0.0f, 1.0f, 1.0f);
>>             a += new Vector4f(0.0f, 1.0f, 0.0f, 1.0f);
>>             a += new Vector4f(1.0f, 0.0f, 0.0f, 1.0f);
>>             a += new Vector4f(0.5f, 0.5f, 1.0f, 1.0f);
>>
>>             //Process operator
>>             Vector4f yVec = new Vector4f (h, h, 0, 0);
>>             Vector4f yDiff = new Vector4f (-1, -1, 1, 1);
>>             for (int y=0; y<h; y++)
>>             {
>>                 Vector4f factor = yVec * finv_WH;
>>                 yVec += yDiff;
>>
>>                 Vector4f xVec = new Vector4f (w, 0, w, 0);
>>                 Vector4f xDiff = new Vector4f (-1, 1, -1, 1);
>>                 for (int x=0; x<w; x++)
>>                 {
>>                     ret += (a * xVec * factor);
>>                     xVec += xDiff;
>>                 }
>>             }
>>             return ret;
>>         }
>>
>> On Fri, Feb 20, 2009 at 8:12 AM, Johann_fxgen <jnadalutti@gmail.com>wrote:
>>
>>>
>>> I have done some performance tests of SIMD under windows.
>>>
>>> Results tests in ms:
>>> In MS C         235   (Visual Studio Release Mode With SIMD)
>>> In MS C         360   (Visual Studio Release Mode With 4D Float)
>>> In Mono C#    453   (With Mono SIMD)
>>> In Mono C#    562   (With Mono 4D Float)
>>> In MS C#       609   (Visual Studio With 4D Float)
>>> In MS C         672   (Visual Studio Debug Mode)
>>>
>>> I'm just surprise by difference between C SIMD and mono SIMD version.
>>>
>>> Is Mono.SIMD under linux speeder than under windows ?
>>>
>>> Johann.
>>>
>>> My mono code for test:
>>>
>>>        using Mono.Simd;
>>>        using System;
>>>        using Mono;
>>>
>>>        public struct Color
>>>        {
>>>                public float r,g,b,a;
>>>        };
>>>
>>>        public class TestMonoSIMD
>>>        {
>>>                public  Color m_pixels;
>>>                const int w = 4096;
>>>                const int h = 4096;
>>>
>>>                public static void Main ()
>>>                {
>>>                        //Debug
>>>                        Console.WriteLine("AccelMode: {0}",
>>> Mono.Simd.SimdRuntime.AccelMode );
>>>
>>>                        //Without SIMD
>>>                        DateTime start1 = DateTime.Now;
>>>                        Color ret1 = Gradient();
>>>                        TimeSpan ts1 = DateTime.Now - start1;
>>>                        Console.WriteLine("-FLOAT {0} {1}", ts1, ret1);
>>>
>>>                        //With SIMD
>>>                        DateTime start2 = DateTime.Now;
>>>                        Vector4f ret2 = GradientSIMD();
>>>                        TimeSpan ts2 = DateTime.Now - start2;
>>>                        Console.WriteLine("-SIMD  {0} {1}", ts2, ret2);
>>>                }
>>>
>>>                public static Color Gradient()
>>>                {
>>>                        float finv_WH = 1.0f / (float)(w*h);
>>>                        Color ret = new Color();
>>>                        ret.r=ret.g=ret.b=ret.a=0.0f;
>>>
>>>                        Color a = new Color();
>>>                        Color b = new Color();
>>>                        Color c = new Color();
>>>                        Color d = new Color();
>>>                        a.r=0.0f;       a.g=0.0f; a.b=1.0f; a.a=1.0f;
>>>                        b.r=0.0f;       b.g=1.0f; b.b=0.0f; b.a=1.0f;
>>>                        c.r=1.0f;       c.g=0.0f; c.b=0.0f; c.a=1.0f;
>>>                        d.r=0.5f;       d.g=0.5f; d.b=1.0f; d.a=1.0f;
>>>
>>>                        //Process operator
>>>                        for (int y=0; y<h; y++)
>>>                        {
>>>                                for (int x=0; x<w; x++)
>>>                                {
>>>                                        //Calc percent A,B,C,D
>>>                                        float pa = (float)((w-x)        *
>>> (h-y)) * finv_WH;
>>>                                        float pb = (float)((x)          *
>>> (h-y)) * finv_WH;
>>>                                        float pc = (float)((w-x)        *
>>> (y))   * finv_WH;
>>>                                        float pd = (float)((x)          *
>>> (y))   * finv_WH;
>>>
>>>                                        float cr= ((a.r*pa) + (b.r*pb) +
>>> (c.r*pc) + (d.r*pd));
>>>                                        float cg= ((a.g*pa) + (b.g*pb) +
>>> (c.g*pc) + (d.g*pd));
>>>                                        float cb= ((a.b*pa) + (b.b*pb) +
>>> (c.b*pc) + (d.b*pd));
>>>                                        float ca= ((a.a*pa) + (b.a*pb) +
>>> (c.a*pc) + (d.a*pd));
>>>                                        ret.r+=cr;      ret.g+=cg;
>>>  ret.b+=cb;      ret.a+=ca;
>>>                                }
>>>                        }
>>>                        return ret;
>>>                }
>>>
>>>                public static Vector4f GradientSIMD()
>>>                {
>>>                        float finv_WH = 1.0f / (float)(w*h);
>>>                        Vector4f ret = new Vector4f(0.0f, 0.0f, 0.0f,
>>> 0.0f);
>>>
>>>                        Vector4f a = new Vector4f(0.0f, 0.0f, 1.0f, 1.0f);
>>>                        Vector4f b = new Vector4f(0.0f, 1.0f, 0.0f, 1.0f);
>>>                        Vector4f c = new Vector4f(1.0f, 0.0f, 0.0f, 1.0f);
>>>                        Vector4f d = new Vector4f(0.5f, 0.5f, 1.0f, 1.0f);
>>>
>>>                        //Process operator
>>>                        Vector4f p = new Vector4f();
>>>                        Vector4f r = new Vector4f();
>>>                        for (int y=0; y<h; y++)
>>>                        {
>>>                                for (int x=0; x<w; x++)
>>>                                {
>>>                                        //Calc percent A,B,C,D
>>>                                        p.X = (float)((w-x)     * (h-y)) *
>>> finv_WH;
>>>                                        p.Y = (float)((x)               *
>>> (h-y)) * finv_WH;
>>>                                        p.Z = (float)((w-x)     * (y))   *
>>> finv_WH;
>>>                                        p.W = (float)((x)               *
>>> (y))   * finv_WH;
>>>
>>>                                        ret+=a*p + b*p + c*p + d*p;
>>>                                }
>>>                        }
>>>                        return ret;
>>>                }
>>>
>>>        }
>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Mono.SIMD-tp22116483p22116483.html
>>> Sent from the Mono - Dev mailing list archive at Nabble.com.
>>>
>>> _______________________________________________
>>> Mono-devel-list mailing list
>>> Mono-devel-list@lists.ximian.com
>>> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>>>
>>
>>
>

[Attachment #5 (text/html)]

Hey,<br><br>The C++ code seems very similar to the C# SIMD code, so I don&#39;t know \
what would make the C# version any faster. This question would be best directed at \
jit guys, who may know what causes the difference.<br> <br>If you want to try \
speeding up the mono version, you should just use trial and error to see if you can \
rewrite things so that you can get better performance. For example, unrolling the \
loop may improve performance noticably.<br> <br>Alan.<br><br><div \
class="gmail_quote">On Mon, Feb 23, 2009 at 1:16 PM, Johann Nadalutti <span \
dir="ltr">&lt;<a href="mailto:jnadalutti@gmail.com">jnadalutti@gmail.com</a>&gt;</span> \
wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, \
204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> Hey,<br>&nbsp;thanks a lot for \
your modifications.<br>&nbsp;I have now SIMD x3 faster than 4DFloat version \
!<br>&nbsp;I make the same code in C++ and It&#39;s x3 more faster than \
Mono.SIMD.<br> I just want to know why and how to optimize my Mono code.<br>

&nbsp;What do you use as IDE to develop and debug Mono ?<br><br>&nbsp;<br>My Visual \
C++ code for test:<br><br>class VectorSIMD<br>{<br>public:<br><br>&nbsp;&nbsp;&nbsp; \
VectorSIMD();<br>&nbsp;&nbsp;&nbsp; VectorSIMD(float x, float y, float z, float \
w);<br><br> &nbsp;&nbsp;&nbsp; VectorSIMD operator*(const VectorSIMD&amp; other)<br>
&nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; VectorSIMD \
r;<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; r.vec = _mm_mul_ps(vec, \
other.vec);<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; return r;<br>&nbsp;&nbsp;&nbsp; \
}<br><br>&nbsp;&nbsp;&nbsp; VectorSIMD operator*(float f)<br>&nbsp;&nbsp;&nbsp; \
{<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; VectorSIMD r;<br>&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; __m128 b = _mm_load1_ps(&amp;f);<br>

&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; r.vec = _mm_mul_ps(vec, \
b);<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; return r;<br>&nbsp;&nbsp;&nbsp; \
}<br><br><br>&nbsp;&nbsp;&nbsp; VectorSIMD operator+(const VectorSIMD&amp; \
other)<br>&nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; VectorSIMD \
r;<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; r.vec = _mm_add_ps(vec, \
other.vec);<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; return r;<br>

&nbsp;&nbsp;&nbsp; }<br><br>&nbsp;&nbsp;&nbsp; //Datas<br>&nbsp;&nbsp;&nbsp; \
union<br>&nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; __m128 \
vec;<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; struct { float x, y, z, w; \
};<br>&nbsp;&nbsp;&nbsp; \
};<br><br>};<br><br>VectorSIMD::VectorSIMD()<br>{<br>}<br><br>VectorSIMD::VectorSIMD(float \
_x, float _y, float _z, float _w)<br>

{<br>&nbsp;&nbsp;&nbsp; x=_x;&nbsp;&nbsp;&nbsp; y=_y; z=_z; \
w=_w;<br>}<br><br><br>VectorSIMD GradientSIMD()<br>{<br>&nbsp; VectorSIMD \
finv_WH(1.0f / (_W*_H), 1.0f / (_W*_H), 1.0f / (_W*_H), 1.0f / \
(_W*_H));<br>&nbsp;&nbsp;&nbsp; VectorSIMD ret(0.0, 0.0, 0.0, 0.0);<br><br>

&nbsp;&nbsp;&nbsp; VectorSIMD a(0.0f, 0.0f, 1.0f, 1.0f);<br>&nbsp;&nbsp;&nbsp; a =a + \
VectorSIMD(0.0f, 1.0f, 0.0f, 1.0f);<br>&nbsp;&nbsp;&nbsp; a =a + VectorSIMD(1.0f, \
0.0f, 0.0f, 1.0f);<br>&nbsp;&nbsp;&nbsp; a =a + VectorSIMD(0.5f, 0.5f, 1.0f, \
1.0f);<br><br><br>&nbsp;&nbsp;&nbsp; //Process operator<br>

&nbsp; VectorSIMD yVec(_H, _H, 0, 0);<br>&nbsp; VectorSIMD yDiff(-1.0f, -1.0f, 1.0f, \
1.0f);<br>&nbsp;&nbsp;&nbsp; for (int y=0; y&lt;_H; y++)<br>&nbsp;&nbsp;&nbsp; \
{<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; VectorSIMD factor = yVec * \
finv_WH;<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; yVec = yVec + \
yDiff;<br><br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; VectorSIMD xVec(_W, 0, _W, \
0);<br>

&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; VectorSIMD xDiff(-1.0f, 1.0f, -1.0f, \
1.0f);<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; for (int x=0; x&lt;_W; \
x++)<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; ret=ret+(a*xVec*factor);<br>&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; xVec=xVec+xDiff;<br>&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; }<br>&nbsp;&nbsp;&nbsp; }<br><br>&nbsp;&nbsp;&nbsp; return \
ret;<br>

}<br><br><br>Johann.<br><br><br><br><br><div class="gmail_quote">2009/2/23 Alan \
McGovern <span dir="ltr">&lt;<a href="mailto:alan.mcgovern@gmail.com" \
target="_blank">alan.mcgovern@gmail.com</a>&gt;</span><div><div></div> <div \
class="Wj3C7c"><br><blockquote class="gmail_quote" style="border-left: 1px solid \
rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> Hey,<br><br>The \
big issue you&#39;re having is that you haven&#39;t implemented a SIMD algorithm ;) I \
spent 15 mins &#39;optimising&#39; your code and came up with this. Notice that I \
made everything a SIMD operation. There is no scalar code in the method anymore. This \
tripled performance as compared to the non-SIMD version. On my machine:<br>


<br>-FLOAT 00:00:00.3888930 Color<br>-SIMD&nbsp;&nbsp; 00:00:00.1266820 \
Mono.Simd.Vector4f<br><br>You&#39;d want to double check the result just in case I \
made a mistake with my alterations. <br><br>Alan.<br><br>&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; public static Vector4f GradientSIMD()<br>


&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; Vector4f finv_WH = new Vector4f (1.0f / (w*h), 1.0f / (w*h), 1.0f \
/ (w*h), 1.0f / (w*h));<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; \
Vector4f ret = new Vector4f();<br><br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; Vector4f a = new Vector4f(0.0f, 0.0f, 1.0f, 1.0f);<br>


&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; a += new Vector4f(0.0f, \
1.0f, 0.0f, 1.0f);<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; a += \
new Vector4f(1.0f, 0.0f, 0.0f, 1.0f);<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; a += new Vector4f(0.5f, 0.5f, 1.0f, \
1.0f);<br><br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; //Process \
operator<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Vector4f yVec = \
new Vector4f (h, h, 0, 0);<br>


&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Vector4f yDiff = new \
Vector4f (-1, -1, 1, 1);<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; \
for (int y=0; y&lt;h; y++)<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; Vector4f factor = yVec * finv_WH;<br>&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; yVec += \
yDiff;<br><br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; Vector4f xVec = new Vector4f (w, 0, w, 0);<br>


&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Vector4f \
xDiff = new Vector4f (-1, 1, -1, 1);<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; for (int x=0; x&lt;w; \
x++)<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; \
{<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; ret += (a * xVec * factor);<br>&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; xVec += \
xDiff;<br>


&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; \
}<br>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; }<br>&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; return ret;<br>&nbsp;&nbsp;&nbsp; \
&nbsp;&nbsp;&nbsp; }<br><br><div class="gmail_quote">On Fri, Feb 20, 2009 at 8:12 AM, \
Johann_fxgen <span dir="ltr">&lt;<a href="mailto:jnadalutti@gmail.com" \
target="_blank">jnadalutti@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); \
margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br> I have done some performance \
tests of SIMD under windows.<br> <br>
Results tests in ms:<br>
In MS C &nbsp; &nbsp; &nbsp; &nbsp; 235 &nbsp; (Visual Studio Release Mode With \
SIMD)<br> In MS C &nbsp; &nbsp; &nbsp; &nbsp; 360 &nbsp; (Visual Studio Release Mode \
With 4D Float)<br> In Mono C# &nbsp; &nbsp;453 &nbsp; (With Mono SIMD)<br>
In Mono C# &nbsp; &nbsp;562 &nbsp; (With Mono 4D Float)<br>
In MS C# &nbsp; &nbsp; &nbsp; 609 &nbsp; (Visual Studio With 4D Float)<br>
In MS C &nbsp; &nbsp; &nbsp; &nbsp; 672 &nbsp; (Visual Studio Debug Mode)<br>
<br>
I&#39;m just surprise by difference between C SIMD and mono SIMD version.<br>
<br>
Is Mono.SIMD under linux speeder than under windows ?<br>
<br>
Johann.<br>
<br>
My mono code for test:<br>
<br>
 &nbsp; &nbsp; &nbsp; &nbsp;using Mono.Simd;<br>
 &nbsp; &nbsp; &nbsp; &nbsp;using System;<br>
 &nbsp; &nbsp; &nbsp; &nbsp;using Mono;<br>
<br>
 &nbsp; &nbsp; &nbsp; &nbsp;public struct Color<br>
 &nbsp; &nbsp; &nbsp; &nbsp;{<br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;public float r,g,b,a;<br>
 &nbsp; &nbsp; &nbsp; &nbsp;};<br>
<br>
 &nbsp; &nbsp; &nbsp; &nbsp;public class TestMonoSIMD<br>
 &nbsp; &nbsp; &nbsp; &nbsp;{<br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;public &nbsp;Color \
m_pixels;<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;const int w = \
4096;<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;const int h = \
4096;<br> <br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;public static void Main \
()<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{<br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;//Debug<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp;Console.WriteLine(&quot;AccelMode: {0}&quot;, \
Mono.Simd.SimdRuntime.AccelMode );<br> <br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;//Without SIMD<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp;DateTime start1 = DateTime.Now;<br>  &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Color ret1 = \
Gradient();<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp;TimeSpan ts1 = DateTime.Now - start1;<br>  &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;Console.WriteLine(&quot;-FLOAT {0} {1}&quot;, ts1, ret1);<br> <br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;//With SIMD<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp;DateTime start2 = DateTime.Now;<br>  &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Vector4f ret2 = \
GradientSIMD();<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp;TimeSpan ts2 = DateTime.Now - start2;<br>  &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;Console.WriteLine(&quot;-SIMD &nbsp;{0} {1}&quot;, ts2, ret2);<br>  &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}<br> <br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;public static Color \
Gradient()<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{<br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;float finv_WH = 1.0f / (float)(w*h);<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Color ret = new Color();<br>  &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;ret.r=ret.g=ret.b=ret.a=0.0f;<br> <br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;Color a = new Color();<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Color b = new Color();<br>  &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Color c = new \
Color();<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp;Color d = new Color();<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;a.r=0.0f; &nbsp; &nbsp; &nbsp; a.g=0.0f; \
a.b=1.0f; a.a=1.0f;<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp;b.r=0.0f; &nbsp; &nbsp; &nbsp; b.g=1.0f; b.b=0.0f; \
b.a=1.0f;<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp;c.r=1.0f; &nbsp; &nbsp; &nbsp; c.g=0.0f; c.b=0.0f; c.a=1.0f;<br>  &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;d.r=0.5f; \
&nbsp; &nbsp; &nbsp; d.g=0.5f; d.b=1.0f; d.a=1.0f;<br> <br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;//Process operator<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp;for (int y=0; y&lt;h; y++)<br>  &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{<br>  &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp;for (int x=0; x&lt;w; x++)<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{<br>  \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;//Calc percent A,B,C,D<br>  \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;float pa = (float)((w-x) \
&nbsp; &nbsp; &nbsp; &nbsp;* (h-y)) * finv_WH;<br>  &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp;float pb = (float)((x) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;* \
(h-y)) * finv_WH;<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;float pc \
= (float)((w-x) &nbsp; &nbsp; &nbsp; &nbsp;* (y)) &nbsp; * finv_WH;<br>  &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;float pd = (float)((x) &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp;* (y)) &nbsp; * finv_WH;<br> <br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;float cr= ((a.r*pa) + (b.r*pb) \
+ (c.r*pc) + (d.r*pd));<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;float cg= ((a.g*pa) + (b.g*pb) + (c.g*pc) + (d.g*pd));<br>  &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;float cb= ((a.b*pa) + (b.b*pb) + (c.b*pc) + \
(d.b*pd));<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;float ca= \
((a.a*pa) + (b.a*pb) + (c.a*pc) + (d.a*pd));<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp;ret.r+=cr; &nbsp; &nbsp; &nbsp;ret.g+=cg; &nbsp; &nbsp; \
&nbsp;ret.b+=cb; &nbsp; &nbsp; &nbsp;ret.a+=ca;<br>  &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;}<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp;}<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp;return ret;<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;}<br> <br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;public static Vector4f \
GradientSIMD()<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{<br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;float finv_WH = 1.0f / (float)(w*h);<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Vector4f ret = new Vector4f(0.0f, \
0.0f, 0.0f, 0.0f);<br> <br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;Vector4f a = new Vector4f(0.0f, 0.0f, 1.0f, 1.0f);<br>  &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Vector4f b = new \
Vector4f(0.0f, 1.0f, 0.0f, 1.0f);<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Vector4f c = new Vector4f(1.0f, 0.0f, 0.0f, \
1.0f);<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp;Vector4f d = new Vector4f(0.5f, 0.5f, 1.0f, 1.0f);<br> <br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;//Process operator<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp;Vector4f p = new Vector4f();<br>  &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Vector4f r = new \
Vector4f();<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp;for (int y=0; y&lt;h; y++)<br>  &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{<br>  &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;for (int x=0; x&lt;w; x++)<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{<br>  &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;//Calc percent A,B,C,D<br>  &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;p.X = (float)((w-x) &nbsp; &nbsp; * \
(h-y)) * finv_WH;<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;p.Y = \
(float)((x) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; * (h-y)) * finv_WH;<br>  \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;p.Z = (float)((w-x) &nbsp; \
&nbsp; * (y)) &nbsp; * finv_WH;<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp;p.W = (float)((x) &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; * (y)) \
&nbsp; * finv_WH;<br> <br>
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ret+=a*p + b*p + c*p + \
d*p;<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}<br>  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}<br>  &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return ret;<br>  &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}<br> <br>
 &nbsp; &nbsp; &nbsp; &nbsp;}<br>
<font color="#888888"><br>
<br>
--<br>
View this message in context: <a \
href="http://www.nabble.com/Mono.SIMD-tp22116483p22116483.html" \
target="_blank">http://www.nabble.com/Mono.SIMD-tp22116483p22116483.html</a><br> Sent \
from the Mono - Dev mailing list archive at Nabble.com.<br> <br>
_______________________________________________<br>
Mono-devel-list mailing list<br>
<a href="mailto:Mono-devel-list@lists.ximian.com" \
target="_blank">Mono-devel-list@lists.ximian.com</a><br> <a \
href="http://lists.ximian.com/mailman/listinfo/mono-devel-list" \
target="_blank">http://lists.ximian.com/mailman/listinfo/mono-devel-list</a><br> \
</font></blockquote></div><br> </blockquote></div></div></div><br>
</blockquote></div><br>



_______________________________________________
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic