[prev in list] [next in list] [prev in thread] [next in thread] 

List:       freedesktop-xorg-devel
Subject:    Re: [PATCH] fb: Fix memcpy abuse
From:       Soeren Sandmann <sandmann () cs ! au ! dk>
Date:       2011-04-30 12:39:42
Message-ID: ye862pvoqup.fsf () llama03 ! cs ! au ! dk
[Download RAW message or body]

Siarhei Siamashka <siarhei.siamashka@gmail.com> writes:

>>  1            2                 3                 4           Operation
>> ------   ---------------   ---------------   ---------------   ------------
>> 258000   269000 (  1.04)   544000 (  2.11)   552000 (  2.14)   Copy 10x10
>>  21300    23000 (  1.08)    43700 (  2.05)    47100 (  2.21)   Copy 100x100
>>   960      962 (  1.00)     1990 (  2.09)     1990 (  2.07)   Copy 500x500
>>
>> So it's a modest performance hit, but correctness demands it, and it's
>> probably worth keeping the 2x speedup from having the fast path in the
>> first place.
>
> In my opinion, still the best solution is to do this stuff in pixman,
> because it has its own SIMD optimized code and can do a lot better job
> than memcpy/memmove (for example, it can be improved to use MOVNTx
> instructions for x86 when scrolling large areas exceeding L2/L3 cache
> size, and this optimization can't be easily done by glibc if
> memcpy/memmove is used separately for each individual scanline). I did
> some experiments with improving scrolling performance when using
> non-hardware accelerated framebuffer earlier and it showed a really
> major speedup on ARM:
> http://lists.x.org/archives/xorg-devel/2009-November/003536.html

If I remember correctly, the main objection I had to the overlapped blt
patch was that it inlined the C fallback into all the SIMD versions
instead of just calling down through the delegate.

Other than that, I agree that doing this in pixman would be best. In
fact, I think it would make sense to have a full CopyArea implementation
in pixman that would also handle rasterops and planemasks etc, but
simply supporting overlapping in pixman_blt() is useful too.

>> -    if (alu == GXcopy && pm == FB_ALLONES && !reverse &&
>> +    careful = !((srcLine < dstLine && srcLine + width * (bpp/8) > dstLine) ||
>> +                (dstLine < srcLine && dstLine + width * (bpp/8) > srcLine)) ||
>> +              (bpp % 8);
>> +
>> +    if (alu == GXcopy && pm == FB_ALLONES && !careful &&
>>             !(srcX & 7) && !(dstX & 7) && !(width & 7)) {
>>         int i;
>>         CARD8 *src = (CARD8 *) srcLine;
>>         CARD8 *dst = (CARD8 *) dstLine;
>> -
>> +
>>         srcStride *= sizeof(FbBits);
>>         dstStride *= sizeof(FbBits);
>>         width >>= 3;
>
> What is the story with the 'reverse' variable? Does it have no use
> anymore and needs to be purged later?

It's an argument to the function that indicates whether each scanline
should be copied back-to-front. In the non-overlapping case, it can be
ignored, but it is used in other places in fbBlt().


Soren
_______________________________________________
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic