[prev in list] [next in list] [prev in thread] [next in thread] 

List:       zlib-devel
Subject:    [Zlib-devel] [JustForFun][Patch] Throw SSE2 at longest_match()
From:       kaffeemonster () googlemail ! com (Jan Seiffert)
Date:       2011-05-08 14:54:08
Message-ID: BANLkTin6OjZgAeKyw1nR7a9BSkJJ4BFRZw () mail ! gmail ! com
[Download RAW message or body]

This is no serious patch, just a little hack put together in 1 hour.

We boost longest_match a little by giving it a better mousetrap^w memcmp.
Maybe something similar can be done for IBM Power6.
This is a very blunt method, so it isn't without downside, and this
patch is not meant to go anywhere.

Maybe the USE_SSE part of amd64-match.S could be made faster a with a
little different coding (use xor instead of notw, don't use bsf until
a match is found), but i think the main problem can not be fixed:
The downside is with not compressible input, here some 440MB movie:
============ orig ===============
18.44 real
18.33 real
18.32 real
============ sse2 ===============
19.22 real
19.11 real
19.07 real

The fixed startup overhead is to high for short matches.

But maybe it is still usefull for someone, for the good news some other numbers:
Compressing linux-2.6.37.tar (412Meg -> 89Meg) with minigzip, user time

Intel Core i5 750
============ orig ===============
17.77 real
17.76 real
17.74 real
============ sse2 ===============
16.52 real
16.54 real
16.67 real

b2b: 1.0738

Intel Core2 4300
============ orig ===============
17.77 real
17.76 real
17.74 real
============ sse2 ===============
16.52 real
16.54 real
16.67 real

b2b: 1.0738

AMD Semperon 140
============ orig ===============
23.56 real
23.51 real
23.57 real
============ sse2 ===============
20.60 real
20.66 real
20.66 real

b2b: 1.1413

So 7 to 14 % speedup while compressing, on 1:4.6 compression, with the
whole code (read file, compress (LZ, Huffmann), writeout) in place.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: longest_match.patch
Type: text/x-patch
Size: 3005 bytes
Desc: not available
URL: <http://madler.net/pipermail/zlib-devel_madler.net/attachments/20110508/3bfd6a76/attachment.bin>

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic