[prev in list] [next in list] [prev in thread] [next in thread]
List: zlib-devel
Subject: [Zlib-devel] [JustForFun][Patch] Throw SSE2 at longest_match()
From: kaffeemonster () googlemail ! com (Jan Seiffert)
Date: 2011-05-08 14:54:08
Message-ID: BANLkTin6OjZgAeKyw1nR7a9BSkJJ4BFRZw () mail ! gmail ! com
[Download RAW message or body]
This is no serious patch, just a little hack put together in 1 hour.
We boost longest_match a little by giving it a better mousetrap^w memcmp.
Maybe something similar can be done for IBM Power6.
This is a very blunt method, so it isn't without downside, and this
patch is not meant to go anywhere.
Maybe the USE_SSE part of amd64-match.S could be made faster a with a
little different coding (use xor instead of notw, don't use bsf until
a match is found), but i think the main problem can not be fixed:
The downside is with not compressible input, here some 440MB movie:
============ orig ===============
18.44 real
18.33 real
18.32 real
============ sse2 ===============
19.22 real
19.11 real
19.07 real
The fixed startup overhead is to high for short matches.
But maybe it is still usefull for someone, for the good news some other numbers:
Compressing linux-2.6.37.tar (412Meg -> 89Meg) with minigzip, user time
Intel Core i5 750
============ orig ===============
17.77 real
17.76 real
17.74 real
============ sse2 ===============
16.52 real
16.54 real
16.67 real
b2b: 1.0738
Intel Core2 4300
============ orig ===============
17.77 real
17.76 real
17.74 real
============ sse2 ===============
16.52 real
16.54 real
16.67 real
b2b: 1.0738
AMD Semperon 140
============ orig ===============
23.56 real
23.51 real
23.57 real
============ sse2 ===============
20.60 real
20.66 real
20.66 real
b2b: 1.1413
So 7 to 14 % speedup while compressing, on 1:4.6 compression, with the
whole code (read file, compress (LZ, Huffmann), writeout) in place.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: longest_match.patch
Type: text/x-patch
Size: 3005 bytes
Desc: not available
URL: <http://madler.net/pipermail/zlib-devel_madler.net/attachments/20110508/3bfd6a76/attachment.bin>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic