[prev in list] [next in list] [prev in thread] [next in thread]
List: dragonfly-bugs
Subject: [DragonFlyBSD - Bug #2824] New higher speed CRC code
From: bugtracker-admin () leaf ! dragonflybsd ! org
Date: 2015-06-09 13:04:55
Message-ID: redmine.journal-12667.20150609130455.3010929c3606fd17 () leaf ! dragonflybsd ! org
[Download RAW message or body]
Issue #2824 has been updated by alexh.
It doesn't save any operation/instruction with an optimizing compiler.
Even though it should be obvious, just to back it up with some real generated code, \
here go the critical loops of both versions (compiled with gcc -O3). The only \
difference is a 1-byte saving on the encoding of the xor. No real savings, and really \
no point in "optimizing" like that. The compiler does a better job :)
10: 48 83 c6 01 add $0x1,%rsi
14: 89 c1 mov %eax,%ecx
16: c1 e8 08 shr $0x8,%eax
19: 32 4e ff xor -0x1(%rsi),%cl
1c: 0f b6 c9 movzbl %cl,%ecx
1f: 33 04 8d 00 00 00 00 xor 0x0(,%rcx,4),%eax
26: 48 39 d6 cmp %rdx,%rsi
29: 75 e5 jne 10 <singletable_crc32c+0x10>
40: 89 c1 mov %eax,%ecx
42: 32 0e xor (%rsi),%cl
44: 48 83 c6 01 add $0x1,%rsi
48: c1 e8 08 shr $0x8,%eax
4b: 0f b6 c9 movzbl %cl,%ecx
4e: 33 04 8d 00 00 00 00 xor 0x0(,%rcx,4),%eax
55: 48 39 d6 cmp %rdx,%rsi
58: 75 e6 jne 40 <singletable_crc32c_carey+0x10>
----------------------------------------
Bug #2824: New higher speed CRC code
http://bugs.dragonflybsd.org/issues/2824#change-12667
* Author: robin.carey1
* Status: New
* Priority: Normal
* Assignee:
* Category:
* Target version:
----------------------------------------
Dear DragonFlyBSD bugs,
This isn't really a bug. I noticed there is the possibility of improving
the performance of the recently committed new CRC code ("fast iscsi crc
code").
In the following function:
sys/libkern/icrc32.c
<http://gitweb.dragonflybsd.org/dragonfly.git/blob/d557434b1f5510b6fed895379af444f0d034c07b:/sys/libkern/icrc32.c>
static uint32_t
singletable_crc32c(uint32_t crc, const void *buf, size_t size)
{
const uint8_t *p = buf;
while (size--)
crc = crc32Table[(crc ^ *p++) & 0xff] ^ (crc >> 8);
return crc;
}
The two separate operations of "size--" and "*p++" could be combined into
one operation. The way that I would do that would be something like:
...
size_t I;
for (i = 0; i < size; ++i) {
crc = crc32Table[(crc ^ p[i]) & 0xff] ^ (crc >> 8);
}
...
So you would be saving one operation; performance improvement.
I haven't looked at the rest of the code, so perhaps there are other
performance improvements that could be had.
Hope this helps ...
--
Sincerely,
Robin Carey BSc
--
You have received this notification because you have either subscribed to it, or are \
involved in it. To change your notification preferences, please click here: \
http://bugs.dragonflybsd.org/my/account
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic