'[DragonFlyBSD - Bug #2824] New higher speed CRC code'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       dragonfly-bugs
Subject:    [DragonFlyBSD - Bug #2824] New higher speed CRC code
From:       bugtracker-admin () leaf ! dragonflybsd ! org
Date:       2015-06-09 13:04:55
Message-ID: redmine.journal-12667.20150609130455.3010929c3606fd17 () leaf ! dragonflybsd ! org
[Download RAW message or body]

Issue #2824 has been updated by alexh.


It doesn't save any operation/instruction with an optimizing compiler.

Even though it should be obvious, just to back it up with some real generated code, \
here go the critical loops of both versions (compiled with gcc -O3). The only \
difference is a 1-byte saving on the encoding of the xor. No real savings, and really \
no point in "optimizing" like that. The compiler does a better job :)

  10:   48 83 c6 01             add    $0x1,%rsi
  14:   89 c1                   mov    %eax,%ecx
  16:   c1 e8 08                shr    $0x8,%eax
  19:   32 4e ff                xor    -0x1(%rsi),%cl
  1c:   0f b6 c9                movzbl %cl,%ecx
  1f:   33 04 8d 00 00 00 00    xor    0x0(,%rcx,4),%eax
  26:   48 39 d6                cmp    %rdx,%rsi
  29:   75 e5                   jne    10 <singletable_crc32c+0x10>


  40:   89 c1                   mov    %eax,%ecx
  42:   32 0e                   xor    (%rsi),%cl
  44:   48 83 c6 01             add    $0x1,%rsi
  48:   c1 e8 08                shr    $0x8,%eax
  4b:   0f b6 c9                movzbl %cl,%ecx
  4e:   33 04 8d 00 00 00 00    xor    0x0(,%rcx,4),%eax
  55:   48 39 d6                cmp    %rdx,%rsi
  58:   75 e6                   jne    40 <singletable_crc32c_carey+0x10>

----------------------------------------
Bug #2824: New higher speed CRC code
http://bugs.dragonflybsd.org/issues/2824#change-12667

* Author: robin.carey1
* Status: New
* Priority: Normal
* Assignee: 
* Category: 
* Target version: 
----------------------------------------
Dear DragonFlyBSD bugs,

This isn't really a bug. I noticed there is the possibility of improving
the performance of the recently committed new CRC code ("fast iscsi crc
code").

In the following function:

sys/libkern/icrc32.c
<http://gitweb.dragonflybsd.org/dragonfly.git/blob/d557434b1f5510b6fed895379af444f0d034c07b:/sys/libkern/icrc32.c>


static uint32_t
singletable_crc32c(uint32_t crc, const void *buf, size_t size)
{
       const uint8_t *p = buf;


       while (size--)
               crc = crc32Table[(crc ^ *p++) & 0xff] ^ (crc >> 8);

       return crc;
}

The two separate operations of "size--" and "*p++" could be combined into
one operation. The way that I would do that would be something like:

...
size_t I;
for (i = 0; i < size; ++i) {
  crc = crc32Table[(crc ^ p[i]) & 0xff] ^ (crc >> 8);
}
...

So you would be saving one operation; performance improvement.

I haven't looked at the rest of the code, so perhaps there are other
performance improvements that could be had.

Hope this helps ...




-- 
Sincerely,

Robin Carey BSc



-- 
You have received this notification because you have either subscribed to it, or are \
involved in it. To change your notification preferences, please click here: \
http://bugs.dragonflybsd.org/my/account


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic