[prev in list] [next in list] [prev in thread] [next in thread] 

List:       glibc-alpha
Subject:    Example of optimized strlen
From:       Bonz <bonzini () gnu ! org>
Date:       2001-02-28 10:42:58
[Download RAW message or body]

Questo è un messaggio multi-parte scritto in formato MIME.

I attach a fast strlen that I wrote and a commented version from glibc's
CVS repository.  The comments include cycle counts and highlight
three partial register stalls.

Here are the results for a Pentium (counting clocks for the P6 is
difficult, but take into account that up to 12 clocks are lost for the
partial register stalls on the P6 in the finalization, and that *each*
iteration of the inner loop loses 6 clocks because of the other stall).
I'm not considering cache misses nor branch mispredictions.

                        		 my strlen 	glibc strlen
---------------------------------------------------------------------
startup if aligned   			     2		    2
startup if misaligned (worst case)	     7		   12
---------------------------------------------------------------------
inner loop    				     n		 1.25*n
---------------------------------------------------------------------
finalization (worst case)		     9		    9
---------------------------------------------------------------------

The startup costs are better in my version, as is the inner loop's
timing.

(My strlen has no support for bounded pointers yet).

Paolo
["strlen.S" (application/x-unknown-content-type-txtfile)]
["glibc-strlen.S" (application/x-unknown-content-type-txtfile)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic