[prev in list] [next in list] [prev in thread] [next in thread] 

List:       freebsd-amd64
Subject:    Re: libc assembly optimizations?
From:       James Van Artsdalen <james-freebsd-amd64 () jrv ! org>
Date:       2003-12-30 10:16:29
Message-ID: 200312301016.hBUAGT4Q085640 () bigtex ! jrv ! org
[Download RAW message or body]

Here's an alternative for fabs (3):

ENTRY(fabs)
	psllq	$1,%xmm0	/* 64-bit shifts lefts */
	psrlq	$1,%xmm0	/* logical shift right clears sign */
	ret

/usr/src/lib/libc/amd64/gen/fabs.S does the code below.
gcc generates essentially the same code as below.
The shifts above seem to work and look better to me.

The string ops can made be significantly improved if allowed to
read extra bytes around the string but within the same 16-byte
paragraph as the start or end of the string.  This seems safe in
userland.

Finally, can the SSE2 regs be safely used in kernel mode?
Page fill and aligned-bulk bcopy calls can be improved this way.

/*
 * Ok, this sucks. Is there really no way to push an xmm register onto
 * the FP stack directly?
 */

ENTRY(fabs)
	movsd	%xmm0, -8(%rsp)
	fldl	-8(%rsp)
	fabs
	fstpl	-8(%rsp)
	movsd	-8(%rsp),%xmm0
	ret
_______________________________________________
freebsd-amd64@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-amd64
To unsubscribe, send any mail to "freebsd-amd64-unsubscribe@freebsd.org"
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic