[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gcc-bugs
Subject:    [Bug target/79748] [Enhancement] no_callee_saved_registers function attribute (on x86)
From:       "katsunori.kumatani at gmail dot com" <gcc-bugzilla () gcc ! gnu ! org>
Date:       2017-02-28 21:44:03
Message-ID: bug-79748-4-QqUtYTgS2F () http ! gcc ! gnu ! org/bugzilla/
[Download RAW message or body]

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79748

--- Comment #3 from Katsunori Kumatani <katsunori.kumatani at gmail dot com> ---
Well, I remembered I omitted another simple situation in which it would help.
But because this attribute does not exist right now, I'll have to show you a
"what-it-could-be" output code if it did exist.

Let's say we have something like the following code (with that attribute), this
time however, we don't use 'rbx' in 'bar' at all, but we call 'foo' multiple
times:

#include <stdio.h>

static __attribute__((noinline,no_callee_saved_registers)) int foo(int a)
{
  asm("incl %0":"+b"(a));  // use ebx just to demonstrate
  return a;
}

void bar(int x)
{
  printf("%d", foo(x));
  printf("%d", foo(x));
  printf("%d", foo(x));
}

I didn't run this to actual compiler because it's obviously "imagined", I'm
hoping we can exclude optimizations GCC can do to the printf's and let's assume
it calls all 3.

So, in this code, with that attribute the code would look like this (pseudo
assembly):


foo(int):
        movl    %edi, %ebx     # assume the attribute works, so
        incl    %ebx           # it doesn't save ebx at all
        movl    %ebx, %eax
        ret

bar(int):
        pushq   %rbx           # even though bar does NOT use rbx,
        call    foo(int)       # it gets saved because 'foo' clobbers it
                               # just as if you had clobbered it in asm
        [ ... ]                # this here is the printf's and other calls
                               # to 'foo', not important...

        popq    %rbx           # restores it here due to ABI
        ret


The above code might not seem why it is better at first glance, but think about
it. 'rbx' only gets saved *once* (to comply with the ABI because 'bar' does not
use it) in the outer function 'bar' instead of the inner function 'foo' which
gets called 3 times.

Of course, this can be more profound as more callee-saved registers are used
(especially on x86-64), and also when more functions are used, etc. I'm not
saying it's very "crucial" for performance (most little things aren't but they
tend to add up), but I assume if it's easy to implement this attribute (I hope
it is?), it might as well be. I only tried to justify its inclusion if that's
the only thing stopping it from being accepted. :)

The reason 'bar' saves rbx is obviously because 'bar' does NOT have that
attribute so it has to due to ABI, while 'foo' does hence 'foo' does not save
it at all (and 'bar' sees it's the only callee-saved register it clobbers due
to ipa-ra just like any other register, otherwise 'bar' would save ALL ABI
callee-saved registers because it would think 'foo' clobbers them all due to
the attribute).

This attribute would not generate invalid code anyway as long as the caller
knows the function has the attribute (which applies to almost any attribute
anyways, or calling convention). But without the IPA-RA optimization it would
assume all registers are dead and clobbered, as if you had written  asm
volatile("":::"memory", [...clobber all regs...]);  which is not a good idea if
it's abused on "externally visible" functions.

But of course that applies only without the optimization in place, since we
have the optimization already we can take advantage of it here. This helps GCC
have more freedom in generating optimal code.

Mostly in inner functions though, and those that are only visible to the
project and don't interface with the outside world. (however, the latter is not
"wrong" in the sense that it would crash if the declaration was marked with the
attribute, it's just not a good idea for good code gen)

But like any attribute it has to be used responsibly by the respective
programmer. :)=
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic