[prev in list] [next in list] [prev in thread] [next in thread] 

List:       busybox
Subject:    FAST_FUNC not working well with LTO (Link Time Optimization)
From:       Kang-Che Sung <explorer09 () gmail ! com>
Date:       2016-12-25 10:20:17
Message-ID: CADDzAfP4WgLbR2Xu9vuSJbnbRsoDrfeQYs=XjZ9wsLj51a5yMw () mail ! gmail ! com
[Download RAW message or body]

Busybox uses FAST_FUNC macro to tweak with IA-32 calling conventions in
order to make the function call slightly smaller or slightly faster.
However, when I experiment with GCC's LTO (Link Time Optimization), I
discovered that FAST_FUNC could hinder LTO's optimization so that the
resulting executable become a few bytes larger (than what is compiled
without FAST_FUNC).

Although I can comment out the FAST_FUNC lines in include/platform.h to
achieve the level of optimization I want, may I suggest a way for user
to disable FAST_FUNC conveniently?

For example, let me specify CONFIG_EXTRA_CFLAGS="-DFAST_FUNC= -flto"
and I can compile with LTO without a source code hack. It seems like
GCC does not yet provide a macro or a way to detect LTO in code, so
this is the best suggestion I could have.

The changes will be something like below. I would like some comments
about this problem and my suggestion. Please?

Kang-Che Sung ("Explorer")

diff --git a/include/platform.h b/include/platform.h
index c987d418c..7e537b950 100644
--- a/include/platform.h
+++ b/include/platform.h
@@ -108,13 +108,19 @@
  * and/or smaller by using modified ABI. It is usually only needed
  * on non-static, busybox internal functions. Recent versions of gcc
  * optimize statics automatically. FAST_FUNC on static is required
- * only if you need to match a function pointer's type */
-#if __GNUC_PREREQ(3,0) && defined(i386) /* || defined(__x86_64__)? */
+ * only if you need to match a function pointer's type.
+ * FAST_FUNC may not work well with -flto so allow user to disable this.
+ * (-DFAST_FUNC= ) */
+#ifndef FAST_FUNC
+# if __GNUC_PREREQ(3,0) && defined(i386)
 /* stdcall makes callee to pop arguments from stack, not caller */
-# define FAST_FUNC __attribute__((regparm(3),stdcall))
+#  define FAST_FUNC __attribute__((regparm(3),stdcall))
 /* #elif ... - add your favorite arch today! */
-#else
-# define FAST_FUNC
+/* x86_64 doesn't need this - its ABI can't be tweaked like IA-32 (can't use
+ * stdcall; the ABI uses 6 regparms already). */
+# else
+#  define FAST_FUNC
+# endif
 #endif

 /* Make all declarations hidden (-fvisibility flag only affects definitions) */
_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic