[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gcc
Subject:    Re: increase alignment of global structs in increase_alignment pass
From:       Prathamesh Kulkarni <prathamesh.kulkarni () linaro ! org>
Date:       2016-02-23 16:31:37
Message-ID: CAAgBjMkWYk9XKkugc1SXPqW2cWVWJ2MsT5Pb=QEGCxLgw9rC7A () mail ! gmail ! com
[Download RAW message or body]

On 23 February 2016 at 17:31, Richard Biener <rguenther@suse.de> wrote:
> On Tue, 23 Feb 2016, Prathamesh Kulkarni wrote:
>
>> On 22 February 2016 at 17:36, Richard Biener <rguenther@suse.de> wrote:
>> > On Mon, 22 Feb 2016, Prathamesh Kulkarni wrote:
>> >
>> >> Hi Richard,
>> >> As discussed in private mail, this version of patch attempts to
>> >> increase alignment
>> >> of global struct decl if it contains an an array field(s) and array's
>> >> offset is a multiple of the alignment of vector type corresponding to
>> >> it's scalar type and recursively checks for nested structs.
>> >> eg:
>> >> static struct
>> >> {
>> >>   int a, b, c, d;
>> >>   int k[4];
>> >>   float f[10];
>> >> };
>> >> k is a candidate array since it's offset is 16 and alignment of
>> >> "vector (4) int" is 8.
>> >> Similarly for f.
>> >>
>> >> I haven't been able to create a test-case where there are
>> >> multiple candidate arrays and vector alignment of arrays are different.
>> >> I suppose in this case we will have to increase alignment
>> >> of the struct by the max alignment ?
>> >> eg:
>> >> static struct
>> >> {
>> >>   <fields>
>> >>   T1 k[S1]
>> >>   <fields>
>> >>   T2 f[S2]
>> >>   <fields>
>> >> };
>> >>
>> >> if V1 is vector type corresponding to T1, and V2 corresponding vector
>> >> type to T2,
>> >> offset (k) % align(V1) == 0 and offset (f) % align(V2) == 0
>> >> and align (V1) > align(V2) then we will increase alignment of struct
>> >> by align(V1).
>> >>
>> >> Testing showed FAIL for g++.dg/torture/pr31863.C due to program timeout.
>> >> Initially it appeared to me, it went in infinite loop. However
>> >> on second thoughts I think it's probably not an infinite loop, rather
>> >> taking (extraordinarily) large amount of time
>> >> to compile the test-case with the patch.
>> >> The test-case  builds quickly for only 2 instantiations of ClassSpec
>> >> (ClassSpec<Key, A001, 1>,
>> >>  ClassSpec<Key, A002, 2>)
>> >> Building with 22 instantiations (upto ClassSpec<Key, A0023, 22>) takes up
>> >> to ~1m to compile.
>> >> with:
>> >> 23  instantiations: ~2m
>> >> 24 instantiations: ~5m
>> >> For 30 instantiations I terminated cc1plus after 13m (by SIGKILL).
>> >>
>> >> I guess it shouldn't go in an infinite loop because:
>> >> a) structs cannot have circular references.
>> >> b) works for lower number of instantiations
>> >> However I have no sound evidence that it cannot be in infinite loop.
>> >> I don't understand why a decl node is getting visited more than once
>> >> for that test-case.
>> >>
>> >> Using a hash_map to store alignments of decl's so that decl node gets visited
>> >> only once prevents the issue.
>> >
>> > Maybe aliases.  Try not walking vnode->alias == true vars.
>> Hi,
>> I have a hypothesis why decl node gets visited multiple times.
>>
>> Consider the test-case:
>> template <typename T, unsigned N>
>> struct X
>> {
>>   T a;
>>   virtual int foo() { return N; }
>> };
>>
>> typedef struct X<int, 1> x_1;
>> typedef struct X<int ,2> x_2;
>>
>> static x_1 obj1 __attribute__((used));
>> static x_2 obj2 __attribute__((used));
>>
>> Two additional structs are created by C++FE, c++filt shows:
>> _ZTI1XIiLj1EE -> typeinfo for X<int, 1u>
>> _ZTI1XIiLj2EE -> typeinfo for X<int, 2u>
>>
>> Both of these structs have only one field D.2991 and it appears it's
>> *shared* between them:
>>  struct  D.2991;
>>     const void * D.2980;
>>     const char * D.2981;
>>
>> Hence the decl node D.2991 and it's fields (D.2890, D.2981) get visited twice:
>> once when walking _ZTI1XIiLj1EE and 2nd time when walking _ZTI1XIiLj2EE
>>
>> Dump of walking over the global structs for above test-case:
>> http://pastebin.com/R5SABW0c
>>
>> So it appears to me to me a DAG (interior node == struct decl, leaf ==
>> non-struct field,
>> edge from node1 -> node2 if node2 is field of node1) is getting
>> created when struct decl
>> is a type-info object.
>>
>> I am not really clear on how we should proceed:
>> a) Keep using hash_map to store alignments so that every decl gets
>> visited only once.
>> b) Skip walking artificial record decls.
>> I am not sure if skipping all artificial struct decls would be a good
>> idea, but I don't
>> think it's possible to identify if a struct-decl is typeinfo struct at
>> middle-end ?
>
> You shouldn't end up walking those when walking the type of
> global decls.  That is, don't walk typeinfo decls - yes, practically
> that means just not walking DECL_ARTIFICIAL things.
Hi,
I have done the changes in the patch (attached) and cross-tested
on arm*-*-* and aarch64*-*-* without failures.
Is it OK for stage-1 ?

Thanks,
Prathamesh
>
> Richard.

["patch.diff" (text/plain)]

diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index 2b25b45..a6af535 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -794,6 +794,75 @@ make_pass_slp_vectorize (gcc::context *ctxt)
      This should involve global alignment analysis and in the future also
      array padding.  */

+static unsigned get_vec_alignment_for_decl (tree);
+
+static unsigned
+get_vec_alignment_for_array_decl (tree array_decl)
+{
+  tree type = TREE_TYPE (array_decl);
+  gcc_assert (TREE_CODE (type) == ARRAY_TYPE);
+
+  tree vectype = get_vectype_for_scalar_type (strip_array_types (type));
+  return (vectype) ? TYPE_ALIGN (vectype) : 0;
+}
+
+static unsigned
+get_vec_alignment_for_record_decl (tree record_decl)
+{
+  tree type = TREE_TYPE (record_decl);
+  gcc_assert (TREE_CODE (type) == RECORD_TYPE);
+  unsigned max_align = 0, alignment;
+  HOST_WIDE_INT offset;
+
+  if (DECL_ARTIFICIAL (record_decl) || TYPE_PACKED (type))
+    return 0;
+
+  for (tree field = first_field (type);
+       field != NULL_TREE;
+       field = DECL_CHAIN (field))
+    {
+      /* C++FE puts node "._0" of code TYPE_DECL. skip that.  */
+      if (TREE_CODE (field) != FIELD_DECL)
+	continue;
+
+      offset = int_byte_position (field);
+      alignment = get_vec_alignment_for_decl (field);
+      if (alignment
+	  && (offset % (alignment / BITS_PER_UNIT) == 0)
+	  && (alignment > max_align))
+	max_align = alignment;
+    }
+
+  return max_align;
+}
+
+static unsigned
+get_vec_alignment_for_decl (tree decl)
+{
+  if (decl == NULL_TREE)
+    return 0;
+
+  gcc_assert (DECL_P (decl));
+
+  static unsigned alignment = 0;
+  tree type = TREE_TYPE (decl);
+
+  switch (TREE_CODE (type))
+    {
+      case ARRAY_TYPE:
+	alignment = get_vec_alignment_for_array_decl (decl);
+	break;
+      case RECORD_TYPE:
+	alignment = get_vec_alignment_for_record_decl (decl);
+	break;
+      default:
+	alignment = 0;
+	break;
+    }
+
+  return (alignment > DECL_ALIGN (decl)) ? alignment : 0;
+}
+
 static unsigned int
 increase_alignment (void)
 {
@@ -804,23 +873,14 @@ increase_alignment (void)
   /* Increase the alignment of all global arrays for vectorization.  */
   FOR_EACH_DEFINED_VARIABLE (vnode)
     {
-      tree vectype, decl = vnode->decl;
-      tree t;
+      tree decl = vnode->decl;
       unsigned int alignment;

-      t = TREE_TYPE (decl);
-      if (TREE_CODE (t) != ARRAY_TYPE)
-        continue;
-      vectype = get_vectype_for_scalar_type (strip_array_types (t));
-      if (!vectype)
-        continue;
-      alignment = TYPE_ALIGN (vectype);
-      if (DECL_ALIGN (decl) >= alignment)
-        continue;
+      alignment = get_vec_alignment_for_decl (decl);

-      if (vect_can_force_dr_alignment_p (decl, alignment))
+      if (alignment && vect_can_force_dr_alignment_p (decl, alignment))
         {
-	  vnode->increase_alignment (TYPE_ALIGN (vectype));
+	  vnode->increase_alignment (alignment);
           dump_printf (MSG_NOTE, "Increasing alignment of decl: ");
           dump_generic_expr (MSG_NOTE, TDF_SLIM, decl);
           dump_printf (MSG_NOTE, "\n");

["ChangeLog" (application/octet-stream)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic