[prev in list] [next in list] [prev in thread] [next in thread]
List: gcc
Subject: Re: Validity of SUBREG+AND-imm transformations
From: Kyrill Tkachov <kyrylo.tkachov () foss ! arm ! com>
Date: 2016-02-29 10:51:24
Message-ID: 56D422AC.2000703 () foss ! arm ! com
[Download RAW message or body]
Hi Jeff,
On 26/02/16 21:24, Jeff Law wrote:
> On 02/26/2016 06:40 AM, Kyrill Tkachov wrote:
> > Hi all,
> >
> > I'm looking at a case where some RTL passes create an RTL expression of
> > the form:
> > (subreg:QI (and:SI (reg:SI x1)
> > (const_int 31)) 0)
> >
> > which I'd like to simplify to:
> > (and:QI (subreg:QI (reg:SI x1) 0)
> > (const_int 31))
> I can think of cases where the first is better and other cases where the second is \
> better -- a lot depends on context. I don't have a good sense for which is better \
> in general.
> Note that as-written these don't trigger the subtle issues in what happens with \
> upper bits. That's more for extensions.
> (subreg:SI (whatever:QI))
>
> vs
>
> {zero,sign}_extend:SI (whatever:QI))
>
> vs
>
> (and:SI (subreg:SI (whatever:QI) (const_int 0x255)))
>
>
> The first leave the bits beyond QI as "undefined" and sometimes (but I doubt all \
> that often in practice) the compiler will use the undefined nature of those bits to \
> enable optimizations.
>
> The second & 3rd variants crisply define the upper bits.
>
Thanks for the explanation.
>
> >
> > It's easy enough to express in RTL but I'm trying to convince myself on
> > its validity.
> > I know there are some subtle points in this area. combine_simplify_rtx
> > in combine.c
> > has a comment:
> > /* Note that we cannot do any narrowing for non-constants since
> > we might have been counting on using the fact that some bits were
> > zero. We now do this in the SET. */
> That comment makes no sense. Unfortunately it goes back to a change from Kenner in \
> 1994 -- which predates having patch discussions here and consistently adding tests \
> to the testsuite.
> The code used to do this:
>
>
> - if (GET_MODE_CLASS (mode) == MODE_INT
> - && GET_MODE_CLASS (GET_MODE (SUBREG_REG (x))) == MODE_INT
> - && GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (SUBREG_REG (x)))
> - && subreg_lowpart_p (x))
> - return force_to_mode (SUBREG_REG (x), mode, GET_MODE_MASK (mode),
> - NULL_RTX, 0);
>
> Which appears to check that we've got a narrowing subreg expression, and if we do \
> try to force the SUBREG_REG into the right mode using force_to_mode.
> But if we had a narrowing SUBREG_REG, then I can't see how anything would have been \
> dependign on the upper bits being zero.
>
> >
> > and if I try to implement this transformation in simplify_subreg from
> > simplify-rtx.c
> > I get some cases where combine goes into an infinite recursion in
> > simplify_comparison
> > because it tries to do:
> >
> > /* If this is (and:M1 (subreg:M1 X:M2 0) (const_int C1)) where C1
> > fits in both M1 and M2 and the SUBREG is either paradoxical
> > or represents the low part, permute the SUBREG and the AND
> > and try again. */
> Right. I think you just end up ping-ponging between the two equivalent \
> representations. Which may indeed argue that the existing representation is \
> preferred and we should look deeper into why the existing representation isn't \
> being handled as well as it should be.
>
After a bit more experimentation I found it's easy enough to avoid looping on this \
spot. Just check if the gen_lowpart part of that transformation returned something \
different from the original expression...
> >
> > Performing this transformation would help a lot with recognition of some
> > patterns that
> > I'm working on, so would it be acceptable to teach combine or
> > simplify-rtx to do this?
> How does it help recognition? What kinds of patterns are you trying to recognize?
>
On aarch64 the normal integer-side variable shift/rotate instructions truncate their \
shift amount. However, we can't enable SHIFT_COUNT_TRUNCATED by default on aarch64 \
because some of the alternatives in our shift patterns use the vector shift \
instructions, which don't truncate the shift amount. So I'm trying to add a combine \
pattern to eliminate redundant truncations where possible when using the normal shift \
instructions.
So I'm trying to create a define_insn to match something like:
[(set (match_operand:SI 0 "register_operand" "=r")
(ashift:SI
(match_operand:SI 1 "register_operand" "r")
(and:QI
(match_operand:QI 2 "register_operand" "r")
(match_operand:QI 3 "const_int_operand" "n"))))]
where operands[3] is 31 for SImode. The 'and' expression has to be in QImode because \
our shift expanders expand the shift amount to QImode.
However, when combine tries to combine the and-mask operation with the shift, for \
example from the code: unsigned f1(unsigned x, int y) { return x << (y & 31); }
the expression it tries to match is:
(set (reg/i:SI 0 x0)
(ashift:SI (reg:SI 0 x0 [ x ])
(subreg:QI (and:SI (reg:SI 1 x1 [ y ])
(const_int 31 [0x1f])) 0)))
That is, the subreg is not propagated inside. I think the define_insn pattern for \
that would look somewhat unnatural, so I'm trying to get the subreg to be propagated \
to the reg inside.
Thanks,
Kyrill
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic