'Re: RFR: 8282365: Consolidate and improve division by constant idealizations [v47]'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-hotspot-compiler-dev
Subject:    Re: RFR: 8282365: Consolidate and improve division by constant idealizations [v47]
From:       Kim Barrett <kbarrett () openjdk ! org>
Date:       2024-01-31 19:20:17
Message-ID: v8cl1oPhCegBt2GcpaMcm6CsqiSFjBlF7QJ9B5oy34c=.f074820e-99f6-40d2-ad18-976f98d169ad () github ! com
[Download RAW message or body]

On Wed, 31 Jan 2024 17:32:42 GMT, Quan Anh Mai <qamai@openjdk.org> wrote:

> > This patch implements idealisation for unsigned divisions to change a division by \
> > a constant into a series of multiplication and shift. I also change the \
> > idealisation of `DivI` to get a more efficient series when the magic constant \
> > overflows an int32. 
> > In general, the idea behind a signed division transformation is that for a \
> > positive constant `d`, we would need to find constants `c` and `m` so that: 
> > floor(x / d) = floor(x * c / 2**m) for 0 < x < 2**(N - 1) (1)
> > ceil(x / d) = floor(x * c / 2**m) + 1 for -2**(N - 1) <= x < 0 (2)
> > 
> > The implementation in the original book takes into consideration that the machine \
> > may not be able to perform the full multiplication `x * c`, so the constant \
> > overflow and we need to add back the dividend as in `DivLNode::Ideal` cases. \
> > However, for int32 division, `x * c` cannot overflow an int64. As a result, it is \
> > always feasible to just calculate the product and shift the result. 
> > For unsigned multiplication, the situation is somewhat trickier because the \
> > condition needs to be twice as strong (the condition (1) and (2) above are mostly \
> > the same). This results in the magic constant `c` calculated based on the method \
> > presented in Hacker's Delight by Henry S. Warren, Jr. may overflow an uintN. For \
> > int division, we can depend on the theorem devised by Arch D. Robison in N-Bit \
> > Unsigned Division Via N-Bit Multiply-Add, which states that there exists either: 
> > c1 in uint32 and m1, such that floor(x / d) = floor(x * c1 / 2**m1) for 0 < x < \
> > 2**32 (3) c2 in uint32 and m2, such that floor(x / d) = floor((x + 1) * c2 / \
> > 2**m2) for 0 < x < 2**32 (4) 
> > which means that either `x * c1` never overflows an uint64 or `(x + 1) * c2` \
> > never overflows an uint64. And we can perform a full multiplication. 
> > For longs, there is no way to do a full multiplication so we do some basic \
> > transformations to achieve a computable formula. The details I have written as \
> > comments in the overflow case. 
> > More tests are added to cover the possible patterns.
> > 
> > Please take a look and have some reviews. Thank you very much.
> 
> Quan Anh Mai has updated the pull request incrementally with one additional commit \
> since the last revision: 
> further clarify variable meanings

As mentioned previously, I'm not reviewing the core magic number stuff, just the
surrounding structure.  That all looks okay to me now.

-------------

Marked as reviewed by kbarrett (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/9947#pullrequestreview-1854641163


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic