[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-hotspot-compiler-dev
Subject:    Re: RFR: 8312569: RISC-V: Missing intrinsics for Math.ceil, floor, rint [v10]
From:       Feilong Jiang <fjiang () openjdk ! org>
Date:       2023-08-29 14:43:11
Message-ID: 3PcE05aE3Ti8SkcG6PZmPgq8SZcPHThe6iP2MsiY4Oo=.b27d01fd-cb37-404c-80af-c5cc9cfd03b2 () github ! com
[Download RAW message or body]

On Tue, 29 Aug 2023 08:28:42 GMT, Ilya Gavrilin <duke@openjdk.org> wrote:

> > Please review this changes into risc-v double rounding intrinsic.
> > 
> > On risc-v intrinsics for rounding doubles with mode (like Math.ceil/floor/rint) \
> > were missing. On risc-v we don`t have special instruction for such conversion, so \
> > two times conversion was used: double -> long int -> double (using fcvt.l.d, \
> > fcvt.d.l). 
> > Also, we should provide some rounding mode to fcvt.x.x instruction.
> > 
> > Rounding mode selection on ceil (similar for floor and rint): according to \
> > Math.ceil requirements:  
> > > Returns the smallest (closest to negative infinity) value that is greater than \
> > > or equal to the argument and is equal to a mathematical integer \
> > > (Math.java:475).
> > 
> > For double -> long int we choose rup (round towards +inf) mode to get the integer \
> > that more than or equal to the input value.  For long int -> double we choose rdn \
> > (rounds towards -inf) mode to get the smallest (closest to -inf) representation \
> > of integer that we got after conversion. 
> > For cases when we got inf, nan, or value more than 2^63 return input value \
> > (double value which more than 2^63 is guaranteed integer). As well when we store \
> > result we copy sign from input value (need for cases when for (-1.0, 0.0) ceil \
> > need to return -0.0). 
> > We have observed significant improvement on hifive and thead boards.
> > 
> > testing: tier1, tier2 and hotspot:tier3 on hifive
> > 
> > Performance results on hifive (FpRoundingBenchmark.testceil/floor/rint):
> > 
> > Without intrinsic:
> > 
> > Benchmark                      (TESTSIZE)   Mode  Cnt   Score   Error   Units
> > FpRoundingBenchmark.testceil         1024  thrpt   25  39.297  ± 0.037  ops/ms
> > FpRoundingBenchmark.testfloor        1024  thrpt   25  39.398  ± 0.018  ops/ms
> > FpRoundingBenchmark.testrint         1024  thrpt   25  36.388  ± 0.844  ops/ms
> > 
> > With intrinsic:
> > 
> > Benchmark                      (TESTSIZE)   Mode  Cnt   Score   Error   Units
> > FpRoundingBenchmark.testceil         1024  thrpt   25  80.560  ± 0.053  ops/ms
> > FpRoundingBenchmark.testfloor        1024  thrpt   25  80.541  ± 0.081  ops/ms
> > FpRoundingBenchmark.testrint         1024  thrpt   25  80.603  ± 0.071  ops/ms
> 
> Ilya Gavrilin has updated the pull request incrementally with one additional commit \
> since the last revision: 
> Fix typo in c2_MacroAssembler_riscv.cpp

Marked as reviewed by fjiang (Committer).

-------------

PR Review: https://git.openjdk.org/jdk/pull/14991#pullrequestreview-1600611526


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic