[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-core-libs-dev
Subject:    Integrated: 8266054: VectorAPI rotate operation optimization
From:       Jatin Bhateja <jbhateja () openjdk ! java ! net>
Date:       2021-07-28 2:22:41
Message-ID: BDbnRETp0PWKWTnIJghtiMPlewGp5tXpnJPzeBTExuY=.db1e5321-6bab-4a30-9c88-18f1f8f1df5f () github ! com
[Download RAW message or body]

On Tue, 27 Apr 2021 17:56:04 GMT, Jatin Bhateja <jbhateja@openjdk.org> wrote:

> Current VectorAPI Java side implementation expresses rotateLeft and rotateRight \
> operation using following operations:- 
> vec1 = lanewise(VectorOperators.LSHL, n)
> vec2 = lanewise(VectorOperators.LSHR, n)
> res = lanewise(VectorOperations.OR, vec1 , vec2)
> 
> This patch moves above handling from Java side to C2 compiler which facilitates \
> dismantling the rotate operation if target ISA does not support a direct rotate \
> instruction. 
> AVX512 added vector rotate instructions vpro[rl][v][dq] which operate over long and \
> integer type vectors. For other cases (i.e. sub-word type vectors or for targets \
> which do not support direct rotate operations )   instruction sequence comprising \
> of vector SHIFT (LEFT/RIGHT) and vector OR is emitted. 
> Please find below the performance data for included JMH benchmark.
> Machine:  Cascade Lake Server (Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz)
> 
> 
> <html xmlns:v="urn:schemas-microsoft-com:vml"
> xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns="http://www.w3.org/TR/REC-html40">
> 
> <head>
> 
> <meta name=ProgId content=Excel.Sheet>
> <meta name=Generator content="Microsoft Excel 15">
> <link id=Main-File rel=Main-File
> href="file:///C:/Users/jatinbha/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
> <link rel=File-List
> href="file:///C:/Users/jatinbha/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
>  <style>
> 
> </style>
> </head>
> 
> <body link="#0563C1" vlink="#954F72">
> 
> 
> 
> Benchmark | (bits) | (shift) | (size) | Baseline Score (ops/ms) | With Opts \
>                 (ops/ms) | Gain
> -- | -- | -- | -- | -- | -- | --
> RotateBenchmark.testRotateLeftB | 128 | 7 | 256 | 3939.136 | 3836.133 | 0.973851372
> RotateBenchmark.testRotateLeftB | 128 | 7 | 512 | 1984.231 | 1918.27 | 0.966757399
> RotateBenchmark.testRotateLeftB | 128 | 15 | 256 | 3925.165 | 4043.842 | \
> 1.030234907 RotateBenchmark.testRotateLeftB | 128 | 15 | 512 | 1962.723 | 1936.551 \
> | 0.986665464 RotateBenchmark.testRotateLeftB | 128 | 31 | 256 | 3945.6 | 3817.883 \
> | 0.967630525 RotateBenchmark.testRotateLeftB | 128 | 31 | 512 | 1944.458 | \
> 1914.229 | 0.984453766 RotateBenchmark.testRotateLeftB | 256 | 7 | 256 | 4612.149 | \
> 4514.874 | 0.978908964 RotateBenchmark.testRotateLeftB | 256 | 7 | 512 | 2296.252 | \
> 2270.237 | 0.988670669 RotateBenchmark.testRotateLeftB | 256 | 15 | 256 | 4576.628 \
> | 4515.53 | 0.986649996 RotateBenchmark.testRotateLeftB | 256 | 15 | 512 | 2288.278 \
> | 2270.923 | 0.992415694 RotateBenchmark.testRotateLeftB | 256 | 31 | 256 | \
> 4624.243 | 4511.46 | 0.975610495 RotateBenchmark.testRotateLeftB | 256 | 31 | 512 | \
> 2305.459 | 2273.788 | 0.986262605 RotateBenchmark.testRotateLeftB | 512 | 7 | 256 | \
> 7748.283 | 7777.105 | 1.003719792 RotateBenchmark.testRotateLeftB | 512 | 7 | 512 | \
> 3906.214 | 3912.647 | 1.001646863 RotateBenchmark.testRotateLeftB | 512 | 15 | 256 \
> | 7764.653 | 7763.482 | 0.999849188 RotateBenchmark.testRotateLeftB | 512 | 15 | \
> 512 | 3916.061 | 3919.363 | 1.000843194 RotateBenchmark.testRotateLeftB | 512 | 31 \
> | 256 | 7779.754 | 7770.239 | 0.998776954 RotateBenchmark.testRotateLeftB | 512 | \
> 31 | 512 | 3916.471 | 3912.718 | 0.999041739 RotateBenchmark.testRotateLeftI | 128 \
> | 7 | 256 | 4043.39 | 13461.814 | 3.329338501 RotateBenchmark.testRotateLeftI | 128 \
> | 7 | 512 | 1996.217 | 6455.425 | 3.233829288 RotateBenchmark.testRotateLeftI | 128 \
> | 15 | 256 | 4028.614 | 13077.277 | 3.246098286 RotateBenchmark.testRotateLeftI | \
> 128 | 15 | 512 | 1997.612 | 6452.918 | 3.230315997 RotateBenchmark.testRotateLeftI \
> | 128 | 31 | 256 | 4123.357 | 13079.045 | 3.171940969 \
> RotateBenchmark.testRotateLeftI | 128 | 31 | 512 | 2003.356 | 6452.716 | 3.22095324 \
> RotateBenchmark.testRotateLeftI | 256 | 7 | 256 | 7666.949 | 25658.625 | 3.34665393 \
> RotateBenchmark.testRotateLeftI | 256 | 7 | 512 | 3855.826 | 12278.106 | 3.18429981 \
> RotateBenchmark.testRotateLeftI | 256 | 15 | 256 | 7670.901 | 24625.466 | \
> 3.210244272 RotateBenchmark.testRotateLeftI | 256 | 15 | 512 | 3765.786 | 12272.771 \
> | 3.259019764 RotateBenchmark.testRotateLeftI | 256 | 31 | 256 | 7660.599 | \
> 25678.864 | 3.352069988 RotateBenchmark.testRotateLeftI | 256 | 31 | 512 | 3773.401 \
> | 12006.469 | 3.181869353 RotateBenchmark.testRotateLeftI | 512 | 7 | 256 | \
> 11900.948 | 31242.989 | 2.625252123 RotateBenchmark.testRotateLeftI | 512 | 7 | 512 \
> | 5830.878 | 15727.149 | 2.697217983 RotateBenchmark.testRotateLeftI | 512 | 15 | \
> 256 | 12171.847 | 33180.067 | 2.72596813 RotateBenchmark.testRotateLeftI | 512 | 15 \
> | 512 | 5830.544 | 16740.182 | 2.871118372 RotateBenchmark.testRotateLeftI | 512 | \
> 31 | 256 | 11909.553 | 31250.882 | 2.624018047 RotateBenchmark.testRotateLeftI | \
> 512 | 31 | 512 | 5846.747 | 15738.831 | 2.691895339 RotateBenchmark.testRotateLeftL \
> | 128 | 7 | 256 | 2047.243 | 6888.484 | 3.364761291 RotateBenchmark.testRotateLeftL \
> | 128 | 7 | 512 | 1005.029 | 3245.931 | 3.229688895 RotateBenchmark.testRotateLeftL \
> | 128 | 15 | 256 | 1996.921 | 6985.256 | 3.498013191 \
> RotateBenchmark.testRotateLeftL | 128 | 15 | 512 | 986.906 | 3217.778 | 3.260470602 \
> RotateBenchmark.testRotateLeftL | 128 | 31 | 256 | 1999.06 | 6977.672 | 3.490476524 \
> RotateBenchmark.testRotateLeftL | 128 | 31 | 512 | 987.258 | 3236.63 | 3.278403416 \
> RotateBenchmark.testRotateLeftL | 256 | 7 | 256 | 3752.412 | 12995.954 | 3.4633601 \
> RotateBenchmark.testRotateLeftL | 256 | 7 | 512 | 1824.093 | 5809.576 | 3.184912173 \
> RotateBenchmark.testRotateLeftL | 256 | 15 | 256 | 3759.99 | 13262.631 | 3.52730486 \
> RotateBenchmark.testRotateLeftL | 256 | 15 | 512 | 1823.393 | 5803.872 | \
> 3.183006626 RotateBenchmark.testRotateLeftL | 256 | 31 | 256 | 3757.134 | 13284.633 \
> | 3.535842214 RotateBenchmark.testRotateLeftL | 256 | 31 | 512 | 1822.192 | \
> 5824.178 | 3.196248255 RotateBenchmark.testRotateLeftL | 512 | 7 | 256 | 5794.005 | \
> 15567.753 | 2.686872552 RotateBenchmark.testRotateLeftL | 512 | 7 | 512 | 2969.393 \
> | 7694.79 | 2.591368 RotateBenchmark.testRotateLeftL | 512 | 15 | 256 | 5817.292 | \
> 15726.597 | 2.703422314 RotateBenchmark.testRotateLeftL | 512 | 15 | 512 | 2944.655 \
> | 7664.954 | 2.603005785 RotateBenchmark.testRotateLeftL | 512 | 31 | 256 | \
> 5822.131 | 16718.64 | 2.871567129 RotateBenchmark.testRotateLeftL | 512 | 31 | 512 \
> | 2944.763 | 7657.814 | 2.600485676 RotateBenchmark.testRotateLeftS | 128 | 7 | 256 \
> | 8006.155 | 7976.701 | 0.99632108 RotateBenchmark.testRotateLeftS | 128 | 7 | 512 \
> | 4031.753 | 4003.43 | 0.992975016 RotateBenchmark.testRotateLeftS | 128 | 15 | 256 \
> | 8003.879 | 7952.752 | 0.993612222 RotateBenchmark.testRotateLeftS | 128 | 15 | \
> 512 | 4026.359 | 4014.757 | 0.997118488 RotateBenchmark.testRotateLeftS | 128 | 31 \
> | 256 | 8000.842 | 7995.733 | 0.999361442 RotateBenchmark.testRotateLeftS | 128 | \
> 31 | 512 | 4044.421 | 4007.426 | 0.990852832 RotateBenchmark.testRotateLeftS | 256 \
> | 7 | 256 | 15078.471 | 15034.395 | 0.997076892 RotateBenchmark.testRotateLeftS | \
> 256 | 7 | 512 | 7236.509 | 7620.334 | 1.053040078 RotateBenchmark.testRotateLeftS | \
> 256 | 15 | 256 | 15093.661 | 15024.17 | 0.995396014 RotateBenchmark.testRotateLeftS \
> | 256 | 15 | 512 | 7308.568 | 7724.381 | 1.056893909 \
> RotateBenchmark.testRotateLeftS | 256 | 31 | 256 | 15332.233 | 15432.113 | \
> 1.006514381 RotateBenchmark.testRotateLeftS | 256 | 31 | 512 | 7317.18 | 7626.679 | \
> 1.042297579 RotateBenchmark.testRotateLeftS | 512 | 7 | 256 | 24079.012 | 23939.263 \
> | 0.994196232 RotateBenchmark.testRotateLeftS | 512 | 7 | 512 | 11441.41 | 11921.21 \
> | 1.041935391 RotateBenchmark.testRotateLeftS | 512 | 15 | 256 | 23563.675 | \
> 23590.959 | 1.001157884 RotateBenchmark.testRotateLeftS | 512 | 15 | 512 | \
> 11418.634 | 11949.391 | 1.046481654 RotateBenchmark.testRotateLeftS | 512 | 31 | \
> 256 | 24035.69 | 23595.385 | 0.9816812 RotateBenchmark.testRotateLeftS | 512 | 31 | \
> 512 | 11668.091 | 11899.536 | 1.019835721 RotateBenchmark.testRotateRightB | 128 | \
> 7 | 256 | 3852.421 | 3816.521 | 0.990681185 RotateBenchmark.testRotateRightB | 128 \
> | 7 | 512 | 1956.766 | 1923.638 | 0.983070025 RotateBenchmark.testRotateRightB | \
> 128 | 15 | 256 | 3899.136 | 4038.945 | 1.035856405 RotateBenchmark.testRotateRightB \
> | 128 | 15 | 512 | 1957.733 | 2030.973 | 1.037410617 \
> RotateBenchmark.testRotateRightB | 128 | 31 | 256 | 3902.5 | 4043.736 | 1.03619116 \
> RotateBenchmark.testRotateRightB | 128 | 31 | 512 | 1957.728 | 1920.434 | \
> 0.980950367 RotateBenchmark.testRotateRightB | 256 | 7 | 256 | 4565.887 | 4515.083 \
> | 0.988873137 RotateBenchmark.testRotateRightB | 256 | 7 | 512 | 2300.057 | \
> 2278.065 | 0.990438498 RotateBenchmark.testRotateRightB | 256 | 15 | 256 | 4570.754 \
> | 4527.692 | 0.990578797 RotateBenchmark.testRotateRightB | 256 | 15 | 512 | \
> 2300.524 | 2268.659 | 0.986148808 RotateBenchmark.testRotateRightB | 256 | 31 | 256 \
> | 4577.569 | 4513.29 | 0.98595783 RotateBenchmark.testRotateRightB | 256 | 31 | 512 \
> | 2304.335 | 2273.178 | 0.986478962 RotateBenchmark.testRotateRightB | 512 | 7 | \
> 256 | 7772.483 | 7842.671 | 1.009030319 RotateBenchmark.testRotateRightB | 512 | 7 \
> | 512 | 3907.265 | 3917.325 | 1.002574691 RotateBenchmark.testRotateRightB | 512 | \
> 15 | 256 | 7855.653 | 7865.25 | 1.001221668 RotateBenchmark.testRotateRightB | 512 \
> | 15 | 512 | 3909.845 | 3976.813 | 1.017128045 RotateBenchmark.testRotateRightB | \
> 512 | 31 | 256 | 7746.765 | 7870.159 | 1.015928455 RotateBenchmark.testRotateRightB \
> | 512 | 31 | 512 | 3919.596 | 3981.934 | 1.01590419 \
> RotateBenchmark.testRotateRightI | 128 | 7 | 256 | 4125.151 | 13056.878 | \
> 3.165187893 RotateBenchmark.testRotateRightI | 128 | 7 | 512 | 2045.201 | 6501.447 \
> | 3.17887924 RotateBenchmark.testRotateRightI | 128 | 15 | 256 | 4111.736 | \
> 13318.124 | 3.23905134 RotateBenchmark.testRotateRightI | 128 | 15 | 512 | 2055.355 \
> | 6497.289 | 3.161151723 RotateBenchmark.testRotateRightI | 128 | 31 | 256 | \
> 4109.353 | 13073.3 | 3.181352393 RotateBenchmark.testRotateRightI | 128 | 31 | 512 \
> | 2055.431 | 6463.902 | 3.14479153 RotateBenchmark.testRotateRightI | 256 | 7 | 256 \
> | 7804.976 | 24585.962 | 3.150036848 RotateBenchmark.testRotateRightI | 256 | 7 | \
> 512 | 3815.818 | 11985.145 | 3.140911071 RotateBenchmark.testRotateRightI | 256 | \
> 15 | 256 | 7644.977 | 25863.841 | 3.383115606 RotateBenchmark.testRotateRightI | \
> 256 | 15 | 512 | 3822.508 | 12280.58 | 3.212702236 RotateBenchmark.testRotateRightI \
> | 256 | 31 | 256 | 7709.635 | 25655.108 | 3.327668301 \
> RotateBenchmark.testRotateRightI | 256 | 31 | 512 | 3801.5 | 12271.65 | 3.228107326 \
> RotateBenchmark.testRotateRightI | 512 | 7 | 256 | 12223.711 | 31239.788 | \
> 2.555671351 RotateBenchmark.testRotateRightI | 512 | 7 | 512 | 5973.571 | 16740.852 \
> | 2.802486486 RotateBenchmark.testRotateRightI | 512 | 15 | 256 | 12205.47 | \
> 31248.025 | 2.560165647 RotateBenchmark.testRotateRightI | 512 | 15 | 512 | \
> 5966.513 | 15728.168 | 2.6360737 RotateBenchmark.testRotateRightI | 512 | 31 | 256 \
> | 12209.405 | 33181.105 | 2.71766765 RotateBenchmark.testRotateRightI | 512 | 31 | \
> 512 | 5981.527 | 15727.496 | 2.629344647 RotateBenchmark.testRotateRightL | 128 | 7 \
> | 256 | 2054.509 | 6980.849 | 3.397818652 RotateBenchmark.testRotateRightL | 128 | \
> 7 | 512 | 997.375 | 3242.374 | 3.250907633 RotateBenchmark.testRotateRightL | 128 | \
> 15 | 256 | 2051.459 | 6892.389 | 3.359749817 RotateBenchmark.testRotateRightL | 128 \
> | 15 | 512 | 1002.906 | 3223.342 | 3.21400211 RotateBenchmark.testRotateRightL | \
> 128 | 31 | 256 | 2044.749 | 6984.157 | 3.415654929 RotateBenchmark.testRotateRightL \
> | 128 | 31 | 512 | 1004.273 | 3237.496 | 3.22372104 \
> RotateBenchmark.testRotateRightL | 256 | 7 | 256 | 3811.551 | 13347.75 | \
> 3.501920872 RotateBenchmark.testRotateRightL | 256 | 7 | 512 | 1892.883 | 5840.85 | \
> 3.085689924 RotateBenchmark.testRotateRightL | 256 | 15 | 256 | 3821.705 | \
> 14034.823 | 3.672398314 RotateBenchmark.testRotateRightL | 256 | 15 | 512 | \
> 1799.193 | 5817.533 | 3.233412424 RotateBenchmark.testRotateRightL | 256 | 31 | 256 \
> | 3816.666 | 14022.31 | 3.673968327 RotateBenchmark.testRotateRightL | 256 | 31 | \
> 512 | 1796.649 | 5824.13 | 3.241662673 RotateBenchmark.testRotateRightL | 512 | 7 | \
> 256 | 5943.986 | 15586.254 | 2.622188881 RotateBenchmark.testRotateRightL | 512 | 7 \
> | 512 | 3022.686 | 7662.241 | 2.534911334 RotateBenchmark.testRotateRightL | 512 | \
> 15 | 256 | 5958.008 | 15726.859 | 2.639616966 RotateBenchmark.testRotateRightL | \
> 512 | 15 | 512 | 2998.469 | 7654.703 | 2.552870482 RotateBenchmark.testRotateRightL \
> | 512 | 31 | 256 | 5937.491 | 15741.207 | 2.651154671 \
> RotateBenchmark.testRotateRightL | 512 | 31 | 512 | 3014.699 | 7656.837 | \
> 2.539834657 RotateBenchmark.testRotateRightS | 128 | 7 | 256 | 8172.896 | 8003.474 \
> | 0.979270261 RotateBenchmark.testRotateRightS | 128 | 7 | 512 | 4111.074 | \
> 4047.267 | 0.984479238 RotateBenchmark.testRotateRightS | 128 | 15 | 256 | 8225.79 \
> | 8040.421 | 0.9774649 RotateBenchmark.testRotateRightS | 128 | 15 | 512 | 4129.801 \
> | 4011.919 | 0.971455767 RotateBenchmark.testRotateRightS | 128 | 31 | 256 | \
> 8176.102 | 8052.686 | 0.984905276 RotateBenchmark.testRotateRightS | 128 | 31 | 512 \
> | 4117.735 | 4046.522 | 0.982705784 RotateBenchmark.testRotateRightS | 256 | 7 | \
> 256 | 15213.617 | 15169.51 | 0.997100821 RotateBenchmark.testRotateRightS | 256 | 7 \
> | 512 | 7530.289 | 7625.581 | 1.012654494 RotateBenchmark.testRotateRightS | 256 | \
> 15 | 256 | 15238.384 | 15069.978 | 0.988948566 RotateBenchmark.testRotateRightS | \
> 256 | 15 | 512 | 7275.098 | 7620.764 | 1.047513587 RotateBenchmark.testRotateRightS \
> | 256 | 31 | 256 | 15299.821 | 15043.765 | 0.983264118 \
> RotateBenchmark.testRotateRightS | 256 | 31 | 512 | 7273.028 | 7630.97 | 1.04921499 \
> RotateBenchmark.testRotateRightS | 512 | 7 | 256 | 23998.152 | 23920.046 | \
> 0.996745333 RotateBenchmark.testRotateRightS | 512 | 7 | 512 | 11582.679 | \
> 11916.382 | 1.02881052 RotateBenchmark.testRotateRightS | 512 | 15 | 256 | \
> 23982.797 | 23434.756 | 0.977148579 RotateBenchmark.testRotateRightS | 512 | 15 | \
> 512 | 11629.806 | 11918.759 | 1.0248459 RotateBenchmark.testRotateRightS | 512 | 31 \
> | 256 | 23988.549 | 23475.629 | 0.978618132 RotateBenchmark.testRotateRightS | 512 \
> | 31 | 512 | 11650.146 | 11916.47 | 1.022860143 
> 
> 
> </body>
> 
> </html>

This pull request has now been integrated.

Changeset: d994b93e
Author:    Jatin Bhateja <jbhateja@openjdk.org>
URL:       https://git.openjdk.java.net/jdk/commit/d994b93e211d49af79212d765633ba3457365a08
                
Stats:     4438 lines in 57 files changed: 4219 ins; 58 del; 161 mod

8266054: VectorAPI rotate operation optimization

Reviewed-by: psandoz, sviswanathan

-------------

PR: https://git.openjdk.java.net/jdk/pull/3720


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic