[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-hotspot-runtime-dev
Subject:    Re: RFR: JDK-8312018: Improve reservation of class space and CDS [v4]
From:       Thomas Stuefe <stuefe () openjdk ! org>
Date:       2023-08-30 17:51:35
Message-ID: G-sqerabqBAXrZptJHg4KsXblzv2gFkwmJDh6UxY2v8=.326f2300-8431-4aa8-b203-85049a4f18c2 () github ! com
[Download RAW message or body]

On Mon, 28 Aug 2023 16:32:30 GMT, Ioi Lam <iklam@openjdk.org> wrote:

> > > @iklam is there anything missing from your point of view?
> > 
> > I just realized this -- for the above 32GB allocations, do we need to use the new \
> > algorithm for all platforms? As far as I know, only aarch64 and ppc64 need it \
> > because they want to use a single "load immediate" instruction. 
> > For the other CPUs, we can just ask the OS. That will be faster, always succeed, \
> > and be at the "right" location as decided by the OS.
> 
> > > > @iklam is there anything missing from your point of view?
> > > 
> > > 
> > > I just realized this -- for the above 32GB allocations, do we need to use the \
> > > new algorithm for all platforms? As far as I know, only aarch64 and ppc64 need \
> > > it because they want to use a single "load immediate" instruction. For the \
> > > other CPUs, we can just ask the OS. That will be faster, always succeed, and be \
> > > at the "right" location as decided by the OS.
> > 
> > The argument for doing it on the remaining platforms (x64 and risc) would be that \
> > those, too, could profit from using 16-bit moves and short immediates, instead of \
> > - e.g. in the case of x64 - always emitting a giant 8-byte immediate for addi. 
> 
> Do you mean this?
> 
> 
> 0x00007f13e73204e9:   mov    0x8(%rax),%ebx               ;; 1141:   __ \
>                 load_klass(rbx, rax, rscratch1);
> 0x00007f13e73204ec:   movabs $0x800000000,%r10
> 0x00007f13e73204f6:   add    %r10,%rbx
> 
> 
> I am not familiar with x64 instructions. I thought 64-bit immediate moves to a \
> register must be 10 bytes (8 byte immediate value), if the value is larger than 32 \
> bits. So you can't make the `movabs` instruction any shorter. 
> 
> > And that the code would be better tested, since all platforms run through it.
> > 
> > OTOH, this could also be done in a follow-up. So, if you prefer it that way, I \
> > make that section aarch/ppc only.
> 
> 
> For this PR, I would prefer doing it only on aarch64/ppc for the above 32GB \
> allocations (otherwise we will have a regression for the other platforms -- there's \
> now a chance of failure, at least theoretically). 
> The algorithm is still used on all plaforms for the lower allocations, right? So we \
> will get some test mileage that way.

Thanks @iklam , @adinn  and @rkennke!

Also thanks to @TheRealMDoerr for testing at SAP. There is one remaining test issue \
on AIX, but I identified it as an old problem \
(https://bugs.openjdk.org/browse/JDK-8315321) and already opened a PR for it. For \
now, I disabled the test on AIX until that is fixed.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/15041#issuecomment-1699602865


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic