[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-hotspot-runtime-dev
Subject:    Re: RFR: JDK-8198285: More consistent Access API for arraycopy
From:       Roman Kennke <rkennke () redhat ! com>
Date:       2018-05-31 22:25:21
Message-ID: 3bb2415e-4d15-fbd3-dde2-73a25c7bd65b () redhat ! com
[Download RAW message or body]

Hi Erik,

It took me a while to get back to this.

I wrote a little wrapper around the extended arraycopy API that allows
to basically write exactly what you suggested:


> ArrayAccess<ARRAYCOPY_ATOMIC>::arraycopy_from_native(ptr, obj, index,
> length);

I agree that this seems the cleanest solution.

The backend still sees both obj+off OR raw-ptr, but I think this is ok.

Differential:
http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02.diff/
Full:
http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.02/

What do you think?

Roman

> Each required parameter is clear when you read the API.
> 
> But I am open to suggestions of course.
> 
> Thanks,
> /Erik
> 
> On 2018-04-25 17:59, Roman Kennke wrote:
>>> 1) Resolve the addresses as we do today, but you think you get better
>>> Shenandoah performance if we either
>>> 2) Have 3 different access calls, (from heap to native, from native to
>>> heap, from heap to heap) for copying, or
>>> 3) Have 1 access call that can encode the above 3 variants, but looks
>>> ugly at the call sites.
>> There's also the idea to pass Address arguments to arraycopy, one for
>> src, one for dst, and have 2 different subclasses: one for obj+offset
>> (heap access) and one with direct pointer (raw). Your comment gave me
>> the idea to also provide arrayOop+idx. This would look clean on the
>> caller side I think.
>>
>> It would also be useful on the GC side: BarrierSets would specialize
>> only in the variants that they are interested in, for example, in case
>> of Shenandoah:
>> 1. arraycopy(HeapAddress,HeapAddress) Java->Java
>> 2. arraycopy(HeapAddress,RawAddress)   Java->native
>> 3. arraycopy(RawAddress,HeapAddress)   native->Java
>>
>> other barriersets can ignore the exact type and only call the args
>> address->resolve() or so to get the actual raw address.
>>
>> This would *also* be beneficial for the other APIs: instead of having
>> all the X() and X_at() variants, we can just use one X variant that
>> either takes RawAddress or HeapAddress.
>>
>> I made a little (not yet compiling/working) prototype of this a while
>> ago:
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8199801-2.patch
>>
>>
>> What do you think? Would it make sense to go further down that road?
>>
>> Roman
>>
>>> You clearly went for 3, which leaves the callsites looking rather hard
>>> to read. It is for example not obvious for me what is going on here
>>> (javaClasses.cpp line 313):
>>>
>>> HeapAccess<>::arraycopy<jbyte>(NULL, 0, reinterpret_cast<const
>>> jbyte*>(utf8_str), value(h_obj()),
>>> typeArrayOopDesc::element_offset<jbyte>(0), NULL, length);
>>>
>>> ...without looking very carefully at the long list of arguments encoding
>>> what is actually going on (copy from native to the heap). What is worse
>>> is that this will probably not compile without adding the template
>>> keyword to the call (since you have a qualified template member function
>>> behind a template class), like this:
>>> HeapAccess<>::template arraycopy<jbyte>(NULL, 0, reinterpret_cast<const
>>> jbyte*>(utf8_str), value(h_obj()),
>>> typeArrayOopDesc::element_offset<jbyte>(0), NULL, length);
>>>
>>> ...which as a public API leaves me feeling a bit like this:         :C
>>>
>>> May I suggest adding an array access helper. The idea is to keep a
>>> single call through Access, and a single backend point for array copy,
>>> but let the helper provide the three different types of copying as
>>> different functions, so that the call sites still look pretty and easy
>>> to read and follow.
>>>
>>> This helper could additionally have load_at, and store_at use array
>>> indices as opposed to offsets, and hence hide the offset calculations we
>>> perform today (typically involving checking if we are using compressed
>>> oops or not).
>>>
>>> I am thinking something along the lines of
>>> ArrayAccess<>::arraycopy_to_native(readable_arguments),
>>> ArrayAccess<>::arraycopy_from_native(readable_arguments),
>>> ArrayAccess<>::arraycopy(readable_arguments), which translates to some
>>> form of Access<>::arraycopy(unreadable_arguments). And for example
>>> ArrayAccess<>::load_at(obj, index) would translate to some kind of
>>> HeapAccess<IN_HEAP_ARRAY>::load_at(obj, offset_for_index(index)) as a
>>> bonus, making everyone using the API jump with happiness.
>>>
>>> What do you think about this idea? Good or bad? I guess the question is
>>> whether this helper should be in access.hpp, or somewhere else (like in
>>> arrayOop). Thoughts are welcome.
>>>
>>> Thanks,
>>> /Erik
>>>
>>> On 2018-04-11 19:54, Roman Kennke wrote:
>>>>      Currently, the arraycopy API in access.hpp gets the src and dst
>>>> oops,
>>>> plus the src and dst addresses. In order to be most useful to garbage
>>>> collectors, it should receive the src and dst oops together with the
>>>> src
>>>> and dst offsets instead, and let the Access API / GC calculate the src
>>>> and dst addresses.
>>>>
>>>> For example, Shenandoah needs to resolve the src and dst objects for
>>>> arraycopy, and then apply the corresponding offsets. With the current
>>>> API (obj+ptr) it would calculate the ptr-diff from obj to ptr, then
>>>> resolve obj, then re-add the calculate ptr-diff. This is fragile
>>>> because
>>>> we also may resolve obj in the runtime before calculating ptr (e.g. via
>>>> arrayOop::base()). If we then pass in the original obj and a ptr
>>>> calculated from another copy of the same obj, the above resolution
>>>> logic
>>>> would not work. This is currently the case for obj-arraycopy.
>>>>
>>>> I propose to change the API to accept obj+offset, in addition to ptr
>>>> for
>>>> both src and dst. Only one or the other should be used. Heap accesses
>>>> should use obj+offset and pass NULL for raw-ptr, off-heap accesses (or
>>>> heap accesses that are already resolved.. use with care) should pass
>>>> NULL+0 for obj+offset and the raw-ptr. Notice that this also allows the
>>>> API to be used for Java<->native array bulk transfers.
>>>>
>>>> An alternative would be to break the API up into 4 variants:
>>>>
>>>> Java->Java transfer:
>>>> arraycopy(oop src, size_t src_offs, oop dst, size_t dst_offs, size_t
>>>> len)
>>>>
>>>> Java->Native transfer:
>>>> arraycopy(oop src, size_t src_offs, D* raw_dst, size_t len)
>>>>
>>>> Native->Java transfer:
>>>> arraycopy(S* src_raw, oop dst, size_t dst_offs, size_t len)
>>>>
>>>> 'Unsafe' transfer:
>>>> arraycopy(S* src_raw, D* dst_raw, size_t len)
>>>>
>>>>
>>>> But that seemed to be too much boilerplate copy+pasting for my taste.
>>>> (See how having this overly complicated template layer hurts us?)
>>>>
>>>> Plus, I had a better idea: instead of accepting oop+offset OR T* for
>>>> almost every Access API, we may want to abstract that and take an
>>>> Address type argument, which would be either HeapAddress(obj,
>>>> offset) or
>>>> RawAddress(T* ptr). GCs may then just call addr->address() to get the
>>>> actual address, or specialize for HeapAddress variants and resolve the
>>>> objs and then resolve the address. This would also allow us to get rid
>>>> of almost half of the API (all the *_at variants would go) and some
>>>> other simplifications. However, this seemed to explode the scope of
>>>> this
>>>> RFE, and would be better handled in another RFE.
>>>>
>>>> This changes makes both typeArrayKlass and objArrayKlass use the
>>>> changed
>>>> API, plus I identified all (hopefully) places where we do bulk
>>>> Java<->native array transfers and make them use the API too. Gets us
>>>> rid
>>>> of a bunch of memcpy calls :-)
>>>>
>>>> Please review the change:
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8198285/webrev.00/
>>>>
>>>> Thanks, Roman
>>>>
>>
> 


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic