[prev in list] [next in list] [prev in thread] [next in thread]
List: openjdk-hotspot-runtime-dev
Subject: Re: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CD
From: Ioi Lam <ioi.lam () oracle ! com>
Date: 2017-07-31 4:07:50
Message-ID: 4172849a-55e6-7133-90d6-2c75b58b0391 () oracle ! com
[Download RAW message or body]
Hi Jiangli,
Here are my comments. I've not reviewed the GC code and I'll leave that
to the GC experts :-)
stringTable.cpp: StringTable::archive_string
add assert for DumpSharedSpaces only
filemap.cpp
525 void
FileMapInfo::write_archive_heap_regions(GrowableArray<MemRegion> *regions,
526 int first_region, int
num_regions) {
When I first read this function, I found it hard to follow, especially
this part that coalesces the trailing regions:
537 int len = regions->length();
538 if (len > 1) {
539 start = (char*)regions->at(1).start();
540 size = (char*)regions->at(len - 1).end() - start;
541 }
542 }
The rest of filemap.cpp always perform identical operations on MemRegion
arrays, which are either 1 or 2 in size. However, this function doesn't
follow that pattern; it also has a very different notion of "region",
and the confusing part is regions->size() is not the same as num_regions.
How about we change the API to something like the following? Before
calling this API, the caller needs to coalesce the trailing G1 regions
into a single MemRegion.
FileMapInfo::write_archive_heap_regions(MemRegion *regions, int
first, int num_regions) {
if (first == MetaspaceShared::first_string) {
assert(num_regons <= MetaspaceShared::max_strings, "...");
} else {
assert(first ==
MetaspaceShared::first_open_archive_heap_region, "...");
assert(num_regons <=
MetaspaceShared::max_open_archive_heap_region, "...");
}
....
756 if (!string_data_mapped) {
757 StringTable::ignore_shared_strings(true);
758 assert(string_ranges == NULL && num_string_ranges == 0, "sanity");
759 }
760
761 if (open_archive_heap_data_mapped) {
762 MetaspaceShared::set_open_archive_heap_region_mapped();
763 } else {
764 assert(open_archive_heap_ranges == NULL &&
num_open_archive_heap_ranges == 0, "sanity");
765 }
Maybe the two "if" statements should be more consistent? Instead of
StringTable::ignore_shared_strings, how about
StringTable::set_shared_strings_region_mapped()?
FileMapInfo::map_heap_data() --
818 char* addr = (char*)regions[i].start();
819 char* base = os::map_memory(_fd, _full_path, si->_file_offset,
820 addr, regions[i].byte_size(),
si->_read_only,
821 si->_allow_exec);
What happens when the first region succeeds to map but the second region
fails to map? Will both regions be unmapped? I don't see where you store
the return value (base) from os::map_memory(). Does it mean the code
assumes that (addr == base). If so, we need an assert here.
constantPool.cpp
Handle refs_handle;
...
refs_handle = Handle(THREAD, (oop)archived);
This will first create a NULL handle, then construct a temporary handle,
and then assign the temp handle back to the null handle. This means two
handles will be pushed onto THREAD->metadata_handles()
I think it's more efficient if you merge these into a single statement
Handle refs_handle(THREAD, (oop)archived);
Is this experimental code? Maybe it should be removed?
664 if (tag_at(index).is_unresolved_klass()) {
665 #if 0
666 CPSlot entry = cp->slot_at(index);
667 Symbol* name = entry.get_symbol();
668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD);
669 if (k != NULL) {
670 klass_at_put(index, k);
671 }
672 #endif
673 } else
cpCache.hpp:
u8 _archived_references
shouldn't this be declared as an narrowOop to avoid the type casts when
it's used?
cpCache.cpp:
add assert so that one of these is used only at dump time and the
other only at run time?
610 oop ConstantPoolCache::archived_references() {
611 return oopDesc::decode_heap_oop((narrowOop)_archived_references);
612 }
613
614 void ConstantPoolCache::set_archived_references(oop o) {
615 _archived_references = (u8)oopDesc::encode_heap_oop(o);
616 }
Thanks!
- Ioi
On 7/27/17 1:37 PM, Jiangli Zhou wrote:
> Sorry, the mail didn't handle the rich text well. I fixed the format below.
>
> Please help review the changes for JDK-8179302 (Pre-resolve constant pool string \
> entries and cache resolved_reference arrays in CDS archive). Currently, the CDS \
> archive can contain cached class metadata, interned java.lang.String objects. This \
> RFE adds the constant pool ‘resolved_references' arrays (hotspot specific) to the \
> archive for startup/runtime performance enhancement. The ‘resolved_references' \
> arrays are used to hold references of resolved constant pool entries including \
> Strings, mirrors, etc. With the 'resolved_references' being cached, string \
> constants in shared classes can now be resolved to existing interned \
> java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required.
> The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. \
> Part of the changes were contributed by Thomas himself.
> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302
> hotspot: http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/
> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/
>
> Please see below for details of supporting cached ‘resolved_references' and \
> pre-resolving string constants.
> Types of Pinned G1 Heap Regions
>
> The pinned region type is a super type of all archive region types, which include \
> the open archive type and the closed archive type.
> 00100 0 [ 8] Pinned Mask
> 01000 0 [16] Old Mask
> 10000 0 [32] Archive Mask
> 11100 0 [56] Open Archive: ArchiveMask | PinnedMask | OldMask
> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1
>
>
> Pinned Regions
>
> Objects within the region are 'pinned', which means GC does not move any live \
> objects. GC scans and marks objects in the pinned region as normal, but skips \
> forwarding live objects. Pointers in live objects are updated. Dead objects \
> (unreachable) can be collected and freed.
> Archive Regions
>
> The archive types are sub-types of 'pinned'. There are two types of archive region \
> currently, open archive and closed archive. Both can support caching java heap \
> objects via the CDS archive.
> An archive region is also an old region by design.
>
> Open Archive (GC-RW) Regions
>
> Open archive region is GC writable. GC scans & marks objects within the region and \
> adjusts (updates) pointers in live objects the same way as a pinned region. Live \
> objects (reachable) are pinned and not forwarded by GC. Open archive region does \
> not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects \
> are not collected and freed by GC.
> Adjustable Outgoing Pointers
>
> As GC can adjust pointers within the live objects in open archive heap region, \
> objects can have outgoing pointers to another java heap region, including closed \
> archive region, open archive region, pinned (or humongous) region, and normal \
> generational region. When a referenced object is moved by GC, the pointer within \
> the open archive region is updated accordingly.
> Closed Archive (GC-RO) Regions
>
> The closed archive region is GC read-only region. GC cannot write into the region. \
> Objects are not scanned and marked by GC. Objects are pinned and not forwarded. \
> Pointers are not updated by GC either. Hence, objects within the archive region \
> cannot have any outgoing pointers to another java heap region. Objects however can \
> still have pointers to other objects within the closed archive regions (we might \
> allow pointers to open archive regions in the future). That restricts the type of \
> java objects that can be supported by the archive region. In JDK 9 we support \
> archive Strings with the archive regions.
> The GC-readonly archive region makes java heap memory sharable among different JVM \
> processes. NOTE: synchronization on the objects within the archive heap region can \
> still cause writes to the memory page.
> Dormant Objects
>
> Dormant objects are unreachable java objects within the open archive heap region.
> A java object in the open archive heap region is a live object if it can be reached \
> during scanning. Some of the java objects in the region may not be reachable during \
> scanning. Those objects are considered as dormant, but not dead. For example, a \
> constant pool 'resolved_references' array is reachable via the klass root if its \
> container klass (shared) is already loaded at the time during GC scanning. If a \
> shared klass is not yet loaded, the klass root is not scanned and it's constant \
> pool 'resolved_reference' array (A) in the open archive region is not reachable. \
> Then A is a dormant object.
> Object State Transition
>
> All java objects are initially dormant objects when open archive heap regions are \
> mapped to the runtime java heap. A dormant object becomes live object when the \
> associated shared class is loaded at runtime. Explicit call to \
> G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes \
> live. That should be the case for cached objects with strong roots as well, since \
> strong roots are only scanned at the start of GC marking (the initial marking) but \
> not during Remarking/Final marking. If a cached object becomes live during \
> concurrent marking phase, G1 may not find it and mark it live unless a call to \
> G1SATBCardTableModRefBS::enqueue() is made for the object.
> Currently, a live object in the open archive heap region cannot become dormant \
> again. This restriction simplifies GC requirement and guarantees all outgoing \
> pointers are updated by GC correctly. Only objects for shared classes from the \
> builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are \
> supported for caching.
> Caching Java Objects at Archive Dump Time
>
> The closed archive and open archive regions are allocated near the top of the dump \
> time java heap. Archived java objects are copied into the designated archive heap \
> regions. For example, String objects and the underlying 'value' arrays are copied \
> into the closed archive regions. All references to the archived objects (from \
> shared class metadata, string table, etc) are set to the new heap locations. A hash \
> table is used to keep track of all archived java objects during the copying process \
> to make sure java object is not archived more than once if reached from different \
> roots. It also makes sure references to the same archived object are updated using \
> the same new address location.
> Caching Constant Pool resolved_references Array
>
> The 'resolved_references' is an array that holds references of resolved constant \
> pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has \
> one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' \
> arrays are copied into the open archive regions during dump process. Prior to \
> copying the 'resolved_references' arrays, JVM iterates through constant pool \
> entries and resolves all JVM_CONSTANT_String entries to existing interned Strings \
> for all archived classes. When resolving, JVM only looks up the string table and \
> finds existing interned Strings without inserting new ones. If a string entry \
> cannot be resolved to an existing interned String, the constant pool entry remain \
> as unresolved. That prevents memory waste if a constant pool string entry is never \
> used at runtime.
> All String objects referenced by the string table are copied first into the closed \
> archive regions. The string table entry is updated with the new location when each \
> String object is archived. The JVM updates the resolved constant pool string \
> entries with the new object locations when copying the 'resolved_references' arrays \
> to the open archive regions. References to the 'resolved_references' arrays in the \
> ConstantPoolCache are also updated. At runtime as part of \
> ConstantPool::restore_unshareable_info() work, call \
> G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is \
> becoming live. A handle is created for the cached object and added to the \
> loader_data's handles.
> Runtime Java Heap With Cached Java Objects
>
>
> The closed archive regions (the string regions) and open archive regions are mapped \
> to the runtime java heap at the same offsets as the dump time offsets from the \
> runtime java heap base.
> Preliminary test execution and status:
>
> JPRT: passed
> Tier2-rt: passed
> Tier2-gc: passed
> Tier2-comp: passed
> Tier3-rt: passed
> Tier3-gc: passed
> Tier3-comp: passed
> Tier4-rt: passed
> Tier4-gc: passed
> Tier4-comp:6 jobs timed out, all other tests passed
> Tier5-rt: one test failed but passed when running locally, all other tests passed
> Tier5-gc: passed
> Tier5-comp: running
> hotspot_gc: two jobs timed out, all other tests passed
> hotspot_gc in CDS mode: two jobs timed out, all other tests passed
> vm.gc: passed
> vm.gc in CDS mode: passed
> Kichensink: passed
> Kichensink in CDS mode: passed
>
> Thanks,
> Jiangli
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic