[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-hotspot-runtime-dev
Subject:    Re: RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CD
From:       Ioi Lam <ioi.lam () oracle ! com>
Date:       2017-07-31 4:07:50
Message-ID: 4172849a-55e6-7133-90d6-2c75b58b0391 () oracle ! com
[Download RAW message or body]

Hi Jiangli,

Here are my comments. I've not reviewed the GC code and I'll leave that 
to the GC experts :-)

stringTable.cpp: StringTable::archive_string

     add assert for DumpSharedSpaces only

filemap.cpp

  525 void 
FileMapInfo::write_archive_heap_regions(GrowableArray<MemRegion> *regions,
  526                                              int first_region, int 
num_regions) {

When I first read this function, I found it hard to follow, especially 
this part that coalesces the trailing regions:

  537         int len = regions->length();
  538         if (len > 1) {
  539           start = (char*)regions->at(1).start();
  540           size = (char*)regions->at(len - 1).end() - start;
  541         }
  542       }

The rest of filemap.cpp always perform identical operations on MemRegion 
arrays, which are either 1 or 2 in size. However, this function doesn't 
follow that pattern; it also has a very different notion of "region", 
and the confusing part is regions->size() is not the same as num_regions.

How about we change the API to something like the following? Before 
calling this API, the caller needs to coalesce the trailing G1 regions 
into a single MemRegion.

      FileMapInfo::write_archive_heap_regions(MemRegion *regions, int 
first, int num_regions) {
         if (first == MetaspaceShared::first_string) {
            assert(num_regons <=  MetaspaceShared::max_strings, "...");
         } else {
            assert(first == 
MetaspaceShared::first_open_archive_heap_region, "...");
            assert(num_regons <= 
MetaspaceShared::max_open_archive_heap_region, "...");
}
         ....



  756   if (!string_data_mapped) {
  757     StringTable::ignore_shared_strings(true);
  758     assert(string_ranges == NULL && num_string_ranges == 0, "sanity");
  759   }
  760
  761   if (open_archive_heap_data_mapped) {
  762     MetaspaceShared::set_open_archive_heap_region_mapped();
  763   } else {
  764     assert(open_archive_heap_ranges == NULL && 
num_open_archive_heap_ranges == 0, "sanity");
  765   }

Maybe the two "if" statements should be more consistent? Instead of 
StringTable::ignore_shared_strings, how about 
StringTable::set_shared_strings_region_mapped()?

FileMapInfo::map_heap_data() --

  818     char* addr = (char*)regions[i].start();
  819     char* base = os::map_memory(_fd, _full_path, si->_file_offset,
  820                                 addr, regions[i].byte_size(), 
si->_read_only,
  821                                 si->_allow_exec);

What happens when the first region succeeds to map but the second region 
fails to map? Will both regions be unmapped? I don't see where you store 
the return value (base) from os::map_memory(). Does it mean the code 
assumes that (addr == base). If so, we need an assert here.

constantPool.cpp

      Handle refs_handle;
      ...
      refs_handle = Handle(THREAD, (oop)archived);

This will first create a NULL handle, then construct a temporary handle, 
and then assign the temp handle back to the null handle. This means two 
handles will be pushed onto THREAD->metadata_handles()

I think it's more efficient if you merge these into a single statement

      Handle refs_handle(THREAD, (oop)archived);

Is this experimental code? Maybe it should be removed?

  664     if (tag_at(index).is_unresolved_klass()) {
  665 #if 0
  666       CPSlot entry = cp->slot_at(index);
  667       Symbol* name = entry.get_symbol();
  668       Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD);
  669       if (k != NULL) {
  670         klass_at_put(index, k);
  671       }
  672 #endif
  673     } else

cpCache.hpp:

      u8   _archived_references

shouldn't this be declared as an narrowOop to avoid the type casts when 
it's used?

cpCache.cpp:

     add assert so that one of these is used only at dump time and the 
other only at run time?

  610 oop ConstantPoolCache::archived_references() {
  611   return oopDesc::decode_heap_oop((narrowOop)_archived_references);
  612 }
  613
  614 void ConstantPoolCache::set_archived_references(oop o) {
  615   _archived_references = (u8)oopDesc::encode_heap_oop(o);
  616 }

Thanks!
- Ioi

On 7/27/17 1:37 PM, Jiangli Zhou wrote:
> Sorry, the mail didn't handle the rich text well. I fixed the format below.
> 
> Please help review the changes for JDK-8179302 (Pre-resolve constant pool string \
> entries and cache resolved_reference arrays in CDS archive). Currently, the CDS \
> archive can contain cached class metadata, interned java.lang.String objects. This \
> RFE adds the constant pool ‘resolved_references' arrays (hotspot specific) to the \
> archive for startup/runtime performance enhancement. The ‘resolved_references' \
> arrays are used to hold references of resolved constant pool entries including \
> Strings, mirrors, etc. With the 'resolved_references' being cached, string \
> constants in shared classes can now be resolved to existing interned \
> java.lang.Strings at CDS dump time. G1 and 64-bit platforms are required. 
> The GC changes in the RFE were discussed and guided by Thomas Schatzl and GC team. \
>                 Part of the changes were contributed by Thomas himself.
> RFE:        https://bugs.openjdk.java.net/browse/JDK-8179302
> hotspot:   http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/
> whitebox: http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/
> 
> Please see below for details of supporting cached ‘resolved_references' and \
> pre-resolving string constants. 
> Types of Pinned G1 Heap Regions
> 
> The pinned region type is a super type of all archive region types, which include \
> the open archive type and the closed archive type. 
> 00100 0 [ 8] Pinned Mask
> 01000 0 [16] Old Mask
> 10000 0 [32] Archive Mask
> 11100 0 [56] Open Archive:   ArchiveMask | PinnedMask | OldMask
> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | OldMask + 1
> 
> 
> Pinned Regions
> 
> Objects within the region are 'pinned', which means GC does not move any live \
> objects. GC scans and marks objects in the pinned region as normal, but skips \
> forwarding live objects. Pointers in live objects are updated. Dead objects \
> (unreachable) can be collected and freed. 
> Archive Regions
> 
> The archive types are sub-types of 'pinned'. There are two types of archive region \
> currently, open archive and closed archive. Both can support caching java heap \
> objects via the CDS archive. 
> An archive region is also an old region by design.
> 
> Open Archive (GC-RW) Regions
> 
> Open archive region is GC writable. GC scans & marks objects within the region and \
> adjusts (updates) pointers in live objects the same way as a pinned region. Live \
> objects (reachable) are pinned and not forwarded by GC. Open archive region does \
> not have 'dead' objects. Unreachable objects are 'dormant' objects. Dormant objects \
> are not collected and freed by GC. 
> Adjustable Outgoing Pointers
> 
> As GC can adjust pointers within the live objects in open archive heap region, \
> objects can have outgoing pointers to another java heap region, including closed \
> archive region, open archive region, pinned (or humongous) region, and normal \
> generational region. When a referenced object is moved by GC, the pointer within \
> the open archive region is updated accordingly. 
> Closed Archive (GC-RO) Regions
> 
> The closed archive region is GC read-only region. GC cannot write into the region. \
> Objects are not scanned and marked by GC. Objects are pinned and not forwarded. \
> Pointers are not updated by GC either. Hence, objects within the archive region \
> cannot have any outgoing pointers to another java heap region. Objects however can \
> still have pointers to other objects within the closed archive regions (we might \
> allow pointers to open archive regions in the future). That restricts the type of \
> java objects that can be supported by the archive region. In JDK 9 we support \
> archive Strings with the archive regions. 
> The GC-readonly archive region makes java heap memory sharable among different JVM \
> processes. NOTE: synchronization on the objects within the archive heap region can \
> still cause writes to the memory page. 
> Dormant Objects
> 
> Dormant objects are unreachable java objects within the open archive heap region.
> A java object in the open archive heap region is a live object if it can be reached \
> during scanning. Some of the java objects in the region may not be reachable during \
> scanning. Those objects are considered as dormant, but not dead. For example, a \
> constant pool 'resolved_references' array is reachable via the klass root if its \
> container klass (shared) is already loaded at the time during GC scanning. If a \
> shared klass is not yet loaded, the klass root is not scanned and it's constant \
> pool 'resolved_reference' array (A) in the open archive region is not reachable. \
> Then A is a dormant object. 
> Object State Transition
> 
> All java objects are initially dormant objects when open archive heap regions are \
> mapped to the runtime java heap. A dormant object becomes live object when the \
> associated shared class is loaded at runtime. Explicit call to \
> G1SATBCardTableModRefBS::enqueue() needs to be made when a dormant object becomes \
> live. That should be the case for cached objects with strong roots as well, since \
> strong roots are only scanned at the start of GC marking (the initial marking) but \
> not during Remarking/Final marking. If a cached object becomes live during \
> concurrent marking phase, G1 may not find it and mark it live unless a call to \
> G1SATBCardTableModRefBS::enqueue() is made for the object. 
> Currently, a live object in the open archive heap region cannot become dormant \
> again. This restriction simplifies GC requirement and guarantees all outgoing \
> pointers are updated by GC correctly. Only objects for shared classes from the \
> builtin class loaders (boot, PlatformClassLoaders, and AppClassLoaders) are \
> supported for caching. 
> Caching Java Objects at Archive Dump Time
> 
> The closed archive and open archive regions are allocated near the top of the dump \
> time java heap. Archived java objects are copied into the designated archive heap \
> regions. For example, String objects and the underlying 'value' arrays are copied \
> into the closed archive regions. All references to the archived objects (from \
> shared class metadata, string table, etc) are set to the new heap locations. A hash \
> table is used to keep track of all archived java objects during the copying process \
> to make sure java object is not archived more than once if reached from different \
> roots. It also makes sure references to the same archived object are updated using \
> the same new address location. 
> Caching Constant Pool resolved_references Array
> 
> The 'resolved_references' is an array that holds references of resolved constant \
> pool entries including Strings, mirrors and methodTypes, etc. Each loaded class has \
> one 'resolved_references' array (in ConstantPoolCache). The 'resolved_references' \
> arrays are copied into the open archive regions during dump process. Prior to \
> copying the 'resolved_references' arrays, JVM iterates through constant pool \
> entries and resolves all JVM_CONSTANT_String entries to existing interned Strings \
> for all archived classes. When resolving, JVM only looks up the string table and \
> finds existing interned Strings without inserting new ones. If a string entry \
> cannot be resolved to an existing interned String, the constant pool entry remain \
> as unresolved. That prevents memory waste if a constant pool string entry is never \
> used at runtime. 
> All String objects referenced by the string table are copied first into the closed \
> archive regions. The string table entry is updated with the new location when each \
> String object is archived. The JVM updates the resolved constant pool string \
> entries with the new object locations when copying the 'resolved_references' arrays \
> to the open archive regions. References to the 'resolved_references' arrays in the \
> ConstantPoolCache are also updated. At runtime as part of \
> ConstantPool::restore_unshareable_info() work, call \
> G1SATBCardTableModRefBS::enqueue() to let GC know the 'resolved_references' is \
> becoming live. A handle is created for the cached object and added to the \
> loader_data's handles. 
> Runtime Java Heap With Cached Java Objects
> 
> 
> The closed archive regions (the string regions) and open archive regions are mapped \
> to the runtime java heap at the same offsets as the dump time offsets from the \
> runtime java heap base. 
> Preliminary test execution and status:
> 
> JPRT: passed
> Tier2-rt: passed
> Tier2-gc: passed
> Tier2-comp: passed
> Tier3-rt: passed
> Tier3-gc: passed
> Tier3-comp: passed
> Tier4-rt: passed
> Tier4-gc: passed
> Tier4-comp:6 jobs timed out, all other tests passed
> Tier5-rt: one test failed but passed when running locally, all other tests passed
> Tier5-gc: passed
> Tier5-comp: running
> hotspot_gc: two jobs timed out, all other tests passed
> hotspot_gc in CDS mode: two jobs timed out, all other tests passed
> vm.gc: passed
> vm.gc in CDS mode: passed
> Kichensink: passed
> Kichensink in CDS mode: passed
> 
> Thanks,
> Jiangli


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic