[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-serviceability-dev
Subject:    Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
From:       Daniil Titov <daniil.x.titov () oracle ! com>
Date:       2019-12-12 3:25:49
Message-ID: 0569FA2A-AEB3-41E1-8306-11290159B030 () oracle ! com
[Download RAW message or body]

Hi Bob, David, Mandy, and Serguei,

Thank you for reviewing this change!

Best regards,
Daniil

On 12/11/19, 9:21 AM, "Bob Vandette" <bob.vandette@oracle.com> wrote:

    Yes, I defer to Mandy on the best way to express the various Java exceptions.
    I'm ok with the changes.
    
    Thanks for getting this done for JDK14!
    
    Bob.
    
    > On Dec 11, 2019, at 12:12 PM, Daniil Titov <daniil.x.titov@oracle.com> wrote:
    > 
    > Hi Serguei,
    > 
    > Thank you for your comments. I will correct this nits before pushing the \
changes.  > 
    > Hi Bob and David,
    > 
    >> [Mandy Chung]
    >>> I reviewed Metrics and Subsystem in this version.
    >>> I don't need to see a new webrev.
    > 
    > As I understood Mandy finished reviewing this fix. Just wanted to confirm with \
you if you are okey with that version of the fix (webrev.06) ?  > 
    > Mach5 testing: tier1-tier6 and open/test/hotspot/jtreg/containers/docker tests \
passed.   > 
    > Thank you,
    > Daniil
    > 
    > 
    > 
    > On 12/9/19, 6:02 PM, "serguei.spitsyn@oracle.com" \
<serguei.spitsyn@oracle.com> wrote:  > 
    >    Hi Daniil,
    > 
    >    It is not a full review, just some minor comments.
    >    In fact, I do not see real problems yet.
    > 
    >    http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html
  > 
    >       55     public long getTotalSwapSpaceSize() {
    >       56         if (containerMetrics != null) {
    >       57             long limit = containerMetrics.getMemoryAndSwapLimit();
    >       58             // The memory limit metrics is not available if JVM 
    >    runs on Linux host ( not in a docker container)
    >       59             // or if a docker container was started without 
    >    specifying a memory limit ( without '--memory='
    >       60             // Docker option). In latter case there is no limit on 
    >    how much memory the container can use and
    >       61             // it can use as much memory as the host's OS allows.
    >       62             long memLimit = containerMetrics.getMemoryLimit();
    >       63             if (limit >= 0 && memLimit >= 0) {
    >       64                 return limit - memLimit;
    >       65             }
    >       66         }
    >       67         return getTotalSwapSpaceSize0();
    >       68     }
    > 
    >       Unneeded space after brackets '('.
    >       Do we need to check if the (limit - memLimit) value is negative?
    >       The same question is for getFreeSwapSpaceSize():
    >         memSwapLimit - memLimit - (memSwapUsage - memUsage)
    > 
    >       and getFreeMemorySize():
    >         101 return limit - usage;
    > 
    >       81                         // If this happens just retry the loop for 
    >    a few iterations
    > 
    >       Dot is missed at the end of comment.
    > 
    > 
    >    http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html
  > 
    >       34 System.out.println(String.format("Runtime.availableProcessors: 
    >    %d", Runtime.getRuntime().availableProcessors()));
    >       35 
    >    System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: \
  >    %d", osBean.getAvailableProcessors()));
    >       36 
    >    System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize: 
    >    %d", osBean.getTotalMemorySize()));
    >       37 
    >    System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: \
  >    %d", osBean.getTotalPhysicalMemorySize()));
    >       38 
    >    System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize: 
    >    %d", osBean.getFreeMemorySize()));
    >       39 
    >    System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: \
  >    %d", osBean.getFreePhysicalMemorySize()));
    >       40 
    >    System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: \
  >    %d", osBean.getTotalSwapSpaceSize()));
    >       41 
    >    System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: \
  >    %d", osBean.getFreeSwapSpaceSize()));
    >       42 
    >    System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f", 
    >    osBean.getCpuLoad()));
    >       43 
    >    System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad: 
    >    %f", osBean.getSystemCpuLoad()));
    > 
    > 
    >       To make the above lines a little bit shorter I'd suggest to define a 
    >    log() method like this:
    >          private static void log(String msg) ( System.out.println(msg(; }
    > 
    >       34         log(String.format("Runtime.availableProcessors: %d", 
    >    Runtime.getRuntime().availableProcessors()));
    >       35 log(String.format("OperatingSystemMXBean.getAvailableProcessors: 
    >    %d", osBean.getAvailableProcessors()));
    >       36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d", 
    >    osBean.getTotalMemorySize()));
    >       37 
    >    log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: 
    >    %d", osBean.getTotalPhysicalMemorySize()));
    >       38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d", 
    >    osBean.getFreeMemorySize()));
    >       39 
    >    log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d", 
    >    osBean.getFreePhysicalMemorySize()));
    >       40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: 
    >    %d", osBean.getTotalSwapSpaceSize()));
    >       41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: 
    >    %d", osBean.getFreeSwapSpaceSize()));
    >       42         log(String.format("OperatingSystemMXBean.getCpuLoad: %f", 
    >    osBean.getCpuLoad()));
    >       43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f", 
    >    osBean.getSystemCpuLoad()));
    > 
    > 
    >    Thanks,
    >    Serguei
    > 
    > 
    > 
    >    On 12/6/19 17:41, Daniil Titov wrote:
    >> Hi David, Mandy, and Bob,
    >> 
    >> Thank you for reviewing this fix.
    >> 
    >> Please review a new version of the fix [1] that includes the following changes \
comparing to the previous version of the webrev ( webrev.04)  >> 1. The changes in \
Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded.  \
>> 2.  The implementation of methods getFreeMemorySize, getTotalMemorySize, \
> > getFreeSwapSpaceSize and getTotalSwapSpaceSize
    >>      was also reverted to webrev.03 version that return host's values if the \
metrics are unavailable or cannot be properly read.  >>      I would like to mention \
that  currently the native implementation of these methods de-facto may return -1 at \
some circumstances,  >>      but I agree that the changes proposed in the previous \
version of the webrev increase such probability.  >>      I filed the follow-up issue \
[4] as Mandy suggested.  >> 3.  The legacy methods were renamed as David suggested.
    >> 
    >> 
    >>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
    >>> !     static int initialized=1;
    >>> 
    >>>  Am I reading this right that the code currently fails to actually do the
    >>> initialization because of this ???
    >> Yes, currently the code fails to do the initialization but it was unnoticed \
since method  >> get_cpuload_internal(...) was never called for a specific CPU, the \
first parameter "which"  >> was always -1.
    >> 
    >>>  test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >>> 
    >>> System.out.println(String.format(...)
    >>> 
    >>> Why not simply
    >>> 
    >>> System.out.printf(..)
    >> As I tried explain it earlier it would make the tests unstable.
    >> System.out.printf(...) just delegates the call to System.out.format(...) that \
doesn't emit the string atomically.  >> Instead it parses the format string into a \
list of FormatString objects and then iterates over the list.  >> As a result, the \
other traces occasionally got printed between these iterations  and break the pattern \
the test is expected to find  >> in the output.
    >> 
    >> For example, here is the sample of a such output that has the trace message \
printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:"  >> and \
"1030762496".  >> 
    >> <skipped>
    >> [0.304s][trace][os,container] Memory Usage is: 42983424
    >> OperatingSystemMXBean.getFreeMemorySize: 1030758400
    >> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is \
/sys/fs/cgroup/memory/memory.usage_in_bytes  >> [0.305s][trace][os,container] Memory \
Usage is: 42979328  >> [0.306s][trace][os,container] Path to /memory.usage_in_bytes \
is /sys/fs/cgroup/memory/memory.usage_in_bytes  >> \
OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory \
Usage is: 42975232  >> 1030762496
    >> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
    >> 
    >> <skipped>
    >> java.lang.RuntimeException: \
'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from \
stdout/stderr  >> 
    >> 	at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
    >> 	at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
  >> 	at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
    >> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native \
Method)  >> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  >> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  >> 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
    >> 	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
  >> 	at java.base/java.lang.Thread.run(Thread.java:832)
    >> 
    >> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests \
passed. Tier4-tier6 tests are still running.  >> 
    >> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
    >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
    >> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
    >> [4] https://bugs.openjdk.java.net/browse/JDK-8235522
    >> 
    >> Thank you,
    >> Daniil
    >> 
    >> On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung@oracle.com> wrote:
    >> 
    >> 
    >> 
    >>     On 12/6/19 5:59 AM, Bob Vandette wrote:
    >>>> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes@oracle.com>  wrote:
    >>>> 
    >>>> 
    >>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
  >>>> 
    >>>> The changes to allow for a return of -1 are somewhat more extensive than we \
have previously discussed. These methods previously were (per the spec) guaranteed to \
return some (assumably) meaningful value but now they are effectively allowed to fail \
by returning -1. No existing code is expecting to have to handle a return of -1 so I \
see this as a significant compatibility issue.  >> 
    >>     I thought that the error case we are referring to is limit == 0 which
    >>     indicates something unexpected goes wrong.  So the compatibility concern
    >>     should be low.  This is very specific to Metrics implementation for
    >>     cgroup v1 and let me know if I'm wrong.
    >> 
    >>>> Surely there must always be some information available from the operating \
environment? I see from the impl file:  >>>> 
    >>>>    // the host data, value 0 indicates that something went wrong while the \
metric was read and  >>>>   // in this case we return "information unavailable" code \
-1.  >>>> 
    >>>> I don't agree with this. If the container metrics are messed up somehow we \
should either fallback to the host value or else abort with some kind of exception. \
Returning -1 is not an option here IMO.  >>> I agree with David on the compatibility \
concern.  I originally thought that -1 was already a specified return for all of \
these methods.  >>> Since the 0 return failure from the Metrics API should only occur \
if one of the cgroup subsystems is not enabled while others  >>> are, I'd suggest we \
keep Daniil's original logic to fall back to the host value since a disabled \
subsystem is equivalent to no  >>> limits.
    >>> 
    >> 
    >>     It's important to consider carefully if the monitoring API indicates an
    >>     error vs unavailable and an application should continue to run when the
    >>     monitoring system fails to get the metrics.
    >> 
    >>     There are several choices to report "something goes wrong" scenarios
    >>     (should unlikely happen???):
    >>     1. fall back to a random positive value  (e.g. host value)
    >>     2. return a negative value
    >>     3. throw an exception
    >> 
    >>     #3 is not an option as the application is not expecting this.  For #2,
    >>     the application can filter bad values if desirable.
    >> 
    >>     I'm okay if you want to file a JBS issue to follow up and thoroughly
    >>     look at the cases that the metrics are unavailable and the cases when
    >>     fails to obtain.
    >> 
    >>>> ---
    >>>> 
    >>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
    >>>> 
    >>>> System.out.println(String.format(...)
    >>>> 
    >>>> Why not simply
    >>>> 
    >>>> System.out.printf(..)
    >>>> 
    >>>> ?
    >> 
    >>     or simply (as I commented [1])
    >>          System.out.format
    >> 
    >>     Mandy
    >>     [1]
    >>     https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
  >> 
    >> 
    >> 
    >> 
    > 
    > 
    > 
    > 
    
    


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic