[prev in list] [next in list] [prev in thread] [next in thread]
List: openjdk-serviceability-dev
Subject: Re: RFR: 8226575: OperatingSystemMXBean should be made container aware
From: Daniil Titov <daniil.x.titov () oracle ! com>
Date: 2019-12-12 3:25:49
Message-ID: 0569FA2A-AEB3-41E1-8306-11290159B030 () oracle ! com
[Download RAW message or body]
Hi Bob, David, Mandy, and Serguei,
Thank you for reviewing this change!
Best regards,
Daniil
On 12/11/19, 9:21 AM, "Bob Vandette" <bob.vandette@oracle.com> wrote:
Yes, I defer to Mandy on the best way to express the various Java exceptions.
I'm ok with the changes.
Thanks for getting this done for JDK14!
Bob.
> On Dec 11, 2019, at 12:12 PM, Daniil Titov <daniil.x.titov@oracle.com> wrote:
>
> Hi Serguei,
>
> Thank you for your comments. I will correct this nits before pushing the \
changes. >
> Hi Bob and David,
>
>> [Mandy Chung]
>>> I reviewed Metrics and Subsystem in this version.
>>> I don't need to see a new webrev.
>
> As I understood Mandy finished reviewing this fix. Just wanted to confirm with \
you if you are okey with that version of the fix (webrev.06) ? >
> Mach5 testing: tier1-tier6 and open/test/hotspot/jtreg/containers/docker tests \
passed. >
> Thank you,
> Daniil
>
>
>
> On 12/9/19, 6:02 PM, "serguei.spitsyn@oracle.com" \
<serguei.spitsyn@oracle.com> wrote: >
> Hi Daniil,
>
> It is not a full review, just some minor comments.
> In fact, I do not see real problems yet.
>
> http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java.frames.html
>
> 55 public long getTotalSwapSpaceSize() {
> 56 if (containerMetrics != null) {
> 57 long limit = containerMetrics.getMemoryAndSwapLimit();
> 58 // The memory limit metrics is not available if JVM
> runs on Linux host ( not in a docker container)
> 59 // or if a docker container was started without
> specifying a memory limit ( without '--memory='
> 60 // Docker option). In latter case there is no limit on
> how much memory the container can use and
> 61 // it can use as much memory as the host's OS allows.
> 62 long memLimit = containerMetrics.getMemoryLimit();
> 63 if (limit >= 0 && memLimit >= 0) {
> 64 return limit - memLimit;
> 65 }
> 66 }
> 67 return getTotalSwapSpaceSize0();
> 68 }
>
> Unneeded space after brackets '('.
> Do we need to check if the (limit - memLimit) value is negative?
> The same question is for getFreeSwapSpaceSize():
> memSwapLimit - memLimit - (memSwapUsage - memUsage)
>
> and getFreeMemorySize():
> 101 return limit - usage;
>
> 81 // If this happens just retry the loop for
> a few iterations
>
> Dot is missed at the end of comment.
>
>
> http://cr.openjdk.java.net/~dtitov/8226575/webrev.05/test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java.html
>
> 34 System.out.println(String.format("Runtime.availableProcessors:
> %d", Runtime.getRuntime().availableProcessors()));
> 35
> System.out.println(String.format("OperatingSystemMXBean.getAvailableProcessors: \
> %d", osBean.getAvailableProcessors()));
> 36
> System.out.println(String.format("OperatingSystemMXBean.getTotalMemorySize:
> %d", osBean.getTotalMemorySize()));
> 37
> System.out.println(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize: \
> %d", osBean.getTotalPhysicalMemorySize()));
> 38
> System.out.println(String.format("OperatingSystemMXBean.getFreeMemorySize:
> %d", osBean.getFreeMemorySize()));
> 39
> System.out.println(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: \
> %d", osBean.getFreePhysicalMemorySize()));
> 40
> System.out.println(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize: \
> %d", osBean.getTotalSwapSpaceSize()));
> 41
> System.out.println(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize: \
> %d", osBean.getFreeSwapSpaceSize()));
> 42
> System.out.println(String.format("OperatingSystemMXBean.getCpuLoad: %f",
> osBean.getCpuLoad()));
> 43
> System.out.println(String.format("OperatingSystemMXBean.getSystemCpuLoad:
> %f", osBean.getSystemCpuLoad()));
>
>
> To make the above lines a little bit shorter I'd suggest to define a
> log() method like this:
> private static void log(String msg) ( System.out.println(msg(; }
>
> 34 log(String.format("Runtime.availableProcessors: %d",
> Runtime.getRuntime().availableProcessors()));
> 35 log(String.format("OperatingSystemMXBean.getAvailableProcessors:
> %d", osBean.getAvailableProcessors()));
> 36 log(String.format("OperatingSystemMXBean.getTotalMemorySize: %d",
> osBean.getTotalMemorySize()));
> 37
> log(String.format("OperatingSystemMXBean.getTotalPhysicalMemorySize:
> %d", osBean.getTotalPhysicalMemorySize()));
> 38 log(String.format("OperatingSystemMXBean.getFreeMemorySize: %d",
> osBean.getFreeMemorySize()));
> 39
> log(String.format("OperatingSystemMXBean.getFreePhysicalMemorySize: %d",
> osBean.getFreePhysicalMemorySize()));
> 40 log(String.format("OperatingSystemMXBean.getTotalSwapSpaceSize:
> %d", osBean.getTotalSwapSpaceSize()));
> 41 log(String.format("OperatingSystemMXBean.getFreeSwapSpaceSize:
> %d", osBean.getFreeSwapSpaceSize()));
> 42 log(String.format("OperatingSystemMXBean.getCpuLoad: %f",
> osBean.getCpuLoad()));
> 43 log(String.format("OperatingSystemMXBean.getSystemCpuLoad: %f",
> osBean.getSystemCpuLoad()));
>
>
> Thanks,
> Serguei
>
>
>
> On 12/6/19 17:41, Daniil Titov wrote:
>> Hi David, Mandy, and Bob,
>>
>> Thank you for reviewing this fix.
>>
>> Please review a new version of the fix [1] that includes the following changes \
comparing to the previous version of the webrev ( webrev.04) >> 1. The changes in \
Javadoc made in the webrev.04 comparing to webrev.03 and to CSR [3] were discarded. \
>> 2. The implementation of methods getFreeMemorySize, getTotalMemorySize, \
> > getFreeSwapSpaceSize and getTotalSwapSpaceSize
>> was also reverted to webrev.03 version that return host's values if the \
metrics are unavailable or cannot be properly read. >> I would like to mention \
that currently the native implementation of these methods de-facto may return -1 at \
some circumstances, >> but I agree that the changes proposed in the previous \
version of the webrev increase such probability. >> I filed the follow-up issue \
[4] as Mandy suggested. >> 3. The legacy methods were renamed as David suggested.
>>
>>
>>> src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c
>>> ! static int initialized=1;
>>>
>>> Am I reading this right that the code currently fails to actually do the
>>> initialization because of this ???
>> Yes, currently the code fails to do the initialization but it was unnoticed \
since method >> get_cpuload_internal(...) was never called for a specific CPU, the \
first parameter "which" >> was always -1.
>>
>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>>>
>>> System.out.println(String.format(...)
>>>
>>> Why not simply
>>>
>>> System.out.printf(..)
>> As I tried explain it earlier it would make the tests unstable.
>> System.out.printf(...) just delegates the call to System.out.format(...) that \
doesn't emit the string atomically. >> Instead it parses the format string into a \
list of FormatString objects and then iterates over the list. >> As a result, the \
other traces occasionally got printed between these iterations and break the pattern \
the test is expected to find >> in the output.
>>
>> For example, here is the sample of a such output that has the trace message \
printed between " OperatingSystemMXBean.getFreePhysicalMemorySize:" >> and \
"1030762496". >>
>> <skipped>
>> [0.304s][trace][os,container] Memory Usage is: 42983424
>> OperatingSystemMXBean.getFreeMemorySize: 1030758400
>> [0.305s][trace][os,container] Path to /memory.usage_in_bytes is \
/sys/fs/cgroup/memory/memory.usage_in_bytes >> [0.305s][trace][os,container] Memory \
Usage is: 42979328 >> [0.306s][trace][os,container] Path to /memory.usage_in_bytes \
is /sys/fs/cgroup/memory/memory.usage_in_bytes >> \
OperatingSystemMXBean.getFreePhysicalMemorySize: [0.306s][trace][os,container] Memory \
Usage is: 42975232 >> 1030762496
>> OperatingSystemMXBean.getTotalSwapSpaceSize: 499122176
>>
>> <skipped>
>> java.lang.RuntimeException: \
'OperatingSystemMXBean\\.getFreePhysicalMemorySize: [1-9][0-9]+' missing from \
stdout/stderr >>
>> at jdk.test.lib.process.OutputAnalyzer.shouldMatch(OutputAnalyzer.java:306)
>> at TestMemoryAwareness.testOperatingSystemMXBeanAwareness(TestMemoryAwareness.java:151)
>> at TestMemoryAwareness.main(TestMemoryAwareness.java:73)
>> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native \
Method) >> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.base/java.lang.reflect.Method.invoke(Method.java:564)
>> at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
>> at java.base/java.lang.Thread.run(Thread.java:832)
>>
>> Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests \
passed. Tier4-tier6 tests are still running. >>
>> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8226575/webrev.05
>> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575
>> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
>> [4] https://bugs.openjdk.java.net/browse/JDK-8235522
>>
>> Thank you,
>> Daniil
>>
>> On 12/6/19, 1:38 PM, "Mandy Chung" <mandy.chung@oracle.com> wrote:
>>
>>
>>
>> On 12/6/19 5:59 AM, Bob Vandette wrote:
>>>> On Dec 6, 2019, at 2:49 AM, David Holmes<David.Holmes@oracle.com> wrote:
>>>>
>>>>
>>>> src/jdk.management/share/classes/com/sun/management/OperatingSystemMXBean.java
>>>>
>>>> The changes to allow for a return of -1 are somewhat more extensive than we \
have previously discussed. These methods previously were (per the spec) guaranteed to \
return some (assumably) meaningful value but now they are effectively allowed to fail \
by returning -1. No existing code is expecting to have to handle a return of -1 so I \
see this as a significant compatibility issue. >>
>> I thought that the error case we are referring to is limit == 0 which
>> indicates something unexpected goes wrong. So the compatibility concern
>> should be low. This is very specific to Metrics implementation for
>> cgroup v1 and let me know if I'm wrong.
>>
>>>> Surely there must always be some information available from the operating \
environment? I see from the impl file: >>>>
>>>> // the host data, value 0 indicates that something went wrong while the \
metric was read and >>>> // in this case we return "information unavailable" code \
-1. >>>>
>>>> I don't agree with this. If the container metrics are messed up somehow we \
should either fallback to the host value or else abort with some kind of exception. \
Returning -1 is not an option here IMO. >>> I agree with David on the compatibility \
concern. I originally thought that -1 was already a specified return for all of \
these methods. >>> Since the 0 return failure from the Metrics API should only occur \
if one of the cgroup subsystems is not enabled while others >>> are, I'd suggest we \
keep Daniil's original logic to fall back to the host value since a disabled \
subsystem is equivalent to no >>> limits.
>>>
>>
>> It's important to consider carefully if the monitoring API indicates an
>> error vs unavailable and an application should continue to run when the
>> monitoring system fails to get the metrics.
>>
>> There are several choices to report "something goes wrong" scenarios
>> (should unlikely happen???):
>> 1. fall back to a random positive value (e.g. host value)
>> 2. return a negative value
>> 3. throw an exception
>>
>> #3 is not an option as the application is not expecting this. For #2,
>> the application can filter bad values if desirable.
>>
>> I'm okay if you want to file a JBS issue to follow up and thoroughly
>> look at the cases that the metrics are unavailable and the cases when
>> fails to obtain.
>>
>>>> ---
>>>>
>>>> test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java
>>>>
>>>> System.out.println(String.format(...)
>>>>
>>>> Why not simply
>>>>
>>>> System.out.printf(..)
>>>>
>>>> ?
>>
>> or simply (as I commented [1])
>> System.out.format
>>
>> Mandy
>> [1]
>> https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-December/029930.html
>>
>>
>>
>>
>
>
>
>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic