[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-jmx-dev
Subject:    Re: jmx-dev RFR 6523160: RuntimeMXBean.getUptime() returns negative values
From:       Staffan Larsen <staffan.larsen () oracle ! com>
Date:       2013-10-18 11:02:29
Message-ID: B5A388E3-EF9A-4723-82F0-8550CE22290B () oracle ! com
[Download RAW message or body]

Looks good to me.

Thanks,
/Staffan

On 17 okt 2013, at 14:09, Jaroslav Bachorik <jaroslav.bachorik@oracle.com> wrote:

> On 17.10.2013 13:07, David Holmes wrote:
>> On 17/10/2013 8:10 PM, Jaroslav Bachorik wrote:
>>> Hi David,
>>> 
>>> On 17.10.2013 04:26, David Holmes wrote:
>>>> Hi Jaroslav,
>>>> 
>>>> Minor nit: os::elapsed_time should really be defined in terms of the
>>>> other functions ie:
>>>> 
>>>> return ((double) os::elapsed_counter()) / os::elapsed_frequency();
>>> 
>>> Ok. I've changed it. It better communicates the way the elapsedTime is
>>> calculated anyway.
>>> 
>>>> 
>>>> I also prefer the cast above as it is very clear that we will be doing a
>>>> floating-point division.
>>>> 
>>>> Aside: AFAICS os::elapsed_time() is never actually used ??
>>> 
>>> Actually, it is os::elapsedTime() and this one is used quite a lot. The
>>> "elapsed_time()" form is used only in bytecodeHistogram.hpp,
>>> parNewGeneration.hpp and g1CollectedHeap.hpp, where it is also declared.
>> 
>> AFAICS they all define their own elapsed_time() functions they don't use
>> os::elapsed_time().
>> 
>>>> 
>>>> I agree that it appears that changing the frequency should be okay.
>>> 
>>> Thanks for the feedback.
>>> 
>>> Updated webrev: http://cr.openjdk.java.net/~jbachorik/6523160/webrev.03
>> 
>> That should be .04 version :)
> 
> Yep :( copy-paste ...
> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.04
> 
>> 
>> Looks okay.
>> 
>> Thanks,
>> David
>> 
>>> -JB-
>>> 
>>>> 
>>>> Thanks,
>>>> David
>>>> 
>>>> On 17/10/2013 2:16 AM, Jaroslav Bachorik wrote:
>>>>> On 15.10.2013 08:49, David Holmes wrote:
>>>>>> Hi Jaroslav,
>>>>>> 
>>>>>> os_bsd.cpp / os_linux.cpp:
>>>>>> 
>>>>>> If you don't have a monotonic clock you leave timer_frequency set
>>>>>> to 0!
>>>>>> (So you need to test on a system without a monotonic clock, or else
>>>>>> force it to act as-if not present.)
>>>>>> 
>>>>>> That aside I don't trust clock_getres to give values that actually
>>>>>> allow
>>>>>> the timer frequency to be determined. As per the comments in
>>>>>> os_linux.cpp:
>>>>>> 
>>>>>> // It's fixed in newer kernels, however clock_getres() still returns
>>>>>> // 1/HZ. We check if clock_getres() works, but will ignore its
>>>>>> reported
>>>>>> // resolution for now. Hopefully as people move to new kernels, this
>>>>>> // won't be a problem.
>>>>>> 
>>>>>> we don't know what kernels provide real values here and which provide
>>>>>> dummy ones.
>>>>>> 
>>>>>> On BSD you haven't modified os::elapsed_counter.
>>>>>> 
>>>>>> Looking at the linux changes I don't think the logic is correct
>>>>>> even if
>>>>>> clock_getres is accurate. In the existing code we have:
>>>>>> 
>>>>>> elapsed_counter -> elapsed time in microseconds
>>>>>> elapsed_frequency -> 1000 * 1000 (ie micros per second)
>>>>>> elapsed_time -> elapsed_counter*0.000001 -> time in seconds
>>>>>> 
>>>>>> Now we have:
>>>>>> 
>>>>>> elapsed_counter -> elapsed time in nanoseconds
>>>>>> elapsed_frequency -> 1x10^9 / whatever clock_getres says
>>>>>> elapsed_time -> counter/frequency -> ???
>>>>>> 
>>>>>> So elapsed_time not, in general, going to give the elapsed time in
>>>>>> seconds. And elapsed_time is not dependent on the "frequency" at all
>>>>>> because elapsed_counter is not reporting ticks but an actual elapsed
>>>>>> "time" in nanoseconds.
>>>>>> 
>>>>>> 
>>>>>> Also note that we constants for:
>>>>>> 
>>>>>> NANOSECS_PER_SEC
>>>>>> NANOSECS_PER_MILLISEC
>>>>>> 
>>>>>> to aid with time conversions.
>>>>>> 
>>>>>> The linux webrev contains unrelated UseLargePages changes!
>>>>> 
>>>>> Sorry for the mess with UseLargePages changes :/
>>>>> 
>>>>> I've fixed the problems with the frequency (using a fixed number as
>>>>> before) and I kept the changes to minimum.
>>>>> 
>>>>> I was hesitating about changing the elapsed_counter precision from
>>>>> microseconds to nanoseconds but since solaris and windows versions
>>>>> already use nanosecond ticks for elapsed_counter I think the change is
>>>>> safe.
>>>>> 
>>>>> The update webrev:
>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.03
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> David
>>>>>> -----
>>>>>> 
>>>>>> 
>>>>>> On 15/10/2013 12:13 AM, Jaroslav Bachorik wrote:
>>>>>>> On 10.10.2013 13:15, Staffan Larsen wrote:
>>>>>>>> 
>>>>>>>> On 10 okt 2013, at 13:02, Jaroslav Bachorik
>>>>>>>> <jaroslav.bachorik@oracle.com> wrote:
>>>>>>>> 
>>>>>>>>> On 10.10.2013 05:44, David Holmes wrote:
>>>>>>>>>> On 10/10/2013 4:12 AM, Staffan Larsen wrote:
>>>>>>>>>>> 
>>>>>>>>>>> On 9 okt 2013, at 16:19, Jaroslav Bachorik
>>>>>>>>>>> <jaroslav.bachorik@oracle.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> On 9.10.2013 16:10, Staffan Larsen wrote:
>>>>>>>>>>>>> There is now an awful amount of different timestamps in the
>>>>>>>>>>>>> Management class. Can they be consolidated somehow? At least
>>>>>>>>>>>>> _begin_vm_creation_time and the new _begin_vm_creation_ns.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> This discussion also implies that the "elapsed time" we
>>>>>>>>>>>>> print in
>>>>>>>>>>>>> the
>>>>>>>>>>>>> hs_err file is also wrong. It uses os::elapsedTime() which uses
>>>>>>>>>>>>> os::elapsed_counter().
>>>>>>>>>>>>> 
>>>>>>>>>>>>> And I guess the same thing for the VM.uptime Diagnostic Command
>>>>>>>>>>>>> (class VMUptimeDCmd) which also relies on
>>>>>>>>>>>>> os::elapsed_counter().
>>>>>>>>>>>> 
>>>>>>>>>>>> Also the reported GC pauses duration might be wrong since it
>>>>>>>>>>>> uses
>>>>>>>>>>>> Management::timestamp().
>>>>>>>>>>>> 
>>>>>>>>>>>> On the first sight the change looks rather trivial. But,
>>>>>>>>>>>> honestly,
>>>>>>>>>>>> I'm not sure which other parts could for whatever reason break
>>>>>>>>>>>> once
>>>>>>>>>>>> the time-of-day timestamp is replaced with a monotonic
>>>>>>>>>>>> equivalent.
>>>>>>>>>>>> One would think that it shouldn't matter but one never knows ...
>>>>>>>>>>>> 
>>>>>>>>>>>> Staffan, do you think this kind of change is suitable for the
>>>>>>>>>>>> current
>>>>>>>>>>>> phase of JDK release cycle? I think I could improve the patch in
>>>>>>>>>>>> few
>>>>>>>>>>>> days and then it should probably be able to pass the review
>>>>>>>>>>>> before
>>>>>>>>>>>> ZBB. But, it's only P3  ...
>>>>>>>>>>> 
>>>>>>>>>>> I think it is a bit late in the release cycle to clean this up in
>>>>>>>>>>> the
>>>>>>>>>>> way it should be cleaned up. I think we should wait until the
>>>>>>>>>>> first 8
>>>>>>>>>>> update release and do a more thorough job than we have time for
>>>>>>>>>>> right
>>>>>>>>>>> now.
>>>>>>>>>> 
>>>>>>>>>> I second that. The elapsed_counter/elpased_timer APIs and
>>>>>>>>>> implementations are a tangled mess. But part of the problem has
>>>>>>>>>> been
>>>>>>>>>> that people want/expect monotonic time-of-day based timestamps
>>>>>>>>>> (yes a
>>>>>>>>>> contradiction - though some people make sure TOD does not get
>>>>>>>>>> modified
>>>>>>>>>> on their production systems). The use of timestamps in logging has
>>>>>>>>>> to be
>>>>>>>>>> examined carefully - mainly GC logging. I recall a "simple"
>>>>>>>>>> attempted
>>>>>>>>>> change in the past that resulted in trying to compare a nanoTime
>>>>>>>>>> based
>>>>>>>>>> timestamp with the TOD. :(
>>>>>>>>> 
>>>>>>>>> Actually, if I'm reading the sources right for Solaris and Win the
>>>>>>>>> monotonic clock source is used to provide elapsed_counter()
>>>>>>>>> value. It
>>>>>>>>> falls back to TOD when the monotonic clock source is not available.
>>>>>>>>> For Linux/BSD the TOD is used directly.
>>>>>>>>> 
>>>>>>>>> This makes me wonder if changing the linux/bsd implementation to
>>>>>>>>> follow the same logic would be really that disruptive.
>>>>>>>> 
>>>>>>>> Good point. I would like a world where elapsed_counter is monotonic
>>>>>>>> (where possible). Still a bit scary this late in the release, but an
>>>>>>>> interesting experiment.
>>>>>>> 
>>>>>>> The change is rather simple and tests ok. All the means to get a
>>>>>>> monotonic timestamp are already there and proved to work. The core
>>>>>>> tests
>>>>>>> in JPRT went fine.
>>>>>>> 
>>>>>>> The updated webrev is at
>>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.02
>>>>>>> 
>>>>>>> -JB-
>>>>>>> 
>>>>>>>> 
>>>>>>>> /Staffan
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -JB-
>>>>>>>>>> 
>>>>>>>>>> David
>>>>>>>>>> -----
>>>>>>>>>> 
>>>>>>>>>>> /Staffan
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> -JB-
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> /Staffan
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 9 okt 2013, at 13:26, Jaroslav Bachorik
>>>>>>>>>>>>> <jaroslav.bachorik@oracle.com> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On 8.10.2013 23:46, David Holmes wrote:
>>>>>>>>>>>>>>> On 8/10/2013 10:36 PM, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>>>> On 8.10.2013 09:34, David Holmes wrote:
>>>>>>>>>>>>>>>>> Jaroslav,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On 2/10/2013 6:47 PM, Jaroslav Bachorik wrote:
>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> currently the JVM uptime reported by the RuntimeMXBean is
>>>>>>>>>>>>>>>>>> based on
>>>>>>>>>>>>>>>>>> System.currentTimeMillis() which makes it susceptible to
>>>>>>>>>>>>>>>>>> changes of the
>>>>>>>>>>>>>>>>>> OS time (eg. changing timezone, NTP synchronization etc.).
>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>> uptime
>>>>>>>>>>>>>>>>>> should not depend on the system time and should be
>>>>>>>>>>>>>>>>>> calculated
>>>>>>>>>>>>>>>>>> using a
>>>>>>>>>>>>>>>>>> monotonic clock source.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> There is already the way to get the actual JVM uptime in
>>>>>>>>>>>>>>>>>> ticks.
>>>>>>>>>>>>>>>>>> It is
>>>>>>>>>>>>>>>>>> accessible as Management::timestamp() and the ticks are
>>>>>>>>>>>>>>>>>> convertible to
>>>>>>>>>>>>>>>>>> milliseconds using Management::ticks_to_ms(ts_ticks) thus
>>>>>>>>>>>>>>>>>> making it
>>>>>>>>>>>>>>>>>> very
>>>>>>>>>>>>>>>>>> easy to switch to the monotonic clock based uptime.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Maybe I'm missing something but TiumeStamp updates using
>>>>>>>>>>>>>>>>> os::elapsed_counter() which on Linux uses gettimeofday
>>>>>>>>>>>>>>>>> so is
>>>>>>>>>>>>>>>>> not a
>>>>>>>>>>>>>>>>> monotonic clock source.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hm, yes. I wasn't aware of this linux/bsd specific.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Is there any reason why a non monotonic clock source is used
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> timestamping except of the historical one?
>>>>>>>>>>>>>>>> os::javaTimeNanos()
>>>>>>>>>>>>>>>> uses
>>>>>>>>>>>>>>>> montonic clock when available - why can't be the same used
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> os::elapsed_counter() especially when a counter based on
>>>>>>>>>>>>>>>> "gettimeofday"
>>>>>>>>>>>>>>>> is not really a counter?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> It is all historical. These elapsed_counters and
>>>>>>>>>>>>>>> elapsed_timers
>>>>>>>>>>>>>>> make me
>>>>>>>>>>>>>>> cringe. But changing it has a lot of potential consequences
>>>>>>>>>>>>>>> because of
>>>>>>>>>>>>>>> the way these are used in logging etc. Certainly not
>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>> to be
>>>>>>>>>>>>>>> contemplated at this stage of JDK 8.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Perhaps a simpler fix here is to expose a startUpTimeNanos
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>> can then
>>>>>>>>>>>>>>> be used for the uptime.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> My attempt at this is at
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.01/hotspot
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I am using os::javaTimeNanos() to get the monotonic ticks
>>>>>>>>>>>>>> where
>>>>>>>>>>>>>> possible.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The JDK part stays the same as for webrev.00
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>> -----
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> The patch consists of the hotspot and jdk parts.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> For the hotspot a new constant needs to be introduced in
>>>>>>>>>>>>>>>>>> src/share/vm/services/jmm.h and the actual logic to obtain
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> uptime in
>>>>>>>>>>>>>>>>>> milliseconds is added in
>>>>>>>>>>>>>>>>>> src/share/vm/services/management.cpp.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> For the jdk the changes comprise of adding the necessary
>>>>>>>>>>>>>>>>>> JNI
>>>>>>>>>>>>>>>>>> bridging
>>>>>>>>>>>>>>>>>> methods in order to get the new uptime, introducing the
>>>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>> constant
>>>>>>>>>>>>>>>>>> that is used in hotspot and changes to mapfile-vers
>>>>>>>>>>>>>>>>>> files in
>>>>>>>>>>>>>>>>>> order to
>>>>>>>>>>>>>>>>>> properly build the native library.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Issue:   https://bugs.openjdk.java.net/browse/JDK-6523160
>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jbachorik/6523160/webrev.00
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
> 

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic