[prev in list] [next in list] [prev in thread] [next in thread]
List: openjdk-hotspot-runtime-dev
Subject: Integrated: 8297154: Improve safepoint cleanup logging
From: Aleksey Shipilev <shade () openjdk ! org>
Date: 2022-11-23 16:25:27
Message-ID: sj5J9y5n_-e_woJgKinlkSLTsbP9noScavFrLpcVx3U=.84eae836-c4e4-423e-9111-cfd09d085038 () github ! com
[Download RAW message or body]
On Wed, 16 Nov 2022 18:52:50 GMT, Aleksey Shipilev <shade@openjdk.org> wrote:
> When backporting [JDK-8280784](https://bugs.openjdk.org/browse/JDK-8280784), I \
> realized that safepoint cleanup logging added by \
> [JDK-8219436](https://bugs.openjdk.org/browse/JDK-8219436) is misleading. \
> `-Xlog:safepoint` output includes "cleanup" time into "Reaching safepoint", which \
> users can arguably expect to be only the time-to-safepoint. With large reported \
> "Reaching safepoint" time, users would chase a safepoint polling / scheduling \
> problem, rather than suspect the safepoint cleanup costs. I was confused about this \
> myself, and only reading the safepoint timing code made the picture more clear.
> It can be observed if you enable both `-Xlog:safepoint` and `-Xlog:safepoint+stats` \
> and run this on busy machine:
>
> $ build/linux-x86_64-server-fastdebug/images/jdk/bin/java -XX:+SafepointALot \
> -XX:GuaranteedSafepointInterval=1000 [1.647s][info][safepoint,stats] VM Operation \
> [ threads: total initial_running ][ time: sync cleanup vmop \
> total ] page_trap_count [1.647s][info][safepoint,stats] Cleanup \
> [ 15 1 ][ 26541 240544 30371 \
> 297456 ] 0 [1.647s][info][safepoint ] Safepoint "Cleanup", Time \
> since last: 1499772987 ns, Reaching safepoint: 267085 ns, At safepoint: 30371 ns, \
> Total: 297456 ns
>
> Note how "sync" takes only 26541 ns, "cleanup" takes 240544 ns, yet "Reaching \
> safepoint" reports the sums of both.
> To fix this, I propose we report "Cleanup" times separately. This changes the \
> logging statement. If we want to keep the logging statement intact, we can move \
> "cleanup" to "At safepoint" costs, which would be slightly less confusing. This is \
> what we do with `RuntimeService::record_safepoint_end` anyway: management layer \
> clearly separates "sync" and "safepoint" time, and "cleanup" more clearly belongs \
> to "safepoint" time here.
> While we are at it, we can also track and report max cleanup time.
>
> Example output after:
>
>
> [1.099s][info][safepoint,stats] VM Operation [ threads: total \
> initial_running ][ time: sync cleanup vmop total ] \
> page_trap_count [1.099s][info][safepoint,stats] Cleanup [ \
> 15 1 ][ 77391 124662 5420 207473 ] \
> 0 [1.099s][info][safepoint ] Safepoint "Cleanup", Time since last: 1013526788 \
> ns, Reaching safepoint: 77391 ns, Cleanup: 124662 ns, At safepoint: 5420 ns, Total: \
> 207473 ns
>
> Additional testing:
> - [x] Eyeballing the safepoint logs
> - [x] Linux x86_64 fastdebug `tier1`, `tier2`
This pull request has now been integrated.
Changeset: 086763a6
Author: Aleksey Shipilev <shade@openjdk.org>
URL: https://git.openjdk.org/jdk/commit/086763a629fe8b0592f5566c6e346716868d5e63
Stats: 11 lines in 2 files changed: 9 ins; 0 del; 2 mod
8297154: Improve safepoint cleanup logging
Reviewed-by: jsjolen, rehn
-------------
PR: https://git.openjdk.org/jdk/pull/11194
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic