[prev in list] [next in list] [prev in thread] [next in thread]
List: openbsd-bugs
Subject: Re: Kernel panic upon resume of Linux/KVM VM (OpenBSD 6.6)
From: Mike Larkin <mlarkin () nested ! page>
Date: 2019-10-22 23:52:08
Message-ID: 20191022235208.GC9217 () azathoth ! net
[Download RAW message or body]
On Tue, Oct 22, 2019 at 04:25:19PM -0700, guenther@openbsd.org wrote:
> On Tue, 22 Oct 2019, Andreas Rottmann wrote:
> > >Synopsis: panic: pvclock0: unstable result on stable clock
> > >Category: virtualization
> > >Environment:
> > System : OpenBSD 6.6
> > Details : OpenBSD 6.6 (GENERIC.MP) #372: Sat Oct 12 10:56:27 MDT 2019
> > deraadt@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> >
> > Architecture: OpenBSD.amd64
> > Machine : amd64
> > >Description:
> >
> > I've just experienced a kernel panic when resuming my laptop from
> > suspend-to-RAM while my OpenBSD 6.6 VM was running; the first few lines
> > of the crash read like this:
> >
> > panic: pvclock0: unstable result on stable clock
> > Stopped at db_enter+0x10: popq %rbp
> > TID PID UID PRFLAGS PFLAGS CPU COMMAND
> > db_enter() at db_enter+0x10
> > panic() at panic+0x128
> > pvclock_get_timecount(ffffffff81f14360) at pvclock_get_timecount+0xc2
> >
> > The full ddb session, including backtraces for both cores, and the `ps`
> > output is attached as `ddb.txt`.
>
> So the immediate code of the panic is this:
> /* This bit must be set as we attached based on the stable flag */
> if ((flags & PVCLOCK_FLAG_TSC_STABLE) == 0)
> panic("%s: unstable result on stable clock", DEVNAME(sc));
>
> That is, the pvclock driver currently assumes that if it advertises a
> stable clock when the OpenBSD guest is booted, then it'll remain stable
> forever. That apparently is not a safe assumption across a suspend/resume
> cycle in the Linux/KVM host.
>
It probably also isn't a safe assumption in a live migration scenario,
either, if you're correct above.
-ml
> To fix this, the driver would have to get the system to stop using it as
> the active timecounter whenever its marked instable. Perhaps it could
> just adjust its quality (sc->sc_tc->tc_quality) downward while that's the
> case? I'm not sure if that would be enough, but you could try
> implementing that.
>
> Lacking that, I guess you'll want to have KVM stop the guest before you
> suspend the host, and then on resume wait a bit until the clock
> settles--not sure how long that takes or how you would know--before
> restarting the guest.
>
>
> Philip Guenther
>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic