[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-ha-dev
Subject: Re: [Linux-ha-dev] [PATCH] change timeouts, startup behaviour ocf:heartbeat:ManageVE (OpenVZ VE clus
From: Dejan Muhamedagic <dejan () suse ! de>
Date: 2013-04-03 15:52:11
Message-ID: 20130403155210.GA3757 () squib
[Download RAW message or body]
Hi,
On Thu, Mar 21, 2013 at 02:59:17PM +0000, Tim Small wrote:
> On 13/03/13 16:18, Dejan Muhamedagic wrote:
> > On Tue, Mar 12, 2013 at 12:58:44PM +0000, Tim Small wrote:
> >
> > > The attached patch changes the behaviour of the OpenVZ virtual machine
> > > cluster resource agent, so that:
> > >
> > > 1. The default resource stop timeout is greater than the hardcoded
> > >
> > Just for the record: where is this hardcoded actually? Is it
> > also documented?
> >
>
> Defined here:
>
> http://git.openvz.org/?p=vzctl;a=blob;f=include/env.h#l26
>
> /** Shutdown timeout.
> */
> #define MAX_SHTD_TM 120
>
>
>
> Used by env_stop() here:
>
> http://git.openvz.org/?p=vzctl;a=blob;f=src/lib/env.c#l821
> <http://git.openvz.org/?p=vzctl;a=blob;f=src/lib/env.c;h=2da848d87904d9e572b7da5c0e7dc5d93217ae5b;hb=HEAD#l818>
>
>
>
> for (i = 0; i < MAX_SHTD_TM; i++) {
> sleep(1);
> if (!vps_is_run(h, veid)) {
> ret = 0;
> goto out;
> }
> }
>
> kill_vps:
> logger(0, 0, "Killing container ...");
>
>
>
> Perhaps something based on wall time would be more consistent, and I can
> think of cases where users might want it to be a bit higher, or a bit
> lower, but currently it's just fixed at 120s.
>
>
> I can't find the timeout documented anywhere.
That makes it hard to reference in other software products. But
we can anyway increase the advised timeout in the metadata.
> > > 2. The start operation now waits for resource startup to complete i.e.
> > > for the VE to "boot up" (so that the cluster manager can detect VEs
> > > which are hanging on startup, and also throttle simultaneous startups,
> > > so as not-to overburden the node in question). Since the start
> > > operation now does a lot more, the default start operation timeout has
> > > been increased.
> > >
> > I'm not sure if we can introduce this just like that. It changes
> > significantly the agent's behaviour.
> >
>
> Yes. I think it probably makes the agent's behavour a bit more correct,
> but that depends what your definition of a VE resource having "started"
> is, I suppose. Currently with this agent the says that it has started
> as soon as it has begun the boot process, whereas with the proposed
> change, it would mean that it has started when it has booted up (which
> should imply "is operational").
>
> Although my personal reason for the change was so that I had a
> reasonable way to avoid booting tens of VEs on the host machine at the
> same time, I can think of other benefits - such as making other
> resources depend on the fully-booted VE, or detecting the case where a
> faulty VE host node causes the VE to hang during start-up.
>
>
> I suppose other options are:
>
> 1. Make start --wait the default, but make starting without waiting
> selectable using a RA parameter.
>
> 2. Make start without waiting the default, but make --wait selectable
> using a RA parameter.
>
>
> I suppose that the change will break configurations where the
> administrator has hard coded a short timeout, and this change is
> introduced as part of an upgrade, which I suppose is a bad thing...
Yes, it could be so. I think that we should go for option 2.
> > BTW, how does vzctl know when the VE is started?
> >
>
> The vzctl manual page says that 'vzctl start --wait' will "attempt to
> wait till the default runlevel is reached" within the container.
OK. Though that may mean different things depending on which
init system is running.
> > If the description above matches
> > the code modifications, then there should be three instead of
> > one patch.
> >
>
> Fair enough - I was being lazy!
> )
Cheers,
Dejan
>
> Tim.
>
> --
> South East Open Source Solutions Limited
> Registered in England and Wales with company number 06134732.
> Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
> VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309
>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic