[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] lxc RA merged
From:       Darren Thompson <darrent () akurit ! com ! au>
Date:       2011-06-07 0:48:26
Message-ID: 1307407690.20891.4.camel () localhost
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Florian/Team

Please find an updated version for the 'lxc resource agent'

As Florian pointed out, I had not properly initialised/set the new
"use_screen" parameter.

The attached file includes that correction.

PS (OK I'm a complete newbie and I know this should be obvious but) I
had attempted to update the original in GITHUB (by forking) but am now
unable to edit it to add this missing attribute,  How do I re-edit my
change to include a second change????

Darren

On Mon, 2011-06-06 at 12:36 +0930, Darren Thompson wrote:

> Florian
> 
> I have done some "live fire" testing of the updated lxc resource.
> 
> I noted that "screens" has been depreciated in favour of running the
> lxc as a daemon, with output going to a new log file
> 
> Unfortunately when i used it in this configuration I cannot connect to
> the running container and all the log output shows is"
> more processes left in this runlevel for 5 minutesstty: standard
> input: Inappropriate ioctl for device
> Master Resource Control: previous runlevel: N, switching to runlevel:
> 3
> tcgetattr: Inappropriate ioctl for device
> Master Resource Control: runlevel 3 has been reached
> stty: standard input: Inappropriate ioctl for device".
> 
> The cluster show the container as running, but I cannot ping the IP
> address that the container should be using so cannot confirm that it
> is running correctly.
> 
> I suspect that the container is having trouble running as there is not
> a "root console" device when run as a daemon.
> 
> Without the "root console" available "via screens" it's very very
> difficult to diagnose the issue with the container to be certain as to
> what is casing the problem.
> 
> I may create a modified version with the "screens" re-added as an
> "option", as that is my personal preference and will also help
> diagnose the error I'm currently getting with the lxc resource.
> 
> I'll sen out the updated version as an attachment to this list (I
> still have no idea how to create patches/submissions on GIT hub).
> 
> I'm also now getting errors on the original links to your fork on
> GitHub, I'm assuming it's because the driver it's now been pulled into
> the core (or something) making your fork redundant.
> 
> Darren
> 
> 
> On Mon, 2011-05-30 at 15:45 +0200, Florian Haas wrote: 
> 
> > Hello,
> > 
> > after much useful testing from Christoph Mitasch and a number of
> > necessary changes highlighted by ocf-tester, I've now merged and pushed
> > the lxc resource agent that was originally contributed by Darren Thompson.
> > 
> > The resource agent is here:
> > 
> > https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/lxc
> > 
> > Its commit history up to this point can be reviewed here:
> > 
> > https://github.com/ClusterLabs/resource-agents/commits/master/heartbeat/lxc
> > 
> > Hope this is useful.
> > 
> > Cheers,
> > Florian
> > 
> > 
> > _______________________________________________________
> > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > Home Page: http://linux-ha.org/

[Attachment #5 (text/html)]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">
<HTML>
<HEAD>
  <META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
  <META NAME="GENERATOR" CONTENT="GtkHTML/3.28.2">
</HEAD>
<BODY>
Florian/Team<BR>
<BR>
Please find an updated version for the 'lxc resource agent'<BR>
<BR>
As Florian pointed out, I had not properly initialised/set the new \
&quot;use_screen&quot; parameter.<BR> <BR>
The attached file includes that correction.<BR>
<BR>
PS (OK I'm a complete newbie and I know this should be obvious but) I had attempted \
to update the original in GITHUB (by forking) but am now unable to edit it to add \
this missing attribute,&nbsp; How do I re-edit my change to include a second \
change????<BR> <BR>
Darren<BR>
<BR>
On Mon, 2011-06-06 at 12:36 +0930, Darren Thompson wrote:<BR>
<BLOCKQUOTE TYPE=CITE>
    Florian<BR>
    <BR>
    I have done some &quot;live fire&quot; testing of the updated lxc resource.<BR>
    <BR>
    I noted that &quot;screens&quot; has been depreciated in favour of running the \
lxc as a daemon, with output going to a new log file<BR>  <BR>
    Unfortunately when i used it in this configuration I cannot connect to the \
running container and all the log output shows is&quot;<BR>  more processes left in \
this runlevel for 5 minutesstty: standard input: Inappropriate ioctl for device<BR>  \
Master Resource Control: previous runlevel: N, switching to runlevel: 3<BR>  \
tcgetattr: Inappropriate ioctl for device<BR>  Master Resource Control: runlevel 3 \
has been reached<BR>  stty: standard input: Inappropriate ioctl for device&quot;.<BR>
    <BR>
    The cluster show the container as running, but I cannot ping the IP address that \
the container should be using so cannot confirm that it is running correctly.<BR>  \
<BR>  I suspect that the container is having trouble running as there is not a \
&quot;root console&quot; device when run as a daemon.<BR>  <BR>
    Without the &quot;root console&quot; available &quot;via screens&quot; it's very \
very difficult to diagnose the issue with the container to be certain as to what is \
casing the problem.<BR>  <BR>
    I may create a modified version with the &quot;screens&quot; re-added as an \
&quot;option&quot;, as that is my personal preference and will also help diagnose the \
error I'm currently getting with the lxc resource.<BR>  <BR>
    I'll sen out the updated version as an attachment to this list (I still have no \
idea how to create patches/submissions on GIT hub).<BR>  <BR>
    I'm also now getting errors on the original links to your fork on GitHub, I'm \
assuming it's because the driver it's now been pulled into the core (or something) \
making your fork redundant.<BR>  <BR>
    Darren<BR>
    <BR>
    <BR>
    On Mon, 2011-05-30 at 15:45 +0200, Florian Haas wrote: 
    <BLOCKQUOTE TYPE=CITE>
<PRE>
Hello,

after much useful testing from Christoph Mitasch and a number of
necessary changes highlighted by ocf-tester, I've now merged and pushed
the lxc resource agent that was originally contributed by Darren Thompson.

The resource agent is here:

<A HREF="https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/lxc">https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/lxc</A>


Its commit history up to this point can be reviewed here:

<A HREF="https://github.com/ClusterLabs/resource-agents/commits/master/heartbeat/lxc">https://github.com/ClusterLabs/resource-agents/commits/master/heartbeat/lxc</A>


Hope this is useful.

Cheers,
Florian


_______________________________________________________
Linux-HA-Dev: <A HREF="mailto:Linux-HA-Dev@lists.linux-ha.org">Linux-HA-Dev@lists.linux-ha.org</A>
 <A HREF="http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev">http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev</A>
 Home Page: <A HREF="http://linux-ha.org/">http://linux-ha.org/</A>
</PRE>
    </BLOCKQUOTE>
</BLOCKQUOTE>
</BODY>
</HTML>


["lxc" (application/x-shellscript)]

#!/bin/bash
# Should now conform to guidlines: \
http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html #
#	LXC (Linux Containers) OCF RA. 
#	Used to cluster enable the start, stop and monitoring of a LXC container.
#
# Copyright (c) 2011 AkurIT.com.au, Darren Thompson
#                    All Rights Reserved.
#
# Without limiting the rights of the original copyright holders
# This resource is licensed under GPL version 2
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it would be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# Further, this software is distributed without any warranty that it is
# free of the rightful claim of any third person regarding infringement
# or the like.  Any license provided herein, whether implied or
# otherwise, applies only to this software file.  Patent licenses, if
# any, provided herein do not apply to combinations of this program with
# other software, or any other product whatsoever.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.

# OCF instance parameters
#       OCF_RESKEY_container
#       OCF_RESKEY_config
#       OCF_RESKEY_log

# Initialization:
> ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/lib/heartbeat}
. ${OCF_FUNCTIONS_DIR}/ocf-shellfuncs

# Defaults
OCF_RESKEY_log_default="${HA_RSCTMP}/${OCF_RESOURCE_INSTANCE}.log"
OCF_RESKEY_use_screen_default="false"

> ${OCF_RESKEY_log=${OCF_RESKEY_log_default}}
> ${OCF_RESKEY_use_screen=${OCF_RESKEY_use_screen_default}}


# Set default TRANS_RES_STATE (temporary file to "flag" if resource was stated but \
not stopped) TRANS_RES_STATE="${HA_RSCTMP}/${OCF_RESOURCE_INSTANCE}.state"

meta_data() {
cat <<END
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="lxc" version="0.1">
<version>0.1</version>
<longdesc lang="en">Allows LXC containers to be managed by the cluster.
If the container is running "init" it will also perform an orderly shutdown.
It is 'assumed' that the 'init' system will do an orderly shudown if presented with a \
'kill -PWR' signal. On a 'sysvinit' this would require the container to have an \
inittab file containing "p0::powerfail:/sbin/init 0" I have absolutly no idea how \
this is done with 'upstart' or 'systemd', YMMV if your container is using one of \
them.</longdesc> <shortdesc lang="en">Manages LXC containers</shortdesc>

<parameters>
<parameter name="container" required="1" unique="1">
<longdesc lang="en">The unique name for this 'Container Instance' e.g. \
'test1'.</longdesc> <shortdesc lang="en">Container Name</shortdesc>
<content type="string" default=""/>
</parameter>

<parameter name="config" required="1" unique="0">
<longdesc lang="en">Absolute path to the file holding the specific configuration for \
this container e.g. '/etc/lxc/test1/config'.</longdesc> <shortdesc lang="en">The LXC \
config file.</shortdesc> <content type="string" default=""/>
</parameter>

<parameter name="log" required="0" unique="0">
<longdesc lang="en">Absolute path to the container log file</longdesc>
<shortdesc lang="en">Container log file</shortdesc>
<content type="string" default="${OCF_RESKEY_log_default}"/>
</parameter>

<parameter name="use_screen" required="0" unique="0">
<longdesc lang="en">Provides the option of capturing the 'root console' from the \
container and showing it on a separate screen.  To see the screen output run 'screen \
-r {container name}' The default value is set to 'false', change to 'true' to \
activate this option</longdesc> <shortdesc lang="en">Use 'screen' for container 'root \
console' output</shortdesc> <content type="boolean" default="false"/>
</parameter>


</parameters>

<actions>
<action name="start"        timeout="10" />
<action name="stop"         timeout="30" />
<action name="monitor"      timeout="20" interval="60" depth="0"/>
<action name="validate-all" timeout="20" />
<action name="meta-data"    timeout="5" />
</actions>
</resource-agent>
END
}


LXC_usage() {
	cat <<END
	usage: $0 {start|stop|monitor|validate-all|meta-data}

	Expects to have a fully populated OCF RA-compliant environment set.
END
}

cgroup_mounted() {
# test cgroup_mounted, mount if required
	# Various possible overrides to cgroup mount point.
	# If kernel supplies cgroup mount point, prefer it.
	CGROUP_MOUNT_POINT=/var/run/lxc/cgroup
	CGROUP_MOUNT_NAME=lxc
	CGROUP_MOUNTED=false
	[[ -d /sys/fs/cgroup ]] && CGROUP_MOUNT_POINT=/sys/fs/cgroup \
CGROUP_MOUNT_NAME=cgroup  # If cgroup already mounted, use it no matter where it is.
	# If multiple cgroup mounts, prefer the one named lxc if any.
	eval `awk 'BEGIN{P="";N=""}END{print("cgmp="P" \
cgmn="N)}($3=="cgroup"){N=$1;P=$2;if($1="lxc")exit}' /proc/mounts`  [[ "$cgmn" && \
"$cgmp" && -d "$cgmp" ]] && CGROUP_MOUNT_POINT=$cgmp CGROUP_MOUNT_NAME=$cgmn \
CGROUP_MOUNTED=true  $CGROUP_MOUNTED || {
		[[ -d $CGROUP_MOUNT_POINT ]] || ocf_run mkdir -p $CGROUP_MOUNT_POINT
		ocf_run mount -t cgroup $CGROUP_MOUNT_NAME $CGROUP_MOUNT_POINT
	}
	echo 1 >${CGROUP_MOUNT_POINT}/notify_on_release
	return 0
}

LXC_start() {
	# put this here as it's so long it gets messy later!!!
	if ocf_is_true $OCF_RESKEY_use_screen; then
		STARTCMD="screen -dmS ${OCF_RESKEY_container} lxc-start -f ${OCF_RESKEY_config} -n \
${OCF_RESKEY_container} -o ${OCF_RESKEY_log}"  else
		STARTCMD="lxc-start -f ${OCF_RESKEY_config} -n ${OCF_RESKEY_container} -o \
${OCF_RESKEY_log} -d"  fi

	LXC_status
	if [ $? -eq $OCF_SUCCESS ]; then
		ocf_log debug "Resource $OCF_RESOURCE_INSTANCE is already running"
		ocf_run touch "${TRANS_RES_STATE}" || exit $OCF_ERR_GENERIC
		return $OCF_SUCCESS
	fi

	cgroup_mounted
	if [ $? -ne 0 ]; then
		ocf_log err "Unable to find cgroup mount"
		exit $OCF_ERR_GENERIC
	fi

	ocf_log info "Starting" ${OCF_RESKEY_container}
	ocf_run ${STARTCMD} || exit $OCF_ERR_GENERIC

	# Spin on status, wait for the cluster manager to time us out if
	# we fail
	while ! LXC_status; do
		ocf_log info "Container ${OCF_RESKEY_container} has not started, waiting"
		sleep 1
	done

	ocf_run touch "${TRANS_RES_STATE}" || exit $OCF_ERR_GENERIC
	return $OCF_SUCCESS
}



LXC_stop() {
	local shutdown_timeout
	local now
	LXC_status
	if [ $? -eq $OCF_NOT_RUNNING ]; then
		ocf_log debug "Resource $OCF_RESOURCE_INSTANCE is already stopped"
		ocf_run rm -f $TRANS_RES_STATE
		return $OCF_SUCCESS
	fi

	cgroup_mounted
	if [ $? -ne 0 ]; then
		ocf_log err "Unable to find cgroup mount"
		exit $OCF_ERR_GENERIC
	fi

	# If the container is running "init" and is able to perform and orderly shutdown, \
then it should be done.  # It is 'assumed' that the 'init' system will do an orderly \
shudown if presented with a 'kill -PWR' signal.  # On a 'sysvinit' this would require \
the container to have an inittab file containing "p0::powerfail:/sbin/init 0"  \
typeset -i PID=0  # This should work for traditional 'sysvinit' and 'upstart'
	lxc-ps -C init -opid |while read CN PID ;do
		[ $PID -gt 1 ] || continue
		[ "$CN" = "${OCF_RESKEY_container}" ] || continue
		ocf_log info "Sending \"OS shut down\" instruction to" ${OCF_RESKEY_container} "as \
it was found to be using \"sysV init\" or \"upstart\""  kill -PWR $PID
	done
	# This should work for containers using 'systemd' instead of 'init'
	lxc-ps -C systemd -opid |while read CN PID ;do
		[[ $PID -gt 1 ]] || continue
		[[ "$CN" = "${OCF_RESKEY_container}" ]] || continue
		ocf_log info "Sending \"OS shut down\" instruction to" ${OCF_RESKEY_container} "as \
it was found to be using \"systemd\""  kill -PWR $PID
	done
	# The "shutdown_timeout" we use here is the operation
	# timeout specified in the CIB, minus 5 seconds
	now=$(date +%s)
	shutdown_timeout=$(( $now + ($OCF_RESKEY_CRM_meta_timeout/1000) -5 ))
	# Loop on status until we reach $shutdown_timeout
	while [ $now -lt $shutdown_timeout ]; do
	    LXC_status
	    status=$?
	    case $status in
		"$OCF_NOT_RUNNING")
		    ocf_run rm -f $TRANS_RES_STATE
		    return $OCF_SUCCESS
		    ;;
		"$OCF_SUCCESS")
		    # Container is still running, keep waiting (until
		    # shutdown_timeout expires)
		    sleep 1
		    ;;
		*)
		    # Something went wrong. Bail out and
		    # resort to forced stop (destroy).
		    break;
	    esac
	    now=$(date +%s)
	done

	# If the container is still running, it will be stopped now. regardless of state!
	ocf_run lxc-stop -n ${OCF_RESKEY_container} || exit $OCF_ERR_GENERIC
	ocf_log info "Container" ${OCF_RESKEY_container} "stopped"
	ocf_run rm -f $TRANS_RES_STATE

	return $OCF_SUCCESS
}

LXC_status() {
	S=`lxc-info -n ${OCF_RESKEY_container}`
	ocf_log debug "State of ${OCF_RESKEY_container}: $S"
	if [[ "${S##* }" = "RUNNING" ]] ; then 
		return $OCF_SUCCESS
	fi
	return $OCF_NOT_RUNNING
}

LXC_monitor() {
	LXC_status && return $OCF_SUCCESS
	if [ -f $TRANS_RES_STATE ]; then
		ocf_log err "${OCF_RESKEY_container} is not running, but state file \
${TRANS_RES_STATE} exists."  exit $OCF_ERR_GENERIC
	fi
	return $OCF_NOT_RUNNING
}


LXC_validate() {
	# Quick check that all required attributes are set
	if [ -z "${OCF_RESKEY_container}" ]; then
		ocf_log err "LXC container name not set!"
		exit $OCF_ERR_CONFIGURED
	fi
	if [ -z "${OCF_RESKEY_config}" ]; then
		ocf_log err "LXC configuration filename name not set!"
		exit $OCF_ERR_CONFIGURED
	fi

	# Tests that apply only to non-probes
	if ! ocf_is_probe; then
		if ! [ -f "${OCF_RESKEY_config}" ]; then
			ocf_log err "LXC configuration file \"${OCF_RESKEY_config}\" missing or not \
found!"  exit $OCF_ERR_INSTALLED
		fi

		if ocf_is_true $OCF_RESKEY_use_screen; then
			check_binary screen
		fi

	    check_binary lxc-start
	    check_binary lxc-stop
	    check_binary lxc-ps
	    check_binary lxc-info
	fi
	return $OCF_SUCCESS
}

if [ $# -ne 1 ]; then
  LXC_usage
  exit $OCF_ERR_ARGS
fi

case $__OCF_ACTION in
    meta-data)	meta_data
	exit $OCF_SUCCESS
	;;
    usage|help)	LXC_usage
	exit $OCF_SUCCESS
	;;
esac

# Everything except usage and meta-data must pass the validate test
LXC_validate

case $__OCF_ACTION in
start)		LXC_start;;
stop)		LXC_stop;;
status)	LXC_status;;
monitor)	LXC_monitor;;
validate-all)	;;
*)		LXC_usage
		ocf_log err "$0 was called with unsupported arguments: $*"
		exit $OCF_ERR_UNIMPLEMENTED
		;;
esac
rc=$?
ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION : $rc"
exit $rc



_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic