[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] Fwd: [Debian-ha-maintainers] Bug#598549:
From:       Lars Ellenberg <lars.ellenberg () linbit ! com>
Date:       2010-10-15 17:51:33
Message-ID: 20101015175133.GF32593 () barkeeper1-xen ! linbit
[Download RAW message or body]

On Fri, Oct 15, 2010 at 01:15:51PM -0400, Michael Smith wrote:
> Lars Ellenberg wrote:
> >     BTW, looking at those ocf-shellfuncs, did anyone notice that
> > 	ocf_take_lock is broken, because it's racy?
> > 	Not sure if/how we should solve that, though.
> 
> ln is atomic so this ought to work:
> 
>      local tmp
>      tmp=$(mktemp "$lockfile.XXXXXX")
>      echo "$$" > "$tmp"
> 
>      while :
>      do
>          if ! ocf_pidfile_status "$lockfile"
>          then
>              ln "$tmp" "$lockfile" 2>/dev/null && break
>          fi
>          ocf_log info  "Sleeping until $lockfile is released..."

You cannot rely on the lockfile to be removed,
the other one may die without cleaning up after himself.

And if you remove it yourself (once you detected it is stale),
you open an other racy window...

>          sleep 0.$rnd
>      done
> 
> > 	possibly this could do it?
> > 	while :; do
> > 		pid=$(head -n1 $file)
> > 		[ x$pid = x$$ ] && return 0 # won the race
> > 		if [ -z "$pid" ] || ! kill -0 $pid ; then
> > 			echo $$ > $file
> > 		else
> > 			# other still running
> > 			sleep 1
> > 		fi
> > 	done
> 
> Looks like that would work, too.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic