[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-raid
Subject: Re: mdadm --stop goes off and never comes back?
From: "Jon Nelson" <jnelson-linux-raid () jamponi ! net>
Date: 2007-12-22 13:01:44
Message-ID: cccedfc60712220501n28cd9a6dx57e437c01dd1c1fb () mail ! gmail ! com
[Download RAW message or body]
On 12/22/07, Neil Brown <neilb@suse.de> wrote:
> On Wednesday December 19, jnelson-linux-raid@jamponi.net wrote:
> > On 12/19/07, Jon Nelson <jnelson-linux-raid@jamponi.net> wrote:
> > > On 12/19/07, Neil Brown <neilb@suse.de> wrote:
> > > > On Tuesday December 18, jnelson-linux-raid@jamponi.net wrote:
> > > > >
> > > > > I tried to stop the array:
> > > > >
> > > > > mdadm --stop /dev/md2
> > > > >
> > > > > and mdadm never came back. It's off in the kernel somewhere. :-(
>
> Looking at your stack traces, you have the "mdadm -S" holding
> an md lock and trying to get a sysfs lock as part of tearing down the
> array, and 'hald' is trying to read some attribute in
> /sys/block/md....
> and is holding the sysfs lock and trying to get the md lock.
> A classic AB-BA deadlock.
>
> >
> > NOTE: kernel is stock openSUSE 10.3 kernel, x86_64, 2.6.22.13-0.3-default.
> >
>
> It is fixed in mainline with some substantial changes to sysfs.
> I don't imagine they are likely to get back ported to openSUSE, but
> you could try logging a bugzilla if you like.
Nah - I'm eagerly awaiting new kernels anyway as I have some network
cards that work much better (read: they work) with 2.6.24rc3+.
> The 'hald' process is interruptible and killing it would release the
> deadlock.
Cool.
> I suspect you have to be fairly unlucky to lose the race but it is
> obviously quite possible.
Sometimes we are all a little unlucky. In my case, it cost me a reboot
or, in others, nothing at all. Fortunately this was not a production
system with lots of users.
> I don't think there is anything I can do on the md side to avoid the
> bug.
In the situation I don't think that such a change would be warranted anyway.
Thanks again for looking at this. I'm a big believer in the 'canary in
a coal mine' mentality - some problems may indications of much more
serious issues, but in this case, it would appear that the issue has
already been taken care of. Have a Happy Holidays.
--
Jon
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic