'Re: [Linux-ha-dev] [PATCH] add fsck execution mode parameter for'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] [PATCH] add fsck execution mode parameter for
From:       Dejan Muhamedagic <dejanmm () fastmail ! fm>
Date:       2010-06-09 12:28:43
Message-ID: 20100609122842.GB7609 () rondo ! homenet
[Download RAW message or body]

On Mon, Jun 07, 2010 at 08:42:35PM +0200, Bernd Schubert wrote:
> On Monday 07 June 2010, Dejan Muhamedagic wrote:
> > Hi,
> > 
> > On Mon, Jun 07, 2010 at 11:52:49AM +0900, Takatoshi MATSUO wrote:
> > > Hi
> > >
> > > (2010/06/04 23:18), Dejan Muhamedagic wrote:
> > > > Hi,
> > > >
> > > > On Fri, Jun 04, 2010 at 02:16:42PM +0200, Bernd Schubert wrote:
> > > >> On Friday 04 June 2010, Dejan Muhamedagic wrote:
> > > >>> Hi Takatoshi-san,
> > > >>>
> > > >>> On Fri, Jun 04, 2010 at 02:19:42PM +0900, Takatoshi MATSUO wrote:
> > > >>>> Hello
> > > >>>>
> > > >>>> I suggest to add a parameter which decides executing fsck
> > > >>>> as user's policy in Filesystem RA.
> > > >>>>
> > > >>>> Because, current RA dose not check ext3 because executing fsck
> > > >>>> depends on filesystem.
> > > >>>> But ext3 sometimes is broken and remounted read-only although it has
> > > >>>> journal, so
> > > >>>
> > > >>> Under which circumstances does this happen?
> > >
> > > It happens when testing such as pulling disk cable,
> > > crashing OS, and so on.
> > > But I don't know detail circumstances because it's very very rare.
> > >
> > > Like Bernd says, no filesystem is perfect,
> > > and no hardware and no driver are perfect too.
> > >
> > > >> No filesystem is perfect ;) And any kind of hardware issue can cause
> > > >> filesystem and data corruption.
> > > >
> > > > Filesystem corruption? That's like not what I exactly had in mind :)
> > > > I'm not sure if fsck would help in that case anyway. Not saying
> > > > that that never happens (hw or bugs), but what I meant is
> > > > "normal" (say, on stonith) failovers where no fs corruption occurs.
> > > >
> > > >> Takatoshi-san, you should notice however, that for example e2fsck will
> > > >> start to run in non-auto mode, even if only a journal recovery is
> > > >> required. With default extX paramters, it then easily might perform a
> > > >> complete filesystem check, which might last hours. Not only that you
> > > >> might get unexpected long down time, you also need to be aware, that
> > > >> fsck time is often MUCH longer than the resource start timeout. If
> > > >> that happens, pacemaker will kill fsck in the middle of a run, which
> > > >> might damage your filesystem even more.
> > >
> > > I notice complete filesystem check,
> > > so if I use this parameter with force, I'll disable "max-mount-counts"
> > > and "time-last-checked" using tune2fs commands.
> > >
> > > But I don't notice pacemaker will kill fsck.
> 
> That depends on the size of your filesystem, of course. The larger the 
> filesystem and the more inodes it has, the more time fsck will require. With 
> ext4 and the uninit blockgroups feature, fsck time will much smaller than with 
> ext3, but also only if you do have too many inodes/files. 
> On Lustre OSTs fsck time is now usually between 10 minutes and 4 hours, on the 
> MDT with a lfsck database run, it can be more than 24 hours.
> 
> Anyway, all of that is too large for the RA.
> 
> > >
> > > >> That is all fine if you know about possible consequences, but I really
> > > >> doubt that most admins are aware of that.
> > > >
> > > > Most admins are not aware of most things ;-)
> > > >
> > > >>>> I want to decide myself executing fsck before mount to operate more
> > > >>>> safely.
> > > >>>>
> > > >>>> This new parameter has three mode "auto","force" and "no".
> > > >>>> Default is "auto" which do the same thing as before.
> > > >>>> "force" and "no" mean what they say.
> > > >>>
> > > >>> Patch applied. Many thanks!
> > >
> > > Thanks.
> > >
> > > >> That brings up and idea here, with extX, we could easily use
> > > >>
> > > >> dumpe2fs -h | grep "Filesystem state:"
> > > >>
> > > >> to check if fsck needs to be run. So the agent could refuse to mount
> > > >> the decide and make you run it manually in the foreground without any
> > > >> timeouts... I will implement that for our lustre_server agent (a
> > > >> heavily modified Filesystem agent) and then possibly back-port the
> > > >> patch.
> > > >
> > > > That may be a good idea. Given that one can say how long would
> > > > e2fsck take.
> > >
> > > It sounds good, but "dumpe2fs" is specific to ext2 and ext3.
> > > If killing fsck damages filesystem, can I rewrite this patch
> > > based on this idea?
> > 
> > It's probably a good idea not to interrupt fsck. I don't know if
> > there is a way to figure out if e2fsck is going to do the journal
> > recovery or a full filesystem check.
> 
> You don't need e2fsck for a journal recovery - the kernel can do that. 

Right.

> However, running e2fsck on a device that was not properly umounted will make 
> e2fsck to run the journal recovery on its own. 

It seems like that in most configuration we just can't run fsck.
If it makes no difference when only the journal is rolled. Then
the only safe setting would be "no" for the new parameter.

Cheers,

Dejan

> 
> Cheers,
> Bernd
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread]