[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-btrfs
Subject:    Re: [UNRESOLVED] Re: errors found in extent allocation tree or chunk allocation after power failure
From:       "Pallissard, Matthew" <matt () pallissard ! net>
Date:       2019-09-28 0:03:19
Message-ID: 20190928000319.3vdv2kbxpcs5kdyj () matt-laptop-p01
[Download RAW message or body]


On 2019-09-27T17:01:27, Pallissard, Matthew wrote:
> 
> On 2019-09-25T14:32:31, Pallissard, Matthew wrote:
> > On 2019-09-25T15:05:44, Chris Murphy wrote:
> > > On Wed, Sep 25, 2019 at 1:34 PM Pallissard, Matthew <matt@pallissard.net> \
> > > wrote:
> > > > On 2019-09-25T13:08:34, Chris Murphy wrote:
> > > > > On Wed, Sep 25, 2019 at 8:50 AM Pallissard, Matthew <matt@pallissard.net> \
> > > > > wrote:
> > > > > > 
> > > > > > Version:
> > > > > > Kernel: 5.2.2-arch1-1-ARCH #1 SMP PREEMPT Sun Jul 21 19:18:34 UTC 2019 \
> > > > > > x86_64 GNU/Linux
> > > > > 
> > > > > You need to upgrade to arch kernel 5.2.14 or newer (they backported the fix \
> > > > > first appearing in stable 5.2.15). Or you need to downgrade to 5.1 series. \
> > > > > https://lore.kernel.org/linux-btrfs/20190911145542.1125-1-fdmanana@kernel.org/T/#u
> > > > >  
> > > > > That's a nasty bug. I don't offhand see evidence that you've hit this bug. \
> > > > > But I'm not certain. So first thing should be to use a different kernel.
> > > > 
> > > > Interesting, I'll go ahead with a kernel upgrade as that easy enough.
> > > > However, that looks like it's related to a stacktrace regarding a hung \
> > > > process.  Which is not the original problem I had. Based on the output in my \
> > > > previous email, I've been working under the assumption that there is a \
> > > > problem on-disk.  Is that not correct?
> > > 
> > > That bug does cause filesystem corruption that is not repairable.
> > > Whether you have that problem or a different problem, I'm not sure.
> > > But it's best to avoid combining problems.
> > > 
> > > The file system mounts rw now? Or still only mounts ro?
> > 
> > It mounts RW, but I have yet to attempt an actual write.
> > 
> > 
> > > I think most of the errors reported by btrfs check, if they still exist after \
> > > doing a scrub, should be repaired by 'btrfs check --repair' but I don't advise \
> > > that until later. I'm not a developer, maybe Qu can offer some advise on those \
> > > errors.
> > 
> > 
> > > > > Next, anytime there is a crash or powerfailur with Btrfs raid56, you need \
> > > > > to do a complete scrub of the volume. Obviously will take time but that's \
> > > > > what needs to be done first.
> > > > 
> > > > I'm using raid 10, not 5 or 6.
> > > 
> > > Same advice, but it's not as important to raid10 because it doesn't have the \
> > > write hole problem.
> > 
> > 
> > > > > OK actually, before the scrub you need to confirm that each drive's SCT ERC \
> > > > > time is *less* than the kernel's SCSI command timer. e.g.
> > > > 
> > > > I gather that I should probably do this before any scrub, be it raid 5, 6, or \
> > > > 10.  But, Is a scrub the operation I should attempt on this raid 10 array to \
> > > > repair the specific errors mentioned in my previous email?
> > > 
> > > Definitely deal with the timing issue first. If by chance there are bad sectors \
> > > on any of the drives, they must be properly reported by the drive with a \
> > > discrete read error in order for Btrfs to do a proper fixup. If the times are \
> > > mismatched, then Linux can get tired waiting, and do a link reset on the drive \
> > > before the read error happens. And now the whole command queue is lost and the \
> > > problem isn't fixed.
> > 
> > Good to know, that seems like a critical piece of information.  A few searches \
> > turned up this page, https://wiki.debian.org/Btrfs#FAQ. 
> > Should this be noted on the 'gotchas' or 'getting started page as well?  I'd be \
> > happy to make edits should the powers that be allow it. 
> > 
> > > There are myriad errors and the advice I'm giving to scrub is a safe first step \
> > > to make sure the storage stack is sane - or at least we know where the simpler \
> > > problems are. And then move to the less simple ones that have higher risk.  It \
> > > also changed the volume the least. Everything else, like balance and chunk \
> > > recover and btrfs check --repair - all make substantial changes to the file \
> > > system and have higher risk of making things worse.
> > 
> > This sounds sensible.
> > 
> > 
> > > In theory if the storage stack does exactly what Btrfs says, then at worst you \
> > > should lose some data, but the file system itself should be consistent. And \
> > > that includes power failures. The fact there's problems reported suggests a bug \
> > > somewhere - it could be Btrfs, it could be device mapper, it could be \
> > > controller or drive firmware.
> > 
> > I'll go ahead with a kernel upgrade/make sure the timing issues are squared away. \
> > Then I'll kick off a scrub. 
> > I'll report back when the scrub is complete or something interesting happens.  \
> > Whichever comes first.
> 
> As a followup;
> 1. I took care of the timing issues
> 2. ran a scrub.
> 3. I ran a balance, it kept failing with about 20% left
> - stacktraces in dmesg showed spinlock stuff
> 
> 3. got I/O errors on one file during my final backup, (
> - post-backup hashsums of everything else checked out
> - the errors during the copy were csum mismatches should anyone care
> 
> 4. ran a bunch of potentially disruptive btrfs check commands in alphabetical order \
>                 because "why not at this point?"
> - they had zero affect as far as I can tell, all the same files were readable, the \
> btrfs check errors looked identical (admittedly I didn't put them side by side) 
> 5. re-provisioned the array, restored from backups.
> 
> As I thought about it, it may have not been an issue with the original power \
> outage.  I only ran a check after the power outage.  My array could have had an \
> issue due to a previous bug. I was on a 5.2x kernel for several weeks under high \
> load.  Anyway, there are enough unknowns to make a root cause analysis not worth my \
> time. 
> Marking this as unresolved folks in the future who may be looking for answers.
> 

Man, I should have read that over one more time for typos. Oh well.

Matt Pallissard


["signature.asc" (application/pgp-signature)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic