[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-btrfs
Subject: Re: What to do with damaged root fllesystem (opensuse leap 42.2)
From: Duncan <1i5t5.duncan () cox ! net>
Date: 2018-10-05 9:17:32
Message-ID: pan$7d1b$960e48ab$8c11d075$52585620 () cox ! net
[Download RAW message or body]
Beat Meier posted on Wed, 03 Oct 2018 16:20:14 -0300 as excerpted:
> Hello
>
> I'm using btrfs on opensuse leap 42.2.
>
> This days I had a power loss and system does not mount anymore root
> filesystem with subvolumes.
>
> My original problem in dmesg was skinny extents and space cache
> generation (...) does not match inode (...) errors.
Those are not a big deal and should be dealt with automatically, at least
on a reasonably current kernel, so there either were other problems or
you were using an old kernel (not being on opensuse I haven't a clue from
the leap number what the kernel is, but at least the 4.12 kernel below is
both a bit old and not a mainstream LTS kernel, as those were 4.14 and
4.9).
> After investiagting a little bit I did the following commands, which
> already told me was an error...
>
> btrfsck /dev/sdc18
>
> several times
OK, plain btrfsck (aka btrfs check) is normally read-only, reporting
problems but not attempting to fix them, unless --repair or one of the
other options (--init-csum-tree, etc) was used, and that's not
recommended until after checking with the list as it only knows how to
fix some things and can cause further damage for others.
Assuming you didn't try repair mode at this point, why would you run it
several times, as that does nothing but report the same problems several
times? And if you did try repair mode, who told you to do so and why?
> After that
>
> btrfs rescue zero-log
Again, that's a fix for specific problems and should only be run after
checking with the list.
> And at least
>
> btrfs check --repair
As above, this should only be run after checking with the list, and with
the knowledge that if it doesn't fix the problem, it might actually make
it worse, so best to try to scrap what you can off the filesystem using a
read-only mount if possible, or btrfs restore, /before/ trying it.
> All this was done on recues system or live system of opensuse
> Not they told me that I should do
>
> "btrfs restore"
>
> with guidance of the list
>
> So please can you guide me what to do do recover filesystem....
What btrfs restore does is try to recover files off the unmountable
filesystem, putting what it recovers elsewhere. This is actually a good
idea and should have been done earlier, since it doesn't further damage
the existing filesystem, and gives you a chance at getting at the files
before trying riskier operations like btrfs check --repair.
Of course, as the admin's first rule of backups states, the true value of
data isn't defined by arbitrary claims, but rather, by the number of
backups you consider it worth having of that data, just in case. Thus,
only data of such trivial value that it's not worth the time/trouble/
resources to back it up won't have any backups at all.
Which means that the only thing you should need btrfs restore for is a
chance at recovering the data that has changed since your last backup,
that was of trivial enough value it wasn't yet worth doing another backup
yet, or that backup would have been done.
So it shouldn't be a big deal if btrfs restore doesn't work, and/or if
you lose everything on the filesystem, since if it was of more than
trivial value, you can simply restore from the backup that you made,
because that's the /definition/ of data value. Otherwise, you were
simply defining the data as of throw-away value, not worth the trouble to
backup, so losing it isn't a big deal.
Which takes the pressure off trying to restore or otherwise recover,
since in any case, you always saved what was of most value to you, either
the data because you had it backed up, or the time/trouble/resources you
would have otherwise put into that backup, if saving that time/trouble/
resources was more valuable to you than the data you otherwise would have
backed up.
> I have now removed disk from original system and tried to mount on leap
> 15 and of course won't work :-(
>
> Information of my leap 15 system which has not damaged root fs of my
> leap 42.2
>
> btrfs --version btrfs-progs v4.15
>
> uname -a
>
> Linux laptop 4.12.14-lp150.12.16-default #1 SMP Tue Aug 14 17:51:27 UTC
> 2018 (28574e6) x86_64 x86_64 x86_64 GNU/Linux
FWIW, when the filesystem is still mountable, it's the kernel version
that's critical, and commands such as btrfs balance and btrfs scrub
actually call kernel functionality to do what they do, so for them a
current kernel will normally work best.
But once the btrfs won't mount and you're using commands like btrfs
check, btrfs rescue, btrfs restore, etc, on the unmountable filesystem,
it's the btrfs-progs version that's critical, and you'll normally want
the very latest version, since that has the latest fixes and the greatest
chance at fixing things or for restore, scraping files off the damaged
filesystem.
So before doing the btrfs restore, you should find a current btrfs-progs,
4.17.1 ATM, to do it with, as that should give you the best results. Try
Fedora Rawhide or Arch (or the Gentoo I run), as they tend to have more
current versions.
Then you need some place to put the scraped files, a writable filesystem
with enough space to put what you're trying to restore.
Once you have some place to put the scraped files, with luck, it's a
simple case of running...
btrfs restore <options> <device> <path>
... where ...
<device> is the damaged filesystem
<path> is the path on the writable filesystem where you want to dump the
restored files
and <options> can include various options as found in the btrfs-restore
manpage, like -m/--metadata if you want to try to restore owner/times/
perms for the files, -s/--symlinks if you want to try to restore them,
-x/--xattr if you want to try to restore them, etc.
You may want to do a dry-run with -D/--dry-run first, to get some idea of
whether it's looking like it can restore many of the files or not, and
thus, of the sort of free space you may need on the writable filesystem
to store the files it can restore.
If a simple btrfs restore doesn't seem to get anything, there is an
advanced mode as well, with a link to the wiki page covering it in the
btrfs-restore manpage, but it does get quite technical, and results may
vary. You will likely need help with that if you decide to try it, but
as they say, that's a bridge we can cross when/if we get to it, no need
to deal with it just yet.
Meanwhile, again, don't worry too much about whether you can recover
anything here or not, because in any case you already have what was most
important to you, either backups you can restore from if you considered
the data worth having them, or the time and trouble you would have put
into those backups, if you considered saving that more important than
making the backups. So losing the data on the filesystem, whether from
filesystem error as seems to be the case here, due to admin fat-fingering
(the infamous rm -rf .* or alike), or due to physical device loss if the
disks/ssds themselves went bad, can never be a big deal, because the
maximum value of the data in question is always strictly limited to that
of the point at which having a backup is more important than the time/
trouble/resources you save(d) by not having one.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic