On 2020-03-16 19:26, Qu Wenruo wrote: > On 2020/3/16 下午1:19, Tomasz Chmielewski wrote: >> On 2020-03-16 14:06, Qu Wenruo wrote: >>> On 2020/3/16 上午11:13, Tomasz Chmielewski wrote: >>>> After upgrading to Linux 5.5 (tried 5.5.6, 5.5.9, also 5.6.0-rc5), >>>> the >>>> system panics shortly after mounting and starting to use a btrfs >>>> filesystem. Here is a dmesg - please advise how to deal with it. >>>> It has since crashed several times, because of panic=10 parameter >>>> (system boots, runs for a while, crashes, boots again, and so on). >>>> >>>> Mount options: >>>> >>>> noatime,ssd,space_cache=v2,user_subvol_rm_allowed >>>> >>>> >>>> >>>> [   65.777428] BTRFS info (device sda2): enabling ssd optimizations >>>> [   65.777435] BTRFS info (device sda2): using free space tree >>>> [   65.777436] BTRFS info (device sda2): has skinny extents >>>> [   98.225099] BTRFS error (device sda2): parent transid verify >>>> failed >>>> on 19718118866944 wanted 664218442 found 674530371 >>>> [   98.225594] BTRFS error (device sda2): parent transid verify >>>> failed >>>> on 19718118866944 wanted 664218442 found 674530371 >>> >>> This is the root cause, not quota. >>> >>> The metadata is already corrupted, and quota is the first to complain >>> about it. >> >> Still, should it crash the server, putting it into a cycle of >> crash-boot-crash-boot, possibly breaking the filesystem even more? > > The transid mismatch in the first place is the cause, and I'm not sure > how it happened. > > Did you have any history of the kernel used on that server? > > Some potential corruption source includes the v5.2.0~v5.2.14, which > could cause some tree block not written to disk. Yes, it used to run a lot of kernel, starting with 4.18 or perhaps even earlier. >> Also, how do I fix that corruption? >> >> This server had a drive added, a full balance (to RAID-10 for data and >> metadata) and scrub a few weeks ago, with no errors. Running scrub now >> to see if it shows up anything. > > Then at least at that time, it's not corrupted. > > Is there any sudden powerloss happened in recent days? > Another potential cause is out of spec FLUSH/FUA behavior, which means > the hard disk controller is not reporting correct FLUSH/FUA finish. > > That means if you use the same disk/controller, and manually to cause > powerloss, it would fail just after several cycle. Powerloss - possibly there was. Tomasz