'Re: Fedora 32 System-Wide Change proposal: Enable fstrim.timer by default'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       fedora-devel-list
Subject:    Re: Fedora 32 System-Wide Change proposal: Enable fstrim.timer by default
From:       Chris Murphy <lists () colorremedies ! com>
Date:       2019-12-22 21:00:08
Message-ID: CAJCQCtRkujLDd1uhBuKZEAg4GA=kC3egwq-Z8WrwqU30WfB+WA () mail ! gmail ! com
[Download RAW message or body]

On Fri, Dec 20, 2019 at 2:24 AM Lennart Poettering <mzerqung@0pointer.de> wrote:
> So, if this is desirable, why doesn't the kernel do this on its own?

A relevant recent article on the subject:

Issues around discard
https://lwn.net/Articles/787272/

There is a huge range of device firmware flash translation layer
behaviors. There's no mechanism for that behavior to be announced to
the file system or block layer, so that the kernel could adapt
accordingly. For example the FTL can change behavior as the SSD fills
up. It leads to users reporting different problems for the same
workload, with the same SSD make/model/firmware, differing only in how
full the SSD is.

And also the file systems aren't even all on the same page. Nor is the
SSD industry.


> Why do we need a userspace component that just gets an event from the
> kernel and then tells the kernel to do something? If this is generally
> desirable, why is something as trivial as that not a kernel
> functionality anyway?

Second attempt at answering this.

As it relates to online discards (by using the filesystem 'discard'
mount option): adding a delay for issuing discards means the kernel
having to track possibly thousands of delayed discards. And what
happens when it expires? An even bigger set of discards, decently
likely causing a more noticeable device hang, than without this
hypothetical feature. Drives don't offer any kind of rate limiting for
discard

Conversely, the batch discard method (e.g. fstrim.timer->fstrim
command) causes the file system to search for unused blocks. And yes
that one time per week batch could be huge. Hence doing it at midnight
on a Monday; which if that's missed, it happens at some point during
the first boot on Monday. Free space fragmentation, as a result of
file system aging, affects the duration of the batch discard. Combined
with some (perhaps many) SSDs that have slow discard behavior, can
cause the batch discard to take either a second, or minutes. For the
same SSD.

Literally one idea kernel developers have, is to enable 'discard'
mount option (online discard) by default. The idea is that inevitably
irate end users will compel vendors to fix their shit via ridicule. I
find this hilarious because the most civil "go fuck yourself" invented
in the modern era is the expired warranty. I expect this idea would
just lead to a lot of finger pointing, and they really should have
just done this 9 years ago, hindsight being 20/20.

So yeah. *shrug*

-- 
Chris Murphy
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic