[prev in list] [next in list] [prev in thread] [next in thread] 

List:       fedora-devel-list
Subject:    Re: Fedora 32 System-Wide Change proposal (late): Enable EarlyOOM
From:       Benjamin Berg <bberg () redhat ! com>
Date:       2020-01-09 12:57:24
Message-ID: 4db4d8de4cb280409e906c4fefca31e847992764.camel () redhat ! com
[Download RAW message or body]

On Wed, 2020-01-08 at 12:24 -0700, Chris Murphy wrote:
> On Mon, Jan 6, 2020 at 11:09 AM Lennart Poettering <mzerqung@0pointer.de> wrote:
> > - facebook is working on making oomd something that just works for
> >   everyone, they are in the final rounds of canonicalizing the
> >   configuration so that it can just work for all workloads without
> >   tuning. The last bits for this to be deployable are currently being
> >   done on the kernel side ("iocost"), when that's in, they'll submit
> >   oomd (or simplified parts of it) to systemd, so that it's just there
> >   and works. It's their expressive intention to make this something
> >   that also works for desktop stuff and requires no further
> >   tuning. they also will do the systemd work necessary. time frame:
> >   half a year, maybe one year, but no guarantees.
> 
> Looks like PSI based oom killing doesn't work without swap. Therefore
> oomd can't be considered a universal solution. Quite a lot of
> developers have workstations with quite a decent amount of RAM,
> ~64GiB, and do not use swap at all. Server baremetal are likewise
> mixed, depending on workloads, and in cloud it's rare for swap to
> exist.
> 
> https://github.com/facebookincubator/oomd/issues/80
> 
> We think earlyoom can be adjusted to work well for both the swap and
> no swap use cases.

But so can oomd, after all, they are willing to implement a plugin that
uses the MaxAvailable heuristic. It just won't be available in the
short term.

In principle, I think what we are trying to achieve here is to keep the
system mostly responsive from a user perspective. This seems to imply
keeping pages in main memory that belong to "important" processes.

Should oomd not manage to do this well enough out of the box, then I
see two main methods we have to improve things:

 * Aggressively kill when we think important pages might get evicted
   - earlyoom does this based on MemAvailable
   - oomd plugin could do the same if deemed the right thing
 * Actively protect important processes[1]:
   - set MemoryMin, MemoryLow on important units
   - limit "normal" processes more e.g. MemoryHigh for applications
   - in the long run: adjust the OOMScore/MemoryHigh dynamically based
     on whether the user is interacting with an application at the time

earlyoom does the first and has the big advantage that it can be
shipped in F32. However, it is not clear to me that this aggressive
heuristic is actually better overall. And even if it is, we would
likely still move it into oomd in the long run.

Finally, for F32 we might already be able to improve things quite a lot
simply by setting a few configuration options in GNOME systemd units.

Benjamin

[1] I do not know how well this works, so it may be nice if people
experimented with it[2]. For GNOME you can easily add a systemd drop-in 
for various services. e.g. to protect the shell (in a wayland session)
simply do:

$ systemctl edit --user gnome-shell-wayland.service
[Service]
MemoryMin=250M
MemoryLow=500M

Which I suspect should already help a lot in many scenarios.

[2] Unfortunately, I guess that such measurements may be skewed a lot
on systems that use swap due to unrelated lags. i.e. Jan Grulich's mail
from earlier today titled "Lagging system with latest kernels".
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic