[prev in list] [next in list] [prev in thread] [next in thread] 

List:       mesos-commits
Subject:    (mesos) branch master updated: Stopped checking if destroyed cgroup is empty.
From:       bmahler () apache ! org
Date:       2024-01-17 20:55:53
Message-ID: 170552495369.935445.14389825858437426645 () gitbox2-he-fi ! apache ! org
[Download RAW message or body]

This is an automated email from the ASF dual-hosted git repository.

bmahler pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git


The following commit(s) were added to refs/heads/master by this push:
     new 41a170661 Stopped checking if destroyed cgroup is empty.
41a170661 is described below

commit 41a170661ae94dade2d30df167c75cafddaaae79
Author: Ilya Pronin <ipronin@twitter.com>
AuthorDate: Mon May 18 11:14:06 2020 -0700

    Stopped checking if destroyed cgroup is empty.
    
    Linux patch 9c974c77246460fa6a92c18554c3311c8c83c160 makes exiting
    processes that are still attached to a cgroup visible in cgroup.procs.
    It is a fix/workaround for the issue where the cgroup cannot be removed
    even though it appears empty.
    
    We already have a workaround for the removal problem in place: cgroup
    destroyer retries rmdir until it is successful.
    
    With this change we add a workaround for the new cgroup.procs behavior.
    We can no longer expect a cgroup to look empty after TasksKiller reaps
    all processes that used to run inside it. So, instead, we rely on the
    original workaround to ensure that the cgroup is successfuly removed and
    keep process listing only for debugging purposes.
    
    Alternatively, we could await for the cgroup to become empty before
    successfuly returning from TasksKiller. But it appears unnecessary with
    the remove workaround in place.
---
 src/linux/cgroups.cpp | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/src/linux/cgroups.cpp b/src/linux/cgroups.cpp
index 2af654610..c1272fbca 100644
--- a/src/linux/cgroups.cpp
+++ b/src/linux/cgroups.cpp
@@ -1515,17 +1515,23 @@ private:
       return;
     }
 
-    // Verify the cgroup is now empty.
+    // With the Linux patch 9c974c77246460fa6a92c18554c3311c8c83c160 it is
+    // possible that the cgroup might not be empty even though processes inside
+    // it are actually dead. This should be OK because Destroyer keeps
+    // attempting to remove the cgroup until all terminated processes are
+    // detached from it. But we would like some visibility into what's going on.
     Try<set<pid_t>> processes = cgroups::processes(hierarchy, cgroup);
-
-    // If the `cgroup` is already removed, treat this as a success.
-    if ((processes.isError() || !processes->empty()) &&
-        os::exists(path::join(hierarchy, cgroup))) {
-      promise.fail("Failed to kill all processes in cgroup: " +
-                   (processes.isError() ? processes.error()
-                                        : "processes remain"));
-      terminate(self());
-      return;
+    if (processes.isError()) {
+      if (os::exists(path::join(hierarchy, cgroup))) {
+        promise.fail("Failed to kill all processes in cgroup: " +
+                     processes.error());
+        terminate(self());
+        return;
+      }
+    } else if (!processes->empty()) {
+      LOG(WARNING) << "PIDs " << strings::join(",", processes.get())
+                   << " are still attached to cgroup "
+                   << path::join(hierarchy, cgroup);
     }
 
     promise.set(Nothing());

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic