[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-keyrings
Subject:    Re: [RFC PATCH 02/27] containers: Implement containers as kernel objects
From:       Ian Kent <raven () themaw ! net>
Date:       2019-02-21 10:39:28
Message-ID: 6c907b21aca7b93c3b637ba0e30de4c6acb356f4.camel () themaw ! net
[Download RAW message or body]

On Wed, 2019-02-20 at 14:26 +0100, Christian Brauner wrote:
> On Wed, Feb 20, 2019 at 10:46:24AM +0800, Ian Kent wrote:
> > On Fri, 2019-02-15 at 16:07 +0000, David Howells wrote:
> > > Implement a kernel container object such that it contains the following
> > > things:
> > > 
> > >  (1) Namespaces.
> > > 
> > >  (2) A root directory.
> > > 
> > >  (3) A set of processes, including one designated as the 'init' process.
> > 
> > Yeah, I think a name other than init needs to be used for this
> > process.
> > 
> > The problem being that there is no requirement for container
> > process 1 to behave in any way like an "init" process is
> > expected to behave and that leads to confusion (at least
> > it certainly did for me).
> 
> If you look at the documentation for pid namespaces(7) you can see that
> the pid 1 inside a pid namespace is expected to behave like an init
> process:
> -  "The  first  process created in a new namespace [...] has  the PID 1,
>    and is the "init" process for the namespace (see init(1))."
> - "[...] child process that is orphaned within the namespace will be
>   reparented to this process rather than init(1) [...]"
> - "If the "init" process of a PID namespace terminates, the kernel
>   terminates all of the processes in the  namespace  via a SIGKILL
>   signal. This behavior reflects the fact that the "init" process is
>   essential for the cor- rect operation of a PID namespace."
> - "Only signals for which the "init" process has established a signal
>   handler can be sent to the  "init" process by other members of the
>   PID namespace."
> - "[...] the reboot(2) system call causes a signal to be sent to the
>   namespace "init" process."
> 
> This is one of the reasons why all major current container runtimes
> finally after years of failing to realize this run a stub init process
> that mimicks a dumb init. Sure, you get away with not having an init
> that behaves like an init but this is inherently broken or at least
> against the way pid namespaces were designed.

TBH I wasn't sure why the signal I sent didn't arrive, AFAICS
it should have regardless of what signals the container init
process was accepting. But it could have been due to a
different problem in my kernel code (that's very likely).

In any case it wasn't worth perusing because even if I did work
it out I had already found that the request_key sub-system wasn't
playing well with others when trying to run something within a
container's namespaces, so no point in going further ...

Ian

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic