[prev in list] [next in list] [prev in thread] [next in thread] 

List:       freebsd-hackers
Subject:    Re: poll() POLLHUP behaviour inconsistency
From:       Mateusz Guzik <mjguzik () gmail ! com>
Date:       2020-11-04 23:22:57
Message-ID: CAGudoHEK04aXMvAsa4ufPfUUzX6RUJ2Uiv6K06RccGmR3=+V+A () mail ! gmail ! com
[Download RAW message or body]

On 11/4/20, Jérémie Galarneau <jeremie.galarneau@efficios.com> wrote:
> On Tue, 3 Nov 2020 at 19:12, Mateusz Guzik <mjguzik@gmail.com> wrote:
>>
>> On 11/4/20, Jérémie Galarneau <jeremie.galarneau@efficios.com> wrote:
>> > Hi,
>> >
>> > Michael and myself are porting code from Linux to FreeBSD and we have
>> > noticed a
>> > peculiar difference in the way poll() events are handled.
>> >
>> > In short, we have a process that monitors the lifetime of other
>> > processes.
>> > It
>> > does so by sharing a pipe between the parent and the child on every
>> > fork:
>> > read-end in the parent, write-end in the child. The pipe is not used to
>> > communicate; it's only used to poll() on the death of the child
>> > process.
>> >
>> > On Linux, poll() is called with a POLLHUP event and nothing else. When
>> > the child process dies, the poll() call returns with 'revents ==
>> > POLLHUP'.
>> >
>> > After some head scratching, we figured that on FreeBSD (12.1 and 12.2)
>> > if
>> > the
>> > child process died while the parent was not waiting in poll(), we would
>> > get
>> > 'revents == POLLHUP' on the next invocation of poll(), like we do on
>> > Linux.
>> > However, if the parent is in poll() when the child dies, the call to
>> > poll()
>> > never unblocks. This resulted in occasional hangs in the application.
>> >
>> > I am joining a reproducer [1].
>> >
>> >
>> > As indicated, changing the 'events' to 'POLLIN | POLLHUP' causes both
>> > events
>> > to
>> > be delivered in both cases (child dies before/during calling poll()).
>> >
>> > The following excerpts of the FreeBSD, Linux, and Open Specification
>> > seem
>> > in agreement that passing POLLHUP is unnecessary as it is checked
>> > implicitly.
>> >
>> > FreeBSD (POLL(2))
>> >   This flag is always checked, even if not present in the events
>> > bitmask
>> > [...]
>> >
>> > Open Group:
>> >   This flag is only valid in the revents bitmask; it shall be ignored in
>> > the
>> >   events member.
>> >
>> > Linux (poll(2)):
>> >   Hang up (only  returned  in revents; ignored in events).
>> >
>> >
>> > I am surprised by the behaviour being different depending on the moment
>> > the
>> > child process' death occurs.
>> >
>> > This is not a big deal for us to work-around, but I would like to know
>> > if I
>> > should open a bug and try to fix it or if this is intentional (and
>> > perhaps
>> > documented?) behaviour.
>> >
>> > Thanks!
>> > Jérémie Galarneau
>> >
>> > [1] https://gist.github.com/jgalar/5c3c2673b69fa0df652bda80a88f860c
>> >
>>
>> Thanks for the detailed report with a reproducer.
>>
>> pipe_poll checks for POLLIN | POLLRDNORM and POLLOUT | POLLWRNORM in
>> order to decide whether to add itself to the list of waiters. Since
>> you don't specify any of it and POLLHUP condition is not met, it
>> neglects to do anything but at the same time does not return any
>> events to poll itself. Then poll blocks waiting for wakeups which
>> never come since pipe_poll did not add us anywhere.
>>
>> A trivial hack looks like this:
>> diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c
>> index 239cf3bbdfb..59bc03e032a 100644
>> --- a/sys/kern/sys_pipe.c
>> +++ b/sys/kern/sys_pipe.c
>> @@ -1458,13 +1458,13 @@ pipe_poll(struct file *fp, int events, struct
>> ucred *active_cred,
>>         }
>>
>>         if (revents == 0) {
>> -               if (fp->f_flag & FREAD && events & (POLLIN | POLLRDNORM))
>> {
>> +               if (fp->f_flag & FREAD) {
>>                         selrecord(td, &rpipe->pipe_sel);
>>                         if (SEL_WAITING(&rpipe->pipe_sel))
>>                                 rpipe->pipe_state |= PIPE_SEL;
>>                 }
>>
>> -               if (fp->f_flag & FWRITE && events & (POLLOUT |
>> POLLWRNORM)) {
>> +               if (fp->f_flag & FWRITE) {
>>                         selrecord(td, &wpipe->pipe_sel);
>>                         if (SEL_WAITING(&wpipe->pipe_sel))
>>                                 wpipe->pipe_state |= PIPE_SEL;
>>
>> With this in place the reproducer passes. I don't know yet if this is
>> just a pipe or general poll problem.
>>
>> I don't know what the right fix is right now, may take few days. This
>> may or may not be a candidate for errata for the 12.2 release,
>> depending on how the extensive the real fix will turn out to be.
>>
>> That said you may need to implement a workaround regardless of the
>> issue getting fixed.
>
> Hi,
>
> Thanks for the great (and quick!) reply.
> Let me know if I can help with the fix in any way.
>

The fix landed in https://svnweb.freebsd.org/changeset/base/367352 in
the development branch and I'm going to merge it in few days to the
stable/12 -- that is it will be a part of future releases.

I don't intend to work on issuing an errata for 12.2 as the problem
seems to have an easy workaround. However, if fixing this is of
importance I can look into it.

-- 
Mateusz Guzik <mjguzik gmail.com>
_______________________________________________
freebsd-hackers@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic