[prev in list] [next in list] [prev in thread] [next in thread] 

List:       python-dev
Subject:    [Python-Dev] Intended invariants for signals in CPython
From:       Yonatan Zunger via Python-Dev <python-dev () python ! org>
Date:       2020-06-24 21:34:37
Message-ID: CAFk=nbRwoGhYXhS7cu4=KrYYCq6V4SUQHC7wzAp8cDhU=68+4Q () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Hi everyone,

I'm in the process of writing some code to defer signals during critical
regions, which has involved a good deal of reading through the CPython
implementation to understand the behaviors. Something I've found is that
there appears to be a lot of thoughtfulness about where the signal handlers
can be triggered, but this thoughtfulness is largely undocumented. I've put
together a working list of behaviors from staring at the code, but what I'd
like to figure out is which of these behaviors the devs think of as
intended to be invariants, versus which are just accidents of how the code
currently works and might change unpredictably.

And if there are things which are intended to be genuine invariants, would
it be reasonable to document these formally and make them part of the
language, not just for inside the CPython codebase?

What appears to be true is this:

   - Signal handlers are only invoked in the main thread (documented with
   the signal library)
   - High-level: Signal handlers may be invoked at any instruction
   boundary. External C libraries *may* invoke them as well, but there are
   no general guarantees. (Documented with the signal library)
   - Low-level: Certain functions can be described as "interruptable," and
   signal handlers may be invoked whenever these functions are called.
   - Signal handlers are thus partially reentrant: a signal handler may be
   interrupted by another signal iff it invokes an interruptable function.

In particular, the thing whose intentionality I'm not sure about is whether
the notion of an interruptable function or instruction is meant to be an
actual property of the language and/or of the CPython runtime, or whether
it's actually intended that only the "high-level" rule above be true, and
that all signal handlers should be considered to be fully reentrant at all
times. The comments in sysmodule.c about avoiding triggering
PyErr_CheckSignals() suggest that there definitely is some thinking about
this within the CPython code itself.

The reason it would be useful to document this is so that if I'm trying to
write a fairly generic library that handles signals (like the one I'm doing
now) I can reason about where I need to be defensive about an instruction
being interrupted by yet another signal, and maybe avoid calls to certain
functions which are known to be interruptable, much like I would avoid
calling malloc() in a C signal handler.

In the current implementation, the interruptable functions and instructions
are:

Big categories:

   - Any function which calls PyErr_SetFromErrno, *if* errno == EINTR.
   (Catalogue needs to be made of these -- it's a much smaller set than the
   set of all calls to PyErr_SetFromErrno)
   - Basically any open, read, or write method of a raw or buffered file
   object.
   - Likewise, any open, read, or write method on a socket.
   - In any interactive console readline, or in input().
   - object.__str__, object.__repr__, and PyObject_Print, and anything that
   falls back to these.

Specific instructions:

   -
   - Multiplication, division, or stringification of long integers.

More specific functions:

   - In `multiprocessing.shared_memory`, SharedMemory.__init__, .close, and
   .unlink.
   - In `multiprocessing.semaphore`, Semaphore.acquire. (But interestingly,
   *not* threading.Semaphore.acquire)
   - In `signal`, pause, signal, sigwaitinfo, sigtimedwait, pthread_kill,
   and pthread_sigmask.
   - In `fcntl`, fcntl and ioctl.
   - In `traceback`, any of the print methods.
   - In `faulthandler`, dump_traceback
   - In `select`, all of the methods. (select, epoll, etc)
   - In `time`, sleep.
   - In `curses`, whenever you look for key input.
   - In `tkinter`, during the main loop of a Tcl/Tk app.
   - During an SSL handshake.

-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

zunger@humu.com

100 View St, Suite 101

Mountain View, CA 94041

Humu.com <https://www.humu.com>   · LinkedIn
<https://www.linkedin.com/company/humuhq>   · Twitter
<https://twitter.com/humuinc>

[Attachment #5 (text/html)]

<div dir="ltr">Hi everyone,<div><br></div><div>I&#39;m in the process of writing some \
code to defer signals during critical regions, which has involved a good deal of \
reading through the CPython implementation to understand the behaviors. Something \
I&#39;ve found is that there appears to be a lot of thoughtfulness about where the \
signal handlers can be triggered, but this thoughtfulness is largely undocumented. \
I&#39;ve put together a working list of behaviors from staring at the code, but what \
I&#39;d like to figure out is which of these behaviors the devs think of as intended \
to be invariants, versus which are just accidents of how the code currently works and \
might change unpredictably.</div><div><br></div><div>And if there are things which \
are intended to be genuine invariants, would it be reasonable to document these \
formally and make them part of the language, not just for inside the CPython \
codebase?</div><div><br></div><div>What appears to be true is \
this:</div><div><ul><li>Signal handlers are only invoked in the main thread \
(documented with the signal library)</li><li>High-level: Signal handlers may be \
invoked at any instruction boundary. External C libraries <i>may</i>  invoke them as \
well, but there are no general guarantees. (Documented with the signal \
library)</li><li>Low-level: Certain functions can be described as \
&quot;interruptable,&quot; and signal handlers may be invoked whenever these \
functions are called.</li><li>Signal handlers are thus partially reentrant: a signal \
handler may be interrupted by another signal iff it invokes an interruptable \
function.</li></ul><div>In particular, the thing whose intentionality I&#39;m not \
sure about is whether the notion of an interruptable  function or instruction is \
meant to be an actual property of the language and/or of the CPython runtime, or \
whether it&#39;s actually intended that only the &quot;high-level&quot; rule above be \
true, and that all signal handlers should be considered to be fully reentrant at all \
times. The comments in sysmodule.c about avoiding triggering PyErr_CheckSignals() \
suggest that there definitely is some thinking about this within the CPython code \
itself.</div><div><br></div><div>The reason it would be useful to document this is so \
that if I&#39;m trying to write a fairly generic library that handles signals (like \
the one I&#39;m doing now) I can reason about where I need to be defensive about an \
instruction being interrupted by yet another signal, and maybe avoid calls to certain \
functions which are known to be interruptable, much like I would avoid calling \
malloc() in a C signal handler.</div><div><br></div><div>In the current \
implementation, the interruptable functions and instructions \
are:</div><div><br></div><div>Big categories:</div><div><ul><li>Any function which \
calls PyErr_SetFromErrno, <i>if</i>  errno == EINTR. (Catalogue needs to be made of \
these -- it&#39;s a much smaller set than the set of all calls to \
PyErr_SetFromErrno)</li><li>Basically any open, read, or write method of a raw or \
buffered file object.</li><li>Likewise, any open, read, or write method on a \
socket.</li><li>In any interactive console readline, or in \
input().</li><li>object.__str__, object.__repr__, and PyObject_Print, and anything \
that falls back to these.</li></ul><div>Specific \
instructions:</div><div><ul><li></li><li>Multiplication, division, or stringification \
of long integers.<br></li></ul></div>More specific functions:<br><ul><li>In \
`multiprocessing.shared_memory`, SharedMemory.__init__, .close, and \
.unlink.</li><li>In `multiprocessing.semaphore`, Semaphore.acquire. (But \
interestingly, <i>not</i>  threading.Semaphore.acquire)</li><li>In `signal`, pause, \
signal, sigwaitinfo, sigtimedwait, pthread_kill, and  pthread_sigmask.</li><li>In \
`fcntl`, fcntl and ioctl.</li><li>In `traceback`, any of the print \
methods.</li><li>In `faulthandler`, dump_traceback</li><li>In `select`, all of the \
methods. (select, epoll, etc)</li><li>In `time`, sleep.</li><li>In `curses`, whenever \
you look for key input.</li><li>In `tkinter`, during the main loop of a Tcl/Tk \
app.</li><li>During an SSL handshake.</li></ul></div>-- <br><div dir="ltr" \
data-smartmail="gmail_signature"><div dir="ltr"><span><p dir="ltr" \
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span \
style="font-size:10pt;font-family:Arial;background-color:transparent;font-weight:700;vertical-align:baseline;white-space:pre-wrap">Yonatan \
Zunger</span></p><p dir="ltr" \
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span \
style="font-size:10pt;font-family:Arial;color:rgb(87,87,87);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">Distinguished \
Engineer and Chief Ethics Officer</span></p><br><p dir="ltr" \
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span \
style="font-size:10pt;font-family:Arial;color:rgb(87,87,87);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">He \
/ Him</span></p><p dir="ltr" \
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span \
style="font-size:10pt;font-family:Arial;color:rgb(87,87,87);background-color:transparent;vertical-align:baseline;white-space:pre-wrap"><a \
href="mailto:zunger@humu.com" target="_blank">zunger@humu.com</a></span></p><br><p \
dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:5pt"><span \
style="font-size:10pt;font-family:Arial;background-color:transparent;vertical-align:baseline;white-space:pre-wrap"><span \
style="border:none;display:inline-block;overflow:hidden;width:103px;height:20px"><img \
src="https://lh5.googleusercontent.com/cbNu6uIZLHzGYDT--Od9jIo0V4ElIbuHtRloM82dDBNgkD7 \
cd7XP_1kVrNsQXFHHOL5D6NJU4m_453APc94TeiQ9JaVHNP8aKyaitaTRfDU6Q44li58sTqatoWQaZXJYvBnHNGLu" \
width="103" height="20" style="margin-left:0px;margin-top:0px"></span></span></p><p \
dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span \
style="font-size:10pt;font-family:Arial;color:rgb(87,87,87);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">100 \
View St, Suite 101</span></p><p dir="ltr" \
style="line-height:1.38;margin-top:0pt;margin-bottom:10pt"><span \
style="font-size:10pt;font-family:Arial;color:rgb(87,87,87);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">Mountain \
View, CA 94041</span></p><p dir="ltr" \
style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><a \
href="https://www.humu.com" target="_blank"><span \
style="font-size:10pt;font-family:Arial;color:rgb(17,85,204);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">Humu.com</span></a><span \
style="font-size:10pt;font-family:Arial;color:rgb(87,87,87);background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> \
·  </span><a href="https://www.linkedin.com/company/humuhq" target="_blank"><span \
style="font-size:10pt;font-family:Arial;color:rgb(17,85,204);background-color:transparent;vertical-align:baseline;white-space:pre-wrap">LinkedIn</span></a><span \
style="font-size:10pt;font-family:Arial;color:rgb(87,87,87);background-color:transparent;vertical-align:baseline;white-space:pre-wrap"> \
·  </span><a href="https://twitter.com/humuinc" target="_blank"><span \
style="font-size:10pt;font-family:Arial;color:rgb(17,85,204);background-color:transpar \
ent;vertical-align:baseline;white-space:pre-wrap">Twitter</span></a></p></span></div></div></div></div>




_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/W5LGEEWGGO7ODIAJXM54YSI2PZR5UO6Y/
 Code of Conduct: http://python.org/psf/codeofconduct/



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic