[prev in list] [next in list] [prev in thread] [next in thread] 

List:       python-capi-sig
Subject:    [capi-sig]Re: CPython C API Design Guidelines
From:       Jack Jansen <jack.jansen () cwi ! nl>
Date:       2019-03-01 23:49:42
Message-ID: 999CABEF-E1A4-4FE9-81BF-54DD0625AB0E () cwi ! nl
[Download RAW message or body]

I'm very late chiming in, but here goes anyway. I really like this because I think \
the orthogonality of layers and rings solves the problems that all simplifications of \
the API (going back to the days when object.h was given hands and feet) have \
struggled with: different includers of the headers have different ideas of what the \
layering is. I have the feeling that issues like the GC issue aren't showstoppers, \
and can be solved by requiring (for future code) that anything that relies on \
refcounting must include "cpython/refcounting.h" early on, and if possible and \
desirable GC can be abstracted out later with a more generally applicable API.

One of the things I'm wondering: would it be an idea to sacrifice the toplevel \
include files for backward compatibility? In other words: move things like "Python.h" \
into "api/Python.h" for all future code. so that the toplevel Python.h be initially \
used for backward compatibility, then at some future release started producing \
compile time warnings, then for a later release disappear?

And another thing I'm wondering about is cost/benefit analysis. I'm assuming (and I'm \
assuming you're assuming) this whole thing will be beneficial in the long run for \
backward/forward compatibility of extenders, embedders and ports, and thereby make \
life easier for the people maintaining those. So I wonder whether it would be worth \
it to see what the effect would be on some of those. I can think of ports of Python \
to strange and wondrous platforms (MicroPython, or MVS Python, if MVS is actually \
still strange and wondrous) or with strange and wondrous embedding needs (I can think \
of PyObjC with its two-way transparent bridging of objects).

Jack

> On 21 Feb 2019, at 19:25, Steve Dower <steve.dower@python.org> wrote:
> 
> I've had enough ideas bouncing around in my head that I had to get them written up \
> :) 
> So I'm proposing to produce an informational PEP to describe what a "good" C API \
> looks like and act as guidance as we implement new APIs or change existing ones. 
> This is a rough, incomplete first draft that nonetheless I think is enough to \
> trigger useful discussions. It's a brain dump, but I've already dumped most of this \
> before. 
> They're in the text below, but I'll repeat here:
> * this is NOT a brand-new API
> * this is NOT exactly what we currently have implemented
> * this is NOT a proposal to stop shipping half the standard library
> * this IS meant to provide context for discussing both the issues with our current \
> API and to help drive discussions of any new API or API changes 
> I don't have any particular desire to own the entire doc, so if anyone wants to \
> become a co-author I'm very open to that. However, I do have strong opinions on \
> this topic after a number of years working with *excellent* API designs, designers \
> and processes. If you want to propose a _totally_ different vision from this, \
> please consider writing an alternative rather than trying to co-opt this one :) 
> (Doc in approximate Markdown, automatically wrapped to 72 cols for email and I \
> haven't checked if that broke stuff. Sorry if it did) 
> Cheers,
> Steve
> 
> ---
> 
> CPython C API Design Guidelines
> 
> 
> # Abstract
> 
> This document is intended to be a set of guiding principles for
> development of the current CPython C API. Future additions and
> enhancements to the CPython C API should follow, or at least be
> influenced by, the principles described here. At a minimum, any new or
> modified C APIs should be able to be categorised according to the
> terminology defined here, even if exceptions have to be made.
> 
> 
> # Things this document is NOT
> 
> This document is NOT a design of a completely new API (though a
> hypothetical new API should follow this design).
> 
> This document is NOT documentation of the current API (though the
> current API should come to resemble it over time).
> 
> This document is NOT a set of binding rules in the same sense as PEP 7
> and PEP 8 (though designs should be tested against it and exceptions
> should be rare).
> 
> This document is NOT permission to make backwards-incompatible
> modifications to the current API (though backwards-incompatible
> modifications should still be made where warranted).
> 
> 
> # Definitions
> 
> A common understanding of certain terms is necessary to talking about
> the CPython C API. This section has two goals: to clarify existing
> common terminology, and to introduce new terminology. Terms are
> presented in a logical order, rather than alphabetically.
> 
> ## Existing terms
> 
> **Application**: Any independent program that can be launched directly.
> Compare and contrast with *extension*. CPython is normally considered an
> application.
> 
> **Extension**: A program that integrates into an application, and cannot
> be launched directly but must be loaded by that application. Python
> modules, native or otherwise, are considered extenions. When embedded
> into another application, CPython is considered an extension.
> 
> **Native extension**: A subset of all extensions that are compiled to
> the same language as the application they integrate with. When embedded
> into an application that is written in C or uses C-compatible
> conventions, CPython is considered a native extension.
> 
> **API**: Application Programming Interface. The set of interactions
> defined by an application to allow extensions to extend, control, and
> interact with the first. Typically refers to OOP objects and functions
> in the abstract. CPython has one API that applies for all scenarios in
> all contexts, though each scenario will likely only use a subset of this
> API.
> 
> **ABI**: Application Binary Interface. The implementation of an API such
> that its interactions can be realized by a digital computer. Typically
> includes memory layouts and binary representations, and is a function of
> the build tools used to compile CPython. CPython has different ABIs in
> different contexts, and a different ABI for native extensions compared
> to extensions.
> 
> **Stdlib**: Standard library. Components that build upon the Python
> language in order to provide useful building blocks and pre-written
> functionality for users.
> 
> ## New terms
> 
> These terms are introduced briefly here and described in much greater
> detail below.
> 
> **API ring**: One subset of an API for the purpose of extension
> compatibility. Extensions to CPython care about rings. Extensions choose
> to target a particular ring to trade off between deeper integration and
> tighter coupling. Targeting one ring includes access to all rings
> outside of that one. Rings are orthogonal to layers.
> 
> **API layer**: One subset of an API for the purpose of application and
> internal compatibility. Applications that embed CPython, and the CPython
> implementation itself, cares about layers. Applications choose to adopt
> or implement a particular layer, implicitly including all lower layers.
> Layers are orthogonal to rings.
> 
> 
> # Quick Overview
> 
> For context as you continue reading, these are the API **rings**
> provided by CPython:
> 
> * Python ring (equivalent of the Python language)
> * CPython ring (CPython-specific APIs)
> * Internal ring (intended for internal use only)
> 
> These are the API **layers** provided by CPython:
> 
> * Optional stdlib layer (dependencies that must be explicitly required)
> * Required stdlib layer (dependencies that can be assumed)
> * Platform adaption layer (ability to interact with the platform)
> * Core layer ("pure" mode with no platform interactivity)
> 
> (Reminder that this document does not reflect the current state of
> CPython, but is both aspirational and defining terms for the purposes of
> discussion. This is not a proposal to remove anything from the standard
> distribution!)
> 
> 
> # API Rings
> 
> CPython provides three API rings, listed here from outermost to
> innermost:
> 
> * Python ring
> * CPython ring
> * Internal ring
> 
> An extension that targets the Python ring does not have access to the
> CPython or Internal rings. Likewise, an extension that targets the
> CPython ring does not have access to the Internal ring, but does use the
> Python ring.
> 
> When CPython is an extension of another application, that application
> can also select which ring to target.
> 
> The expectation is that all Python implementations can provide an
> equivalent Python ring, CPython officially supports extensions using the
> CPython ring when targeting CPython, and the Internal ring is available
> but unsupported.
> 
> ## Python API ring
> 
> The Python ring provides functionality that should be equivalent across
> all Python implementations - in essence, the Python language itself
> defines this ring.
> 
> The C implementation of the Python API allows native code to interact
> with Python objects as if it were written in Python. The Python API
> supports duck-typing and should correctly handle the substitution of
> alternative types.
> 
> For a concrete example, `PyObject_GetItem` is part of the Python ring
> while `PyDict_GetItem` is in the CPython ring.
> 
> Compatibility requirements for the Python API match the language
> version. Specifically, code relying on the Python API should only break
> or change behaviour if the equivalent code written in Python would also
> break or change behaviour.
> 
> For CPython, including `Python.h` should only provide access to the
> Python ring. Accessing any other rings should produce a compile error.
> 
> ## CPython API ring
> 
> The CPython ring provides functionality that is specific to CPython.
> Extensions that opt in to the CPython ring are tied directly to CPython,
> but have access to functions that are specific to CPython.
> 
> Functions in the CPython ring may require the caller to be using C or be
> able to provide C structures allocated in memory.
> 
> In general, most applications that embed CPython will use the CPython
> ring. Also, native extensions in the Optional stdlib layer
> 
> For a concrete example, the `PyCapsule` type belongs in the CPython ring
> (that is, other implementations are not required to provide this
> particular way to smuggle C pointers through Python objects).
> 
> As a second concrete example, `PyType_FromSpec` belongs in the CPython
> ring. (The equivalent in the Python ring would be to call the `type`
> object, while the equivalent in the internal ring would be to define a
> statis `PyTypeObject`.)
> 
> Compatibility requirements for the CPython API match the CPython
> major.minor version. Specifically, code relying on the CPython API
> should only break or change behaviour if the major.minor version
> changes.
> 
> For CPython, as well as `Python.h`, also include `cpython/<header>.h` to
> obtain access to APIs in the CPython ring.
> 
> ## Internal API ring
> 
> The Internal ring provides functionality that is used to implement
> CPython. Extensions that opt in to the Internal ring may need to rebuild
> for every CPython build.
> 
> In general, most of the Required stdlib layer will use the Internal
> ring.
> 
> For CPython, as well as `Python.h`, also include `internal/<header>.h`
> to obtain access to APIs in the Internal ring.
> 
> 
> # API Layers
> 
> CPython provides four API layers, listed here from top to bottom:
> 
> * Optional stdlib layer
> * Required stdlib layer
> * Platform adaptation layer
> * Core layer
> 
> An application embedding Python targets one layer and all those below
> it, which affects the functionality available in Python.
> 
> Higher layers may depend on the APIs provided by lower layers, but not
> the other way around. In general, layers should aim to maximise
> interaction with the next layer down and avoid skipping it, but this is
> not a strict requirement.
> 
> Lower layers are required to maintain backwards compatibility more
> strictly than the layers above them.
> 
> Components within a layer that depend on other components within that
> layer must be treated as a single component for determining whether it
> may be included or omitted.
> 
> Standard Python distributions (that is, anything that may be launched
> with the `python` command) will depend upon most components in the
> Optional stdlib layer, and hence will require _everything_ from the
> Required stdlib layer and below. Only embedders and potentially
> deployment tools will use reduced layers.
> 
> (Reminder: this document does not present the current state of CPython.)
> 
> ## Core layer
> 
> This layer is the core language and evaluation engine. By adopting this
> layer, an application can provide platform-independent Python execution.
> However, it may require providing implementations of a number of
> callbacks in order to be functional (e.g. for dynamic memory
> allocation).
> 
> Examples of current components that fit into the core layer:
> 
> * Most of most built-in types (str, int, list, dict, etc.)
> * compile, exec, eval
> * read-only members of the sys module
> * import
> 
> Important but potentially non-obvious implications of relying only on
> the core layer:
> 
> * Dynamic memory allocation/deallocation is part of the Platform
> adaptation layer, but there is no way to avoid it here. So any user of
> the core API will need to provide allocators and deallocators. The
> CPython Platform adaptation layer provides the "default"
> implementations, but if an embedder does not want to use these then
> targeting the Core layer will omit them.
> * File system and standard streams are part of the Platform adaptation
> layer, which leaves `open` and `sys.stdout` (among others) without a
> default implementation. An application that wants to support these
> without adding more layers needs to provide its own implementations
> * The core layer only exposes UTF-8 APIs. Encoding and decoding for the
> current platform requires the Platform adaptation layer, while arbitrary
> encoding and decoding requires the Optional stdlib layer.
> * Imports in the core layer are satisfied by a "blind" callback. The
> Platform adaptation layer provides the support for frozen, bytecode and
> natively-encoded source imports, while the Optional stdlib layer is
> required for arbitrary encodings in source files
> 
> ## Platform adaptation layer
> 
> This layer provides the CPython implementation of platform-specific
> adapters to support the core layer.
> 
> * Memory allocation/deallocation
> * File system access
> * Standard input/output streams
> * Cryptographic random number generation
> * os module
> * CPython imports
> 
> Important but potentially non-obvious implications of relying only on
> the platform adaptation layer:
> 
> * File system access generally requires text encodings, but the full set
> of codecs are in the optional stdlib layer. To fully separate these
> layers, an implementation of the current file system encoding would be
> required in the Platform adaptation layer. (But arbitrarily
> encoding/decoding the _contents_ of a file may require higher layers.)
> * Importing from source code may also require arbitrary encodings, but
> imports that can be fully satisfied without this are provided here (e.g.
> native extension modules, precompiled bytecode, frozen modules, natively
> encoded source files)
> 
> ## Required stdlib layer
> 
> This layer provides common APIs for interactions between other modules.
> All components in the Optional stdilib layer may assume that if _they_
> are present, everything in this layer is also present.
> 
> * standard ABCs
> * compiler services (e.g. `copy`, `functools`, `traceback`)
> * standard interop types (e.g. `pathlib`, `enum`, `dataclasses`)
> 
> ## Optional stdlib layer
> 
> This layer provides modules that fundamentally stand alone. None of the
> lower levels may depend on these components being present, and
> components in this layer should explicitly declare dependencies on
> others in the same layer.
> 
> This layer is valuable for embedders and distributors that want to omit
> certain functionality. For example, omitting `socket` should be possible
> when that functionality is not required, as it is in the Optional stdlib
> layer, and omitting it should only affect those components in the
> Optional stdlib layer that have explicitly required it.
> 
> * platform-independent algorithms (e.g. `itertools`, `statistics`)
> * application-specific functionality (e.g. `email`, `socket`, `ftplib`,
> `ssl`)
> * additional compiler services (e.g. `ast`)
> * text codecs (e.g. `base64`, `codecs`, `encodings`)
> * Python-level FFI (e.g. `ctypes`)
> * tools (e.g. ``idlelib``, ``pynche``, ``distutils``, ``msilib``)
> * configuration/information (e.g. ``site``, ``sysconfig``, ``platform``)
> 
> Components in the Optional stdlib layer may be independently versioned.
> _______________________________________________
> capi-sig mailing list -- capi-sig@python.org
> To unsubscribe send an email to capi-sig-leave@python.org
_______________________________________________
capi-sig mailing list -- capi-sig@python.org
To unsubscribe send an email to capi-sig-leave@python.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic