[prev in list] [next in list] [prev in thread] [next in thread] 

List:       python-dev
Subject:    [Python-Dev] Re: (PEP 620) C API for efficient loop iterating on a sequence of PyObject** or other C
From:       Serhiy Storchaka <storchaka () gmail ! com>
Date:       2020-06-28 10:19:14
Message-ID: rd9qr3$2s9a$1 () ciao ! gmane ! io
[Download RAW message or body]

23.06.20 12:18, Victor Stinner пише:
> For example, we can consider continuing to provide raw access to a
> PyObject** array, but an object can reply "sorry, I don't support this
> PyObject** protocol". Also, I expect to have a function call to notify
> the object when the PyObject** view is not longer needed. Something
> like Py_buffer protocol PyBuffer_Release(). Maybe an object can
> generate a temporary PyObject** view which requires to allocate
> resources (like memory) and the release function would release these
> resources.
> 
> Pseudo-code:
> 
> void iterate(PyObject *obj)
> {
> PyObjectPP_View view;
> 
> if (PyObjectPP_View_Get(&view, obj)) {
> // fast-path: the object provides a PyObject** view
> for (Py_ssize_t i=0; i < view.len; i++ {
> PyObject *item = view.array[i];
> ...
> }
> PyObjectPP_View_Release(&view);
> }
> else {
> // slow code path using PySequence_GetItem() or anything else
> ...
> }
> 
> Maybe PyObjectPP_View_Get() should increment the object reference
> counter to ensure that the object cannot be destroyed in the loop (if
> the loop calls arbitrary Python code), and PyObjectPP_View_Release()
> would decrement its reference counter.

It is not enough. A list can change content and size during iteration. 
You need either add the "export" count which prevent list mutating, or 
copy the list, or use such tricks as temporary swapping its content with 
the empty list for the time of iteration. In all cases it is a user 
visible change in behavior.

> "PyObjectPP_View" protocol looks like PySequence_Fast() API, but IMO
> PySequence_Fast() is not generic enough. For example, the first issue
> is that it cannot reply "no, sorry, the object doesn't support
> PyObject**". It always creates a temporary list if the object is not a
> tuple or a list, that may be inefficient for a large sequence.

If you want to avoid conversion to a list, you can check that the object 
is a tuple or a list before using PySequence_Fast*() API. I don't see a 
need for new API for this.

> Also, the "view" protocol should be allowed to query other types than
> just PyObject**. For example, what if I would like to iterate on a
> sequence of integers? bytes, array.array and memoryview can be seen as
> sequences of integers.

Were not the buffer protocol and the memoryview object designed for 
this? In Python you can call `memoryview(obj).cast('I')` and then 
iterate integers. I think there is something similar in C.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/AAPDO2M4OMNLVTQ4ZBWGFUFGG4IBFJZ4/
 Code of Conduct: http://python.org/psf/codeofconduct/


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic