[prev in list] [next in list] [prev in thread] [next in thread]
List: pykde
Subject: Re: [PyQt] QString API v2 concern...
From: Matt Newell <newellm () blur ! com>
Date: 2013-05-09 21:17:39
Message-ID: 201305091417.39787.newellm () blur ! com
[Download RAW message or body]
On Thursday, May 09, 2013 12:40:13 PM you wrote:
> Hi Matt,
>
> I'm in no position to comment on your wider point, but...
> >
> FWIW, the Python-uses-roughly-utf16 meme is a common oversimplification.
> First, as I'm sure most people know, there are significant changes between
> Python2 str/unicode and Python3 str. That cannot but be reflected in
> differences between the CPython usage across the 2/3 boundary.
>
> What is less well known is that there is a significant change to CPython
> between 3.2 and 3.3 where the latter can store a str as either an array of
> 8, 16 or 32 bit values with automatic run-time conversions between them
> (and API changes to match). So whatever else happens within PyQt, I don't
> think the aspiration to the old 1-copy model can be relied on. In the
> event, this is what I came up with for the QString to str direction
> (corrections/optimisations welcome!):
>
>
> PyObject *Python::unicode(const QString &string){#if PY_MAJOR_VERSION < 3
> /* Python 2.x. http://docs.python.org/2/c-api/unicode.html */
> PyObject *s = PyString_FromString(PQ(string));
> PyObject *u = PyUnicode_FromEncodedObject(s, "utf-8", "strict");
> Py_DECREF(s);
> return u;#elif PY_MINOR_VERSION < 3
> /* Python 3.2 or less.
> http://docs.python.org/3.2/c-api/unicode.html#unicode-objects */#ifdef
> Py_UNICODE_WIDE
> return PyUnicode_DecodeUTF16((const char *)string.constData(),
> string.length() * 2, 0, 0);#else
> return PyUnicode_FromUnicode(string.constData(),
> string.length());#endif#else /* Python 3.3 or greater.
> http://docs.python.org/3.3/c-api/unicode.html#unicode-objects */
> return PyUnicode_FromKindAndData(PyUnicode_2BYTE_KIND,
> string.constData(), string.length());#endif}
>
>
> The referenced URLs contain more material.
>
> Hth, Shaheed
>
Interesting. Thanks for the info. Looking at the python source code for
3.3.1 it looks like python will scan the string and either convert to 8-bit
data if all the data falls in the latin1 range, or do a direct copy. That
means the situation is probably a bit worse wrt cpu usage, and a bit better
wrt memory, at least in the common case.
Matt
_______________________________________________
PyQt mailing list PyQt@riverbankcomputing.com
http://www.riverbankcomputing.com/mailman/listinfo/pyqt
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic