'Re: [PyQt] QString API v2 concern...'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       pykde
Subject:    Re: [PyQt] QString API v2 concern...
From:       Matt Newell <newellm () blur ! com>
Date:       2013-05-09 21:17:39
Message-ID: 201305091417.39787.newellm () blur ! com
[Download RAW message or body]

On Thursday, May 09, 2013 12:40:13 PM you wrote:
> Hi Matt,
> 
> I'm in no position to comment on your wider point, but...
> > 
> FWIW, the Python-uses-roughly-utf16 meme is a common oversimplification.
> First, as I'm sure most people know, there are significant changes between
> Python2 str/unicode and Python3 str. That cannot but be reflected in
> differences between the CPython usage across the 2/3 boundary.
> 
> What is less well known is that there is a significant change to CPython
> between 3.2 and 3.3 where the latter can store a str as either an array of
> 8, 16 or 32 bit values with automatic run-time conversions between them
> (and API changes to match). So whatever else happens within PyQt, I don't
> think the aspiration to the old 1-copy model can be relied on. In the
> event, this is what I came up with for the QString to str direction
> (corrections/optimisations welcome!):
> 
> 
> PyObject *Python::unicode(const QString &string){#if PY_MAJOR_VERSION < 3
>     /* Python 2.x. http://docs.python.org/2/c-api/unicode.html */
>     PyObject *s = PyString_FromString(PQ(string));
>     PyObject *u = PyUnicode_FromEncodedObject(s, "utf-8", "strict");
>     Py_DECREF(s);
>     return u;#elif PY_MINOR_VERSION < 3
>     /* Python 3.2 or less.
> http://docs.python.org/3.2/c-api/unicode.html#unicode-objects */#ifdef
> Py_UNICODE_WIDE
>     return PyUnicode_DecodeUTF16((const char *)string.constData(),
> string.length() * 2, 0, 0);#else
>     return PyUnicode_FromUnicode(string.constData(),
> string.length());#endif#else /* Python 3.3 or greater.
> http://docs.python.org/3.3/c-api/unicode.html#unicode-objects */
>     return PyUnicode_FromKindAndData(PyUnicode_2BYTE_KIND,
> string.constData(), string.length());#endif}
> 
> 
> The referenced URLs contain more material.
> 
> Hth, Shaheed
> 

Interesting.  Thanks for the info.  Looking at the python source code for 
3.3.1 it looks like python will scan the string and either convert to 8-bit 
data if all the data falls in the latin1 range, or do a direct copy.  That 
means the situation is probably a bit worse wrt cpu usage, and a bit better 
wrt memory, at least in the common case.

Matt

_______________________________________________
PyQt mailing list    PyQt@riverbankcomputing.com
http://www.riverbankcomputing.com/mailman/listinfo/pyqt
[prev in list] [next in list] [prev in thread] [next in thread]