[prev in list] [next in list] [prev in thread] [next in thread]
List: kwrite-devel
Subject: Re: Using i18n
From: "Philipp A." <flying-sheep () web ! de>
Date: 2013-05-04 19:32:13
Message-ID: CAN8d9gn3MEcUTrzoeREZoyffqQrDNRO68BSRrOoWUSNu=tejqw () mail ! gmail ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
besides, a internationalization function that only works with ascii would
be pretty pointless, no? =)
the whole point of unicode is that it works with foreign languages and
glyphs, so i18n better eats every unicode i throw at it and spits out
unicode strings!
2013/5/4 Philipp A. <flying-sheep@web.de>
> > i18n only works on utf8 formatted "ascii strings"
>
> no, as i said, this works: i18n(' ·'.encode('utf-8'))
>
> " ·" is unicode, not ascii. when i encode it to utf-8 bytes, and call i18n
> on it, i18n flawlessly returns a unicode string, which means that it
> decodes the bytes it's passed as utf-8, not ascii. only if i pass it a
> unicode string, it inexplicably tries to *encode* it to bytes with the
> ascii codec. no idea why, but it's true.
>
> so what's the case is that i18n works on either
> 1. *utf-8-encoded* byte strings of the whole unicode range
> 2. unicode strings which happen to only contain characters from the ascii
> range.
>
> so it accepts only objects that survive the following treatment:
>
> def test(t):
> u = str if sys.version_info.major == 3 else unicode
> if isinstance(t, u):
> return t.encode('ascii')
> else:
> assert isinstance(t, bytes)
> return t
>
>
> 2013/5/4 Albert Astals Cid <aacid@kde.org>
>
>> El Divendres, 3 de maig de 2013, a les 22:17:29, Philipp A. va escriure:
>> > well, it is:
>> > >>> from sys import version_info
>> > >>> version_info[:2]
>> >
>> > (3, 3)
>> >
>> > >>> from PyKDE4.kdecore import versionString, i18n
>> > >>> versionString()
>> >
>> > '4.10.2'
>> >
>> > >>> i18n(' ·'.encode('utf-8'))
>> >
>> > ' ·'
>> >
>> > >>> print(i18n(' ·'))
>> >
>> > Traceback (most recent call last):
>> > File "<stdin>", line 1, in <module>
>> > UnicodeEncodeError: 'ascii' codec can't encode character '\xb7' in
>> position
>> > 0: ordinal not in range(128)
>> >
>> > note that the result of i18n *is* a unicode string, and i18n *accepts*
>> > unicode strings, but only if those unicode strings happen to only
>> contain
>> > ascii – just like in the bad old python2 times.
>> >
>> > so i18n is buggy on KDE 4.10, and we have to work around it.
>>
>> Why is it buggy? i18n only works on utf8 formatted "ascii strings"
>>
>> Are you expecting something in the python part to do some magic?
>>
>> Cheers,
>> Albert
>>
>> >
>> >
>> > 2013/5/3 Shaheed Haque <srhaque@theiet.org>
>> >
>> > > Just after I hit "send", I found this:
>> > >
>> > > http://www.mail-archive.com/pyqt@riverbankcomputing.com/msg14058.html
>> > >
>> > > which suggests this is not an issue???
>> > >
>> > > On 3 May 2013 20:42, Shaheed Haque <srhaque@theiet.org> wrote:
>> > >> Hi Philipp,
>> > >>
>> > >> On 3 May 2013 19:56, Philipp A. <flying-sheep@web.de> wrote:
>> > >>> Hi, i've seen some uses of kdecore.i18n popping up in Paté plugins,
>> and
>> > >>> have some recommendations:
>> > >>>
>> > >>> 2. It takes more than one argument. so for the sake of consistency
>> > >>> instead of doing the ugly
>> > >>>
>> > >>> i18n(b'foo %(name)s.') % { 'name': 'bar'}
>> > >>>
>> > >>> or even the better
>> > >>>
>> > >>> i18n(b'foo {name}.').format(name='bar')
>> > >>>
>> > >>> we should do the Qt-style
>> > >>>
>> > >>> i18n(b'foo %1.', 'bar')
>> > >>>
>> > >>> 1. i18n takes byte strings. even on python3. this means that every
>> time
>> > >>> a developer accustomed to python2 who doesn't know it tries to use
>> it,
>> > >>> the
>> > >>> plugin WILL break for python3 users.
>> > >>
>> > >> I've been using the argument syntax of the third form, but simply
>> > >> specified quoted strings (i.e. without the "b" prefix). Without
>> really
>> > >> thinking about it, I had assumed that i18n would have done something
>> > >> plausible on Python2 (not sure exactly what though!), and on Python3
>> it
>> > >> would just be Unicode all the way. I'd certainly prefer not avoid
>> having
>> > >> to
>> > >> use "b" all over the place.
>> > >>
>> > >>
>> https://github.com/Werkov/PyQt4/blob/master/examples/tools/i18n/i18n.py
>> > >>
>> > >> seems to suggest that something like that is possible, but when I
>> went
>> > >> looking for some docs on this, but could not see an obvious spec. Do
>> you
>> > >> have a reference handy?
>> > >>
>> > >> Thanks, Shaheed
>> > >>
>> > >>> we have to come up with a solution.
>> > >>>
>> > >>> there is a possible solution here, but it involves a fairly
>> convoluted
>> > >>> i18n replacement:
>> > >>>
>> > >>>
>> https://projects.kde.org/projects/kde/applications/kate/repository/revis
>> > >>>
>> ions/master/entry/addons/kate/pate/src/plugins/python_console_ipython/py
>> > >>> thon_console_ipython.py#L36
>> > >>>
>> > >>> should we add that function to libkatepate and call it a day?
>> > >>>
>> > >>> _______________________________________________
>> > >>> KWrite-Devel mailing list
>> > >>> KWrite-Devel@kde.org
>> > >>> https://mail.kde.org/mailman/listinfo/kwrite-devel
>> > >
>> > > _______________________________________________
>> > > KWrite-Devel mailing list
>> > > KWrite-Devel@kde.org
>> > > https://mail.kde.org/mailman/listinfo/kwrite-devel
>> _______________________________________________
>> KWrite-Devel mailing list
>> KWrite-Devel@kde.org
>> https://mail.kde.org/mailman/listinfo/kwrite-devel
>>
>
>
[Attachment #5 (text/html)]
<div dir="ltr"><div>besides, a internationalization function that only works with \
ascii would be pretty pointless, no? =)<br><br></div>the whole point of unicode is \
that it works with foreign languages and glyphs, so i18n better eats every unicode i \
throw at it and spits out unicode strings!<br> </div><div \
class="gmail_extra"><br><br><div class="gmail_quote">2013/5/4 Philipp A. <span \
dir="ltr"><<a href="mailto:flying-sheep@web.de" \
target="_blank">flying-sheep@web.de</a>></span><br><blockquote class="gmail_quote" \
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div \
dir="ltr"><div><div><div><div><div><div><div><div><div><div class="im"><div>> i18n \
only works on utf8 formatted "ascii strings"<br><br></div></div>no, as i \
said, this works: i18n(' ·'.encode('utf-8'))<br> <br></div>
" ·" is unicode, not ascii. when i encode it to utf-8 bytes, and call i18n on it, \
i18n flawlessly returns a unicode string, which means that it decodes the bytes it's \
passed as utf-8, not ascii. only if i pass it a unicode string, it inexplicably tries \
to *encode* it to bytes with the ascii codec. no idea why, but it's true.<br>
<br></div>so what's the case is that i18n works on either<br>1. *utf-8-encoded* byte \
strings of the whole unicode range<br>2. unicode strings which happen to only \
contain characters from the ascii range.<br></div><br></div>
so it accepts only objects that survive the following \
treatment:<br><br></div><div>def test(t):<br></div> u = str if \
sys.version_info.major == 3 else unicode<br> if isinstance(t, u):</div> \
return t.encode('ascii')<br>
</div> else:<br></div> assert isinstance(t, bytes)<br></div> \
return t<div><div class="h5"><br><div><div><div><div><div><div><div><div \
class="gmail_extra"><br><div class="gmail_quote">2013/5/4 Albert Astals Cid <span \
dir="ltr"><<a href="mailto:aacid@kde.org" \
target="_blank">aacid@kde.org</a>></span><br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex">El Divendres, 3 de maig de 2013, a les 22:17:29, \
Philipp A. va escriure:<br> <div>> well, it is:<br>
> >>> from sys import version_info<br>
> >>> version_info[:2]<br>
><br>
> (3, 3)<br>
><br>
> >>> from PyKDE4.kdecore import versionString, i18n<br>
> >>> versionString()<br>
><br>
> '4.10.2'<br>
><br>
> >>> i18n(' ·'.encode('utf-8'))<br>
><br>
> ' ·'<br>
><br>
> >>> print(i18n(' ·'))<br>
><br>
> Traceback (most recent call last):<br>
> File "<stdin>", line 1, in <module><br>
> UnicodeEncodeError: 'ascii' codec can't encode character \
'\xb7' in position<br> > 0: ordinal not in range(128)<br>
><br>
> note that the result of i18n *is* a unicode string, and i18n *accepts*<br>
> unicode strings, but only if those unicode strings happen to only contain<br>
> ascii – just like in the bad old python2 times.<br>
><br>
> so i18n is buggy on KDE 4.10, and we have to work around it.<br>
<br>
</div>Why is it buggy? i18n only works on utf8 formatted "ascii \
strings"<br> <br>
Are you expecting something in the python part to do some magic?<br>
<br>
Cheers,<br>
Albert<br>
<div><div><br>
><br>
><br>
> 2013/5/3 Shaheed Haque <<a href="mailto:srhaque@theiet.org" \
target="_blank">srhaque@theiet.org</a>><br> ><br>
> > Just after I hit "send", I found this:<br>
> ><br>
> > <a href="http://www.mail-archive.com/pyqt@riverbankcomputing.com/msg14058.html" \
target="_blank">http://www.mail-archive.com/pyqt@riverbankcomputing.com/msg14058.html</a><br>
> ><br>
> > which suggests this is not an issue???<br>
> ><br>
> > On 3 May 2013 20:42, Shaheed Haque <<a href="mailto:srhaque@theiet.org" \
target="_blank">srhaque@theiet.org</a>> wrote:<br> > >> Hi Philipp,<br>
> >><br>
> >> On 3 May 2013 19:56, Philipp A. <<a \
href="mailto:flying-sheep@web.de" target="_blank">flying-sheep@web.de</a>> \
wrote:<br> > >>> Hi, i've seen some uses of kdecore.i18n popping up in \
Paté plugins, and<br> > >>> have some recommendations:<br>
> >>><br>
> >>> 2. It takes more than one argument. so for the sake of \
consistency<br> > >>> instead of doing the ugly<br>
> >>><br>
> >>> i18n(b'foo %(name)s.') % { 'name': \
'bar'}<br> > >>><br>
> >>> or even the better<br>
> >>><br>
> >>> i18n(b'foo {name}.').format(name='bar')<br>
> >>><br>
> >>> we should do the Qt-style<br>
> >>><br>
> >>> i18n(b'foo %1.', 'bar')<br>
> >>><br>
> >>> 1. i18n takes byte strings. even on python3. this means that \
every time<br> > >>> a developer accustomed to python2 who doesn't know \
it tries to use it,<br> > >>> the<br>
> >>> plugin WILL break for python3 users.<br>
> >><br>
> >> I've been using the argument syntax of the third form, but \
simply<br> > >> specified quoted strings (i.e. without the "b" \
prefix). Without really<br> > >> thinking about it, I had assumed that i18n \
would have done something<br> > >> plausible on Python2 (not sure exactly \
what though!), and on Python3 it<br> > >> would just be Unicode all the way. \
I'd certainly prefer not avoid having<br> > >> to<br>
> >> use "b" all over the place.<br>
> >><br>
> >> <a href="https://github.com/Werkov/PyQt4/blob/master/examples/tools/i18n/i18n.py" \
target="_blank">https://github.com/Werkov/PyQt4/blob/master/examples/tools/i18n/i18n.py</a><br>
> >><br>
> >> seems to suggest that something like that is possible, but when I \
went<br> > >> looking for some docs on this, but could not see an obvious \
spec. Do you<br> > >> have a reference handy?<br>
> >><br>
> >> Thanks, Shaheed<br>
> >><br>
> >>> we have to come up with a solution.<br>
> >>><br>
> >>> there is a possible solution here, but it involves a fairly \
convoluted<br> > >>> i18n replacement:<br>
> >>><br>
> >>> <a \
href="https://projects.kde.org/projects/kde/applications/kate/repository/revis" \
target="_blank">https://projects.kde.org/projects/kde/applications/kate/repository/revis</a><br>
> >>> ions/master/entry/addons/kate/pate/src/plugins/python_console_ipython/py<br>
> >>> thon_console_ipython.py#L36<br>
> >>><br>
> >>> should we add that function to libkatepate and call it a day?<br>
> >>><br>
> >>> _______________________________________________<br>
> >>> KWrite-Devel mailing list<br>
> >>> <a href="mailto:KWrite-Devel@kde.org" \
target="_blank">KWrite-Devel@kde.org</a><br> > >>> <a \
href="https://mail.kde.org/mailman/listinfo/kwrite-devel" \
target="_blank">https://mail.kde.org/mailman/listinfo/kwrite-devel</a><br> > \
><br> > > _______________________________________________<br>
> > KWrite-Devel mailing list<br>
> > <a href="mailto:KWrite-Devel@kde.org" \
target="_blank">KWrite-Devel@kde.org</a><br> > > <a \
href="https://mail.kde.org/mailman/listinfo/kwrite-devel" \
target="_blank">https://mail.kde.org/mailman/listinfo/kwrite-devel</a><br> \
_______________________________________________<br> KWrite-Devel mailing list<br>
<a href="mailto:KWrite-Devel@kde.org" target="_blank">KWrite-Devel@kde.org</a><br>
<a href="https://mail.kde.org/mailman/listinfo/kwrite-devel" \
target="_blank">https://mail.kde.org/mailman/listinfo/kwrite-devel</a><br> \
</div></div></blockquote></div><br></div></div></div></div></div></div></div></div></div></div></div>
</blockquote></div><br></div>
_______________________________________________
KWrite-Devel mailing list
KWrite-Devel@kde.org
https://mail.kde.org/mailman/listinfo/kwrite-devel
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic