[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kwrite-devel
Subject:    Re: Using i18n
From:       "Philipp A." <flying-sheep () web ! de>
Date:       2013-05-04 19:32:13
Message-ID: CAN8d9gn3MEcUTrzoeREZoyffqQrDNRO68BSRrOoWUSNu=tejqw () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


besides, a internationalization function that only works with ascii would
be pretty pointless, no? =)

the whole point of unicode is that it works with foreign languages and
glyphs, so i18n better eats every unicode i throw at it and spits out
unicode strings!


2013/5/4 Philipp A. <flying-sheep@web.de>

> > i18n only works on utf8 formatted "ascii strings"
>
> no, as i said, this works: i18n(' ·'.encode('utf-8'))
>
> " ·" is unicode, not ascii. when i encode it to utf-8 bytes, and call i18n
> on it, i18n flawlessly returns a unicode string, which means that it
> decodes the bytes it's passed as utf-8, not ascii. only if i pass it a
> unicode string, it inexplicably tries to *encode* it to bytes with the
> ascii codec. no idea why, but it's true.
>
> so what's the case is that i18n works on either
> 1. *utf-8-encoded* byte strings of the whole unicode range
> 2.  unicode strings which happen to only contain characters from the ascii
> range.
>
> so it accepts only  objects that survive the following treatment:
>
> def test(t):
>     u = str if sys.version_info.major == 3 else unicode
>     if isinstance(t, u):
>         return t.encode('ascii')
>     else:
>         assert isinstance(t, bytes)
>         return t
>
>
> 2013/5/4 Albert Astals Cid <aacid@kde.org>
>
>> El Divendres, 3 de maig de 2013, a les 22:17:29, Philipp A. va escriure:
>> > well, it is:
>> > >>> from sys import version_info
>> > >>> version_info[:2]
>> >
>> > (3, 3)
>> >
>> > >>> from PyKDE4.kdecore import versionString, i18n
>> > >>> versionString()
>> >
>> > '4.10.2'
>> >
>> > >>> i18n(' ·'.encode('utf-8'))
>> >
>> > ' ·'
>> >
>> > >>> print(i18n(' ·'))
>> >
>> > Traceback (most recent call last):
>> >   File "<stdin>", line 1, in <module>
>> > UnicodeEncodeError: 'ascii' codec can't encode character '\xb7' in
>> position
>> > 0: ordinal not in range(128)
>> >
>> > note that the result of i18n *is* a unicode string, and i18n *accepts*
>> > unicode strings, but only if those unicode strings happen to only
>> contain
>> > ascii – just like in the bad old python2 times.
>> >
>> > so i18n is buggy on KDE 4.10, and we have to work around it.
>>
>> Why is it buggy? i18n only works on utf8 formatted "ascii strings"
>>
>> Are you expecting something in the python part to do some magic?
>>
>> Cheers,
>>   Albert
>>
>> >
>> >
>> > 2013/5/3 Shaheed Haque <srhaque@theiet.org>
>> >
>> > > Just after I hit "send", I found this:
>> > >
>> > > http://www.mail-archive.com/pyqt@riverbankcomputing.com/msg14058.html
>> > >
>> > > which suggests this is not an issue???
>> > >
>> > > On 3 May 2013 20:42, Shaheed Haque <srhaque@theiet.org> wrote:
>> > >> Hi Philipp,
>> > >>
>> > >> On 3 May 2013 19:56, Philipp A. <flying-sheep@web.de> wrote:
>> > >>> Hi, i've seen some uses of kdecore.i18n popping up in Paté plugins,
>> and
>> > >>> have some recommendations:
>> > >>>
>> > >>> 2. It takes more than one argument. so for the sake of consistency
>> > >>> instead of doing the ugly
>> > >>>
>> > >>>     i18n(b'foo %(name)s.') % { 'name': 'bar'}
>> > >>>
>> > >>> or even the better
>> > >>>
>> > >>>     i18n(b'foo {name}.').format(name='bar')
>> > >>>
>> > >>> we should do the Qt-style
>> > >>>
>> > >>>     i18n(b'foo %1.', 'bar')
>> > >>>
>> > >>> 1.  i18n takes byte strings. even on python3. this means that every
>> time
>> > >>> a developer accustomed to python2 who doesn't know it tries to use
>> it,
>> > >>> the
>> > >>> plugin WILL break for python3 users.
>> > >>
>> > >> I've been using the argument syntax of the third form, but simply
>> > >> specified quoted strings (i.e. without the "b" prefix). Without
>> really
>> > >> thinking about it, I had assumed that i18n would have done something
>> > >> plausible on Python2 (not sure exactly what though!), and on Python3
>> it
>> > >> would just be Unicode all the way. I'd certainly prefer not avoid
>> having
>> > >> to
>> > >> use "b" all over the place.
>> > >>
>> > >>
>> https://github.com/Werkov/PyQt4/blob/master/examples/tools/i18n/i18n.py
>> > >>
>> > >> seems to suggest that something like that is possible, but when I
>> went
>> > >> looking for some docs on this, but could not see an obvious spec. Do
>> you
>> > >> have a reference handy?
>> > >>
>> > >> Thanks, Shaheed
>> > >>
>> > >>> we have to come up with a solution.
>> > >>>
>> > >>> there is a possible solution here, but it involves a fairly
>> convoluted
>> > >>> i18n replacement:
>> > >>>
>> > >>>
>> https://projects.kde.org/projects/kde/applications/kate/repository/revis
>> > >>>
>> ions/master/entry/addons/kate/pate/src/plugins/python_console_ipython/py
>> > >>> thon_console_ipython.py#L36
>> > >>>
>> > >>> should we add that function to libkatepate and call it a day?
>> > >>>
>> > >>> _______________________________________________
>> > >>> KWrite-Devel mailing list
>> > >>> KWrite-Devel@kde.org
>> > >>> https://mail.kde.org/mailman/listinfo/kwrite-devel
>> > >
>> > > _______________________________________________
>> > > KWrite-Devel mailing list
>> > > KWrite-Devel@kde.org
>> > > https://mail.kde.org/mailman/listinfo/kwrite-devel
>> _______________________________________________
>> KWrite-Devel mailing list
>> KWrite-Devel@kde.org
>> https://mail.kde.org/mailman/listinfo/kwrite-devel
>>
>
>

[Attachment #5 (text/html)]

<div dir="ltr"><div>besides, a internationalization function that only works with \
ascii would be pretty pointless, no? =)<br><br></div>the whole point of unicode is \
that it works with foreign languages and glyphs, so i18n better eats every unicode i \
throw at it and spits out unicode strings!<br> </div><div \
class="gmail_extra"><br><br><div class="gmail_quote">2013/5/4 Philipp A. <span \
dir="ltr">&lt;<a href="mailto:flying-sheep@web.de" \
target="_blank">flying-sheep@web.de</a>&gt;</span><br><blockquote class="gmail_quote" \
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div \
dir="ltr"><div><div><div><div><div><div><div><div><div><div class="im"><div>&gt; i18n \
only works on utf8 formatted &quot;ascii strings&quot;<br><br></div></div>no, as i \
said, this works: i18n(&#39; ·&#39;.encode(&#39;utf-8&#39;))<br> <br></div>
" ·" is unicode, not ascii. when i encode it to utf-8 bytes, and call i18n on it, \
i18n flawlessly returns a unicode string, which means that it decodes the bytes it's \
passed as utf-8, not ascii. only if i pass it a unicode string, it inexplicably tries \
to *encode* it to bytes with the ascii codec. no idea why, but it's true.<br>

<br></div>so what's the case is that i18n works on either<br>1. *utf-8-encoded* byte \
strings of the whole unicode range<br>2.   unicode strings which happen to only \
contain characters from the ascii range.<br></div><br></div>

so it accepts only   objects that survive the following \
treatment:<br><br></div><div>def test(t):<br></div>       u = str if \
sys.version_info.major == 3 else unicode<br>       if isinstance(t, u):</div>         \
return t.encode(&#39;ascii&#39;)<br>

</div>       else:<br></div>               assert isinstance(t, bytes)<br></div>      \
return t<div><div class="h5"><br><div><div><div><div><div><div><div><div \
class="gmail_extra"><br><div class="gmail_quote">2013/5/4 Albert Astals Cid <span \
dir="ltr">&lt;<a href="mailto:aacid@kde.org" \
target="_blank">aacid@kde.org</a>&gt;</span><br>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex">El Divendres, 3 de maig de 2013, a les 22:17:29, \
Philipp A. va escriure:<br> <div>&gt; well, it is:<br>
&gt; &gt;&gt;&gt; from sys import version_info<br>
&gt; &gt;&gt;&gt; version_info[:2]<br>
&gt;<br>
&gt; (3, 3)<br>
&gt;<br>
&gt; &gt;&gt;&gt; from PyKDE4.kdecore import versionString, i18n<br>
&gt; &gt;&gt;&gt; versionString()<br>
&gt;<br>
&gt; &#39;4.10.2&#39;<br>
&gt;<br>
&gt; &gt;&gt;&gt; i18n(&#39; ·&#39;.encode(&#39;utf-8&#39;))<br>
&gt;<br>
&gt; &#39; ·&#39;<br>
&gt;<br>
&gt; &gt;&gt;&gt; print(i18n(&#39; ·&#39;))<br>
&gt;<br>
&gt; Traceback (most recent call last):<br>
&gt;    File &quot;&lt;stdin&gt;&quot;, line 1, in &lt;module&gt;<br>
&gt; UnicodeEncodeError: &#39;ascii&#39; codec can&#39;t encode character \
&#39;\xb7&#39; in position<br> &gt; 0: ordinal not in range(128)<br>
&gt;<br>
&gt; note that the result of i18n *is* a unicode string, and i18n *accepts*<br>
&gt; unicode strings, but only if those unicode strings happen to only contain<br>
&gt; ascii – just like in the bad old python2 times.<br>
&gt;<br>
&gt; so i18n is buggy on KDE 4.10, and we have to work around it.<br>
<br>
</div>Why is it buggy? i18n only works on utf8 formatted &quot;ascii \
strings&quot;<br> <br>
Are you expecting something in the python part to do some magic?<br>
<br>
Cheers,<br>
   Albert<br>
<div><div><br>
&gt;<br>
&gt;<br>
&gt; 2013/5/3 Shaheed Haque &lt;<a href="mailto:srhaque@theiet.org" \
target="_blank">srhaque@theiet.org</a>&gt;<br> &gt;<br>
&gt; &gt; Just after I hit &quot;send&quot;, I found this:<br>
&gt; &gt;<br>
&gt; &gt; <a href="http://www.mail-archive.com/pyqt@riverbankcomputing.com/msg14058.html" \
target="_blank">http://www.mail-archive.com/pyqt@riverbankcomputing.com/msg14058.html</a><br>
 &gt; &gt;<br>
&gt; &gt; which suggests this is not an issue???<br>
&gt; &gt;<br>
&gt; &gt; On 3 May 2013 20:42, Shaheed Haque &lt;<a href="mailto:srhaque@theiet.org" \
target="_blank">srhaque@theiet.org</a>&gt; wrote:<br> &gt; &gt;&gt; Hi Philipp,<br>
&gt; &gt;&gt;<br>
&gt; &gt;&gt; On 3 May 2013 19:56, Philipp A. &lt;<a \
href="mailto:flying-sheep@web.de" target="_blank">flying-sheep@web.de</a>&gt; \
wrote:<br> &gt; &gt;&gt;&gt; Hi, i've seen some uses of kdecore.i18n popping up in \
Paté plugins, and<br> &gt; &gt;&gt;&gt; have some recommendations:<br>
&gt; &gt;&gt;&gt;<br>
&gt; &gt;&gt;&gt; 2. It takes more than one argument. so for the sake of \
consistency<br> &gt; &gt;&gt;&gt; instead of doing the ugly<br>
&gt; &gt;&gt;&gt;<br>
&gt; &gt;&gt;&gt;       i18n(b&#39;foo %(name)s.&#39;) % { &#39;name&#39;: \
&#39;bar&#39;}<br> &gt; &gt;&gt;&gt;<br>
&gt; &gt;&gt;&gt; or even the better<br>
&gt; &gt;&gt;&gt;<br>
&gt; &gt;&gt;&gt;       i18n(b&#39;foo {name}.&#39;).format(name=&#39;bar&#39;)<br>
&gt; &gt;&gt;&gt;<br>
&gt; &gt;&gt;&gt; we should do the Qt-style<br>
&gt; &gt;&gt;&gt;<br>
&gt; &gt;&gt;&gt;       i18n(b&#39;foo %1.&#39;, &#39;bar&#39;)<br>
&gt; &gt;&gt;&gt;<br>
&gt; &gt;&gt;&gt; 1.   i18n takes byte strings. even on python3. this means that \
every time<br> &gt; &gt;&gt;&gt; a developer accustomed to python2 who doesn't know \
it tries to use it,<br> &gt; &gt;&gt;&gt; the<br>
&gt; &gt;&gt;&gt; plugin WILL break for python3 users.<br>
&gt; &gt;&gt;<br>
&gt; &gt;&gt; I&#39;ve been using the argument syntax of the third form, but \
simply<br> &gt; &gt;&gt; specified quoted strings (i.e. without the &quot;b&quot; \
prefix). Without really<br> &gt; &gt;&gt; thinking about it, I had assumed that i18n \
would have done something<br> &gt; &gt;&gt; plausible on Python2 (not sure exactly \
what though!), and on Python3 it<br> &gt; &gt;&gt; would just be Unicode all the way. \
I&#39;d certainly prefer not avoid having<br> &gt; &gt;&gt; to<br>
&gt; &gt;&gt; use &quot;b&quot; all over the place.<br>
&gt; &gt;&gt;<br>
&gt; &gt;&gt; <a href="https://github.com/Werkov/PyQt4/blob/master/examples/tools/i18n/i18n.py" \
target="_blank">https://github.com/Werkov/PyQt4/blob/master/examples/tools/i18n/i18n.py</a><br>
 &gt; &gt;&gt;<br>
&gt; &gt;&gt; seems to suggest that something like that is possible, but when I \
went<br> &gt; &gt;&gt; looking for some docs on this, but could not see an obvious \
spec. Do you<br> &gt; &gt;&gt; have a reference handy?<br>
&gt; &gt;&gt;<br>
&gt; &gt;&gt; Thanks, Shaheed<br>
&gt; &gt;&gt;<br>
&gt; &gt;&gt;&gt; we have to come up with a solution.<br>
&gt; &gt;&gt;&gt;<br>
&gt; &gt;&gt;&gt; there is a possible solution here, but it involves a fairly \
convoluted<br> &gt; &gt;&gt;&gt; i18n replacement:<br>
&gt; &gt;&gt;&gt;<br>
&gt; &gt;&gt;&gt; <a \
href="https://projects.kde.org/projects/kde/applications/kate/repository/revis" \
target="_blank">https://projects.kde.org/projects/kde/applications/kate/repository/revis</a><br>
 &gt; &gt;&gt;&gt; ions/master/entry/addons/kate/pate/src/plugins/python_console_ipython/py<br>
 &gt; &gt;&gt;&gt; thon_console_ipython.py#L36<br>
&gt; &gt;&gt;&gt;<br>
&gt; &gt;&gt;&gt; should we add that function to libkatepate and call it a day?<br>
&gt; &gt;&gt;&gt;<br>
&gt; &gt;&gt;&gt; _______________________________________________<br>
&gt; &gt;&gt;&gt; KWrite-Devel mailing list<br>
&gt; &gt;&gt;&gt; <a href="mailto:KWrite-Devel@kde.org" \
target="_blank">KWrite-Devel@kde.org</a><br> &gt; &gt;&gt;&gt; <a \
href="https://mail.kde.org/mailman/listinfo/kwrite-devel" \
target="_blank">https://mail.kde.org/mailman/listinfo/kwrite-devel</a><br> &gt; \
&gt;<br> &gt; &gt; _______________________________________________<br>
&gt; &gt; KWrite-Devel mailing list<br>
&gt; &gt; <a href="mailto:KWrite-Devel@kde.org" \
target="_blank">KWrite-Devel@kde.org</a><br> &gt; &gt; <a \
href="https://mail.kde.org/mailman/listinfo/kwrite-devel" \
target="_blank">https://mail.kde.org/mailman/listinfo/kwrite-devel</a><br> \
_______________________________________________<br> KWrite-Devel mailing list<br>
<a href="mailto:KWrite-Devel@kde.org" target="_blank">KWrite-Devel@kde.org</a><br>
<a href="https://mail.kde.org/mailman/listinfo/kwrite-devel" \
target="_blank">https://mail.kde.org/mailman/listinfo/kwrite-devel</a><br> \
</div></div></blockquote></div><br></div></div></div></div></div></div></div></div></div></div></div>
 </blockquote></div><br></div>



_______________________________________________
KWrite-Devel mailing list
KWrite-Devel@kde.org
https://mail.kde.org/mailman/listinfo/kwrite-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic