[prev in list] [next in list] [prev in thread] [next in thread] 

List:       libguestfs
Subject:    Re: [Libguestfs] [PATCH] python: Fix UnicodeError in inspect_list_applications2() (RHBZ#1684004)
From:       Sam Eiderman <sameid () google ! com>
Date:       2020-04-20 12:38:29
Message-ID: CAFr6bUnie_uF6-cBix2J1vZi0AjFQ0Y4v7cSi2wCXMs3yoPSWw () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


I uploaded a v2, which does as you requested, more globally (across all
python bindings) - tell me what you think.

On Mon, Apr 20, 2020 at 2:42 PM Daniel P. Berrangé <berrange@redhat.com>
wrote:

> On Mon, Apr 20, 2020 at 01:17:35PM +0300, Sam Eiderman wrote:
> > The python3 bindings create unicode objects from application strings
> > on the guest (i.e. installed rpm, deb packages).
> > It is documented that rpm package fields such as description should be
> > utf8 encoded - however in some cases they are not a valid unicode
> > string, on SLES11 SP4 the following packages fail to be converted to
> > unicode using guestfs_int_py_fromstring() (which invokes
> > PyUnicode_FromString()):
> >
> >  PackageKit
> >  aaa_base
> >  coreutils
> >  dejavu
> >  desktop-data-SLED
> >  gnome-utils
> >  hunspell
> >  hunspell-32bit
> >  hunspell-tools
> >  libblocxx6
> >  libexif
> >  libgphoto2
> >  libgtksourceview-2_0-0
> >  libmpfr1
> >  libopensc2
> >  libopensc2-32bit
> >  liborc-0_4-0
> >  libpackagekit-glib10
> >  libpixman-1-0
> >  libpixman-1-0-32bit
> >  libpoppler-glib4
> >  libpoppler5
> >  libsensors3
> >  libtelepathy-glib0
> >  m4
> >  opensc
> >  opensc-32bit
> >  permissions
> >  pinentry
> >  poppler-tools
> >  python-gtksourceview
> >  splashy
> >  syslog-ng
> >  tar
> >  tightvnc
> >  xorg-x11
> >  xorg-x11-xauth
> >  yast2-mouse
> >
> > This is a surgical fix for inspect_list_applications2()'s description
> > field.
> >
> > Signed-off-by: Sam Eiderman <sameid@google.com>
> > ---
> >  generator/python.ml | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/generator/python.ml b/generator/python.ml
> > index f0d6b5d96..7394a943a 100644
> > --- a/generator/python.ml
> > +++ b/generator/python.ml
> > @@ -170,6 +170,14 @@ and generate_python_structs () =
> >          function
> >          | name, FString ->
> >              pr "  value = guestfs_int_py_fromstring (%s->%s);\n" typ
> name;
> > +            (match typ, name with
> > +            | "application", "app_description"
> > +            | "application2", "app2_description" ->
> > +                pr "  if (value == NULL) {\n";
> > +                pr "    value = guestfs_int_py_fromstring (\"\");\n";
> > +                pr "    PyErr_Clear ();\n";
> > +                pr "  }\n";
>
> I don't think this is especially friendly/helpful to users.
>
> I'm assuming that there's just a handful of characters that are not
> valid UTF-8. I think we really want a graceful conversion that will
> convert as much as possible, replacing any invalid UTF-8 with some
> generic placeholder character.
>
> Regards,
> Daniel
> --
> |: https://berrange.com      -o-
> https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-
> https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-
> https://www.instagram.com/dberrange :|
>
>

[Attachment #5 (text/html)]

<div dir="ltr">I uploaded a v2, which does as you requested, more globally (across \
all python bindings) - tell me what you think.</div><br><div class="gmail_quote"><div \
dir="ltr" class="gmail_attr">On Mon, Apr 20, 2020 at 2:42 PM Daniel P. Berrangé \
&lt;<a href="mailto:berrange@redhat.com">berrange@redhat.com</a>&gt; \
wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, Apr 20, 2020 \
at 01:17:35PM +0300, Sam Eiderman wrote:<br> &gt; The python3 bindings create unicode \
objects from application strings<br> &gt; on the guest (i.e. installed rpm, deb \
packages).<br> &gt; It is documented that rpm package fields such as description \
should be<br> &gt; utf8 encoded - however in some cases they are not a valid \
unicode<br> &gt; string, on SLES11 SP4 the following packages fail to be converted \
to<br> &gt; unicode using guestfs_int_py_fromstring() (which invokes<br>
&gt; PyUnicode_FromString()):<br>
&gt; <br>
&gt;   PackageKit<br>
&gt;   aaa_base<br>
&gt;   coreutils<br>
&gt;   dejavu<br>
&gt;   desktop-data-SLED<br>
&gt;   gnome-utils<br>
&gt;   hunspell<br>
&gt;   hunspell-32bit<br>
&gt;   hunspell-tools<br>
&gt;   libblocxx6<br>
&gt;   libexif<br>
&gt;   libgphoto2<br>
&gt;   libgtksourceview-2_0-0<br>
&gt;   libmpfr1<br>
&gt;   libopensc2<br>
&gt;   libopensc2-32bit<br>
&gt;   liborc-0_4-0<br>
&gt;   libpackagekit-glib10<br>
&gt;   libpixman-1-0<br>
&gt;   libpixman-1-0-32bit<br>
&gt;   libpoppler-glib4<br>
&gt;   libpoppler5<br>
&gt;   libsensors3<br>
&gt;   libtelepathy-glib0<br>
&gt;   m4<br>
&gt;   opensc<br>
&gt;   opensc-32bit<br>
&gt;   permissions<br>
&gt;   pinentry<br>
&gt;   poppler-tools<br>
&gt;   python-gtksourceview<br>
&gt;   splashy<br>
&gt;   syslog-ng<br>
&gt;   tar<br>
&gt;   tightvnc<br>
&gt;   xorg-x11<br>
&gt;   xorg-x11-xauth<br>
&gt;   yast2-mouse<br>
&gt; <br>
&gt; This is a surgical fix for inspect_list_applications2()&#39;s description<br>
&gt; field.<br>
&gt; <br>
&gt; Signed-off-by: Sam Eiderman &lt;<a href="mailto:sameid@google.com" \
target="_blank">sameid@google.com</a>&gt;<br> &gt; ---<br>
&gt;   generator/<a href="http://python.ml" rel="noreferrer" \
target="_blank">python.ml</a> | 8 ++++++++<br> &gt;   1 file changed, 8 \
insertions(+)<br> &gt; <br>
&gt; diff --git a/generator/<a href="http://python.ml" rel="noreferrer" \
target="_blank">python.ml</a> b/generator/<a href="http://python.ml" rel="noreferrer" \
target="_blank">python.ml</a><br> &gt; index f0d6b5d96..7394a943a 100644<br>
&gt; --- a/generator/<a href="http://python.ml" rel="noreferrer" \
target="_blank">python.ml</a><br> &gt; +++ b/generator/<a href="http://python.ml" \
rel="noreferrer" target="_blank">python.ml</a><br> &gt; @@ -170,6 +170,14 @@ and \
generate_python_structs () =<br> &gt;               function<br>
&gt;               | name, FString -&gt;<br>
&gt;                     pr &quot;   value = guestfs_int_py_fromstring \
(%s-&gt;%s);\n&quot; typ name;<br> &gt; +                  (match typ, name with<br>
&gt; +                  | &quot;application&quot;, &quot;app_description&quot;<br>
&gt; +                  | &quot;application2&quot;, &quot;app2_description&quot; \
-&gt;<br> &gt; +                        pr &quot;   if (value == NULL) {\n&quot;;<br>
&gt; +                        pr &quot;      value = guestfs_int_py_fromstring \
(\&quot;\&quot;);\n&quot;;<br> &gt; +                        pr &quot;      \
PyErr_Clear ();\n&quot;;<br> &gt; +                        pr &quot;   }\n&quot;;<br>
<br>
I don&#39;t think this is especially friendly/helpful to users.<br>
<br>
I&#39;m assuming that there&#39;s just a handful of characters that are not<br>
valid UTF-8. I think we really want a graceful conversion that will<br>
convert as much as possible, replacing any invalid UTF-8 with some<br>
generic placeholder character.<br>
<br>
Regards,<br>
Daniel<br>
-- <br>
> > <a href="https://berrange.com" rel="noreferrer" \
> > target="_blank">https://berrange.com</a>         -o-      <a \
> > href="https://www.flickr.com/photos/dberrange" rel="noreferrer" \
> > target="_blank">https://www.flickr.com/photos/dberrange</a> :|<br>
> > <a href="https://libvirt.org" rel="noreferrer" \
> > target="_blank">https://libvirt.org</a>              -o-                  <a \
> > href="https://fstop138.berrange.com" rel="noreferrer" \
> > target="_blank">https://fstop138.berrange.com</a> :|<br>
> > <a href="https://entangle-photo.org" rel="noreferrer" \
> > target="_blank">https://entangle-photo.org</a>      -o-      <a \
> > href="https://www.instagram.com/dberrange" rel="noreferrer" \
> > target="_blank">https://www.instagram.com/dberrange</a> :|<br>
<br>
</blockquote></div>



_______________________________________________
Libguestfs mailing list
Libguestfs@redhat.com
https://www.redhat.com/mailman/listinfo/libguestfs

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic