[prev in list] [next in list] [prev in thread] [next in thread]
List: libguestfs
Subject: Re: [Libguestfs] [PATCH] python: Fix UnicodeError in inspect_list_applications2() (RHBZ#1684004)
From: Sam Eiderman <sameid () google ! com>
Date: 2020-04-20 12:38:29
Message-ID: CAFr6bUnie_uF6-cBix2J1vZi0AjFQ0Y4v7cSi2wCXMs3yoPSWw () mail ! gmail ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
I uploaded a v2, which does as you requested, more globally (across all
python bindings) - tell me what you think.
On Mon, Apr 20, 2020 at 2:42 PM Daniel P. Berrangé <berrange@redhat.com>
wrote:
> On Mon, Apr 20, 2020 at 01:17:35PM +0300, Sam Eiderman wrote:
> > The python3 bindings create unicode objects from application strings
> > on the guest (i.e. installed rpm, deb packages).
> > It is documented that rpm package fields such as description should be
> > utf8 encoded - however in some cases they are not a valid unicode
> > string, on SLES11 SP4 the following packages fail to be converted to
> > unicode using guestfs_int_py_fromstring() (which invokes
> > PyUnicode_FromString()):
> >
> > PackageKit
> > aaa_base
> > coreutils
> > dejavu
> > desktop-data-SLED
> > gnome-utils
> > hunspell
> > hunspell-32bit
> > hunspell-tools
> > libblocxx6
> > libexif
> > libgphoto2
> > libgtksourceview-2_0-0
> > libmpfr1
> > libopensc2
> > libopensc2-32bit
> > liborc-0_4-0
> > libpackagekit-glib10
> > libpixman-1-0
> > libpixman-1-0-32bit
> > libpoppler-glib4
> > libpoppler5
> > libsensors3
> > libtelepathy-glib0
> > m4
> > opensc
> > opensc-32bit
> > permissions
> > pinentry
> > poppler-tools
> > python-gtksourceview
> > splashy
> > syslog-ng
> > tar
> > tightvnc
> > xorg-x11
> > xorg-x11-xauth
> > yast2-mouse
> >
> > This is a surgical fix for inspect_list_applications2()'s description
> > field.
> >
> > Signed-off-by: Sam Eiderman <sameid@google.com>
> > ---
> > generator/python.ml | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/generator/python.ml b/generator/python.ml
> > index f0d6b5d96..7394a943a 100644
> > --- a/generator/python.ml
> > +++ b/generator/python.ml
> > @@ -170,6 +170,14 @@ and generate_python_structs () =
> > function
> > | name, FString ->
> > pr " value = guestfs_int_py_fromstring (%s->%s);\n" typ
> name;
> > + (match typ, name with
> > + | "application", "app_description"
> > + | "application2", "app2_description" ->
> > + pr " if (value == NULL) {\n";
> > + pr " value = guestfs_int_py_fromstring (\"\");\n";
> > + pr " PyErr_Clear ();\n";
> > + pr " }\n";
>
> I don't think this is especially friendly/helpful to users.
>
> I'm assuming that there's just a handful of characters that are not
> valid UTF-8. I think we really want a graceful conversion that will
> convert as much as possible, replacing any invalid UTF-8 with some
> generic placeholder character.
>
> Regards,
> Daniel
> --
> |: https://berrange.com -o-
> https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-
> https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o-
> https://www.instagram.com/dberrange :|
>
>
[Attachment #5 (text/html)]
<div dir="ltr">I uploaded a v2, which does as you requested, more globally (across \
all python bindings) - tell me what you think.</div><br><div class="gmail_quote"><div \
dir="ltr" class="gmail_attr">On Mon, Apr 20, 2020 at 2:42 PM Daniel P. Berrangé \
<<a href="mailto:berrange@redhat.com">berrange@redhat.com</a>> \
wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, Apr 20, 2020 \
at 01:17:35PM +0300, Sam Eiderman wrote:<br> > The python3 bindings create unicode \
objects from application strings<br> > on the guest (i.e. installed rpm, deb \
packages).<br> > It is documented that rpm package fields such as description \
should be<br> > utf8 encoded - however in some cases they are not a valid \
unicode<br> > string, on SLES11 SP4 the following packages fail to be converted \
to<br> > unicode using guestfs_int_py_fromstring() (which invokes<br>
> PyUnicode_FromString()):<br>
> <br>
> PackageKit<br>
> aaa_base<br>
> coreutils<br>
> dejavu<br>
> desktop-data-SLED<br>
> gnome-utils<br>
> hunspell<br>
> hunspell-32bit<br>
> hunspell-tools<br>
> libblocxx6<br>
> libexif<br>
> libgphoto2<br>
> libgtksourceview-2_0-0<br>
> libmpfr1<br>
> libopensc2<br>
> libopensc2-32bit<br>
> liborc-0_4-0<br>
> libpackagekit-glib10<br>
> libpixman-1-0<br>
> libpixman-1-0-32bit<br>
> libpoppler-glib4<br>
> libpoppler5<br>
> libsensors3<br>
> libtelepathy-glib0<br>
> m4<br>
> opensc<br>
> opensc-32bit<br>
> permissions<br>
> pinentry<br>
> poppler-tools<br>
> python-gtksourceview<br>
> splashy<br>
> syslog-ng<br>
> tar<br>
> tightvnc<br>
> xorg-x11<br>
> xorg-x11-xauth<br>
> yast2-mouse<br>
> <br>
> This is a surgical fix for inspect_list_applications2()'s description<br>
> field.<br>
> <br>
> Signed-off-by: Sam Eiderman <<a href="mailto:sameid@google.com" \
target="_blank">sameid@google.com</a>><br> > ---<br>
> generator/<a href="http://python.ml" rel="noreferrer" \
target="_blank">python.ml</a> | 8 ++++++++<br> > 1 file changed, 8 \
insertions(+)<br> > <br>
> diff --git a/generator/<a href="http://python.ml" rel="noreferrer" \
target="_blank">python.ml</a> b/generator/<a href="http://python.ml" rel="noreferrer" \
target="_blank">python.ml</a><br> > index f0d6b5d96..7394a943a 100644<br>
> --- a/generator/<a href="http://python.ml" rel="noreferrer" \
target="_blank">python.ml</a><br> > +++ b/generator/<a href="http://python.ml" \
rel="noreferrer" target="_blank">python.ml</a><br> > @@ -170,6 +170,14 @@ and \
generate_python_structs () =<br> > function<br>
> | name, FString -><br>
> pr " value = guestfs_int_py_fromstring \
(%s->%s);\n" typ name;<br> > + (match typ, name with<br>
> + | "application", "app_description"<br>
> + | "application2", "app2_description" \
-><br> > + pr " if (value == NULL) {\n";<br>
> + pr " value = guestfs_int_py_fromstring \
(\"\");\n";<br> > + pr " \
PyErr_Clear ();\n";<br> > + pr " }\n";<br>
<br>
I don't think this is especially friendly/helpful to users.<br>
<br>
I'm assuming that there's just a handful of characters that are not<br>
valid UTF-8. I think we really want a graceful conversion that will<br>
convert as much as possible, replacing any invalid UTF-8 with some<br>
generic placeholder character.<br>
<br>
Regards,<br>
Daniel<br>
-- <br>
> > <a href="https://berrange.com" rel="noreferrer" \
> > target="_blank">https://berrange.com</a> -o- <a \
> > href="https://www.flickr.com/photos/dberrange" rel="noreferrer" \
> > target="_blank">https://www.flickr.com/photos/dberrange</a> :|<br>
> > <a href="https://libvirt.org" rel="noreferrer" \
> > target="_blank">https://libvirt.org</a> -o- <a \
> > href="https://fstop138.berrange.com" rel="noreferrer" \
> > target="_blank">https://fstop138.berrange.com</a> :|<br>
> > <a href="https://entangle-photo.org" rel="noreferrer" \
> > target="_blank">https://entangle-photo.org</a> -o- <a \
> > href="https://www.instagram.com/dberrange" rel="noreferrer" \
> > target="_blank">https://www.instagram.com/dberrange</a> :|<br>
<br>
</blockquote></div>
_______________________________________________
Libguestfs mailing list
Libguestfs@redhat.com
https://www.redhat.com/mailman/listinfo/libguestfs
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic