[prev in list] [next in list] [prev in thread] [next in thread] 

List:       perl5-porters
Subject:    [perl #72414] UTF-16 filters do not handle all surrogates gracefully
From:       "James E Keenan via RT" <perlbug-followup () perl ! org>
Date:       2017-02-27 16:35:47
Message-ID: rt-4.0.24-19765-1488213347-1620.72414-15-0 () perl ! org
[Download RAW message or body]

On Mon, 27 Feb 2017 12:12:18 GMT, hv wrote:
> On Sun, 26 Feb 2017 19:10:09 -0800, jkeenan wrote:
> > However, I haven't been able to figure out how to use
> > Porting/bisect.pl to determine the commit at which the program first
> > completed successfully.  Suggestions?
> 
> Verify that the testcase exits non-zero on failure and zero on
> success:
> 
> % perl-5.10 ~/72414-script.pl
> Malformed UTF-16 surrogate.
> % echo $?
> 9
> % perl-blead ~/72414-script.pl
> Hello world at /home/hv/72414-script.pl line 1.
> % echo $?
> 0
>  %
> 
> Check the docs for example of "when was this fixed":
> 
> % perldoc Porting/bisect-runner.pl | grep -A1 'stop being an error'
>         # When did this stop being an error?
>         .../Porting/bisect.pl --expect-fail -e '1 // 2'
>  %
> 
> Bisect:
> 
> % Porting/bisect.pl --expect-fail -- ./perl -Ilib ~/72414-script.pl
> [...]
> ba77e4cc9d1ceebf472c9c5c18b2377ee47062e6 is the first bad commit
> commit ba77e4cc9d1ceebf472c9c5c18b2377ee47062e6
> Author: Nicholas Clark <nick@ccl4.org>
> Date:   Thu Oct 22 19:39:30 2009 +0100
> 
> S_utf16_textfilter() needs to avoid splitting UTF-16 surrogate pairs.
> 
> Easier said than done.
> 
> :040000 040000 00e64049450c3e91b8d09afa4b676520cc75836e
> f73afa6dfba581efaa53915a40b8c611e07cf23f M       t
> :100644 100644 f795707e0d90fbc38ebad23b3b8944647530c5e0
> f105505ea49664c0a0d00a89ecff57ccb32ee284 M       toke.c
> bisect run success
> That took 1277 seconds.
>  %
> 
> The bisector could helpfully s/bad commit/good commit/ under expect-
> fail.
> 
> Hugo

Bisection confirmed:

#####
# bad
$ git show | head -1
commit b3766b12c64c46e0bcc2c1dc58cc7b96d8bef10c
$ ./perl -Ilib /home/jkeenan/learn/perl/p5p/72414-script.pl
Malformed UTF-16 surrogate.

# good
$ git show | head -1
commit ba77e4cc9d1ceebf472c9c5c18b2377ee47062e6
$./perl -Ilib /home/jkeenan/learn/perl/p5p/72414-script.pl
Hello world at /home/jkeenan/learn/perl/p5p/72414-script.pl line 1.
#####

Hugo, Tux, alh +++ for assistance in bisection.

Marking ticket Resolved.

Thank you very much.
-- 
James E Keenan (jkeenan@cpan.org)

---
via perlbug:  queue: perl5 status: open
https://rt.perl.org/Ticket/Display.html?id=72414
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic