'[perl.git] branch khw/ebcdic, created. v5.17.11-243-gb407d77'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       perl5-changes
Subject:    [perl.git]  branch khw/ebcdic, created. v5.17.11-243-gb407d77
From:       "Karl Williamson" <public () khwilliamson ! com>
Date:       2013-04-30 18:19:55
Message-ID: E1UXF9f-0002By-B0 () camel ! ams6 ! corp ! booking ! com
[Download RAW message or body]

In perl.git, the branch khw/ebcdic has been created

<http://perl5.git.perl.org/perl.git/commitdiff/b407d7766c85bc6bf2e30c2cd2ef9a306402bbda?hp=0000000000000000000000000000000000000000>


        at  b407d7766c85bc6bf2e30c2cd2ef9a306402bbda (commit)

- Log -----------------------------------------------------------------
commit b407d7766c85bc6bf2e30c2cd2ef9a306402bbda
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 23 18:58:54 2013 -0600

    XXX experimental pp_pack.c: 'u'

M	pp_pack.c

commit 02820fd88b0561ecbd47970f3708f617d397707b
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Feb 25 17:25:08 2013 -0700

    XXX CPAN Normalize
    
    This converts Unicode::Normalize to use the native tables that are used
    by Perl starting in XXX, while using the Unicode-ordered ones that were
    used before then.
    
    Another alternative would be to have mktables generate just these tables
    in Unicode ordering.

M	cpan/Unicode-Normalize/Normalize.xs

commit 5a197cd4a6f6ad23bcf3c1aa8201b30fe24c9f4f
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Feb 25 17:22:55 2013 -0700

    XXX CPAN prob wrong Collate
    
    This changes to implicity usenative code points.  This is likely wrong,
    as the module comes with its own data, that are probably in terms of
    Unicode

M	cpan/Unicode-Collate/Collate.xs

commit 50e8ac62d0449940ab8cb4f02fe79722627d1204
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 27 22:14:02 2013 -0600

    utf8.c: Remove wrapper functions.
    
    Now that the Unicode data is stored in native character set order, it is
    rare to need to work with the Unicode order.  Traditionally, the real
    work was done in functions that worked with the Unicode order, and
    wrapper functions (or macros) were used to translate to/from native.
    
    There are two groups of functions: one that translates from code point
    to UTF-8, and the other group goes the opposite direction.
    
    This commit changes the base function that translates from UTF-8 to code
    point to output native instead of Unicode.  Those extremely rare
    instances where Unicode output is needed instead will have to hand-wrap
    calls to this function with a translation macro, as now described in the
    API pod.  Prior to this, it was the other way, the native was wrapped,
    and the rare, strict Unicode wasn't.  This eliminates a layer of
    function call overhead for a common case.
    
    The base function that translates from code point to UTF-8 retains its
    Unicode input, as that is more natural to process.  However, it is
    de-emphasized in the pod, with the functionality description moved to
    the pod for a native input wrapper function.  And, those wrappers are
    now macros in all cases; previously there was function call overhead
    sometimes.  (Equivalent exported functions are retained, however, for XS
    code that uses the Perl_foo() form.)
    
    I had hoped to rebase this commit, squashing it with an earlier commit
    in this series, eliminating the use of a temporary function name change,
    but the work involved turns out to be large, with no real payoff.

M	embed.fnc
M	embed.h
M	mathoms.c
M	proto.h
M	utf8.c
M	utf8.h

commit 0b861599c9110cd3c8a4bc9011cd870262d7531c
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 30 09:13:35 2013 -0600

    perlapi vis utf8.c: Nits

M	utf8.c

commit 21ac04cab11fdc7e4111a540dfac51ce49aa98bd
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 30 08:04:45 2013 -0600

    utf8.c: Move 2 functions to earlier in file
    
    This moves these two functions to be adjacent to the function they each
    call, thus keeping like things together.

M	utf8.c

commit b998a6105de76a79d466ece6d5779a46faf42db1
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 27 08:59:19 2013 -0600

    embed.fnc: Slight clarification in comments

M	embed.fnc

commit 673abbdc3af3fe6675936e723a9a11ad32e6fc1d
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Apr 22 14:44:08 2013 -0600

    mg.c: White-space only
    
    I found re-formatting this multi-line 'if' to be easier to understand

M	mg.c

commit b989379c1ae25341231b1cec97d50e8fd0618945
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Apr 22 14:34:47 2013 -0600

    toke.c: Remove redundant test
    
    This checks that something is both not-printable and not a word
    character, but all word characters are printable, so just the
    non-printable test suffices.

M	toke.c

commit ede1131e5af2cb3ea627d4ff1fc0af13dd279e34
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 20 17:04:08 2013 -0600

    gv.c: Add comment

M	gv.c

commit fa39d8516839a395557c32e30dd192047491897e
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Apr 19 17:02:25 2013 -0600

    XXX rebase, finish up: reenable fold_grind.t

M	t/re/fold_grind.t

commit 398e205cc26021df34185831aadc71d37d36f509
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Apr 19 13:58:12 2013 -0600

    t/op/coreamp.t: Generalize for non-ASCII platfomrs

M	t/op/coreamp.t

commit b4036f082cf0a6c4c245e16b1c72e246b2b3ccea
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Apr 19 13:19:44 2013 -0600

    XXX temporary lib/warnings.pm: Add debugging info

M	lib/warnings.pm

commit f0d98c60179636f156568e36e7dbf2e17a5fa54b
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Apr 19 13:18:20 2013 -0600

    regcomp.c: Add missing (parens) to expression
    
    A pair of parentheses were missing leading to this 'if' not acting as
    intended.

M	regcomp.c

commit cd346eb7f5b365e7ef54a806a8cd7497ea1b746b
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Apr 17 21:49:10 2013 -0600

    t/re/re_tests: Some tests are platform-specific

M	t/re/re_tests

commit 5104b329bd4d7259a4a11d600a033b4e7722c0c0
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Apr 17 21:47:41 2013 -0600

    t/re/regexp.t: Add ability to skip depending on platform
    
    This adds the capability to specify that a test is to be done only on an
    ASCII platform, or only on an EBCDIC.

M	t/re/regexp.t

commit a57313a9aa9ff8a9b4b523563e95423ba9da92cf
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Apr 17 08:22:36 2013 -0600

    t/io/crlf.t: Generalize for non-ASCII platforms

M	t/io/crlf.t

commit c6c2a05c0bbfcf6de7c20654ea2408aa5e700712
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 16 20:15:08 2013 -0600

    unicode_constants.h: regened for ebcdic

M	unicode_constants.h

commit 0557369ba11ba0e625a613e1790212371c8bcc46
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 16 15:49:06 2013 -0600

    XXX finish up t/re/regexp.t: Generalize for non-ASCII platforms
    
    This adds code to the processing of the tests in t/re/re_tests to
    automatically convert from unicode to native character sets
    
    Add comment about circular tests
    XXX better commit message

M	t/re/regexp.t

commit f11e45e9d5d10f4bc33c1756f39f5249c4c90f91
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 16 12:13:07 2013 -0600

    ext/B/t/b.t: Generalize for non-ASCII platforms

M	ext/B/t/b.t

commit d829285f59a911f3575680271c9a069fc1af1e44
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 16 12:02:26 2013 -0600

    dist/Safe/t/safeutf8.t: Generalize to non-ASCII platform

M	dist/Safe/t/safeutf8.t

commit 673d4136b1da614dd35f8bcc61744389399b644a
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 16 11:50:04 2013 -0600

    t/op/warn.t: Generalize for non-ASCII platforms

M	t/op/warn.t

commit f3dff4bbbd0d971ddf7872b0c2f3abc1e316dcc5
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 16 10:18:02 2013 -0600

    re/reg_email.t: Generalize for non-ASCII platforms
    
    This replaces all the hard-coded hex character values.  It uses the new
    (?[ ]) notation.  I checked that the compiled regex matches the exact
    same code points as before these changes.

M	t/re/reg_email.t

commit 8a0391e583ba9146ba3635c6682f12d8e986715a
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 16 09:04:50 2013 -0600

    t/porting/regen.t: Add file to check

M	t/porting/regen.t

commit 3f772bf8974203e1c761b7b7119cbadfc98cff24
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 16 09:03:47 2013 -0600

    dist/ExtUtils-Install/t/InstallWithMM.t: Skip if EBCDIC
    
    Because is uses JSON

M	dist/ExtUtils-Install/t/InstallWithMM.t

commit 353f2f9337cff31ba9a13b5efe9e198aa973800a
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Apr 14 21:31:04 2013 -0600

    XXX: t/lib/warnings/utf8: Experiment with malformed utf8

M	t/lib/warnings/utf8

commit e1d6b933f7d26cf087fea5fd9eabd760695d57be
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 22:04:50 2013 -0600

    XXX skip cpan tests

M	t/TEST

commit d87b511d45950997cb8428765c81519ccd2944a1
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 16:19:20 2013 -0600

    ext/XS-APItest/t/svpeek.t: Generalize for non-ASCII platforms

M	ext/XS-APItest/t/svpeek.t

commit 89e1406e5640b8d7d1984c79fb5a72cfb24ee377
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 16:14:35 2013 -0600

    ext/XS-APItest/t/svpv_magic.t: Generalize for non-ASCII platforms

M	ext/XS-APItest/t/svpv_magic.t

commit ea4b078f4e69979796a74a769446732cb33ecbf8
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 15:54:37 2013 -0600

    lib/DBM_Filter/t/encode.t: Generalize for non-ASCII platforms

M	lib/DBM_Filter/t/encode.t

commit 9708080ab55f4e9ea45b0bf5eba3c997b2aa3ed7
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 15:48:06 2013 -0600

    XXX finish up lib/dumpvar.pl: Generalize for EBCDIC
    
    Has octal constants

M	lib/dumpvar.pl

commit 9c1c5b86114bf90d7485fabc4b7927f110e2c378
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 15:35:52 2013 -0600

    XXX finish up lib/utf8.t: Generalize for non-ASCII platforms
    
    This includes choosing a different code point that has 3 bytes in both
    UTF-8 and UTF-EBCDIC, so that the pos numbers work for both.

M	lib/utf8.t

commit 8aae8cd02cbf8a4ef258270a1147e0c682918d07
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 15:16:44 2013 -0600

    t/uni/parser.t: Generalize for non-ASCII platforms

M	t/uni/parser.t

commit 268a6dfc5343ed087c87539798a1511d4363e429
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 14:41:46 2013 -0600

    t/uni/method.t: Generalize for non-ASCII platforms
    
    I couldn't figure out a way to not use the hard-coded values

M	t/uni/method.t

commit 5975c25b801ffcd66fecb6bb5aac620ad95dd2c6
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 14:26:09 2013 -0600

    t/op/magic.t: Generalize for non-ASCII platforms

M	t/op/magic.t

commit 62d44001515993d7cdca388783935c686befec61
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 13:36:41 2013 -0600

    t/io/through.t: Generalize for non-ASCII platforms
    
    This uses hard-coded values for EBCDIC because of the shell issues

M	t/io/through.t

commit e105e4bc303b4fbc8ab4dcd77f7445c682e79f33
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 13:16:00 2013 -0600

    toke.c: Fix EBCDIC bugs with single char variable names
    
    Latin1 variable single character variable names should all be legal,
    but the test was not for non-ASCII, it was for variant characters.  On
    EBCDIC platforms, this isn't the same as non-ASCII.
    
    The legal control character variable names are not the same as the C0
    and DEL controls, but are \001 \037 minus those that traditionally match
    \s on ASCII platforms, plus \c?.

M	toke.c

commit a3f37bf4a50fd3db66ee7370973b5b6782182a58
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 12:55:09 2013 -0600

    toke.c: An EBCDIC fix
    
    toCTRL(0..31) yields a printing character.  This is different from
    toCTRL(control) on EBCDIC machines.

M	toke.c

commit 778836bfb6c64b6a48ead242d78ad3caec50f446
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 12:52:17 2013 -0600

    XXX \c must be followed by printable
    
    This should be revised and included in 5.18, 5.19  depending on RFC outcome.

M	dquote_static.c

commit c538d599928e657fe08a8c9e263caabea06fc63d
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 11:41:04 2013 -0600

    XXX temp toCTRL

M	dquote_static.c
M	ext/B/B.pm
M	handy.h
M	pod/perlebcdic.pod
M	t/op/chars.t

commit 22edbc1630e77d62fb420a0f6cc95d009b4aa4a4
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 09:18:41 2013 -0600

    perlio.c: Generalize for EBCDIC
    
    This code had the hex constants for CARRIAGE RETURN and LINE FEED
    hard-coded in.  It appears to me from the comments that '\r' and '\n'
    are not suitable to use instead.  This commit changes the constants to
    use the native values instead.

M	perlio.c

commit 72b96dc1bdfc38ed93befdc8940bc28bfc6bef97
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 13 09:51:34 2013 -0600

    unicode_constants.h: Add #defines for CR, LF

M	regen/unicode_constants.pl
M	unicode_constants.h

commit d044e82436ca83e2651e320368577aa68de65ec3
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Apr 7 10:45:14 2013 -0600

    t/op/goto.t: Generalize for EBCDIC

M	t/op/goto.t

commit ac1844613833b8b60b75cc4573fe8516dd285924
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 6 21:03:44 2013 -0600

    regcomp.c: White-space only, wrap comment to fit

M	regcomp.c

commit 0748f2ff52f149ab4507202c197e50f2dbb6e12b
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Apr 3 20:15:17 2013 -0600

    t/re/pat.t: Generalize for EBCDIC

M	t/re/pat.t

commit 9e6fd46a036ddf939ea6d4d095577fee07252483
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Apr 3 21:56:02 2013 -0600

    XXX t/op/pack.t: Generalize for EBCDIC
    
    One unknown what to do: uuencode

M	t/op/pack.t

commit cf3cd58b9c837cecd70e6a2a8e527e86ef5e32a5
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 6 12:56:52 2013 -0600

    regcomp.c: In EBCDIC [i-j] exclude also ASCII
    
    i and j are not adjacent in EBCDIC.  This excluded any alphabetic
    characters between them, but allowed other ascii ones.

M	regcomp.c
M	t/re/pat_advanced.t

commit 42aa56769696320c8aefc47f86091cc805ad9648
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 6 12:54:42 2013 -0600

    utf8.c: Don't use slower general-purpose function
    
    There is a macro that accomplishes the same task for a two byte UTF-8
    encoded character, and avoids the overhead of the general purpose
    function call.

M	utf8.c

commit d994ea7004bd7786446f6755c1f5ebd0b887bc7c
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 6 12:53:07 2013 -0600

    utf8.c: Don't do ++ in macro parameter
    
    The formal parameter gets evaluated multiple times on an EBCDIC
    platform, thus incrementing more than the intended once.

M	utf8.c

commit 3c3a66716c7a569bf8b558544dcc0cd345125f68
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 6 12:50:48 2013 -0600

    utf8.c: Use macro instead of duplicating code
    
    There is a macro that accomplishes this task, and is easier to read.

M	utf8.c

commit a1ab7fcb2a13a19d0acf0d0c1aed8bfb3d2ab7c3
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 6 10:15:05 2013 -0600

    t/io/bom.t: Fix to run under EBCDIC

M	t/io/bom.t

commit 6c00a3974820781dbb06de02fb05b30866be63aa
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Apr 5 23:34:50 2013 -0600

    t/uni/overload.t: EBCDIC fixes

M	t/uni/overload.t

commit 5d98a9019a51910542a677c14fdc22d8d935338f
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Apr 5 23:34:13 2013 -0600

    t/uni/method.t: EBCDIC fixes

M	t/uni/method.t

commit 9ef4d43c8d8b35f9adb816601ad8b66f39337969
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Apr 5 23:33:28 2013 -0600

    t/op/utf8magic.t: EBCDIC fixes

M	t/op/utf8magic.t

commit 1bb871086122e9d1525eb2f7efca8d517ea6fe16
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Apr 5 23:32:57 2013 -0600

    t/op/evalbytes.t: EBCDIC fixes

M	t/op/evalbytes.t

commit 15aef1cfb50d3fd8e2486aa09ba463166be6522c
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Apr 5 16:20:20 2013 -0600

    lib/utf8.pm: Fix pod verbatim line wrap

M	lib/utf8.pm
M	t/porting/known_pod_issues.dat

commit f869dfd4716ea58b9d1bec1a574890155ef40494
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Apr 5 13:27:42 2013 -0600

    t/op/length.t: EBCDIC fixes

M	t/op/length.t

commit e24ea3159f0193b8b6de0e0478e74faf7baa2a7b
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 6 13:01:54 2013 -0600

    t/op/utfhash.t: XXX Add debug

M	t/op/utfhash.t

commit 6207f508136411d5ddeeec17383744f6289cb6f6
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Apr 5 12:21:21 2013 -0600

    Data-Dumper/Dumper.pm: Fix for EBCDIC

M	dist/Data-Dumper/Dumper.pm

commit fdfea61ded453db682807db7f4f47b7b173a4277
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Apr 5 12:15:58 2013 -0600

    Dumper.xs: Don't translate character twice
    
    utf8_to_uvchr() already returns the native code point; no need to
    convert again.  This code is only executed on Perls before 5.15

M	dist/Data-Dumper/Dumper.xs

commit eb2c9790c32486a7abda26616604a3da6e31f6d6
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Apr 6 20:39:22 2013 -0600

    dist/IO/t/io_utf8argv.t: Generalize and enable EBCDIC
    
    Infrastructure now exists to have this test run on EBCDIC platforms.

M	dist/IO/t/io_utf8argv.t

commit 3c0a09008c6b14659bd1886ab00a78a60355e62c
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Apr 3 21:59:16 2013 -0600

    utf8.h: Clarify comments

M	utf8.h

commit 60725362b8884d108f95a7b3e0d3eb0e3ed93a4b
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Apr 3 19:06:52 2013 -0600

    XXX CPAN cpan/Test/lib/Test.pm: Fixes for EBCDIC

M	cpan/Test/lib/Test.pm

commit c37f5efa9c4daa3f210b9dcbee24361ef31e946e
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Apr 1 22:29:16 2013 -0600

    t/re/pat_re_eval.t: Some EBCDIC fixes

M	t/re/pat_re_eval.t

commit dba798f061c2cf30edf9799b70f6248edde7e95e
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 2 07:11:19 2013 -0600

    t/test.pl:  Add fcn for UTF-EBCDIC conversion
    
    This adds the function byte_utf8a_to_utf8n().  This takes the bytes that
    form a UTF-8 string and convert them to the bytes that form that string
    on the native platform.

M	t/test.pl

commit e5e96b333d1cfab6d0ed62838612f2849a2e460e
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Apr 1 22:28:43 2013 -0600

    dist/Storable/t/utf8.t: Fix to run under EBCDIC

M	dist/Storable/t/utf8.t

commit 756afbfd0700f3c9547236ce9837a3b8976d9f2d
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Apr 1 22:28:08 2013 -0600

    t/uni/variables.t: Fix to run under EBCDIC

M	t/uni/variables.t

commit 3669cee44dbc045980413f7a7f2f00a818bea52b
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Apr 1 21:08:20 2013 -0600

    t/op/split.t: EBCDIC fixes

M	t/op/split.t

commit deeb721978cdf7720d04931abb648b9e841e8ffe
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Apr 1 20:43:03 2013 -0600

    re/pat_advanced.t: EBCDIC fixes
    
    This includes not skipping some EBCDIC that formerly was, since we now
    have testing infrastructure that makes this easy.

M	t/re/pat_advanced.t

commit 59450a879007f7a7300480293e65d05410cb82c0
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Apr 1 20:01:04 2013 -0600

    t/io/utf8.t: EBCDIC fixes

M	t/io/utf8.t

commit 2061779dddcbc467b02c3e9b6e65d8abf2baa3ed
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Mar 30 21:13:38 2013 -0600

    Unicode::UCD.pm: Nits

M	lib/Unicode/UCD.pm

commit 39f2ee9c1188559421255e4d77363d626889b4c4
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Mar 30 12:32:09 2013 -0600

    t/uni/fold.t: Generalize for non-ASCII platforms

M	t/uni/fold.t

commit 9d7c2bc40ae4dc79093ffc27fe752b590eb2d2c2
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Mar 29 15:22:28 2013 -0600

    XXX t/op/tiehandle.t: skip for now; deep recursion

M	t/op/tiehandle.t

commit 24c684efcbe68d82a3ee38c40d5e6523bc0557e4
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Mar 29 14:56:16 2013 -0600

    XXX better commit msg utf8.c: Avoid unnecessary UTF-8 conversions
    
    This changes the code so that converting to UTF-8 is avoided unless
    necessary.  For such inputs, the conversion back from UTF-8 is also
    avoided.  The cost of doing this is that the first swatches are combined
    into one that contains the values for all characters 0-255, instead of
    having multiple swatches.  That means when first calculating the swatch
    it calculates all 256, instead of 128 (160 on EBCDIC).
    
    This also fixes an EBCDIC bug in which characters in this range were
    being translated twice.

M	utf8.c

commit f1b9465fca2317758f65380fa5541ba7298276be
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Mar 29 13:34:59 2013 -0600

    utf8.c: No need to check for UTF-8 malformations
    
    This function assumes that the input is well-formed UTF-8, even though
    until this commit, the preferatory comments didn't say so.  The API does
    not pass the buffer length, so there is no way it could check for
    reading off the end of the buffer.  One code path already calls
    valid_utf8_to_uvchr(); this changes the remaining code path to correspond.

M	utf8.c

commit 5e1efb7b0527fda29b8e5afdeb07e7dad6f2ca04
Author: Karl Williamson <public@khwilliamson.com>
Date:   Thu Mar 28 19:56:39 2013 -0600

    utf8.c: Remove redundant assignment.
    
    This variable is always set just below.

M	utf8.c

commit fed3816f422e1daa6e8b0db7eb4d1b566196bbd0
Author: Karl Williamson <public@khwilliamson.com>
Date:   Thu Mar 28 17:19:16 2013 -0600

    XXX enable _invlist_dump;

M	embed.fnc
M	embed.h
M	proto.h

commit e64aad44709d90379f522ff10d4649d2e4d00e55
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Mar 8 11:01:32 2013 -0700

    XXX EBCDIC header files

M	charclass_invlists.h
M	l1_char_class_tab.h
M	regcharclass.h
M	unicode_constants.h

commit 019beee02470827bc1f004ddf6b516402af48260
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Mar 15 12:26:15 2013 -0600

    hints/os390.sh: Suppress bogus compiler message

M	hints/os390.sh

commit 11a30f20c4da10f0feeced233bf2684e00c75ece
Author: John Goodyear <johngood@us.ibm.com>
Date:   Sat Mar 2 12:31:25 2013 -0700

    XXX Temporary for z/OS long long support

M	Configure
M	hints/os390.sh

commit 119a0715d8c01f5cfec0682311f858c18e7597a7
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 27 18:17:28 2013 -0600

    Add test that to/from native character set works
    
    For non-ASCII systems, there are character set translation tables.  This
    makes sure the two accessible ones are inverses of each other.  If not,
    nothing can be expected to work right.

M	MANIFEST
A	t/base/translate.t

commit 5f0c5bfbb8d6f4e845ecf89f1d0e317a0b372b18
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 27 16:55:55 2013 -0600

    lib/feature/bundle: Fix some things to pass under EBCDIC

M	t/lib/feature/bundle

commit b9ad138530a0c76aefcad57a322f3a942f02d0a6
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 27 16:08:04 2013 -0600

    XS-APItest/t/fetch_pad_names.t: Skip if EBCDIC
    
    This could be ported, but there's a lot of stuff to convert; would need
    a function to convert byte strings that form legal UTF-8 into legal
    UTF-EBCDIC

M	ext/XS-APItest/t/fetch_pad_names.t

commit 5599ad9706a3902a83b5b2af0bdf8d6994773312
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 27 12:05:53 2013 -0600

    XXX ext/XS-APItest/t/utf8.t: Fix so passes EBCDIC
    
    This involves skipping much of the tests.  Reexamine later

M	ext/XS-APItest/t/utf8.t

commit 962c60330d68e4f268d3b4dc9537f8b0ec907fed
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 27 11:27:06 2013 -0600

    ext/re/t/re_funcs_u.t: Fix to work under EBCDIC

M	ext/re/t/re_funcs_u.t

commit e9c6dee6411d20d25aa338093f8750bf52c830f5
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 27 11:11:22 2013 -0600

    XXX dist/IO/t/io_utf8argv.t: Temporarily skip if EBCDIC

M	dist/IO/t/io_utf8argv.t

commit 54a2162dc70b00a569fb7bd11390d9ec23b718fe
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 27 10:33:44 2013 -0600

    t/op/print.t: Skip an EBCDIC test
    
    This could be written (the values would probably change depending on the
    code page), but the code that would get exercised is unlikely to vary
    depending on character set.

M	t/op/print.t

commit 436abc614332a5d92e050d6eb7e1226ae1b36d18
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Mar 26 15:44:59 2013 -0600

    XXX t/TEST: Avoid SIGPIPEs

M	t/TEST

commit 4e3f5bb71d96b774eaa30c5ede2669b25cc5ceef
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Mar 26 15:49:08 2013 -0600

    XXX Temporarily test normalization

M	cpan/Unicode-Normalize/t/fcdc.t
M	cpan/Unicode-Normalize/t/form.t
M	cpan/Unicode-Normalize/t/func.t
M	cpan/Unicode-Normalize/t/illegal.t
M	cpan/Unicode-Normalize/t/norm.t
M	cpan/Unicode-Normalize/t/null.t
M	cpan/Unicode-Normalize/t/partial1.t
M	cpan/Unicode-Normalize/t/partial2.t
M	cpan/Unicode-Normalize/t/proto.t
M	cpan/Unicode-Normalize/t/split.t
M	cpan/Unicode-Normalize/t/test.t
M	cpan/Unicode-Normalize/t/tie.t

commit 261067894f4bdeae223fb15ea21b455babc9a57d
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Mar 26 14:06:50 2013 -0600

    op/index.t: Fix tests for EBCDIC
    
    Commit 8a38a836 erroneously translates literals into the native
    encoding, causing a double translation, which is garbage.

M	t/op/index.t

commit 8ad90e8e1c90f0215c0e88dcb982ad379250f950
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 25 20:43:38 2013 -0600

    op/chop.t: Fix for EBCDIC
    
    One test is skipped because the code point is not representable on
    EBCDIC platforms.  Another test is modified to work on EBCDIC.

M	t/op/chop.t

commit 40ed7d874aff44b2b5220dd287dc2caa045f4696
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 25 19:56:50 2013 -0600

    t/op/lc.t: Fix to work under EBCDIC
    
    This had code that attempted this, but it was wrong.  The conversion to
    EBCDIC must be done before the \U, or similar.

M	t/op/lc.t

commit c6d1c65247594d26b41d2f3893a008d48b1a76fe
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 25 15:33:55 2013 -0600

    Skip some tests under EBCDIC
    
    EBCDIC won't work on these because of inherent differences from ASCII

M	t/porting/customized.t
M	t/porting/manifest.t

commit 1cf20253ac5e2a1ddea67cd48cfae2762e19219a
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 25 15:04:14 2013 -0600

    porting/bincompat.t: Skip under EBCDIC
    
    because the sorting order is different

M	t/porting/bincompat.t

commit 5e33bbfec0ddd0037d9496d9bba20b79ac39fd36
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 25 14:59:50 2013 -0600

    t/re/regex_sets.t: So will pass under EBCDIC

M	t/re/regex_sets.t

commit b308d6eab702974802c62eebc70b063a2319e6a9
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 25 14:59:26 2013 -0600

    t/porting/bincompat.t: Typo in comment

M	t/porting/bincompat.t

commit db01a8b91d76b5ccd32502a0cf47804297d78775
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 25 13:09:09 2013 -0600

    XXX fix \x{too large}

M	dist/IO/IO.xs
M	doop.c
M	inline.h
M	pp.c
M	pp_pack.c
M	regcomp.c
M	sv.c
M	toke.c
M	utf8.c
M	utf8.h

commit df2fe75ccdafbf1642fd72c6a6cba2069a862be1
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Mar 24 17:59:59 2013 -0600

    mktables: Fix typos in comments
    
    One of these fixes is for where a real CTRL-X was specified, instead of
    $^X

M	lib/unicore/mktables

commit c744de11174926be5c58fc7acdd72e81f73d61dc
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Mar 24 13:16:08 2013 -0600

    utf8.c: Fix so UTF-16 to UTF-8 conversion works under EBCDIC

M	utf8.c

commit 1ff9a68f3ce0591da90be6b6ddd4179894be5599
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Mar 24 13:14:34 2013 -0600

    utf8.h, utfebcdic.h: Add #define

M	utf8.h
M	utfebcdic.h

commit bcafb326a821e97201b0a7bc058b91a82e3390b9
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Mar 24 13:11:25 2013 -0600

    utf8.c: Use mnemonics instead of hex numbers

M	utf8.c

commit 23733f9feb1625182b4f7a96ccaa38a9a1ec04cd
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 20 22:15:58 2013 -0600

    lib/Unicode/UCD.t: Allow to run under EBCDIC,

M	lib/Unicode/UCD.t

commit d3f160ec3d8703769674a7fd35c21d9795900b40
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Mar 19 15:27:31 2013 -0600

    t/op/quotemeta.t: EBCDIC fixes

M	t/op/quotemeta.t

commit fb91c20b51a78c33b3d469a6dab33c3c805debf6
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Mar 19 11:32:55 2013 -0600

    t/re/fold_grind.t: Fixes for EBCDIC

M	t/re/fold_grind.t

commit 4358a3a102ab4ad200abed892a1298415a71f51a
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Mar 19 11:21:09 2013 -0600

    t/lib/charnames/alias: Fix some EBCDIC problems

M	t/lib/charnames/alias

commit f1f753bf96ad4ebd4fd8c4cc671002c6e1bb5fc3
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Mar 19 11:20:24 2013 -0600

    t/uni/class.t: Make work on EBCDIC

M	t/uni/class.t

commit ddcf29077cffdaa1c8dae8b69bd931d4455017dd
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Mar 19 11:01:57 2013 -0600

    feature/unicode_strings.t: Fix to work on EBCDIC

M	lib/feature/unicode_strings.t

commit 5d946d7851e9906d20a53e494c7aa25904ddd268
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Mar 19 10:10:46 2013 -0600

    XXX rebase regen/regcharclass.pl: make more EBCDIC friendly
    
    XXX regen/regcharclass.pl: maybe temp comment out utf8_char
    One of the possible inputs to this process is a string.  This clarifies
    that it must be specified in Unicode characters, and adds code to
    translate it to native, if necessary.

M	regen/regcharclass.pl
M	regen/regcharclass_multi_char_folds.pl

commit 0e62f9e6d907fca9ffc9349e705848075e4fc0d6
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Mar 19 10:09:53 2013 -0600

    XXX temporarily skip some folding tests

M	regen/regcharclass.pl
M	t/re/fold_grind.t
M	t/re/reg_fold.t

commit b6d2168f900353073833e2092d3c6f9e34bbd829
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 18 22:00:29 2013 -0600

    XXX temp skip perl5db.t

M	lib/perl5db.t

commit b0ee64fee4d97b5733463ce1b2568b6cf5dc0995
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 18 11:45:06 2013 -0600

    pp.c: White-space only
    
    Make a ternary operation more clear

M	pp.c

commit 82ffeb8eb0556d6dcc0dca4491a3ce1e8435278e
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 18 11:43:42 2013 -0600

    Fix valid_utf8_to_uvchr() for EBCDIC

M	utf8.c

commit 27ab3e38d590ae2ca0c1c9e78039e72746ec7463
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Mar 17 21:42:20 2013 -0600

    t/test.pl: Add comment about EBCDIC

M	t/test.pl

commit 4f15892a9e91315f863b1c20188174267d3b8044
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Mar 17 17:39:33 2013 -0600

    XXX makedepend.SH: Why does 255 work and 250 not?

M	makedepend.SH

commit ba56cfd7ff276c1a82d918d60c3eadfaffeacc46
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Mar 16 22:48:22 2013 -0600

    XXX regen/mk_PL_charclass.pl: Make EBCDIC friendly
    
    need more of a commit message

M	regen/mk_PL_charclass.pl

commit f2ef559a2c7516570de894265b294f97a7060cff
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Mar 16 22:44:44 2013 -0600

    XXX make various things more EBCDIC friendly
    
    Adds trailing white space errors
    Need to know what to do about ^A meaning 0x1, and M-foo meaning meta

M	lib/DB.pm
M	lib/dumpvar.pl
M	lib/perl5db.pl
M	lib/sigtrap.pm

commit 222fd450cbb588df3645dd756ca7427321dd8406
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Mar 16 22:41:15 2013 -0600

    XXX: Fixup commit message.
    
    Fix UTF8_ACUUMULATE, utf8.c

M	utf8.c
M	utf8.h

commit 0db7087d577d25c662cd89eed175c7d6c2de7b42
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Mar 16 16:52:45 2013 -0600

    regcomp.c: Fix bug in EBCDIC
    
    The POSIXA and NPOSIXA regnodes need to set the bits on only the ASCII
    code points, but under EBCDIC those code points are not 0-127.

M	regcomp.c

commit 6f9a7a015291e2855aaa3f71a610d8a52fd48188
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Mar 15 11:57:24 2013 -0600

    re/charset.t: Allow to work on EBCDIC
    
    This just converts the hard-coded character numbers to native, so will
    work on any platform.

M	t/re/charset.t

commit 37d1f528847997b09f1564b84f12b6c6166f56dc
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Mar 15 11:50:35 2013 -0600

    XS-APItest/t/handy.t: Change output message
    
    On EBCDIC platforms, the output is not in terms of \N{U+}; change text
    to \x{ }

M	ext/XS-APItest/t/handy.t

commit ef85adcdcd6b58903a6de8dccf1b7f3f417d1173
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 13 21:44:16 2013 -0600

    XXX Dumper.xs: Don't know why this stopped compiling

M	dist/Data-Dumper/Dumper.xs

commit 9339920e391408b24ee039f0365d72cd8b17f4d7
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 13 16:20:23 2013 -0600

    toke.c: Simplify some code
    
    We don't have to test separately for lower vs uppercase here, as
    upper/lower case A-Z and a-z are not intermixed in the gaps in A-Z and
    a-z under EBCDIC.

M	toke.c

commit 0d053611833136e45c69aa262e2a58b139099578
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 13 16:18:12 2013 -0600

    genpacksizetables.pl: Correct comment typo

M	genpacksizetables.pl

commit c8e470c43aaa8c4e3db8bd278912b857d5d0d15c
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 13 16:17:39 2013 -0600

    APItest/t/handy.t: Make EBCDIC-friendly

M	ext/XS-APItest/t/handy.t

commit 70f3ef564fea4ae891b50083134131d04f8c1231
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 13 16:16:14 2013 -0600

    Data-Dumper: Make EBCDIC-friendly

M	dist/Data-Dumper/Dumper.xs

commit 481b9f8fc41292acf791da1d85fcc9eac3142556
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 13 16:14:31 2013 -0600

    sv.c: Make less ASCII-centric

M	sv.c

commit 3abedf4a6b4d618c2cb80b54b260140b5289e430
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 13 16:07:52 2013 -0600

    charnames.t: Generalize for non-ASCII platforms

M	lib/charnames.t

commit eea453362c4060ac7955b15e437785cafc88815e
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 13 16:05:46 2013 -0600

    dump.c: Make less ASCII-centric:
    
    This has the added advantage of being clearer as to what is going on.

M	dump.c

commit fdbe387ef0afd4914d08664ac41e6d8c5362bdf7
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 13 16:02:52 2013 -0600

    hv.c: Stop being ASCII-centric
    
    This uses macros which work cross-platform.  This has the added advantge
    that it is much clearer what is going on.

M	hv.c

commit 57fbf4c60e98ba17fb2a7f6e231797434b97ad15
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Mar 12 22:34:17 2013 -0600

    t/TEST: Don't bail if fails in t/base unless minitest
    
    In order to completely compile Perl, many modules must have been parsed
    and compiled, so if there is a full perl, we know that things basically
    work.  The purpose of bailing out is that if these supposedly very base
    level functionality tests don't work, there's no point in continuing.
    But over the years, tests of more esoteric functionality have been
    added here, and if one of them doesn't work, it still could be that Perl
    pretty much does work.
    
    I believe it would be best to move such non-basic tests elsewhere, but
    that's work, and hasn't bitten us much so far; this change lessens the
    severity of the biting even more.  Where it will really bite is if
    things are so bad that a full perl binary can't be compiled, and we are
    trying to figure out why using minitest.

M	t/TEST

commit 7b38e7564441570de7d2f76126da190800f4e591
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 11 15:11:10 2013 -0600

    Added Porting/reorder_charclass_invlists.pl
    
    This program is used too bootstrap perl onto a non-ASCII platform with
    no pre-existing perl.

M	MANIFEST
A	Porting/reorder_charclass_invlists.pl

commit a1d64e6244113e63632fea9aa961ecc270ae2901
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Mar 10 22:17:31 2013 -0600

    t/base/lex.t: Use char suitable for both ASCII and EBCDIC
    
    \xE2 is 'S' in EBCDIC, and so is going to be legal.  \xDF is an alpha
    which has no ASCII equivalent in either character set

M	t/base/lex.t

commit 46eb04e6055bf8c3d0269580f01ba592599171c0
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Mar 10 13:11:07 2013 -0600

    XXX Temporary comment out ParseXS check
    
    this is to get things to compile for now

M	dist/ExtUtils-ParseXS/lib/ExtUtils/ParseXS.pm

commit 52fe59e7f579e6c8c8c1f75364552289c3d0372d
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Mar 10 11:34:10 2013 -0600

    XXX Collate, Normalize: Allow to compile under EBCDIC

M	cpan/Unicode-Collate/Collate.pm
M	cpan/Unicode-Collate/mkheader
M	cpan/Unicode-Normalize/Normalize.pm
M	cpan/Unicode-Normalize/mkheader

commit 6179f495ff10bd45debd0d683c7ef7ae2b58cea4
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Mar 9 21:57:38 2013 -0700

    XXX dquote_static.c: Silence wrong warning on EBCDIC
    
    Unsure of whether to add the 2nd !isCNTRL_L1 to silence return trip,
    which should be a separate commit anyway.
    
    This silences an inappropriate warning that doesn't happen on ASCII
    platforms.  CTRL-T maps to 0x14 on both ASCII and EBCDIC platforms.  But
    0x14 is a C1 control on EBCDIC, a C0 on ASCII.  Therefore the test that
    it's a control should include both C0 and C1, which isCNTRL_L1() does.
    
    Also has a white-space change, outdenting a line so it doesn't wrap in
    an 80 column window.

M	dquote_static.c

commit b2aa076a112b143b69c9f4b1460fee7bb5579ee6
Author: Karl Williamson <public@khwilliamson.com>
Date:   Thu Mar 7 12:08:41 2013 -0700

    utfebcdic.h: Change 'unsigned char' to U8
    
    This is for consistency with the rest of Perl

M	utfebcdic.h

commit 9a6eb35c78871f9a5c1032222ed8101b4bb8076f
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Mar 8 08:11:38 2013 -0700

    regen/regcharclass.pl: Make more EBCDIC-friendly
    
    This commit changes the code generated by the macros so that they work
    right out-of-the-box on non-ASCII platforms for non-UTF-8 inputs.  THEY
    ARE WRONG for UTF-8, but this is good enough to get perl bootstrapped
    onto the target platform, and regcharclass.pl can be run there,
    generating macros correct UTF-8.

M	regcharclass.h
M	regen/regcharclass.pl

commit b4c11692cc40c4b9fab12b46d170de09f57fb4c1
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 6 21:30:01 2013 -0700

    utfebcdic.h: Add (UV) cast
    
    The operand of this macro is implicitly a UV.  Make sure that it is.

M	utfebcdic.h

commit 3a1e680113176c50e0ce1dde59d9e3d6f4fed584
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 6 17:04:58 2013 -0700

    handy.h: Allow bootstrapping to non-ASCII platform
    
    This adds a bunch of macros and moves things around to support
    conditional compilation when Configure is called with
    -DBOOTSTRAP_CHARSET.  Doing so causes the usual macros that are
    table-driven to not be used, since the table may not be valid when
    bringing Perl up for the first time on a non-ASCII platform.
    
    This allows it to compile using the platform's native C library ctype
    functions, which should work enough to compile miniperl, and allow the
    table to be changed to be valid.  Then Configure can be re-run to not
    bootstrap, and normal compilation can proceed

M	handy.h
M	inline.h

commit b80e7f6aaeb7956c9f7048bbc4b711d977d46dd7
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 4 13:00:47 2013 -0700

    toke.c: Remove EBCDIC dependency

M	toke.c

commit 4e743d7b770b668c3de5afa2e7dcff5f81d72a1e
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 4 09:14:25 2013 -0700

    toke.c: Remove character set dependency
    
    Instead of hard-coding the bit patterns that comprise the Byte Order
    Mark in the UTF-8 or UTF-EBCDIC encodings, use the generated ones for
    the current platform.
    
    This removes some EBCDIC-only code.

M	toke.c

commit 838879b15d00d7526dc770fb4f8f3caf1d4233b5
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Mar 4 09:10:27 2013 -0700

    unicode_constants.h: Add #defines for Byte Order Mark
    
    These will be used in future commits

M	regen/unicode_constants.pl
M	unicode_constants.h

commit 422651554888435b3dd26603547005758ab81503
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Mar 2 15:04:18 2013 -0700

    XXX: Find a cleaner way. Handle missing is_UTF8_CHAR_utf8_safe
    
    This macro may not be present, and is currently used exclusively in
    IS_UTF8_CHAR, which itself may be undefined, and code should cope with
    that.  This is a work-around until a better solution is found.

M	utf8.c
M	utf8.h

commit 11c10d20d69304ffc4fe1924626c7acf1cf04cc9
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Mar 2 14:09:04 2013 -0700

    Add Porting tool for help with non-ASCII platforms
    
    Porting/reorder_l1_char_class_tab.pl is used to bootstrap Perl onto a
    non-ASCII platform with no working Perl.

M	MANIFEST
A	Porting/reorder_l1_char_class_tab.pl
M	regen/mk_PL_charclass.pl

commit 4ea480159561c6964e56a2c9655959211dab0d24
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Mar 2 13:06:58 2013 -0700

    inline.h: Reorder functions
    
    The comment implied that the functions below it in the file were
    deprecated, but in fact only the next two functions were.  This
    clarifies that and moves them so they are the final ones in the file

M	inline.h

commit d75ca1d943fba58d53ff6b65c78b3fb108f6656b
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Mar 2 12:33:42 2013 -0700

    utfebcdic.h: Add comment

M	utfebcdic.h

commit 753553b58a147fa717ab92c924ff086ab5aea6e2
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Mar 2 12:12:11 2013 -0700

    utf8.h: Clean up START_MARK definition and use
    
    The previous definition broke good encapsulation rules.  UTF_START_MARK
    should return something that fits in a byte; it shouldn't be the caller
    that does this.  So the mask is moved into the definition.  This means
    it can apply only to the portion that creates something larger than a
    byte.  Further, the EBCDIC version can be simplified, since 7 is the
    largest possible number of bytes in an EBCDIC UTF8 character.

M	utf8.h
M	utfebcdic.h

commit 3bf13bf4c83ca46e026663f7896c275184502515
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Mar 2 12:05:26 2013 -0700

    utf8.h: Move #includes
    
    These two files were only being #included for non-ebcdic compiles; they
    should be included always.

M	utf8.h

commit b2a404f3215758291aca7b0a736a9ac70acc8bb3
Author: John Goodyear <johngood@us.ibm.com>
Date:   Sat Mar 2 11:49:14 2013 -0700

    utfebcdic.h: Remove extra parameter expansions
    
    These two macros were improperly expanding the parameters as well as
    defining the operation, leading to compile errors.

M	utfebcdic.h

commit 53f465f0919f80fbf3b03b914a0e4687500a0950
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Mar 1 08:28:52 2013 -0700

    utf8.h: Simplify UTF8_EIGHT_BIT_foo on EBCDIC
    
    These macros were previously defined in terms of UTF8_TWO_BYTE_HI and
    UTF8_TWO_BYTE_LO.  But the EIGHT_BIT versions can use the less general
    and simpler NATIVE_TO_LATN1 instead of NATIVE_TO_UNI because the input
    domain is restricted in the EIGHT_BIT.  Note that on ASCII platforms,
    these both expand to the same thing, so the difference matters only on
    EBCDIC.

M	utf8.h

commit f45545334d2b75b597b13d207d36eef5ac69e3e1
Author: Karl Williamson <public@khwilliamson.com>
Date:   Thu Feb 28 09:25:27 2013 -0700

    XXX temp:  show makedepend cerr

M	makedepend.SH

commit e5c798734146470ca89f57802875947bfe1f8179
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Feb 27 21:59:11 2013 -0700

    makedepend.SH: Split too long lines; properly join
    
    I had thought that a continuation introduced a space.  But no,
    a continuation can happen in the middle of a token.
    
    And this splits lines that are getting very long to avoid preprocessor
    limitations.

M	makedepend.SH

commit 73349c9551037874d97aff1321311b76fd5d86fe
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Feb 27 15:51:28 2013 -0700

    makedepend.SH: White-space only
    
    Align continuation backslashes

M	makedepend.SH

commit 6d296269a9504f87ff859618ec1d54cd6b40388a
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Feb 27 14:39:28 2013 -0700

    makedepend.SH: Remove some unnecessary white space
    
    Multi-line preprocessor directives are now joined into single lines.
    This can create lines too long for the preprocessor to handle.  This
    commit removes blanks adjoining comments that get deleted.  This makes
    things somewhat less likely to exceed the limit.
    
    This commit also fixes several [] which were meant to each match a tab
    or a blank, but editors converted the tabs to blanks

M	makedepend.SH

commit 5a37d0f3ada71c13ccec9a5472eb67b89d4588a8
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Feb 27 14:30:51 2013 -0700

    makedepend.SH: Retain '/**/' comments
    
    These comments may actually be necessary.

M	makedepend.SH

commit 11e86360babb332401e188a3158ec23db31a366b
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Feb 27 08:38:19 2013 -0700

    handy.h: Remove extraneous parens

M	handy.h

commit 47cb1301507f6ea4368f06be8a914a3b49346dd3
Author: Andy Dougherty <doughera@lafayette.edu>
Date:   Wed Feb 27 13:06:07 2013 -0500

    Disable gcc-style function attributes on z/OS.
    
    John Goodyear <johngood@us.ibm.com> reports that the z/OS C compiler
    supports the attribute keyword, but not exactly the same as gcc.
    Instead of a "warning", the compiler emits an "INFORMATIONAL" message
    that Configure fails to detect.  Until Configure is fixed, just disable
    the attributes altogether.
    
    John Goodyear

M	hints/os390.sh

commit 9bbf0d837f8663a55d6286f85309adc7c9e8633c
Author: Andy Dougherty <doughera@lafayette.edu>
Date:   Wed Feb 27 09:12:13 2013 -0500

    Change os390 custom cppstdin script to use fgrep.
    
    Grep appears to be limited to 2048 characters, and truncates
    the output for cppstin.  Fgrep apparently doesn't have that limit.
    Thanks to John Goodyear <johngood@us.ibm.com> for reporting this.

M	hints/os390.sh

commit 459510bc9bde35c6abf5192433f335bdb3b1d1a7
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Feb 26 13:45:19 2013 -0700

    utf8.c: Use more clearly named macro
    
    In the case of invariants these two macros should do the same thing,
    but it seems to me that the latter name more clearly indicates what is
    going on.

M	utf8.c

commit b1c7d4a6f599f64609766296acb680646179b7ba
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Feb 26 13:35:12 2013 -0700

    Add macro OFFUNISKIP
    
    This means use official Unicode code point numbering, not native.  Doing
    this converts the existing UNISKIP calls in the code to refer to native
    code points, which is what they meant anyway.  The terminology is
    somewhat ambiguous, but I don't think will cause real confusion.
    NATIVE_SKIP is also introduced for situations where it is important to
    be precise.

M	toke.c
M	utf8.c
M	utf8.h
M	utfebcdic.h

commit 6e2e8834ce68a5055531af3d67ce9ca66c9893dc
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Feb 26 13:22:19 2013 -0700

    toke.c: white space only

M	toke.c

commit f08df806248f24b2f7bd042c33b93f1f751fd3c0
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Feb 17 14:00:13 2013 -0700

    toke.c: Don't remap \N{} for EBCDIC
    
    Everything is now in native,

M	toke.c

commit f9ea71e52139371605224f1f29db1570b0e71d61
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Feb 26 12:08:50 2013 -0700

    utf8.c: Deprecate two functions
    
    This is to force any code that has been using these functions to change.
    Since the Unicode tables are now stored in native order, these functions
    should only rarely be needed.
    
    However, the functionality of these is needed, and in actuality, on
    ASCII platforms, the native functions are #defined to these.  So what
    this commit does is rename the functions to something else, and create
    wrappers with the old names, so that anyone using them will get the
    deprecation.

M	embed.fnc
M	embed.h
M	mathoms.c
M	proto.h
M	utf8.c
M	utf8.h

commit fd3538552d64ce1f82d9ed1c518a8a22cfcae412
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Feb 26 11:26:09 2013 -0700

    Deprecate uvuni_to_utf8()
    
    Code should almost never be dealing with non-native code points

M	embed.fnc
M	embed.h
M	proto.h
M	utf8.c
M	utf8.h

commit 3937884de10fcef9ad0cad2e65b95b10c8d539e3
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Feb 26 11:02:33 2013 -0700

    Deprecate utf8_to_uni_buf()
    
    Now that the tables are stored in native order, there is almost no need
    for code to be dealing in Unicode order.

M	embed.fnc
M	proto.h
M	utf8.c

commit 37cb54589d58515758170a337f38c5946064de06
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Feb 26 09:00:18 2013 -0700

    makedepend.SH: Comment out unnecessary code
    
    This causes problems currently for z/OS.  But, since we don't know why
    it was there, I'm leaving it in as a placeholder.

M	makedepend.SH

commit 165c4169c8e84d321fcbe8519e70b7b4922c6d9c
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Feb 25 20:26:44 2013 -0700

    Deprecate valid_utf8_to_uvuni()
    
    Now that all the tables are stored in native format, there is very
    little reason to use this function; and those who do need this kind of
    functionality should be using the bottom level routine, so as to make it
    clear they are doing nonstandard stuff.

M	embed.fnc
M	proto.h
M	utf8.c

commit f00eeb1ed7f48cf3aad18d17bee44d43cb8ed657
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Feb 25 20:14:26 2013 -0700

    utf8.c: Swap which fcn wraps the other
    
    This is in preparation for the current wrapee becoming deprecated

M	embed.fnc
M	embed.h
M	proto.h
M	utf8.c
M	utf8.h

commit 5493ecd8a71030306af709a57985b64a10e38b32
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Feb 25 19:29:34 2013 -0700

    utf8.c: Skip a no-op
    
    Since the value is invariant under both UTF-8 and not, we already have
    it in 'uv'; no need to do anything else to get it

M	utf8.c

commit b1ae2e3e8c60988b792fe705f86214929c40fe9f
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Feb 25 19:26:50 2013 -0700

    utf8.c: Move comment to where makes more sense

M	utf8.c

commit 6644c1f80c59ca38f615945a927a7e357d5d8d6a
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Feb 25 17:30:10 2013 -0700

    APItest: Test native code points, instead of Unicode

M	ext/XS-APItest/APItest.pm
M	ext/XS-APItest/APItest.xs
M	ext/XS-APItest/t/utf8.t

commit c00f03b4a5a49c0ea5fc59bee3b0eb878c242672
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Feb 25 17:12:53 2013 -0700

    XXX CPAN Encode.xs
    
    Use core function if available.  This will insulate this code from any
    future changes.

M	cpan/Encode/Encode.xs

commit 2b13180bc5418f673052abb6846015c2057d09d1
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Feb 25 17:04:24 2013 -0700

    XXX CPAN and unsure Encode

M	cpan/Encode/Encode.xs
M	cpan/Encode/Unicode/Unicode.xs

commit 31bc4b76738416ca92e00b243f4bf6223471ac78
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Feb 25 17:00:47 2013 -0700

    XXX CPAN Encode.xs: fix indent

M	cpan/Encode/Encode.xs

commit 700cf1d3741691faca852ca1052102b4b90122a4
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Feb 24 17:23:15 2013 -0700

    Don't refer to U+XXXX when mean native
    
    These messages say the output number is Unicode, but it is really
    native, so change to saying is 0xXXXX.

M	regen/regcharclass_multi_char_folds.pl
M	regexec.c

commit e064dda9987b53d13aa9aa3d62fc27fe283b8683
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Feb 24 16:43:59 2013 -0700

    Convert some uvuni() to uvchr()
    
    All the tables are now based on the native character set, so using
    uvuni() in almost all cases is wrong.

M	cygwin/cygwin.c
M	doop.c
M	op.c
M	pp_pack.c
M	regcomp.c
M	regexec.c
M	toke.c
M	utf8.c

commit fb99b48bd985aaaabc588da6507e4727e8778a01
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Feb 24 16:25:47 2013 -0700

    handy.h: White space only

M	handy.h

commit cc107ecf8d68e66271d34308df231be78011f13d
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Feb 24 16:19:49 2013 -0700

    t/test.pl: Allow native/latin1 string conversions to work on utf8.
    
    These functions no longer have the hard-coded definitions in them,
    but now end up resolving to internal functions, so that new encodings
    could be added and these would automatically understand them.
    
    Instead of using tr///, these now go character by character and
    converting to/from ord, which is slower, but allows them to operate on
    utf8 strings.
    
    Peephole optimization should make these essentially no-ops on ascii
    platforms.

M	t/test.pl

commit bd039685ed98d2bd793680d9c80c1e0002d8a4eb
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Feb 24 16:05:55 2013 -0700

    t/test.pl: Simplify ord to/from native fcns
    
    This commit changes these functions from converting to/from a string to
    calling utf8:: functions which operate on ordinals instead.

M	t/test.pl

commit f320ccd9d77559d193b782d8db3692b5d830735b
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Feb 24 15:35:38 2013 -0700

    Make casing tables native
    
    These are final tables that haven't been converted to native character
    set casing.

M	perl.h
M	utfebcdic.h

commit 558f7ad4bd1bad806eaae899c54f0e25fa8de703
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Feb 24 15:32:30 2013 -0700

    utfebcdic.h: Remove trailing spaces

M	utfebcdic.h

commit 508eef76a19429b8bf83fdc4184fce991a6bb70b
Author: Karl Williamson <public@khwilliamson.com>
Date:   Fri Feb 22 18:55:26 2013 -0700

    EBCDIC has the unicode bug too
    
    We have not had a working modern Perl on EBCDIC for some years.  When I
    started out, comments and code led me to conclude erroneously that
    natively it supported semantics for all 256 characters 0-255.  It turns
    out that I was wrong; it natively (at least on some platforms) has the
    same rules (essentially none) for the characters which don't correspond
    to ASCII onees, as the rules for these on ASCII platforms.
    
    A previous commit for 5.18 changed the docs about this issue.  This
    current commit forces ASCII rules on EBCDIC platforms (even should there
    be one that natively uses all 256).  To get all 256, the same things
    like 'use feature "unicode_strings"' must now be done.

M	handy.h

commit 87e00a68a95d83f2eeb3e96d097274f82914a168
Author: Karl Williamson <public@khwilliamson.com>
Date:   Thu Feb 21 13:47:52 2013 -0700

    handy.h: Solve a failure to compile problem under EBCDIC
    
    handy.h is included in files that don't include perl.h, and hence not
    utf8.h.  We can't rely therefore on the ASCII/EBCDIC conversion
    macros being available to us.  The best way to cope is to use the native
    ctype functions.  Most, but not all, of the macros in this commit
    currently resolve to use those native ones, but a future commit will
    change that.

M	handy.h

commit 97948cde7a9fadc67193f663f73cf3038833142c
Author: Karl Williamson <public@khwilliamson.com>
Date:   Thu Feb 21 13:35:12 2013 -0700

    handy.h: Simplify some macro definitions
    
    Now, only one of the macros relies on magic numbers (isPRINT), leading
    to clearer definitions.

M	handy.h

commit 221aaddfd42b0e604b42959b1bb562aa01b1d928
Author: Karl Williamson <public@khwilliamson.com>
Date:   Thu Feb 21 13:26:49 2013 -0700

    handy.h: Combine macros that are same in ASCII, EBCDIC
    
    These 4 macros can have the same RHS for their ASCII and EBCDIC
    versions, so no need to duplicate their definitions
    
    This also enables the EBCDIC versions to not have undefined expansions
    when compiling without perl.h

M	handy.h

commit 47f8763f40ab564a050e8e28603ec89ef4157fde
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Feb 20 10:39:48 2013 -0700

    Deprecate NATIVE_TO_NEED and ASCII_TO_NEED
    
    These macros are no longer called in the Perl core.  This commit turns
    them into functions so that they can use gcc's deprecation facility.
    
    I believe these were defective right from the beginning, and I have
    struggled to understand what's going on.  From the name, it appears
    NATIVE_TO_NEED taks a native byte and turns it into UTF-8 if the
    appropriate parameter indicates that.  But that is impossible to do
    correctly from that API, as for variant characters, it needs to return
    two bytes.  It could only work correctly if ch is an I8 byte, which
    isn't native, and hence the name would be wrong.
    
    Similar arguments for ASCII_TO_NEED.
    
    The function S_append_utf8_from_native_byte(const U8 byte, U8** dest)
    does what I think NATIVE_TO_NEED intended.

M	embed.fnc
M	mathoms.c
M	proto.h
M	toke.c
M	utf8.h
M	utfebcdic.h

commit 303ab06940594e3900839ecb7a65c81cc458b1f2
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Feb 20 10:26:43 2013 -0700

    Remove remaining calls of NATIVE_TO_NEED
    
    These calls are just copying the input to the output byte by byte.
    There is no need to worry about UTF-8 or not, as the output is just an
    exact copy of the input

M	toke.c

commit 18ad69985d835ec25e4fe56fd1001c6000ecb9e8
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Feb 20 08:12:15 2013 -0700

    toke.c: Remove some NATIVE_TO_NEED calls
    
    I believe NATIVE_TO_NEED is defective, and will remove it in a future
    commit.  But, just in case I'm wrong, I'm doing it in small steps so
    bisects will show the culprit.  This removes the calls to it where the
    parameter is clearly invariant under UTF-8 and UTF-EBCDIC, and so the
    result can't be other than just the parameter.

M	toke.c

commit 53783c493d0cdfd209b83af002e608bb6479cec8
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Feb 20 08:22:07 2013 -0700

    toke.c: in [A-Za-z] use macros that exclude non-ASCII alphas
    
    This code is attempting to deal with the problem of holes in the ranges
    a-z and A-Z in EBCDIC.  Prior to this patch, it accepeted things like A
    WITH GRAVE, etc, which shouldn't have the special processing to deal
    with the holes

M	toke.c

commit c591456017bf17d02c8e795a2de5025277dde1ee
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Feb 19 15:13:19 2013 -0700

    Use real illegal UTF-8 byte
    
    The code here was wrong in assuming that \xFF is not legal in UTF-8
    encoded strings.  It currently doesn't work due to a bug, but that may
    eventually be fixed: [perl #116867].  The comments are also wrong that
    all bytes are legal in UTF-EBCDIC.
    
    It turns out that in well-formed UTF-8, the bytes C0 and C1 never appear
    (C2, C3, and C4 as well in UTF-EBCDIC), as they would be the start byte
    of an illegal overlong sequence.
    
    This creates a #define for an illegal byte using one of the real illegal
    ones, and changes the code to use that.
    
    No test is included due to #116867.

M	op.c
M	toke.c
M	utf8.h

commit 0b0a7324a2ed247395052be4451d7966c9a873b8
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Feb 17 13:50:45 2013 -0700

    toke.c: Remove remapping for EBCDIC for octal
    
    The code prior to this commit converted something like \04 into its
    EBCDIC equivalent only in double-quoted strings.  This was not done in
    patterns, and so gave inconsistent results.  The correct thing to do
    should be to do the native thing, what someone who works on a platform
    would think \04 do.  Platform independent characters are available
    through \N{}, either by name or by U+.
    
    The comment changed by this was wrong, as in some cases it was native,
    and in some cases Unicode.

M	toke.c

commit 8ca217e7fe9d418de701d393d92915c353cd9cff
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Feb 17 13:47:13 2013 -0700

    Remove EBCDIC remappings
    
    Now that the tables are stored in native format, we shouldn't be doing
    remapping.
    
    Note that this assumes that the Latin1 casing tables are stored in
    native order; not all of this has been done yet.

M	handy.h
M	perly.c
M	pp.c
M	regcomp.c
M	regexec.c
M	utf8.c

commit a7c763b3e26791fef499deeb72be224446d646a9
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Feb 17 12:46:05 2013 -0700

    Add and use macro to return EBCDIC
    
    The conversion from UTF-8 to code point should generally be to the
    native code point.  This adds a macro to do that, and converts the
    core calls to the existing macro to use the new one instead.  The old
    macro is retained for possible backwards compatibility, though it
    probably should be deprecated.

M	handy.h
M	pp.c
M	regcomp.c
M	regexec.c
M	toke.c
M	utf8.c
M	utf8.h

commit 83804835bde2c7217025a7b2821e4b22dce1332e
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sun Feb 17 09:18:06 2013 -0700

    charnames: fix nit in comment

M	lib/_charnames.pm

commit 891d8b28094b465b3b5de7296ecc733ca8e2cb7d
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Feb 16 11:05:44 2013 -0700

    charnames: Make work in EBCDIC
    
    Now that mktables generates native tables, the we need to make U+ mean
    Unicode instead of native.

M	lib/_charnames.pm
M	lib/charnames.pm

commit a6e2e1b818da962b8045794d76066358e3a92d14
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Feb 16 09:35:56 2013 -0700

    Unicode::UCD: Work on non-ASCII platforms
    
    Now that mktables generates native tables, it is a fairly simple matter
    to get Unicode::UCD to work on those platforms.

M	lib/Unicode/UCD.pm

commit f598e6e5ca7f7b8511ebf86ee1ad4e920ee5458b
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Mar 27 17:01:24 2013 -0600

    Unicode::UCD: Typo in comment

M	lib/Unicode/UCD.pm

commit d416db013e4d6a9aa1f344d5debcf3b015cffd07
Author: Karl Williamson <public@khwilliamson.com>
Date:   Thu Feb 14 22:16:38 2013 -0700

    mktables: Generate native code-point tables
    
    The output tables for mktables are now in the platform's native
    character set.  This means there is no change for ASCII platforms, but
    is a change for EBCDIC ones.
    
    Since we currently don't have any EBCDIC test platforms, I tested this
    by faking it out to generate EBCDIC data, and then eye-balled the
    results.
    
    Code that didn't realize there was a potential difference between EBCDIC
    and non-EBCDIC platforms will now start to work; code that tried to do
    the right thing under these circumstances will no longer work.  Fixing
    that comes in later commits.

M	lib/unicore/mktables

commit e332a9315da70d49d85e95d4dc386d7cb70cd74b
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 2 21:36:28 2013 -0600

    mktables: Move table creation code
    
    This code is moved later in the process.  This is in preparation for
    mktables generating tables in the native character set.  By moving it to
    later, the translation to native has already been done, and special
    coding need not be done.
    
    This also caught 7 code points that were omitted somehow in the previous
    logic

M	lib/unicore/mktables

commit fcf234326588538bdcbb86bbbe2eb1609ffbaefc
Author: Karl Williamson <public@khwilliamson.com>
Date:   Thu Feb 14 10:50:00 2013 -0700

    Fix some EBCDIC problems
    
    These spots have native code points, so should be using the macros for
    native code points, instead of Unicode ones.

M	regcomp.c
M	sv.c
M	toke.c

commit 0b3fb2e83ff47e3b8c83661518974099c69df7ba
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Feb 13 22:10:19 2013 -0700

    Remove unnecessary temp variable in converting to UTF-8
    
    These areas of code included a temporary that is unnecessary.

M	inline.h
M	regcomp.c
M	sv.c

commit 93aca4b19abed6c277f9e83c764e155e0b3d8eaa
Author: Karl Williamson <public@khwilliamson.com>
Date:   Wed Feb 13 22:00:55 2013 -0700

    utf8.h: Correct macros for EBCDIC
    
    These macros were incorrect for EBCDIC.  The 3 step process given in
    utfebcdic.h wasn't being followed.

M	utf8.h

commit 43feeeaaa546966bab913c926848d4b83bcf9043
Author: Karl Williamson <public@khwilliamson.com>
Date:   Sat Feb 9 21:23:30 2013 -0700

    Extract common code to an inline function
    
    This fairly short paradigm is repeated in several places; a later commit
    will improve it.

M	embed.fnc
M	embed.h
M	inline.h
M	pp_pack.c
M	proto.h
M	sv.c
M	toke.c
M	utf8.c

commit c885fb95b554def1cd86209ecf1842e1f6995565
Author: Karl Williamson <public@khwilliamson.com>
Date:   Thu Feb 7 21:35:57 2013 -0700

    Don't use EBCDIC macro for a C language escape
    
    C recognizes '\a' (for BEL); just use that instead of a look-up.
    
    regen/unicode_constants.pl could be used to generate the character for
    the ESC (set in surrounding code), but I didn't do that because of
    potential bootstrapping problems when porting to an EBCDIC platform
    without a working perl.  (The other characters generated in that .pl are
    less likely to cause problems when compiling perl.)

M	regcomp.c
M	toke.c

commit c071bf17185da45ae38f4fd061f9d9c83ef0a6f4
Author: Karl Williamson <public@khwilliamson.com>
Date:   Thu Feb 7 19:53:38 2013 -0700

    Use byte domain EBCDIC/LATIN1 macro where appropriate
    
    The macros like NATIVE_TO_UNI will work on EBCDIC, but operate on the
    whole Unicode range.  In the locations affected by this commit, it is
    known that the domain is limited to a single byte, so the simpler ones
    whose names contain LATIN1 may be used.
    
    On ASCII platforms, all the macros are null, so there is no effective
    change.

M	handy.h
M	regcomp.c
M	utf8.c

commit d8ba14798825c15295f863aafe8751c42e356cb2
Author: Karl Williamson <public@khwilliamson.com>
Date:   Thu Feb 7 14:31:09 2013 -0700

    Use new clearer named #defines
    
    This converts several areas of code to use the more clearly named macros
    introduced in a recent commit

M	op.c
M	toke.c
M	utf8.c
M	utf8.h
M	utfebcdic.h

commit a0f98dc2f4bd314ccaeac36dfbb47cfa1383421a
Author: Karl Williamson <public@khwilliamson.com>
Date:   Thu Feb 7 13:52:31 2013 -0700

    utf8.h, utfebcdic.h: Create less confusing #defines
    
    This commit creates macros whose names mean something to me, and I don't
    find confusing.  The older names are retained for backwards
    compatibility.  Future commits will fix bugs I introduced from
    misunderstanding the meaning of the older names.
    
    The older names are now #defined in terms of the newer ones, and moved
    so that they are only defined once, valid for both ASCII and EBCDIC
    platforms.

M	utf8.h
M	utfebcdic.h

commit c6b660e92c6c3112981b8b7ecf9ab554c8f6cde2
Author: Karl Williamson <public@khwilliamson.com>
Date:   Mon Feb 4 14:22:02 2013 -0700

    pp_ctl.c: Use isCNTRL instead of hard-coded mask
    
    This is clearer and portable to EBCDIC.

M	pp_ctl.c

commit 355b90ed4f79405fe4bffb202667db3911ac03b5
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Feb 26 13:51:05 2013 -0700

    utf8.c: is_utf8_char_slow() should use native length
    
    What is passed is the actual length of the native utf8 character.  What
    this was calculating was the length it would be if it were a Unicode
    character, and then compares, apples to oranges.

M	utf8.c

commit 7b95e84edb7afc7a9529a12b07a4cdaa611c3319
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 30 08:42:08 2013 -0600

    autodoc.pl: Don't list undocumented deprecated fcns in API
    
    autodoc creates a list of all the undocumented functions that are part
    of the API.  It omits ones that are experimental and whose API may
    change; and now it omits ones that are deprecated (and whose API is
    planned to change to be non-existent)

M	autodoc.pl

commit 73acb38ffb609e4996d74f3b2a0148f2f363c045
Author: Karl Williamson <public@khwilliamson.com>
Date:   Tue Apr 30 08:39:44 2013 -0600

    autodoc.pl: Add note for deprecated functions
    
    This causes each deprecated function to have a prominent note to that
    effect in its API documentation.

M	autodoc.pl
M	mg.c
M	utf8.c
-----------------------------------------------------------------------

--
Perl5 Master Repository


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic