[prev in list] [next in list] [prev in thread] [next in thread]
List: perl5-changes
Subject: [perl.git] branch khw/ebcdic, created. v5.17.11-243-gb407d77
From: "Karl Williamson" <public () khwilliamson ! com>
Date: 2013-04-30 18:19:55
Message-ID: E1UXF9f-0002By-B0 () camel ! ams6 ! corp ! booking ! com
[Download RAW message or body]
In perl.git, the branch khw/ebcdic has been created
<http://perl5.git.perl.org/perl.git/commitdiff/b407d7766c85bc6bf2e30c2cd2ef9a306402bbda?hp=0000000000000000000000000000000000000000>
at b407d7766c85bc6bf2e30c2cd2ef9a306402bbda (commit)
- Log -----------------------------------------------------------------
commit b407d7766c85bc6bf2e30c2cd2ef9a306402bbda
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 23 18:58:54 2013 -0600
XXX experimental pp_pack.c: 'u'
M pp_pack.c
commit 02820fd88b0561ecbd47970f3708f617d397707b
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Feb 25 17:25:08 2013 -0700
XXX CPAN Normalize
This converts Unicode::Normalize to use the native tables that are used
by Perl starting in XXX, while using the Unicode-ordered ones that were
used before then.
Another alternative would be to have mktables generate just these tables
in Unicode ordering.
M cpan/Unicode-Normalize/Normalize.xs
commit 5a197cd4a6f6ad23bcf3c1aa8201b30fe24c9f4f
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Feb 25 17:22:55 2013 -0700
XXX CPAN prob wrong Collate
This changes to implicity usenative code points. This is likely wrong,
as the module comes with its own data, that are probably in terms of
Unicode
M cpan/Unicode-Collate/Collate.xs
commit 50e8ac62d0449940ab8cb4f02fe79722627d1204
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 27 22:14:02 2013 -0600
utf8.c: Remove wrapper functions.
Now that the Unicode data is stored in native character set order, it is
rare to need to work with the Unicode order. Traditionally, the real
work was done in functions that worked with the Unicode order, and
wrapper functions (or macros) were used to translate to/from native.
There are two groups of functions: one that translates from code point
to UTF-8, and the other group goes the opposite direction.
This commit changes the base function that translates from UTF-8 to code
point to output native instead of Unicode. Those extremely rare
instances where Unicode output is needed instead will have to hand-wrap
calls to this function with a translation macro, as now described in the
API pod. Prior to this, it was the other way, the native was wrapped,
and the rare, strict Unicode wasn't. This eliminates a layer of
function call overhead for a common case.
The base function that translates from code point to UTF-8 retains its
Unicode input, as that is more natural to process. However, it is
de-emphasized in the pod, with the functionality description moved to
the pod for a native input wrapper function. And, those wrappers are
now macros in all cases; previously there was function call overhead
sometimes. (Equivalent exported functions are retained, however, for XS
code that uses the Perl_foo() form.)
I had hoped to rebase this commit, squashing it with an earlier commit
in this series, eliminating the use of a temporary function name change,
but the work involved turns out to be large, with no real payoff.
M embed.fnc
M embed.h
M mathoms.c
M proto.h
M utf8.c
M utf8.h
commit 0b861599c9110cd3c8a4bc9011cd870262d7531c
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 30 09:13:35 2013 -0600
perlapi vis utf8.c: Nits
M utf8.c
commit 21ac04cab11fdc7e4111a540dfac51ce49aa98bd
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 30 08:04:45 2013 -0600
utf8.c: Move 2 functions to earlier in file
This moves these two functions to be adjacent to the function they each
call, thus keeping like things together.
M utf8.c
commit b998a6105de76a79d466ece6d5779a46faf42db1
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 27 08:59:19 2013 -0600
embed.fnc: Slight clarification in comments
M embed.fnc
commit 673abbdc3af3fe6675936e723a9a11ad32e6fc1d
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Apr 22 14:44:08 2013 -0600
mg.c: White-space only
I found re-formatting this multi-line 'if' to be easier to understand
M mg.c
commit b989379c1ae25341231b1cec97d50e8fd0618945
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Apr 22 14:34:47 2013 -0600
toke.c: Remove redundant test
This checks that something is both not-printable and not a word
character, but all word characters are printable, so just the
non-printable test suffices.
M toke.c
commit ede1131e5af2cb3ea627d4ff1fc0af13dd279e34
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 20 17:04:08 2013 -0600
gv.c: Add comment
M gv.c
commit fa39d8516839a395557c32e30dd192047491897e
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Apr 19 17:02:25 2013 -0600
XXX rebase, finish up: reenable fold_grind.t
M t/re/fold_grind.t
commit 398e205cc26021df34185831aadc71d37d36f509
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Apr 19 13:58:12 2013 -0600
t/op/coreamp.t: Generalize for non-ASCII platfomrs
M t/op/coreamp.t
commit b4036f082cf0a6c4c245e16b1c72e246b2b3ccea
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Apr 19 13:19:44 2013 -0600
XXX temporary lib/warnings.pm: Add debugging info
M lib/warnings.pm
commit f0d98c60179636f156568e36e7dbf2e17a5fa54b
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Apr 19 13:18:20 2013 -0600
regcomp.c: Add missing (parens) to expression
A pair of parentheses were missing leading to this 'if' not acting as
intended.
M regcomp.c
commit cd346eb7f5b365e7ef54a806a8cd7497ea1b746b
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Apr 17 21:49:10 2013 -0600
t/re/re_tests: Some tests are platform-specific
M t/re/re_tests
commit 5104b329bd4d7259a4a11d600a033b4e7722c0c0
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Apr 17 21:47:41 2013 -0600
t/re/regexp.t: Add ability to skip depending on platform
This adds the capability to specify that a test is to be done only on an
ASCII platform, or only on an EBCDIC.
M t/re/regexp.t
commit a57313a9aa9ff8a9b4b523563e95423ba9da92cf
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Apr 17 08:22:36 2013 -0600
t/io/crlf.t: Generalize for non-ASCII platforms
M t/io/crlf.t
commit c6c2a05c0bbfcf6de7c20654ea2408aa5e700712
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 16 20:15:08 2013 -0600
unicode_constants.h: regened for ebcdic
M unicode_constants.h
commit 0557369ba11ba0e625a613e1790212371c8bcc46
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 16 15:49:06 2013 -0600
XXX finish up t/re/regexp.t: Generalize for non-ASCII platforms
This adds code to the processing of the tests in t/re/re_tests to
automatically convert from unicode to native character sets
Add comment about circular tests
XXX better commit message
M t/re/regexp.t
commit f11e45e9d5d10f4bc33c1756f39f5249c4c90f91
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 16 12:13:07 2013 -0600
ext/B/t/b.t: Generalize for non-ASCII platforms
M ext/B/t/b.t
commit d829285f59a911f3575680271c9a069fc1af1e44
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 16 12:02:26 2013 -0600
dist/Safe/t/safeutf8.t: Generalize to non-ASCII platform
M dist/Safe/t/safeutf8.t
commit 673d4136b1da614dd35f8bcc61744389399b644a
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 16 11:50:04 2013 -0600
t/op/warn.t: Generalize for non-ASCII platforms
M t/op/warn.t
commit f3dff4bbbd0d971ddf7872b0c2f3abc1e316dcc5
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 16 10:18:02 2013 -0600
re/reg_email.t: Generalize for non-ASCII platforms
This replaces all the hard-coded hex character values. It uses the new
(?[ ]) notation. I checked that the compiled regex matches the exact
same code points as before these changes.
M t/re/reg_email.t
commit 8a0391e583ba9146ba3635c6682f12d8e986715a
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 16 09:04:50 2013 -0600
t/porting/regen.t: Add file to check
M t/porting/regen.t
commit 3f772bf8974203e1c761b7b7119cbadfc98cff24
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 16 09:03:47 2013 -0600
dist/ExtUtils-Install/t/InstallWithMM.t: Skip if EBCDIC
Because is uses JSON
M dist/ExtUtils-Install/t/InstallWithMM.t
commit 353f2f9337cff31ba9a13b5efe9e198aa973800a
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Apr 14 21:31:04 2013 -0600
XXX: t/lib/warnings/utf8: Experiment with malformed utf8
M t/lib/warnings/utf8
commit e1d6b933f7d26cf087fea5fd9eabd760695d57be
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 22:04:50 2013 -0600
XXX skip cpan tests
M t/TEST
commit d87b511d45950997cb8428765c81519ccd2944a1
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 16:19:20 2013 -0600
ext/XS-APItest/t/svpeek.t: Generalize for non-ASCII platforms
M ext/XS-APItest/t/svpeek.t
commit 89e1406e5640b8d7d1984c79fb5a72cfb24ee377
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 16:14:35 2013 -0600
ext/XS-APItest/t/svpv_magic.t: Generalize for non-ASCII platforms
M ext/XS-APItest/t/svpv_magic.t
commit ea4b078f4e69979796a74a769446732cb33ecbf8
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 15:54:37 2013 -0600
lib/DBM_Filter/t/encode.t: Generalize for non-ASCII platforms
M lib/DBM_Filter/t/encode.t
commit 9708080ab55f4e9ea45b0bf5eba3c997b2aa3ed7
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 15:48:06 2013 -0600
XXX finish up lib/dumpvar.pl: Generalize for EBCDIC
Has octal constants
M lib/dumpvar.pl
commit 9c1c5b86114bf90d7485fabc4b7927f110e2c378
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 15:35:52 2013 -0600
XXX finish up lib/utf8.t: Generalize for non-ASCII platforms
This includes choosing a different code point that has 3 bytes in both
UTF-8 and UTF-EBCDIC, so that the pos numbers work for both.
M lib/utf8.t
commit 8aae8cd02cbf8a4ef258270a1147e0c682918d07
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 15:16:44 2013 -0600
t/uni/parser.t: Generalize for non-ASCII platforms
M t/uni/parser.t
commit 268a6dfc5343ed087c87539798a1511d4363e429
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 14:41:46 2013 -0600
t/uni/method.t: Generalize for non-ASCII platforms
I couldn't figure out a way to not use the hard-coded values
M t/uni/method.t
commit 5975c25b801ffcd66fecb6bb5aac620ad95dd2c6
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 14:26:09 2013 -0600
t/op/magic.t: Generalize for non-ASCII platforms
M t/op/magic.t
commit 62d44001515993d7cdca388783935c686befec61
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 13:36:41 2013 -0600
t/io/through.t: Generalize for non-ASCII platforms
This uses hard-coded values for EBCDIC because of the shell issues
M t/io/through.t
commit e105e4bc303b4fbc8ab4dcd77f7445c682e79f33
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 13:16:00 2013 -0600
toke.c: Fix EBCDIC bugs with single char variable names
Latin1 variable single character variable names should all be legal,
but the test was not for non-ASCII, it was for variant characters. On
EBCDIC platforms, this isn't the same as non-ASCII.
The legal control character variable names are not the same as the C0
and DEL controls, but are \001 \037 minus those that traditionally match
\s on ASCII platforms, plus \c?.
M toke.c
commit a3f37bf4a50fd3db66ee7370973b5b6782182a58
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 12:55:09 2013 -0600
toke.c: An EBCDIC fix
toCTRL(0..31) yields a printing character. This is different from
toCTRL(control) on EBCDIC machines.
M toke.c
commit 778836bfb6c64b6a48ead242d78ad3caec50f446
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 12:52:17 2013 -0600
XXX \c must be followed by printable
This should be revised and included in 5.18, 5.19 depending on RFC outcome.
M dquote_static.c
commit c538d599928e657fe08a8c9e263caabea06fc63d
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 11:41:04 2013 -0600
XXX temp toCTRL
M dquote_static.c
M ext/B/B.pm
M handy.h
M pod/perlebcdic.pod
M t/op/chars.t
commit 22edbc1630e77d62fb420a0f6cc95d009b4aa4a4
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 09:18:41 2013 -0600
perlio.c: Generalize for EBCDIC
This code had the hex constants for CARRIAGE RETURN and LINE FEED
hard-coded in. It appears to me from the comments that '\r' and '\n'
are not suitable to use instead. This commit changes the constants to
use the native values instead.
M perlio.c
commit 72b96dc1bdfc38ed93befdc8940bc28bfc6bef97
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 13 09:51:34 2013 -0600
unicode_constants.h: Add #defines for CR, LF
M regen/unicode_constants.pl
M unicode_constants.h
commit d044e82436ca83e2651e320368577aa68de65ec3
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Apr 7 10:45:14 2013 -0600
t/op/goto.t: Generalize for EBCDIC
M t/op/goto.t
commit ac1844613833b8b60b75cc4573fe8516dd285924
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 6 21:03:44 2013 -0600
regcomp.c: White-space only, wrap comment to fit
M regcomp.c
commit 0748f2ff52f149ab4507202c197e50f2dbb6e12b
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Apr 3 20:15:17 2013 -0600
t/re/pat.t: Generalize for EBCDIC
M t/re/pat.t
commit 9e6fd46a036ddf939ea6d4d095577fee07252483
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Apr 3 21:56:02 2013 -0600
XXX t/op/pack.t: Generalize for EBCDIC
One unknown what to do: uuencode
M t/op/pack.t
commit cf3cd58b9c837cecd70e6a2a8e527e86ef5e32a5
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 6 12:56:52 2013 -0600
regcomp.c: In EBCDIC [i-j] exclude also ASCII
i and j are not adjacent in EBCDIC. This excluded any alphabetic
characters between them, but allowed other ascii ones.
M regcomp.c
M t/re/pat_advanced.t
commit 42aa56769696320c8aefc47f86091cc805ad9648
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 6 12:54:42 2013 -0600
utf8.c: Don't use slower general-purpose function
There is a macro that accomplishes the same task for a two byte UTF-8
encoded character, and avoids the overhead of the general purpose
function call.
M utf8.c
commit d994ea7004bd7786446f6755c1f5ebd0b887bc7c
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 6 12:53:07 2013 -0600
utf8.c: Don't do ++ in macro parameter
The formal parameter gets evaluated multiple times on an EBCDIC
platform, thus incrementing more than the intended once.
M utf8.c
commit 3c3a66716c7a569bf8b558544dcc0cd345125f68
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 6 12:50:48 2013 -0600
utf8.c: Use macro instead of duplicating code
There is a macro that accomplishes this task, and is easier to read.
M utf8.c
commit a1ab7fcb2a13a19d0acf0d0c1aed8bfb3d2ab7c3
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 6 10:15:05 2013 -0600
t/io/bom.t: Fix to run under EBCDIC
M t/io/bom.t
commit 6c00a3974820781dbb06de02fb05b30866be63aa
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Apr 5 23:34:50 2013 -0600
t/uni/overload.t: EBCDIC fixes
M t/uni/overload.t
commit 5d98a9019a51910542a677c14fdc22d8d935338f
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Apr 5 23:34:13 2013 -0600
t/uni/method.t: EBCDIC fixes
M t/uni/method.t
commit 9ef4d43c8d8b35f9adb816601ad8b66f39337969
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Apr 5 23:33:28 2013 -0600
t/op/utf8magic.t: EBCDIC fixes
M t/op/utf8magic.t
commit 1bb871086122e9d1525eb2f7efca8d517ea6fe16
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Apr 5 23:32:57 2013 -0600
t/op/evalbytes.t: EBCDIC fixes
M t/op/evalbytes.t
commit 15aef1cfb50d3fd8e2486aa09ba463166be6522c
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Apr 5 16:20:20 2013 -0600
lib/utf8.pm: Fix pod verbatim line wrap
M lib/utf8.pm
M t/porting/known_pod_issues.dat
commit f869dfd4716ea58b9d1bec1a574890155ef40494
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Apr 5 13:27:42 2013 -0600
t/op/length.t: EBCDIC fixes
M t/op/length.t
commit e24ea3159f0193b8b6de0e0478e74faf7baa2a7b
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 6 13:01:54 2013 -0600
t/op/utfhash.t: XXX Add debug
M t/op/utfhash.t
commit 6207f508136411d5ddeeec17383744f6289cb6f6
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Apr 5 12:21:21 2013 -0600
Data-Dumper/Dumper.pm: Fix for EBCDIC
M dist/Data-Dumper/Dumper.pm
commit fdfea61ded453db682807db7f4f47b7b173a4277
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Apr 5 12:15:58 2013 -0600
Dumper.xs: Don't translate character twice
utf8_to_uvchr() already returns the native code point; no need to
convert again. This code is only executed on Perls before 5.15
M dist/Data-Dumper/Dumper.xs
commit eb2c9790c32486a7abda26616604a3da6e31f6d6
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Apr 6 20:39:22 2013 -0600
dist/IO/t/io_utf8argv.t: Generalize and enable EBCDIC
Infrastructure now exists to have this test run on EBCDIC platforms.
M dist/IO/t/io_utf8argv.t
commit 3c0a09008c6b14659bd1886ab00a78a60355e62c
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Apr 3 21:59:16 2013 -0600
utf8.h: Clarify comments
M utf8.h
commit 60725362b8884d108f95a7b3e0d3eb0e3ed93a4b
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Apr 3 19:06:52 2013 -0600
XXX CPAN cpan/Test/lib/Test.pm: Fixes for EBCDIC
M cpan/Test/lib/Test.pm
commit c37f5efa9c4daa3f210b9dcbee24361ef31e946e
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Apr 1 22:29:16 2013 -0600
t/re/pat_re_eval.t: Some EBCDIC fixes
M t/re/pat_re_eval.t
commit dba798f061c2cf30edf9799b70f6248edde7e95e
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 2 07:11:19 2013 -0600
t/test.pl: Add fcn for UTF-EBCDIC conversion
This adds the function byte_utf8a_to_utf8n(). This takes the bytes that
form a UTF-8 string and convert them to the bytes that form that string
on the native platform.
M t/test.pl
commit e5e96b333d1cfab6d0ed62838612f2849a2e460e
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Apr 1 22:28:43 2013 -0600
dist/Storable/t/utf8.t: Fix to run under EBCDIC
M dist/Storable/t/utf8.t
commit 756afbfd0700f3c9547236ce9837a3b8976d9f2d
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Apr 1 22:28:08 2013 -0600
t/uni/variables.t: Fix to run under EBCDIC
M t/uni/variables.t
commit 3669cee44dbc045980413f7a7f2f00a818bea52b
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Apr 1 21:08:20 2013 -0600
t/op/split.t: EBCDIC fixes
M t/op/split.t
commit deeb721978cdf7720d04931abb648b9e841e8ffe
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Apr 1 20:43:03 2013 -0600
re/pat_advanced.t: EBCDIC fixes
This includes not skipping some EBCDIC that formerly was, since we now
have testing infrastructure that makes this easy.
M t/re/pat_advanced.t
commit 59450a879007f7a7300480293e65d05410cb82c0
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Apr 1 20:01:04 2013 -0600
t/io/utf8.t: EBCDIC fixes
M t/io/utf8.t
commit 2061779dddcbc467b02c3e9b6e65d8abf2baa3ed
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Mar 30 21:13:38 2013 -0600
Unicode::UCD.pm: Nits
M lib/Unicode/UCD.pm
commit 39f2ee9c1188559421255e4d77363d626889b4c4
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Mar 30 12:32:09 2013 -0600
t/uni/fold.t: Generalize for non-ASCII platforms
M t/uni/fold.t
commit 9d7c2bc40ae4dc79093ffc27fe752b590eb2d2c2
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Mar 29 15:22:28 2013 -0600
XXX t/op/tiehandle.t: skip for now; deep recursion
M t/op/tiehandle.t
commit 24c684efcbe68d82a3ee38c40d5e6523bc0557e4
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Mar 29 14:56:16 2013 -0600
XXX better commit msg utf8.c: Avoid unnecessary UTF-8 conversions
This changes the code so that converting to UTF-8 is avoided unless
necessary. For such inputs, the conversion back from UTF-8 is also
avoided. The cost of doing this is that the first swatches are combined
into one that contains the values for all characters 0-255, instead of
having multiple swatches. That means when first calculating the swatch
it calculates all 256, instead of 128 (160 on EBCDIC).
This also fixes an EBCDIC bug in which characters in this range were
being translated twice.
M utf8.c
commit f1b9465fca2317758f65380fa5541ba7298276be
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Mar 29 13:34:59 2013 -0600
utf8.c: No need to check for UTF-8 malformations
This function assumes that the input is well-formed UTF-8, even though
until this commit, the preferatory comments didn't say so. The API does
not pass the buffer length, so there is no way it could check for
reading off the end of the buffer. One code path already calls
valid_utf8_to_uvchr(); this changes the remaining code path to correspond.
M utf8.c
commit 5e1efb7b0527fda29b8e5afdeb07e7dad6f2ca04
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Mar 28 19:56:39 2013 -0600
utf8.c: Remove redundant assignment.
This variable is always set just below.
M utf8.c
commit fed3816f422e1daa6e8b0db7eb4d1b566196bbd0
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Mar 28 17:19:16 2013 -0600
XXX enable _invlist_dump;
M embed.fnc
M embed.h
M proto.h
commit e64aad44709d90379f522ff10d4649d2e4d00e55
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Mar 8 11:01:32 2013 -0700
XXX EBCDIC header files
M charclass_invlists.h
M l1_char_class_tab.h
M regcharclass.h
M unicode_constants.h
commit 019beee02470827bc1f004ddf6b516402af48260
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Mar 15 12:26:15 2013 -0600
hints/os390.sh: Suppress bogus compiler message
M hints/os390.sh
commit 11a30f20c4da10f0feeced233bf2684e00c75ece
Author: John Goodyear <johngood@us.ibm.com>
Date: Sat Mar 2 12:31:25 2013 -0700
XXX Temporary for z/OS long long support
M Configure
M hints/os390.sh
commit 119a0715d8c01f5cfec0682311f858c18e7597a7
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 27 18:17:28 2013 -0600
Add test that to/from native character set works
For non-ASCII systems, there are character set translation tables. This
makes sure the two accessible ones are inverses of each other. If not,
nothing can be expected to work right.
M MANIFEST
A t/base/translate.t
commit 5f0c5bfbb8d6f4e845ecf89f1d0e317a0b372b18
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 27 16:55:55 2013 -0600
lib/feature/bundle: Fix some things to pass under EBCDIC
M t/lib/feature/bundle
commit b9ad138530a0c76aefcad57a322f3a942f02d0a6
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 27 16:08:04 2013 -0600
XS-APItest/t/fetch_pad_names.t: Skip if EBCDIC
This could be ported, but there's a lot of stuff to convert; would need
a function to convert byte strings that form legal UTF-8 into legal
UTF-EBCDIC
M ext/XS-APItest/t/fetch_pad_names.t
commit 5599ad9706a3902a83b5b2af0bdf8d6994773312
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 27 12:05:53 2013 -0600
XXX ext/XS-APItest/t/utf8.t: Fix so passes EBCDIC
This involves skipping much of the tests. Reexamine later
M ext/XS-APItest/t/utf8.t
commit 962c60330d68e4f268d3b4dc9537f8b0ec907fed
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 27 11:27:06 2013 -0600
ext/re/t/re_funcs_u.t: Fix to work under EBCDIC
M ext/re/t/re_funcs_u.t
commit e9c6dee6411d20d25aa338093f8750bf52c830f5
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 27 11:11:22 2013 -0600
XXX dist/IO/t/io_utf8argv.t: Temporarily skip if EBCDIC
M dist/IO/t/io_utf8argv.t
commit 54a2162dc70b00a569fb7bd11390d9ec23b718fe
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 27 10:33:44 2013 -0600
t/op/print.t: Skip an EBCDIC test
This could be written (the values would probably change depending on the
code page), but the code that would get exercised is unlikely to vary
depending on character set.
M t/op/print.t
commit 436abc614332a5d92e050d6eb7e1226ae1b36d18
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Mar 26 15:44:59 2013 -0600
XXX t/TEST: Avoid SIGPIPEs
M t/TEST
commit 4e3f5bb71d96b774eaa30c5ede2669b25cc5ceef
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Mar 26 15:49:08 2013 -0600
XXX Temporarily test normalization
M cpan/Unicode-Normalize/t/fcdc.t
M cpan/Unicode-Normalize/t/form.t
M cpan/Unicode-Normalize/t/func.t
M cpan/Unicode-Normalize/t/illegal.t
M cpan/Unicode-Normalize/t/norm.t
M cpan/Unicode-Normalize/t/null.t
M cpan/Unicode-Normalize/t/partial1.t
M cpan/Unicode-Normalize/t/partial2.t
M cpan/Unicode-Normalize/t/proto.t
M cpan/Unicode-Normalize/t/split.t
M cpan/Unicode-Normalize/t/test.t
M cpan/Unicode-Normalize/t/tie.t
commit 261067894f4bdeae223fb15ea21b455babc9a57d
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Mar 26 14:06:50 2013 -0600
op/index.t: Fix tests for EBCDIC
Commit 8a38a836 erroneously translates literals into the native
encoding, causing a double translation, which is garbage.
M t/op/index.t
commit 8ad90e8e1c90f0215c0e88dcb982ad379250f950
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 25 20:43:38 2013 -0600
op/chop.t: Fix for EBCDIC
One test is skipped because the code point is not representable on
EBCDIC platforms. Another test is modified to work on EBCDIC.
M t/op/chop.t
commit 40ed7d874aff44b2b5220dd287dc2caa045f4696
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 25 19:56:50 2013 -0600
t/op/lc.t: Fix to work under EBCDIC
This had code that attempted this, but it was wrong. The conversion to
EBCDIC must be done before the \U, or similar.
M t/op/lc.t
commit c6d1c65247594d26b41d2f3893a008d48b1a76fe
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 25 15:33:55 2013 -0600
Skip some tests under EBCDIC
EBCDIC won't work on these because of inherent differences from ASCII
M t/porting/customized.t
M t/porting/manifest.t
commit 1cf20253ac5e2a1ddea67cd48cfae2762e19219a
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 25 15:04:14 2013 -0600
porting/bincompat.t: Skip under EBCDIC
because the sorting order is different
M t/porting/bincompat.t
commit 5e33bbfec0ddd0037d9496d9bba20b79ac39fd36
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 25 14:59:50 2013 -0600
t/re/regex_sets.t: So will pass under EBCDIC
M t/re/regex_sets.t
commit b308d6eab702974802c62eebc70b063a2319e6a9
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 25 14:59:26 2013 -0600
t/porting/bincompat.t: Typo in comment
M t/porting/bincompat.t
commit db01a8b91d76b5ccd32502a0cf47804297d78775
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 25 13:09:09 2013 -0600
XXX fix \x{too large}
M dist/IO/IO.xs
M doop.c
M inline.h
M pp.c
M pp_pack.c
M regcomp.c
M sv.c
M toke.c
M utf8.c
M utf8.h
commit df2fe75ccdafbf1642fd72c6a6cba2069a862be1
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Mar 24 17:59:59 2013 -0600
mktables: Fix typos in comments
One of these fixes is for where a real CTRL-X was specified, instead of
$^X
M lib/unicore/mktables
commit c744de11174926be5c58fc7acdd72e81f73d61dc
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Mar 24 13:16:08 2013 -0600
utf8.c: Fix so UTF-16 to UTF-8 conversion works under EBCDIC
M utf8.c
commit 1ff9a68f3ce0591da90be6b6ddd4179894be5599
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Mar 24 13:14:34 2013 -0600
utf8.h, utfebcdic.h: Add #define
M utf8.h
M utfebcdic.h
commit bcafb326a821e97201b0a7bc058b91a82e3390b9
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Mar 24 13:11:25 2013 -0600
utf8.c: Use mnemonics instead of hex numbers
M utf8.c
commit 23733f9feb1625182b4f7a96ccaa38a9a1ec04cd
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 20 22:15:58 2013 -0600
lib/Unicode/UCD.t: Allow to run under EBCDIC,
M lib/Unicode/UCD.t
commit d3f160ec3d8703769674a7fd35c21d9795900b40
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Mar 19 15:27:31 2013 -0600
t/op/quotemeta.t: EBCDIC fixes
M t/op/quotemeta.t
commit fb91c20b51a78c33b3d469a6dab33c3c805debf6
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Mar 19 11:32:55 2013 -0600
t/re/fold_grind.t: Fixes for EBCDIC
M t/re/fold_grind.t
commit 4358a3a102ab4ad200abed892a1298415a71f51a
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Mar 19 11:21:09 2013 -0600
t/lib/charnames/alias: Fix some EBCDIC problems
M t/lib/charnames/alias
commit f1f753bf96ad4ebd4fd8c4cc671002c6e1bb5fc3
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Mar 19 11:20:24 2013 -0600
t/uni/class.t: Make work on EBCDIC
M t/uni/class.t
commit ddcf29077cffdaa1c8dae8b69bd931d4455017dd
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Mar 19 11:01:57 2013 -0600
feature/unicode_strings.t: Fix to work on EBCDIC
M lib/feature/unicode_strings.t
commit 5d946d7851e9906d20a53e494c7aa25904ddd268
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Mar 19 10:10:46 2013 -0600
XXX rebase regen/regcharclass.pl: make more EBCDIC friendly
XXX regen/regcharclass.pl: maybe temp comment out utf8_char
One of the possible inputs to this process is a string. This clarifies
that it must be specified in Unicode characters, and adds code to
translate it to native, if necessary.
M regen/regcharclass.pl
M regen/regcharclass_multi_char_folds.pl
commit 0e62f9e6d907fca9ffc9349e705848075e4fc0d6
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Mar 19 10:09:53 2013 -0600
XXX temporarily skip some folding tests
M regen/regcharclass.pl
M t/re/fold_grind.t
M t/re/reg_fold.t
commit b6d2168f900353073833e2092d3c6f9e34bbd829
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 18 22:00:29 2013 -0600
XXX temp skip perl5db.t
M lib/perl5db.t
commit b0ee64fee4d97b5733463ce1b2568b6cf5dc0995
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 18 11:45:06 2013 -0600
pp.c: White-space only
Make a ternary operation more clear
M pp.c
commit 82ffeb8eb0556d6dcc0dca4491a3ce1e8435278e
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 18 11:43:42 2013 -0600
Fix valid_utf8_to_uvchr() for EBCDIC
M utf8.c
commit 27ab3e38d590ae2ca0c1c9e78039e72746ec7463
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Mar 17 21:42:20 2013 -0600
t/test.pl: Add comment about EBCDIC
M t/test.pl
commit 4f15892a9e91315f863b1c20188174267d3b8044
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Mar 17 17:39:33 2013 -0600
XXX makedepend.SH: Why does 255 work and 250 not?
M makedepend.SH
commit ba56cfd7ff276c1a82d918d60c3eadfaffeacc46
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Mar 16 22:48:22 2013 -0600
XXX regen/mk_PL_charclass.pl: Make EBCDIC friendly
need more of a commit message
M regen/mk_PL_charclass.pl
commit f2ef559a2c7516570de894265b294f97a7060cff
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Mar 16 22:44:44 2013 -0600
XXX make various things more EBCDIC friendly
Adds trailing white space errors
Need to know what to do about ^A meaning 0x1, and M-foo meaning meta
M lib/DB.pm
M lib/dumpvar.pl
M lib/perl5db.pl
M lib/sigtrap.pm
commit 222fd450cbb588df3645dd756ca7427321dd8406
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Mar 16 22:41:15 2013 -0600
XXX: Fixup commit message.
Fix UTF8_ACUUMULATE, utf8.c
M utf8.c
M utf8.h
commit 0db7087d577d25c662cd89eed175c7d6c2de7b42
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Mar 16 16:52:45 2013 -0600
regcomp.c: Fix bug in EBCDIC
The POSIXA and NPOSIXA regnodes need to set the bits on only the ASCII
code points, but under EBCDIC those code points are not 0-127.
M regcomp.c
commit 6f9a7a015291e2855aaa3f71a610d8a52fd48188
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Mar 15 11:57:24 2013 -0600
re/charset.t: Allow to work on EBCDIC
This just converts the hard-coded character numbers to native, so will
work on any platform.
M t/re/charset.t
commit 37d1f528847997b09f1564b84f12b6c6166f56dc
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Mar 15 11:50:35 2013 -0600
XS-APItest/t/handy.t: Change output message
On EBCDIC platforms, the output is not in terms of \N{U+}; change text
to \x{ }
M ext/XS-APItest/t/handy.t
commit ef85adcdcd6b58903a6de8dccf1b7f3f417d1173
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 13 21:44:16 2013 -0600
XXX Dumper.xs: Don't know why this stopped compiling
M dist/Data-Dumper/Dumper.xs
commit 9339920e391408b24ee039f0365d72cd8b17f4d7
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 13 16:20:23 2013 -0600
toke.c: Simplify some code
We don't have to test separately for lower vs uppercase here, as
upper/lower case A-Z and a-z are not intermixed in the gaps in A-Z and
a-z under EBCDIC.
M toke.c
commit 0d053611833136e45c69aa262e2a58b139099578
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 13 16:18:12 2013 -0600
genpacksizetables.pl: Correct comment typo
M genpacksizetables.pl
commit c8e470c43aaa8c4e3db8bd278912b857d5d0d15c
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 13 16:17:39 2013 -0600
APItest/t/handy.t: Make EBCDIC-friendly
M ext/XS-APItest/t/handy.t
commit 70f3ef564fea4ae891b50083134131d04f8c1231
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 13 16:16:14 2013 -0600
Data-Dumper: Make EBCDIC-friendly
M dist/Data-Dumper/Dumper.xs
commit 481b9f8fc41292acf791da1d85fcc9eac3142556
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 13 16:14:31 2013 -0600
sv.c: Make less ASCII-centric
M sv.c
commit 3abedf4a6b4d618c2cb80b54b260140b5289e430
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 13 16:07:52 2013 -0600
charnames.t: Generalize for non-ASCII platforms
M lib/charnames.t
commit eea453362c4060ac7955b15e437785cafc88815e
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 13 16:05:46 2013 -0600
dump.c: Make less ASCII-centric:
This has the added advantage of being clearer as to what is going on.
M dump.c
commit fdbe387ef0afd4914d08664ac41e6d8c5362bdf7
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 13 16:02:52 2013 -0600
hv.c: Stop being ASCII-centric
This uses macros which work cross-platform. This has the added advantge
that it is much clearer what is going on.
M hv.c
commit 57fbf4c60e98ba17fb2a7f6e231797434b97ad15
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Mar 12 22:34:17 2013 -0600
t/TEST: Don't bail if fails in t/base unless minitest
In order to completely compile Perl, many modules must have been parsed
and compiled, so if there is a full perl, we know that things basically
work. The purpose of bailing out is that if these supposedly very base
level functionality tests don't work, there's no point in continuing.
But over the years, tests of more esoteric functionality have been
added here, and if one of them doesn't work, it still could be that Perl
pretty much does work.
I believe it would be best to move such non-basic tests elsewhere, but
that's work, and hasn't bitten us much so far; this change lessens the
severity of the biting even more. Where it will really bite is if
things are so bad that a full perl binary can't be compiled, and we are
trying to figure out why using minitest.
M t/TEST
commit 7b38e7564441570de7d2f76126da190800f4e591
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 11 15:11:10 2013 -0600
Added Porting/reorder_charclass_invlists.pl
This program is used too bootstrap perl onto a non-ASCII platform with
no pre-existing perl.
M MANIFEST
A Porting/reorder_charclass_invlists.pl
commit a1d64e6244113e63632fea9aa961ecc270ae2901
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Mar 10 22:17:31 2013 -0600
t/base/lex.t: Use char suitable for both ASCII and EBCDIC
\xE2 is 'S' in EBCDIC, and so is going to be legal. \xDF is an alpha
which has no ASCII equivalent in either character set
M t/base/lex.t
commit 46eb04e6055bf8c3d0269580f01ba592599171c0
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Mar 10 13:11:07 2013 -0600
XXX Temporary comment out ParseXS check
this is to get things to compile for now
M dist/ExtUtils-ParseXS/lib/ExtUtils/ParseXS.pm
commit 52fe59e7f579e6c8c8c1f75364552289c3d0372d
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Mar 10 11:34:10 2013 -0600
XXX Collate, Normalize: Allow to compile under EBCDIC
M cpan/Unicode-Collate/Collate.pm
M cpan/Unicode-Collate/mkheader
M cpan/Unicode-Normalize/Normalize.pm
M cpan/Unicode-Normalize/mkheader
commit 6179f495ff10bd45debd0d683c7ef7ae2b58cea4
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Mar 9 21:57:38 2013 -0700
XXX dquote_static.c: Silence wrong warning on EBCDIC
Unsure of whether to add the 2nd !isCNTRL_L1 to silence return trip,
which should be a separate commit anyway.
This silences an inappropriate warning that doesn't happen on ASCII
platforms. CTRL-T maps to 0x14 on both ASCII and EBCDIC platforms. But
0x14 is a C1 control on EBCDIC, a C0 on ASCII. Therefore the test that
it's a control should include both C0 and C1, which isCNTRL_L1() does.
Also has a white-space change, outdenting a line so it doesn't wrap in
an 80 column window.
M dquote_static.c
commit b2aa076a112b143b69c9f4b1460fee7bb5579ee6
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Mar 7 12:08:41 2013 -0700
utfebcdic.h: Change 'unsigned char' to U8
This is for consistency with the rest of Perl
M utfebcdic.h
commit 9a6eb35c78871f9a5c1032222ed8101b4bb8076f
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Mar 8 08:11:38 2013 -0700
regen/regcharclass.pl: Make more EBCDIC-friendly
This commit changes the code generated by the macros so that they work
right out-of-the-box on non-ASCII platforms for non-UTF-8 inputs. THEY
ARE WRONG for UTF-8, but this is good enough to get perl bootstrapped
onto the target platform, and regcharclass.pl can be run there,
generating macros correct UTF-8.
M regcharclass.h
M regen/regcharclass.pl
commit b4c11692cc40c4b9fab12b46d170de09f57fb4c1
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 6 21:30:01 2013 -0700
utfebcdic.h: Add (UV) cast
The operand of this macro is implicitly a UV. Make sure that it is.
M utfebcdic.h
commit 3a1e680113176c50e0ce1dde59d9e3d6f4fed584
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 6 17:04:58 2013 -0700
handy.h: Allow bootstrapping to non-ASCII platform
This adds a bunch of macros and moves things around to support
conditional compilation when Configure is called with
-DBOOTSTRAP_CHARSET. Doing so causes the usual macros that are
table-driven to not be used, since the table may not be valid when
bringing Perl up for the first time on a non-ASCII platform.
This allows it to compile using the platform's native C library ctype
functions, which should work enough to compile miniperl, and allow the
table to be changed to be valid. Then Configure can be re-run to not
bootstrap, and normal compilation can proceed
M handy.h
M inline.h
commit b80e7f6aaeb7956c9f7048bbc4b711d977d46dd7
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 4 13:00:47 2013 -0700
toke.c: Remove EBCDIC dependency
M toke.c
commit 4e743d7b770b668c3de5afa2e7dcff5f81d72a1e
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 4 09:14:25 2013 -0700
toke.c: Remove character set dependency
Instead of hard-coding the bit patterns that comprise the Byte Order
Mark in the UTF-8 or UTF-EBCDIC encodings, use the generated ones for
the current platform.
This removes some EBCDIC-only code.
M toke.c
commit 838879b15d00d7526dc770fb4f8f3caf1d4233b5
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Mar 4 09:10:27 2013 -0700
unicode_constants.h: Add #defines for Byte Order Mark
These will be used in future commits
M regen/unicode_constants.pl
M unicode_constants.h
commit 422651554888435b3dd26603547005758ab81503
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Mar 2 15:04:18 2013 -0700
XXX: Find a cleaner way. Handle missing is_UTF8_CHAR_utf8_safe
This macro may not be present, and is currently used exclusively in
IS_UTF8_CHAR, which itself may be undefined, and code should cope with
that. This is a work-around until a better solution is found.
M utf8.c
M utf8.h
commit 11c10d20d69304ffc4fe1924626c7acf1cf04cc9
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Mar 2 14:09:04 2013 -0700
Add Porting tool for help with non-ASCII platforms
Porting/reorder_l1_char_class_tab.pl is used to bootstrap Perl onto a
non-ASCII platform with no working Perl.
M MANIFEST
A Porting/reorder_l1_char_class_tab.pl
M regen/mk_PL_charclass.pl
commit 4ea480159561c6964e56a2c9655959211dab0d24
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Mar 2 13:06:58 2013 -0700
inline.h: Reorder functions
The comment implied that the functions below it in the file were
deprecated, but in fact only the next two functions were. This
clarifies that and moves them so they are the final ones in the file
M inline.h
commit d75ca1d943fba58d53ff6b65c78b3fb108f6656b
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Mar 2 12:33:42 2013 -0700
utfebcdic.h: Add comment
M utfebcdic.h
commit 753553b58a147fa717ab92c924ff086ab5aea6e2
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Mar 2 12:12:11 2013 -0700
utf8.h: Clean up START_MARK definition and use
The previous definition broke good encapsulation rules. UTF_START_MARK
should return something that fits in a byte; it shouldn't be the caller
that does this. So the mask is moved into the definition. This means
it can apply only to the portion that creates something larger than a
byte. Further, the EBCDIC version can be simplified, since 7 is the
largest possible number of bytes in an EBCDIC UTF8 character.
M utf8.h
M utfebcdic.h
commit 3bf13bf4c83ca46e026663f7896c275184502515
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Mar 2 12:05:26 2013 -0700
utf8.h: Move #includes
These two files were only being #included for non-ebcdic compiles; they
should be included always.
M utf8.h
commit b2a404f3215758291aca7b0a736a9ac70acc8bb3
Author: John Goodyear <johngood@us.ibm.com>
Date: Sat Mar 2 11:49:14 2013 -0700
utfebcdic.h: Remove extra parameter expansions
These two macros were improperly expanding the parameters as well as
defining the operation, leading to compile errors.
M utfebcdic.h
commit 53f465f0919f80fbf3b03b914a0e4687500a0950
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Mar 1 08:28:52 2013 -0700
utf8.h: Simplify UTF8_EIGHT_BIT_foo on EBCDIC
These macros were previously defined in terms of UTF8_TWO_BYTE_HI and
UTF8_TWO_BYTE_LO. But the EIGHT_BIT versions can use the less general
and simpler NATIVE_TO_LATN1 instead of NATIVE_TO_UNI because the input
domain is restricted in the EIGHT_BIT. Note that on ASCII platforms,
these both expand to the same thing, so the difference matters only on
EBCDIC.
M utf8.h
commit f45545334d2b75b597b13d207d36eef5ac69e3e1
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Feb 28 09:25:27 2013 -0700
XXX temp: show makedepend cerr
M makedepend.SH
commit e5c798734146470ca89f57802875947bfe1f8179
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Feb 27 21:59:11 2013 -0700
makedepend.SH: Split too long lines; properly join
I had thought that a continuation introduced a space. But no,
a continuation can happen in the middle of a token.
And this splits lines that are getting very long to avoid preprocessor
limitations.
M makedepend.SH
commit 73349c9551037874d97aff1321311b76fd5d86fe
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Feb 27 15:51:28 2013 -0700
makedepend.SH: White-space only
Align continuation backslashes
M makedepend.SH
commit 6d296269a9504f87ff859618ec1d54cd6b40388a
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Feb 27 14:39:28 2013 -0700
makedepend.SH: Remove some unnecessary white space
Multi-line preprocessor directives are now joined into single lines.
This can create lines too long for the preprocessor to handle. This
commit removes blanks adjoining comments that get deleted. This makes
things somewhat less likely to exceed the limit.
This commit also fixes several [] which were meant to each match a tab
or a blank, but editors converted the tabs to blanks
M makedepend.SH
commit 5a37d0f3ada71c13ccec9a5472eb67b89d4588a8
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Feb 27 14:30:51 2013 -0700
makedepend.SH: Retain '/**/' comments
These comments may actually be necessary.
M makedepend.SH
commit 11e86360babb332401e188a3158ec23db31a366b
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Feb 27 08:38:19 2013 -0700
handy.h: Remove extraneous parens
M handy.h
commit 47cb1301507f6ea4368f06be8a914a3b49346dd3
Author: Andy Dougherty <doughera@lafayette.edu>
Date: Wed Feb 27 13:06:07 2013 -0500
Disable gcc-style function attributes on z/OS.
John Goodyear <johngood@us.ibm.com> reports that the z/OS C compiler
supports the attribute keyword, but not exactly the same as gcc.
Instead of a "warning", the compiler emits an "INFORMATIONAL" message
that Configure fails to detect. Until Configure is fixed, just disable
the attributes altogether.
John Goodyear
M hints/os390.sh
commit 9bbf0d837f8663a55d6286f85309adc7c9e8633c
Author: Andy Dougherty <doughera@lafayette.edu>
Date: Wed Feb 27 09:12:13 2013 -0500
Change os390 custom cppstdin script to use fgrep.
Grep appears to be limited to 2048 characters, and truncates
the output for cppstin. Fgrep apparently doesn't have that limit.
Thanks to John Goodyear <johngood@us.ibm.com> for reporting this.
M hints/os390.sh
commit 459510bc9bde35c6abf5192433f335bdb3b1d1a7
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Feb 26 13:45:19 2013 -0700
utf8.c: Use more clearly named macro
In the case of invariants these two macros should do the same thing,
but it seems to me that the latter name more clearly indicates what is
going on.
M utf8.c
commit b1c7d4a6f599f64609766296acb680646179b7ba
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Feb 26 13:35:12 2013 -0700
Add macro OFFUNISKIP
This means use official Unicode code point numbering, not native. Doing
this converts the existing UNISKIP calls in the code to refer to native
code points, which is what they meant anyway. The terminology is
somewhat ambiguous, but I don't think will cause real confusion.
NATIVE_SKIP is also introduced for situations where it is important to
be precise.
M toke.c
M utf8.c
M utf8.h
M utfebcdic.h
commit 6e2e8834ce68a5055531af3d67ce9ca66c9893dc
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Feb 26 13:22:19 2013 -0700
toke.c: white space only
M toke.c
commit f08df806248f24b2f7bd042c33b93f1f751fd3c0
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Feb 17 14:00:13 2013 -0700
toke.c: Don't remap \N{} for EBCDIC
Everything is now in native,
M toke.c
commit f9ea71e52139371605224f1f29db1570b0e71d61
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Feb 26 12:08:50 2013 -0700
utf8.c: Deprecate two functions
This is to force any code that has been using these functions to change.
Since the Unicode tables are now stored in native order, these functions
should only rarely be needed.
However, the functionality of these is needed, and in actuality, on
ASCII platforms, the native functions are #defined to these. So what
this commit does is rename the functions to something else, and create
wrappers with the old names, so that anyone using them will get the
deprecation.
M embed.fnc
M embed.h
M mathoms.c
M proto.h
M utf8.c
M utf8.h
commit fd3538552d64ce1f82d9ed1c518a8a22cfcae412
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Feb 26 11:26:09 2013 -0700
Deprecate uvuni_to_utf8()
Code should almost never be dealing with non-native code points
M embed.fnc
M embed.h
M proto.h
M utf8.c
M utf8.h
commit 3937884de10fcef9ad0cad2e65b95b10c8d539e3
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Feb 26 11:02:33 2013 -0700
Deprecate utf8_to_uni_buf()
Now that the tables are stored in native order, there is almost no need
for code to be dealing in Unicode order.
M embed.fnc
M proto.h
M utf8.c
commit 37cb54589d58515758170a337f38c5946064de06
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Feb 26 09:00:18 2013 -0700
makedepend.SH: Comment out unnecessary code
This causes problems currently for z/OS. But, since we don't know why
it was there, I'm leaving it in as a placeholder.
M makedepend.SH
commit 165c4169c8e84d321fcbe8519e70b7b4922c6d9c
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Feb 25 20:26:44 2013 -0700
Deprecate valid_utf8_to_uvuni()
Now that all the tables are stored in native format, there is very
little reason to use this function; and those who do need this kind of
functionality should be using the bottom level routine, so as to make it
clear they are doing nonstandard stuff.
M embed.fnc
M proto.h
M utf8.c
commit f00eeb1ed7f48cf3aad18d17bee44d43cb8ed657
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Feb 25 20:14:26 2013 -0700
utf8.c: Swap which fcn wraps the other
This is in preparation for the current wrapee becoming deprecated
M embed.fnc
M embed.h
M proto.h
M utf8.c
M utf8.h
commit 5493ecd8a71030306af709a57985b64a10e38b32
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Feb 25 19:29:34 2013 -0700
utf8.c: Skip a no-op
Since the value is invariant under both UTF-8 and not, we already have
it in 'uv'; no need to do anything else to get it
M utf8.c
commit b1ae2e3e8c60988b792fe705f86214929c40fe9f
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Feb 25 19:26:50 2013 -0700
utf8.c: Move comment to where makes more sense
M utf8.c
commit 6644c1f80c59ca38f615945a927a7e357d5d8d6a
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Feb 25 17:30:10 2013 -0700
APItest: Test native code points, instead of Unicode
M ext/XS-APItest/APItest.pm
M ext/XS-APItest/APItest.xs
M ext/XS-APItest/t/utf8.t
commit c00f03b4a5a49c0ea5fc59bee3b0eb878c242672
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Feb 25 17:12:53 2013 -0700
XXX CPAN Encode.xs
Use core function if available. This will insulate this code from any
future changes.
M cpan/Encode/Encode.xs
commit 2b13180bc5418f673052abb6846015c2057d09d1
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Feb 25 17:04:24 2013 -0700
XXX CPAN and unsure Encode
M cpan/Encode/Encode.xs
M cpan/Encode/Unicode/Unicode.xs
commit 31bc4b76738416ca92e00b243f4bf6223471ac78
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Feb 25 17:00:47 2013 -0700
XXX CPAN Encode.xs: fix indent
M cpan/Encode/Encode.xs
commit 700cf1d3741691faca852ca1052102b4b90122a4
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Feb 24 17:23:15 2013 -0700
Don't refer to U+XXXX when mean native
These messages say the output number is Unicode, but it is really
native, so change to saying is 0xXXXX.
M regen/regcharclass_multi_char_folds.pl
M regexec.c
commit e064dda9987b53d13aa9aa3d62fc27fe283b8683
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Feb 24 16:43:59 2013 -0700
Convert some uvuni() to uvchr()
All the tables are now based on the native character set, so using
uvuni() in almost all cases is wrong.
M cygwin/cygwin.c
M doop.c
M op.c
M pp_pack.c
M regcomp.c
M regexec.c
M toke.c
M utf8.c
commit fb99b48bd985aaaabc588da6507e4727e8778a01
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Feb 24 16:25:47 2013 -0700
handy.h: White space only
M handy.h
commit cc107ecf8d68e66271d34308df231be78011f13d
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Feb 24 16:19:49 2013 -0700
t/test.pl: Allow native/latin1 string conversions to work on utf8.
These functions no longer have the hard-coded definitions in them,
but now end up resolving to internal functions, so that new encodings
could be added and these would automatically understand them.
Instead of using tr///, these now go character by character and
converting to/from ord, which is slower, but allows them to operate on
utf8 strings.
Peephole optimization should make these essentially no-ops on ascii
platforms.
M t/test.pl
commit bd039685ed98d2bd793680d9c80c1e0002d8a4eb
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Feb 24 16:05:55 2013 -0700
t/test.pl: Simplify ord to/from native fcns
This commit changes these functions from converting to/from a string to
calling utf8:: functions which operate on ordinals instead.
M t/test.pl
commit f320ccd9d77559d193b782d8db3692b5d830735b
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Feb 24 15:35:38 2013 -0700
Make casing tables native
These are final tables that haven't been converted to native character
set casing.
M perl.h
M utfebcdic.h
commit 558f7ad4bd1bad806eaae899c54f0e25fa8de703
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Feb 24 15:32:30 2013 -0700
utfebcdic.h: Remove trailing spaces
M utfebcdic.h
commit 508eef76a19429b8bf83fdc4184fce991a6bb70b
Author: Karl Williamson <public@khwilliamson.com>
Date: Fri Feb 22 18:55:26 2013 -0700
EBCDIC has the unicode bug too
We have not had a working modern Perl on EBCDIC for some years. When I
started out, comments and code led me to conclude erroneously that
natively it supported semantics for all 256 characters 0-255. It turns
out that I was wrong; it natively (at least on some platforms) has the
same rules (essentially none) for the characters which don't correspond
to ASCII onees, as the rules for these on ASCII platforms.
A previous commit for 5.18 changed the docs about this issue. This
current commit forces ASCII rules on EBCDIC platforms (even should there
be one that natively uses all 256). To get all 256, the same things
like 'use feature "unicode_strings"' must now be done.
M handy.h
commit 87e00a68a95d83f2eeb3e96d097274f82914a168
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Feb 21 13:47:52 2013 -0700
handy.h: Solve a failure to compile problem under EBCDIC
handy.h is included in files that don't include perl.h, and hence not
utf8.h. We can't rely therefore on the ASCII/EBCDIC conversion
macros being available to us. The best way to cope is to use the native
ctype functions. Most, but not all, of the macros in this commit
currently resolve to use those native ones, but a future commit will
change that.
M handy.h
commit 97948cde7a9fadc67193f663f73cf3038833142c
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Feb 21 13:35:12 2013 -0700
handy.h: Simplify some macro definitions
Now, only one of the macros relies on magic numbers (isPRINT), leading
to clearer definitions.
M handy.h
commit 221aaddfd42b0e604b42959b1bb562aa01b1d928
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Feb 21 13:26:49 2013 -0700
handy.h: Combine macros that are same in ASCII, EBCDIC
These 4 macros can have the same RHS for their ASCII and EBCDIC
versions, so no need to duplicate their definitions
This also enables the EBCDIC versions to not have undefined expansions
when compiling without perl.h
M handy.h
commit 47f8763f40ab564a050e8e28603ec89ef4157fde
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Feb 20 10:39:48 2013 -0700
Deprecate NATIVE_TO_NEED and ASCII_TO_NEED
These macros are no longer called in the Perl core. This commit turns
them into functions so that they can use gcc's deprecation facility.
I believe these were defective right from the beginning, and I have
struggled to understand what's going on. From the name, it appears
NATIVE_TO_NEED taks a native byte and turns it into UTF-8 if the
appropriate parameter indicates that. But that is impossible to do
correctly from that API, as for variant characters, it needs to return
two bytes. It could only work correctly if ch is an I8 byte, which
isn't native, and hence the name would be wrong.
Similar arguments for ASCII_TO_NEED.
The function S_append_utf8_from_native_byte(const U8 byte, U8** dest)
does what I think NATIVE_TO_NEED intended.
M embed.fnc
M mathoms.c
M proto.h
M toke.c
M utf8.h
M utfebcdic.h
commit 303ab06940594e3900839ecb7a65c81cc458b1f2
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Feb 20 10:26:43 2013 -0700
Remove remaining calls of NATIVE_TO_NEED
These calls are just copying the input to the output byte by byte.
There is no need to worry about UTF-8 or not, as the output is just an
exact copy of the input
M toke.c
commit 18ad69985d835ec25e4fe56fd1001c6000ecb9e8
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Feb 20 08:12:15 2013 -0700
toke.c: Remove some NATIVE_TO_NEED calls
I believe NATIVE_TO_NEED is defective, and will remove it in a future
commit. But, just in case I'm wrong, I'm doing it in small steps so
bisects will show the culprit. This removes the calls to it where the
parameter is clearly invariant under UTF-8 and UTF-EBCDIC, and so the
result can't be other than just the parameter.
M toke.c
commit 53783c493d0cdfd209b83af002e608bb6479cec8
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Feb 20 08:22:07 2013 -0700
toke.c: in [A-Za-z] use macros that exclude non-ASCII alphas
This code is attempting to deal with the problem of holes in the ranges
a-z and A-Z in EBCDIC. Prior to this patch, it accepeted things like A
WITH GRAVE, etc, which shouldn't have the special processing to deal
with the holes
M toke.c
commit c591456017bf17d02c8e795a2de5025277dde1ee
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Feb 19 15:13:19 2013 -0700
Use real illegal UTF-8 byte
The code here was wrong in assuming that \xFF is not legal in UTF-8
encoded strings. It currently doesn't work due to a bug, but that may
eventually be fixed: [perl #116867]. The comments are also wrong that
all bytes are legal in UTF-EBCDIC.
It turns out that in well-formed UTF-8, the bytes C0 and C1 never appear
(C2, C3, and C4 as well in UTF-EBCDIC), as they would be the start byte
of an illegal overlong sequence.
This creates a #define for an illegal byte using one of the real illegal
ones, and changes the code to use that.
No test is included due to #116867.
M op.c
M toke.c
M utf8.h
commit 0b0a7324a2ed247395052be4451d7966c9a873b8
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Feb 17 13:50:45 2013 -0700
toke.c: Remove remapping for EBCDIC for octal
The code prior to this commit converted something like \04 into its
EBCDIC equivalent only in double-quoted strings. This was not done in
patterns, and so gave inconsistent results. The correct thing to do
should be to do the native thing, what someone who works on a platform
would think \04 do. Platform independent characters are available
through \N{}, either by name or by U+.
The comment changed by this was wrong, as in some cases it was native,
and in some cases Unicode.
M toke.c
commit 8ca217e7fe9d418de701d393d92915c353cd9cff
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Feb 17 13:47:13 2013 -0700
Remove EBCDIC remappings
Now that the tables are stored in native format, we shouldn't be doing
remapping.
Note that this assumes that the Latin1 casing tables are stored in
native order; not all of this has been done yet.
M handy.h
M perly.c
M pp.c
M regcomp.c
M regexec.c
M utf8.c
commit a7c763b3e26791fef499deeb72be224446d646a9
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Feb 17 12:46:05 2013 -0700
Add and use macro to return EBCDIC
The conversion from UTF-8 to code point should generally be to the
native code point. This adds a macro to do that, and converts the
core calls to the existing macro to use the new one instead. The old
macro is retained for possible backwards compatibility, though it
probably should be deprecated.
M handy.h
M pp.c
M regcomp.c
M regexec.c
M toke.c
M utf8.c
M utf8.h
commit 83804835bde2c7217025a7b2821e4b22dce1332e
Author: Karl Williamson <public@khwilliamson.com>
Date: Sun Feb 17 09:18:06 2013 -0700
charnames: fix nit in comment
M lib/_charnames.pm
commit 891d8b28094b465b3b5de7296ecc733ca8e2cb7d
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Feb 16 11:05:44 2013 -0700
charnames: Make work in EBCDIC
Now that mktables generates native tables, the we need to make U+ mean
Unicode instead of native.
M lib/_charnames.pm
M lib/charnames.pm
commit a6e2e1b818da962b8045794d76066358e3a92d14
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Feb 16 09:35:56 2013 -0700
Unicode::UCD: Work on non-ASCII platforms
Now that mktables generates native tables, it is a fairly simple matter
to get Unicode::UCD to work on those platforms.
M lib/Unicode/UCD.pm
commit f598e6e5ca7f7b8511ebf86ee1ad4e920ee5458b
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Mar 27 17:01:24 2013 -0600
Unicode::UCD: Typo in comment
M lib/Unicode/UCD.pm
commit d416db013e4d6a9aa1f344d5debcf3b015cffd07
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Feb 14 22:16:38 2013 -0700
mktables: Generate native code-point tables
The output tables for mktables are now in the platform's native
character set. This means there is no change for ASCII platforms, but
is a change for EBCDIC ones.
Since we currently don't have any EBCDIC test platforms, I tested this
by faking it out to generate EBCDIC data, and then eye-balled the
results.
Code that didn't realize there was a potential difference between EBCDIC
and non-EBCDIC platforms will now start to work; code that tried to do
the right thing under these circumstances will no longer work. Fixing
that comes in later commits.
M lib/unicore/mktables
commit e332a9315da70d49d85e95d4dc386d7cb70cd74b
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 2 21:36:28 2013 -0600
mktables: Move table creation code
This code is moved later in the process. This is in preparation for
mktables generating tables in the native character set. By moving it to
later, the translation to native has already been done, and special
coding need not be done.
This also caught 7 code points that were omitted somehow in the previous
logic
M lib/unicore/mktables
commit fcf234326588538bdcbb86bbbe2eb1609ffbaefc
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Feb 14 10:50:00 2013 -0700
Fix some EBCDIC problems
These spots have native code points, so should be using the macros for
native code points, instead of Unicode ones.
M regcomp.c
M sv.c
M toke.c
commit 0b3fb2e83ff47e3b8c83661518974099c69df7ba
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Feb 13 22:10:19 2013 -0700
Remove unnecessary temp variable in converting to UTF-8
These areas of code included a temporary that is unnecessary.
M inline.h
M regcomp.c
M sv.c
commit 93aca4b19abed6c277f9e83c764e155e0b3d8eaa
Author: Karl Williamson <public@khwilliamson.com>
Date: Wed Feb 13 22:00:55 2013 -0700
utf8.h: Correct macros for EBCDIC
These macros were incorrect for EBCDIC. The 3 step process given in
utfebcdic.h wasn't being followed.
M utf8.h
commit 43feeeaaa546966bab913c926848d4b83bcf9043
Author: Karl Williamson <public@khwilliamson.com>
Date: Sat Feb 9 21:23:30 2013 -0700
Extract common code to an inline function
This fairly short paradigm is repeated in several places; a later commit
will improve it.
M embed.fnc
M embed.h
M inline.h
M pp_pack.c
M proto.h
M sv.c
M toke.c
M utf8.c
commit c885fb95b554def1cd86209ecf1842e1f6995565
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Feb 7 21:35:57 2013 -0700
Don't use EBCDIC macro for a C language escape
C recognizes '\a' (for BEL); just use that instead of a look-up.
regen/unicode_constants.pl could be used to generate the character for
the ESC (set in surrounding code), but I didn't do that because of
potential bootstrapping problems when porting to an EBCDIC platform
without a working perl. (The other characters generated in that .pl are
less likely to cause problems when compiling perl.)
M regcomp.c
M toke.c
commit c071bf17185da45ae38f4fd061f9d9c83ef0a6f4
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Feb 7 19:53:38 2013 -0700
Use byte domain EBCDIC/LATIN1 macro where appropriate
The macros like NATIVE_TO_UNI will work on EBCDIC, but operate on the
whole Unicode range. In the locations affected by this commit, it is
known that the domain is limited to a single byte, so the simpler ones
whose names contain LATIN1 may be used.
On ASCII platforms, all the macros are null, so there is no effective
change.
M handy.h
M regcomp.c
M utf8.c
commit d8ba14798825c15295f863aafe8751c42e356cb2
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Feb 7 14:31:09 2013 -0700
Use new clearer named #defines
This converts several areas of code to use the more clearly named macros
introduced in a recent commit
M op.c
M toke.c
M utf8.c
M utf8.h
M utfebcdic.h
commit a0f98dc2f4bd314ccaeac36dfbb47cfa1383421a
Author: Karl Williamson <public@khwilliamson.com>
Date: Thu Feb 7 13:52:31 2013 -0700
utf8.h, utfebcdic.h: Create less confusing #defines
This commit creates macros whose names mean something to me, and I don't
find confusing. The older names are retained for backwards
compatibility. Future commits will fix bugs I introduced from
misunderstanding the meaning of the older names.
The older names are now #defined in terms of the newer ones, and moved
so that they are only defined once, valid for both ASCII and EBCDIC
platforms.
M utf8.h
M utfebcdic.h
commit c6b660e92c6c3112981b8b7ecf9ab554c8f6cde2
Author: Karl Williamson <public@khwilliamson.com>
Date: Mon Feb 4 14:22:02 2013 -0700
pp_ctl.c: Use isCNTRL instead of hard-coded mask
This is clearer and portable to EBCDIC.
M pp_ctl.c
commit 355b90ed4f79405fe4bffb202667db3911ac03b5
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Feb 26 13:51:05 2013 -0700
utf8.c: is_utf8_char_slow() should use native length
What is passed is the actual length of the native utf8 character. What
this was calculating was the length it would be if it were a Unicode
character, and then compares, apples to oranges.
M utf8.c
commit 7b95e84edb7afc7a9529a12b07a4cdaa611c3319
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 30 08:42:08 2013 -0600
autodoc.pl: Don't list undocumented deprecated fcns in API
autodoc creates a list of all the undocumented functions that are part
of the API. It omits ones that are experimental and whose API may
change; and now it omits ones that are deprecated (and whose API is
planned to change to be non-existent)
M autodoc.pl
commit 73acb38ffb609e4996d74f3b2a0148f2f363c045
Author: Karl Williamson <public@khwilliamson.com>
Date: Tue Apr 30 08:39:44 2013 -0600
autodoc.pl: Add note for deprecated functions
This causes each deprecated function to have a prominent note to that
effect in its API documentation.
M autodoc.pl
M mg.c
M utf8.c
-----------------------------------------------------------------------
--
Perl5 Master Repository
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic