[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xfree-i18n
Subject:    Re: [I18n]Re: The case for __STDC_ISO_10646__
From:       Ienup Sung <ienup.sung () eng ! sun ! com>
Date:       2001-02-24 1:23:56
[Download RAW message or body]

Again, the wchar_t is an opaque data type which means you shouldn't
assume anything on the value of the wchar_t in your application and access of
the wchar_t character and string should be done by using only WPIs and, as
a result, we will be able to support not only Unicode but also any other
codesets without having to provide mappings as you suggested.

By using the macro, what you are doing is actually not going away from
the hassel of multiple codesets but you are locking in yourself into a single
codeset and nothing else and practically disabling  any other codesets/locales
and systems instead of solving the problem for everyone. Considering there are
many users and data using and in various codesets, such action will actually
limit your application to a few users.

With regards,

Ienup


] X-URL: http://www.cl.cam.ac.uk/~mgk25/
] Date: Thu, 22 Feb 2001 13:35:20 +0000
] From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>
] Subject: [I18n]Re: The case for __STDC_ISO_10646__
] To: i18n@XFree86.Org
] MIME-version: 1.0
] 
] Tomohiro KUBOTA wrote on 2001-02-22 12:45 UTC:
] > On systems with __STDC_ISO_10646__, encodings which are incompatible
] > with Unicode cannot be handled as "multibyte character".
] 
] No. This is definitely wrong. As I explained before, you can map these
] easily into the private use ranges of UCS. Just stay out of the range of
] values allocated by UCS in wchar_t and then you can also declare
] __STDC_ISO_10646__, because at least your encoding is not incompatible
] with UCS any more and there is no out-of-band agreement on the wchar_t
] encoding necessary between the compiler and the run-time library.
] 
] You just use a subset of UCS that consists only of private use
] characters. Perfectly valid. Technical detail: you still must map at
] least ASCII to the UCS range, because the ISO 10646 standard does not
] allow UCS subsets that do not contain ASCII, and this will also significantly
] help with code portability because a lot of code assumes (wchar_t)0 ..
] (wchar_t)127 to be used for ASCII.
] 
] Using the huge private use ranges of UCS for non-UCS encodings is
] admittedly a bit of a hack, but it is still far better than having
] different incompatible encodings in wchar_t.
] 
] > Again:
] > 1. I don't understand why you are willing to work to degrade XTerm.
] >    What is your motivation to do so with spending your time and labor?
] 
] Xterm was a year ago an 8-bit terminal emulator. We turned it into a
] UTF-8 terminal emulator that does everything internally in UCS. This was
] done as part of an effort to turn Linux into a UTF-8 environment. This
] activity also attracted some of the ISO 2022 crowd, but that was never
] the plan originally. If you want to support any of the CJK legacy mess,
] just use kterm. It exists already and is widely deployed and people are
] happy with it. Proper support for ISO 2022 and multiple widecharacter
] encoding locales is one to two orders of magnitude more complicated than
] UCS support. Such added complexity quickly scares of future developers.
] It has happened with Emacs, where the ISO 2022 support that came in via
] the MULE integration has significantly decelerated the further evolution
] of Emacs and people are planning to rip it out again in favour of
] simpler UCS based i18n to get a more manageable code base again. I hope
] that this will not happen to XTerm as well.
] 
] Pretty much all modern programming languages have now hardwired UCS into
] their specification (Java, ECMAScript, Perl, Python, TCL, Ada, C#,
] VisualBasic, etc.). Why does one have to be called a religious fanatic
] if one follows the same sound engineering principle for xterm and
] eventually the rest of X11?
] 
] Why do you need switchable wide character encodings and all the pain and
] complicated support infrastructure they case in the end?
] 
] Markus
] 
] -- 
] Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
] Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>
] 
] _______________________________________________
] I18n mailing list
] I18n@XFree86.Org
] http://XFree86.Org/mailman/listinfo/i18n
_______________________________________________
I18n mailing list
I18n@XFree86.Org
http://XFree86.Org/mailman/listinfo/i18n

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic