[prev in list] [next in list] [prev in thread] [next in thread]
List: dragonfly-submit
Subject: Re: UTF8 locale MFC for DragonflyBSD
From: Matthew Dillon <dillon () apollo ! backplane ! com>
Date: 2004-03-29 16:46:49
Message-ID: 200403291646.i2TGkn4p049638 () apollo ! backplane ! com
[Download RAW message or body]
:> I don't really want to force us to UCS2, just because MS did. It is pretty
:> pointless if you think about Unicode as mean to encode every _written_
:> script in the world. Therefore if we want to apply any length checks, the
:> correct way is as specified by at least Unicode 3 e.g. UCS4.
:
:Well, not just MS; a lot of folks (notably Sun/Java) were caught off
:guard when Unicode was extended beyond the base 64k characters. I won't
:replicate the flame wars here, they're all on Google. :-)
:
:My personal opinion: UCS-4 wastes a lot of space given that Unicode 3.1
:is a ~21-bit set and nobody is really using the >=U+10000 space in a
:practical manner (yet?). But if you need to have a one-to-one mapping,
:you don't have much choice.
:
:Unless you have a machine which uses 21-bit bytes, of course. ;-)
UTF8 is the way we should go. I severely dislike the wasted space as
well, and it's a mistake to try to use a direct-encoding representation
when most programs already deal in 'strings' rather then 'characters'
for most things.
-Matt
Matthew Dillon
<dillon@backplane.com>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic