[prev in list] [next in list] [prev in thread] [next in thread] 

List:       dragonfly-submit
Subject:    Re: UTF8 locale MFC for DragonflyBSD
From:       Matthew Dillon <dillon () apollo ! backplane ! com>
Date:       2004-03-29 16:46:49
Message-ID: 200403291646.i2TGkn4p049638 () apollo ! backplane ! com
[Download RAW message or body]

:> I don't really want to force us to UCS2, just because MS did. It is pretty
:> pointless if you think about Unicode as mean to encode every _written_
:> script in the world. Therefore if we want to apply any length checks, the
:> correct way is as specified by at least Unicode 3 e.g. UCS4.
:
:Well, not just MS; a lot of folks (notably Sun/Java) were caught off 
:guard when Unicode was extended beyond the base 64k characters.  I won't 
:replicate the flame wars here, they're all on Google. :-)
:
:My personal opinion: UCS-4 wastes a lot of space given that Unicode 3.1 
:is a ~21-bit set and nobody is really using the >=U+10000 space in a 
:practical manner (yet?).  But if you need to have a one-to-one mapping, 
:you don't have much choice.
:
:Unless you have a machine which uses 21-bit bytes, of course. ;-)

    UTF8 is the way we should go.  I severely dislike the wasted space as
    well, and it's a mistake to try to use a direct-encoding representation
    when most programs already deal in 'strings' rather then 'characters'
    for most things.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic