[prev in list] [next in list] [prev in thread] [next in thread]
List: postgresql-general
Subject: Re: [HACKERS] Bug in UTF8-Validation Code?
From: Mark Dilger <pgsql () markdilger ! com>
Date: 2007-04-03 15:47:14
Message-ID: 46127702.8060100 () markdilger ! com
[Download RAW message or body]
Martijn van Oosterhout wrote:
> On Tue, Apr 03, 2007 at 11:43:21AM +0200, Albe Laurenz wrote:
>> IMHO this is the only good and intuitive way for CHR() and ASCII().
>
> Hardly. The comment earlier about mbtowc was much closer to the mark.
> And wide characters are defined as Unicode points.
>
> Basically, CHR() takes a unicode point and returns that character
> in a string appropriately encoded. ASCII() does the reverse.
>
> Just about every multibyte encoding other than Unicode has the problem
> of not distinguishing between the code point and the encoding of it.
> Unicode is a collection of encodings based on the same set.
>
> Have a nice day,
Thanks for the feedback. Would you say that the way I implemented things in the
example code would be correct for multibyte non Unicode encodings? I don't see
how to avoid the endianness issue for those encodings.
mark
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at
http://www.postgresql.org/about/donate
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic