[prev in list] [next in list] [prev in thread] [next in thread]
List: icu
Subject: UTF_EXPECTED_LENGTH
From: "Zartaj T. Majeed" <zmajeed () adobe ! com>
Date: 2002-09-24 19:16:53
[Download RAW message or body]
No. UTF_CHAR_LENGTH takes a UChar32, a Unicode scalar value.
I have needed a function that will take a UTF-8 byte and tell
me how many more bytes should follow it for a complete character.
So something like:
#define UTF8_EXPECTED_LENGTH(uchar) \
((uchar) < 0x80? 1 : \
((uchar) & 0xe0 == 0xc0)? 2 : \
((uchar) & 0xf0 == 0xe0)? 3 : \
((uchar) & 0xf8 == 0xf0)? 4 : 0)
Zartaj
> Are you asking about UTF_CHAR_LENGTH (or UTF8_CHAR_LENGTH or
> UTF16_CHAR_LENGTH)? Right now, none of them return an error value, and
> I'm not sure why an error value is needed (maybe if you had malformed
> UTF-8).
>
> George Rhoten
> IBM Globalization Center of Competency/ICU San Jose, CA, USA
>
>
>
>
> "Zartaj T. Majeed" <zmajeed@adobe.com>
> Sent by: icu-admin@www-124.southbury.usf.ibm.com
> 09/24/2002 11:42 AM
>
>
> To: "icu list" <icu@www-124.southbury.usf.ibm.com>
> cc:
> Subject: RE: icu4c api proposal: simplify UTF macros
>
>
>
>
> Is there a macro that returns the expected length of a character
> given a code unit? I.e. the number of subsequent code units
> needed to form a valid character would be one less.
> For a single-unit character or the first unit of multi-unit character
> the macro would return the expected length . For any other code unit
> it would return an error value.
>
> Thanks,
> Zartaj
>
> ______________________________________________
> icu mailing list
> icu@oss.software.ibm.com
> http://oss.software.ibm.com/developerworks/oss/mailman/listinfo/icu
>
>
_______________________________________________
icu mailing list
icu@oss.software.ibm.com
http://oss.software.ibm.com/developerworks/oss/mailman/listinfo/icu
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic