[prev in list] [next in list] [prev in thread] [next in thread] 

List:       icu-bugrfe
Subject:    Notification: incoming/1587
From:       jtcsv () jtcsv ! com
Date:       2001-12-05 21:35:19
[Download RAW message or body]

ICU bug tracking notification

new message incoming/1587

Message summary for PR#1587
	From: heninger@us.ibm.com
	Subject: rbbi, character break for Tamil
	Date: Wed, 5 Dec 2001 16:35:13 -0500 (EST)
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From jtcsv  Wed Dec  5 16:35:14 2001
Received: by w1424.hostcentric.net (8.10.1/8.9.0) id fB5LZEn22868
	for jtcsv; Wed, 5 Dec 2001 16:35:14 -0500 (EST)
Received: from localhost (w1424.hostcentric.net [66.40.230.254])
	by w1424.hostcentric.net (8.10.1/8.9.0) with ESMTP id fB5LZD222865
	for <jtcsv@jtcsv.com>; Wed, 5 Dec 2001 16:35:13 -0500 (EST)
Date: Wed, 5 Dec 2001 16:35:13 -0500 (EST)
Message-Id: <200112052135.fB5LZD222865@w1424.hostcentric.net>
From: heninger@us.ibm.com
To: jtcsv@jtcsv.com
Subject: rbbi, character break for Tamil

Full_Name: Andy Heninger
Version: 200
OS: all
ICU_Component: textbounds
project: ICU4C
Submission from: (NULL) (32.97.110.72)
Submitted by: andy


Hyangmi Cho/CAM/Lotus@LOTUS:

The last one [.brk file] you gave us is fine but there is an error for handling
Indic script like below,

ICU breaks it 2 characters while user want it as one. I think we can still wait.
Thanks.

Eric Mader:

The current character break rules only handle Devanagari correctly. (the example
below is Tamil...) The right way to fix this is to follow the Unicode standard,
which gives a generic pattern that will work for all scripts.

Hyangmi:
Yes it's Tamil. Simple example is \u0baa + \u0bc1 that should be handled as one
character.



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic