[prev in list] [next in list] [prev in thread] [next in thread] 

List:       icu-bugrfe
Subject:    Notification: incoming/1574
From:       jtcsv () jtcsv ! com
Date:       2001-12-02 23:08:57
[Download RAW message or body]

ICU bug tracking notification

new message incoming/1574

Message summary for PR#1574
	From: shef31@yahoo.com
	Subject: French Contractions
	Date: Sun, 2 Dec 2001 18:08:54 -0500 (EST)
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From jtcsv  Sun Dec  2 18:08:55 2001
Received: by w1424.hostcentric.net (8.10.1/8.9.0) id fB2N8tF25640
	for jtcsv; Sun, 2 Dec 2001 18:08:55 -0500 (EST)
Received: from localhost (w1424.hostcentric.net [66.40.230.254])
	by w1424.hostcentric.net (8.10.1/8.9.0) with ESMTP id fB2N8ss25637
	for <jtcsv@jtcsv.com>; Sun, 2 Dec 2001 18:08:54 -0500 (EST)
Date: Sun, 2 Dec 2001 18:08:54 -0500 (EST)
Message-Id: <200112022308.fB2N8ss25637@w1424.hostcentric.net>
From: shef31@yahoo.com
To: jtcsv@jtcsv.com
Subject: French Contractions

Full_Name: Shef
Version: 131
OS: all
ICU_Component: textbounds
Submission from: (NULL) (216.43.222.90)


The BreakIterator.getWordInstance() does not split up French contractions.
"l'homme" is treated as one word, whereas it should be tokenized as "le" +
"homme", or "l" + "homme". For the complete set of rules, see
http://french.about.com/library/pronunciation/bl-contractions.htm


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic