[prev in list] [next in list] [prev in thread] [next in thread]
List: aspell-user
Subject: Re: [Aspell-user] Hyphens and apostrophes in words
From: Ciarán Ó Duibhín <ciaran () oduibhin ! freeserve ! co ! uk>
Date: 2013-06-02 16:54:33
Message-ID: 1824B0D9EC23448380BE2C445829FBD3 () InneallChiarin
[Download RAW message or body]
As this thread seems to be at an end, I'll just tidy up a couple of =
loose ends.
1. My problem 1, with the treatment of hyphens during tokenization. =
Carlo's suggestion of two-pass checking, the first with the hyphen as a =
letter, and the second with the hyphen as a punctuation mark, is =
interesting, but won't the first pass object to all the productive =
compounds like "half-moon" - potentially infinite in number - which will =
not be in the dictionary? It may still be workable if the input to the =
second pass is, not the whole text over again, but the list of words =
rejected in the first pass, but interactive use of the checker seems to =
be ruled out unless both checks can be done in the same pass, as the MS =
spell checker does it.
2. Kevin referred to the file =
http://aspell.net/man-html/Words-With-Symbols-in-Them.html from which I =
quote:
The case where the symbol can appear at the beginning or end of the word =
is more difficult to deal with. The symbol may or may not actually be =
part of the word. Aspell currently handles this case by first trying to =
spell check the word with the symbol and if that fails, try it without.
I cannot reconcile this with what I observed in my problem 2, where =
aspell appeared to check the word without the symbol (apostrophe) only. =
I found that 'twas was rejected when the dictionary contained 'twas but =
not twas , and accepted when the dictionary contained twas but not 'twas =
.
Ciar=E1n =D3 Duibh=EDn.
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic