[prev in list] [next in list] [prev in thread] [next in thread] 

List:       aspell-user
Subject:    Re: [Aspell-user] Hyphens and apostrophes in words
From:       Ciarán Ó Duibhín <ciaran () oduibhin ! freeserve ! co ! uk>
Date:       2013-06-02 16:54:33
Message-ID: 1824B0D9EC23448380BE2C445829FBD3 () InneallChiarin
[Download RAW message or body]

As this thread seems to be at an end, I'll just tidy up a couple of =
loose ends.

1. My problem 1, with the treatment of hyphens during tokenization.  =
Carlo's suggestion of two-pass checking, the first with the hyphen as a =
letter, and the second with the hyphen as a punctuation mark, is =
interesting, but won't the first pass object to all the productive =
compounds like "half-moon" - potentially infinite in number - which will =
not be in the dictionary?  It may still be workable if the input to the =
second pass is, not the whole text over again, but the list of words =
rejected in the first pass, but interactive use of the checker seems to =
be ruled out unless both checks can be done in the same pass, as the MS =
spell checker does it.

2. Kevin referred to the file =
http://aspell.net/man-html/Words-With-Symbols-in-Them.html from which I =
quote:
The case where the symbol can appear at the beginning or end of the word =
is more difficult to deal with. The symbol may or may not actually be =
part of the word. Aspell currently handles this case by first trying to =
spell check the word with the symbol and if that fails, try it without.
I cannot reconcile this with what I observed in my problem 2, where =
aspell appeared to check the word without the symbol (apostrophe) only.  =
I found that 'twas was rejected when the dictionary contained 'twas but =
not twas , and accepted when the dictionary contained twas but not 'twas =
.

Ciar=E1n =D3 Duibh=EDn.

 

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic