In the past few weeks “stemming” support was added to the Open Text Summarizer. Stemming is the ability to take a word such as "running" and trace it back to its original form to the word "run". We use this feature to group together all of the thederivativs of a certain stem. For OTS, keywords equal ideas; We need this ability to group to words together to recognize that “I ran” and “I am running” are of similar ideas. The stemming process is govern by given rules. At the moment there are two main rule groups. prefix and postfix. Each rule is defined as [“replace this” : “with that”]. A set of two stringsseparated by a colon. The will try to match the end of the word while the will try to match the beginning. The program will try to apply each of the rules , from top to bottom, until one is matched. It will apply ONLY ONE rule of each group. The stem rules are defined in en.xml (or any other language code dot xml); They look like this: replaceThis:withThis sses:s ing: went:go In the example file the program will replace each “sses” at the end of a word with “s” , remove every “ing” from the end of a word and replace the word “went” with “go”. In the example the program will be able to tell that: stem(“went”) == stem(“going”) == stem(“go”) == “go” As Alan said “There are some grammar rules for this but because English is such a bastard language they can be quite unreliable.” for example: we cant automatically drop the “s” at the end of the word to remove plural because first it might end with “es” and second it may be a word such as “was”. One trick would be to place “es” before “s” and “e” and maybe to have in the beginning a list of words that break our algorithm. You can go wild with the list because it is O(N), where N is the number of words in the article. We already have O(N^2) in some other place. In order to fully support the 24+ languages that OTS support we need to define the rules for each language to make this connection. I know that for many languages this feature is critical(russian for example). Shalom, Nadav