Howdy to all the playas out there interested in localization. So far, I've never attempted to learn another spoken language, despite ample opportunities. I've always been interested in learning a foreign language, but apparently never interested enough. Unlike me and foreign languages, me and computers click. My broadening view of computers has returned me to this latent desire to learn a new language, sort of. Planning on implementing an extension to nepomuk, it is becomming apparant that the feature which I wish to implement could have profound effects on translation efforts. I would appreciate your time and constructive comments, but first bare with me as I explain this fluid concept.

From http://wordnet.princeton.edu/:
"WordNet® is a large lexical database of English, developed under the direction of George A. Miller. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. The resulting network of meaningfully related words and concepts can be navigated with the browser. WordNet is also freely and publicly available for download. WordNet's structure makes it a useful tool for computational linguistics and natural language processing."

I've been very interested in WordNet, since becoming aware of this amazing software. Of the software that I'm aware of, few come close to utilizing WordNet to it's potential- the Natural Language ToolKit(http://nltk.sourceforge.net/index.php/Screenshots) being the exception. Considering this, it seems evident that such functionality would have profound impacts if implemented in KDE-- allowing easy adoption of lexical utilities to a broad spectrum of people. Such a utility would allow usefull features to become, some of which I will list in increasing order of implementational complexity: 1) Reduction of index size, while increasing "connections" to key word sets extracted from documents through the use of synonyms for "Desktop Search" 2) Improvement to dictionary application, as a quick example imagine hooking up one of these examples to wikipedia http://kde-look.org/CONTENT/content-m1/m87173-1.png or http://www.visualthesaurus.com/landing 3) Improved tools for translation and knowledge representation. The case may be, that these tools will expand to be the letters defining tomorrows paragraph-- the egg coming before the chicken.

One becomes aware of the contextual complexity of language when attempting to learn a specialized concept or having a conversation with a person who has been subjected to a different set of circumstances. This complexity is intensly magnified when comparing two different languages with this in mind. Most everyone is able to utilise language without much conscious effort(as a child), interestingly indicating that written language lacks much of the information conveyed in spoken language(even with the use of the most mastered literary devices). Not physically present communication has been a determining factor in the size of mankinds stride, justifying furthur improvements as contextual scope increases. It seems that WordNet was a gigantic leap in the right direction, but such leaps are unlikely to have smooth landings. The time is now right to extend the lexical database WordNet, by allowing users to refine the database and have those refinements rated by others-- by context. Such an improvement would be even more versatile if done so in a decentralized manner, allowing the comparison of different {interests,regions,languages} to strenghen our communication by conveying information that was once not there.

Translations as Semantic Mirrors: From Parallel Corpus to Wordnet by Helge Dyvik (www.hf.uib.no/i/LiLi/SLF/Dyvik/ICAMEpaper.pdf) explains how you can use existing data to produce translations of wn, in addition to noting that verifying the translations is no small task. I'm assuming that this means the general objects that make up a sentence(nouns,verbs,adjectives, and adverbs) apply to all languages, just according to different set of order and rules. An infrastructure should be set in place to make semantical data more modular, via Nepomuk. One such solution, involves lexical servers(services) which can communicate with other servers gathering information as needed or may be cross-referencing datasets. The clients to these servers, perhaps Lexikal, connect to the servers to retrieve information, submit changes(which will be rated, preferably indirectly, by other users), and link objects/concepts together. By integrating the features of "Lexikal" we could improve translation efforts immeasurably, in addition to improving data quality. Personally, being most interested in improving data quality, the perspective of the people who know more then one language would be much appreciated. Also of interest to me is what kind of server would be best utilized, although that isn't a topic in relation to this list.