Am Donnerstag, 13. Juli 2006 12:16 schrieb Kevin Donnelly: > > But I was very interested in your idea of a script or scripts to apply > changes to the whole translation tree. For instance, if I have decided > that a word is incorrect in a specific context, and needs to be replaced by > another, what do I do? At present, I just update it when I see it, which > is not very efficient. What do other teams do? > E.g. something like this: $ python checkword-en-de.py /home/kdestable/svn/l10n/de [pP]lugin [pP]lugin, [mM]odul 852 Messages <80 characters and <10 words with "[pP]lugin" in msgid and "[pP]lugin|[mM]odul" in msgstr found in 1457 files, 75 with different translation 315 Messages with "[pP]lugin"-"[pP]lugin" 537 Messages with "[pP]lugin"-"[mM]odul" 75 Messages with different translation: ['noatun; is a fully-featured plugin-based media player for kde;.','noatun; ist ein vielseitiger und erweiterbarer Medienspieler für kde;.','noatun.po','#6'] ['Milk Chocolate is a simple, minimalist user interface plugin','Vollmilchschokolade ist eine einfache, minimalistische Benutzeroberfläche','noatun.po','#99'] <-- snip--> $ python checkword-en-de.py -h Usage: python checkword-en-de.py [OPTION] /path/to/pofiledir/ regexpr-word-en [regexpr-word-de,regexpr-word2-de,...] Options: -h, --help : usage -s, --summary : print only summary -a, --all : list of all messages, default only list of messages with different translation -m, --msglen int : check only msgids with len Far better would be some sort of interface (ideally GUI) which searches all > files in the tree for this target word, and lists the msgids/msgstrs where > it occurs. You could then scan these, tick the msgstrs to replace, and > have the word replaced globally, and the po files saved. I can do > File/Replace on a single file easily enough, but doing this over the whole > tree is not worth it. KFileReplace will do a global Find/Replace, but it > is a blunt tool for this job, in that it only lists the words that were > searched for, and not the context of the words. A replace in these > circumstances would be risky. To look at the context, you have to open > each file individually. > > > A couple of years ago, Pedro Morais on the Portuguese team did some useful > scripts in Python which did basic checking (for example, that the msgstr of > a msgid ending in a full stop also had a full stop). Some of these > duplicated some of the functionality in KBabel, but were still useful as > standalones. > > If anyone else is vaguely interested in this idea, can we get a discussion > going, so that at least one positive thing will have come out of Rosetta? Yes, I am very interested. I have so far: checkutils.py: the main library (walk through l10n/LANG, search gui.po's for a program, extract all guiitems from a set of docbooks or docmessage.po's, generate dictionaries with guistrings from messages.po's, search for similar guiitems with levenstein distance, soundex or python difflib etc) kzwiebelfisch.py: find messages with typical translation errors Usage: python kzwiebelfisch.py [OPTION] path/to/[app[doc.po]] options: -h, --help : usage -b, --backup : backup files app-doc.po to app-doc.po.backup and apply all changes to app-doc.po -f, --fuzzy : set hits to fuzzy, but dont write "Muster" to first line of msgstr -m, --muster : set hits to fuzzy and write "Muster" to first line of msgstr -k, --konsole : output to konsole, dont change *.po files -t, --tag : choose to tag hits in konsole -p file, --pattern file : read pattern from file -p "regex1|regex2|etc", --pattern "regex1|regex2|etc": use regex1,regex2,etc (quoted+separated with "|") as pattern default : no bfmktp, edit msgstr with readline, use internal pattern, change po-files with edited msgstr checkdocbook.py: check guiitems in english documentation Usage: python checkdocbook.py /path/to/l10n/documentation/[kdemodul/program/[name.docbook]] options: -h, --help : usage -s,--summary: only summary, dont write bugreport-logs -t itemtype,--type itemtype: check only itemtyp (button|menu|submenu| menuitem|label|icon) Output: logs kdemodul-program-trunk|kdestable-en.log in working dir checktrans.py: check guiitems in language documentation (docmessage.po's) Usage: python checktrans.py /path/to/l10n/lang/docmessages/[kdemodul/program/[docname.po]] Output: kdemodul-program-trunk|stable-lang-po.log in working dir checkdefaulttranslations.py: check for default translations e.g. the translations in visualdict.po checkdocstabletrunk.py: compare docbooks in two dir trees (stable - trunk) checktags.py: Usage: python checktags.py [OPTION] /path/to/pofiledir/[pofile] Options: -h, --help : usage -s, --summary : print only summary -a, --all : print all messages with tags, default only messages with different tags -f, --fuzzy : set messages with different tags to fuzzy, default false Output : default print messages with different tags checkshortcuts.py: python checkshortcuts.py [OPTION] /path/to/pofiledir/[pofile] Options: -h, --help : usage -s, --summary : print only summary -a, --all : print all messages with shortcuts, default only messages with different shortcuts -f, --fuzzy : set messages with different shortcuts to fuzzy, default false -o, --obsolete : include obsolete entities from user.entities, default false Output : default print messages with different shortcuts checkscreenshots.py: Usage : python checkscreenshots.py [OPTION] /path/to/l10n/lang/docs/[moduldir/[progdir]] Options: -h, --help : usage -s, --size int : check only png-files > int KB size, default 5 KB defaults : -s 10 /home/kdestable/svn/l10n/de/docs/ Output : list of png-files in lang/docs, but not in documentation list of png-files in documentation, but not in lang/docs list of png-files newer in documentation than in lang/docs checkmessagediffs.py: Usage : python checkmessagediffs.py [OPTION] /path/to/l10n/de/[doc]messages/[dir/[file[.po]]] Options: -h, --help : usage -s, --summary : print only summary -a, --all : list of all messages, default only list of messages with different translations -m, --msglen int : check only msgids with len in msgid's in docmessages.po, could be missing markup in documentation roughtrans.py translate all guiitems automatically etc. Problem with all these scripts: I am in the middle of a complete rewrite of the main library checkutils.py, but lack of time (too much untranslated german docs, too much errors in the documentations) prevent me to finish this. Some of these scripts just work basically and need a lot of improvement. As it was never intended to use this outside the german team, there are a lot of german comments in the scripts and some of the scripts have to be extended to work outside the dir tree l10/de The basic idea behind all this stuff: There is no automatic sync between strings in the gui and these strings in the documentation. Every change in a message.po -especially in kdelibs.po- generates wrong guiitems in docmessage.po's or docbooks. And there are a lot of errors even in the english templates: $ python checkdocbook.py -s /home/kdestable/svn/l10n/documentation/ No of documentations = 300 No of documentations with errors 221 of 300 = 74 % No of documentations without errors 69 of 300 = 23 % 14521 guiitems found at 26247 locations in 886 docbook(s) guiitems in docbooks found in 3912 messages catalogues: 9868 of 14521 = 68 % guiitems in docbooks NOT found in 3912 messages catalogues: 4653 of 14521 = 32 % itemtypes not in gui: 307 guibutton 1860 guimenu 135 guisubmenu 0 guimenuitem 3573 guilabel 151 guiicon in gui: 2366 guibutton 14816 guimenu 805 guisubmenu 0 guimenuitem 7264 guilabel 283 guiicon It is a waste of time to proofread the language documentation in 10 - 20 languages and check them for wrong guiitems, let's do this once for the templates and then let a script detect all errors in the language documentations. Burkhard Lück