[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-i18n-doc
Subject:    Re: GNOME appdata files translation
From:       "T.C. Hollingsworth" <tchollingsworth () gmail ! com>
Date:       2013-11-04 4:35:23
Message-ID: CAJVv0OmtDNKgQY3zTDz4JjbAuadozmjGeOYqqe9SQHuKf83_Yg () mail ! gmail ! com
[Download RAW message or body]

Hi Yuri, Albert, other translators!

On Sat, Nov 2, 2013 at 1:55 AM, Yuri Chornoivan <yurchor@ukr.net> wrote:
> Hi,
>
> GNOME developers proclaim their goal to add appdata files [1] to every
> package and to hide for GNOME users every package without such files in 1
> year (GNOME 3.14).
>
> So far, even Fedora/KDE package manager (Apper) cannot show appdata by
> default (should be recompiled with a specific option). So there is no KDE
> distribution that can show appdata.

Our very awesome Rex Dieter is already working on getting that going.
I, for one, can't wait to have screenshots and other juicy goodness in
my apper. :-)

> Form the technical PoV, is it possible for scripty to extract messages from
> such files (should they be added to the repos) and merge them back?

Attached is a rough patch against l10n-kde4/scripts to implement this.

A fair bit of it is copy/pasted/otherwise heavily inspired from the
.desktop file stuff, since it's sort of doing the same thing.  It
might make sense to combine a couple of the scripts and just add args
to do different stuff depending on whether it's appdata or .desktop,
but I wanted to keep it simple for now.

Things I've tested:
- createappdatacontext.py spits out acceptable looking POTs when fed
reasonable contrived arguments (example at [1])
- merge_appdata_files.sh runs applyappdatacontext.py successfully when
fed reasonable contrived arguments and spits out a valid combined XML
file

Things I've not tested:
- findappdatafiles, though how bad can you screw up changing arguments to find?
- update_translations, because oh my god I already need a drink ;-)

Also, the indentation on translated entries looks nothing like the
originals, despite a chunk of awkward code that tries to remedy that.
Not sure whether it's possible to fix that, or if we need some sort of
project-wide policy on indentation to match what the script will do,
or if nobody cares.

I wouldn't be the least bit surprised if I missed something,
especially in update_translations, so please do have a look and let me
know how badly I would have broken scripty. ;-)

-T.C.

[1] https://gist.github.com/tchollingsworth/2e708dd8925c9ff2d939

["appdata-l10n.patch" (text/x-patch)]

Index: applyappdatacontext.py
===================================================================
--- applyappdatacontext.py	(revision 0)
+++ applyappdatacontext.py	(working copy)
@@ -0,0 +1,83 @@
+#!/usr/bin/env python
+
+from __future__ import unicode_literals
+import gettext
+import os
+import sys
+import xml.etree.ElementTree as ET
+
+infile = sys.argv[1]
+outfile = infile + '.new'
+langs = sys.argv[2:]
+
+localedir = os.path.join(os.environ['KDEDIR'], 'share', 'locale')
+
+# generate Python translation objects on the fly that are hooked up to our
+# temporary catalogs
+class GT(object):
+    def __init__(self):
+        self._objs = {}
+    
+    def __getitem__(self, name):
+        if name not in self._objs:
+            domain = 'apply_{0}'.format(lang)
+            self._objs[name] = gettext.translation(domain, localedir, \
languages=['abc']) +        
+        return self._objs[name]
+gt = GT()
+
+try:
+    infh = open(infile)
+except IOError:
+    sys.stderr.write('Cannot open file {0}'.format(f))
+    sys.exit(1)
+
+p = ET.parse(infh)
+root = p.getroot()
+
+for context in ('name', 'summary', 'description'):
+    msgid = None
+    
+    for elem in root.findall(context):
+        #find the msgid and indention we stripped previously
+        if not 'lang' in elem.attrib:
+            if context != 'description':
+                msgid = elem.text
+            else:
+                # reverse the stripping of indentation
+                start = '\n' if elem.text.startswith('\n') else ''
+                indent = elem.text.lstrip('\n')
+                omsg = ''.join(ET.tostring(e) for e in elem).split('\n')
+                msgid = '\n'.join([ line[len(indent):] if line.startswith(indent) \
else line +                            for line in omsg ])
+                reindent = [ line.startswith(indent) for line in omsg ]
+                print elem.text
+            
+            break
+    
+    #this field doesn't exist
+    if msgid is None:
+        continue
+    
+    for lang in langs:
+        domain = 'apply_{0}'.format(lang)
+        omsgstr = gt[domain].gettext(msgid).split('\n')
+        msgstr = start + '\n'.join([ indent + line if reindent[i] else line for i, \
line in enumerate(omsg) ]) +        msgelem = ET.fromstring('<{0} \
lang="{1}">{2}</{0}>\n'.format(context, lang, msgstr)) +        
+        #try to find an existing translation in the XML
+        elem = root.find('.//{0}[@lang="{1}"]'.format(context, lang))
+        
+        #create a new node if we didn't find an existing one to use
+        if elem is None:
+            root.append(msgelem)
+        else:
+            for child in elem:
+                del child
+            for child in msgelem:
+                elem.append(child)
+            elem.text = msgelem.text
+
+x = p.write(outfile, encoding='utf-8', xml_declaration=True)
+
+infh.close()
Index: createappdatacontext.py
===================================================================
--- createappdatacontext.py	(revision 0)
+++ createappdatacontext.py	(working copy)
@@ -0,0 +1,79 @@
+#!/usr/bin/env python
+
+from __future__ import unicode_literals
+import codecs
+import datetime
+from optparse import OptionParser
+import os
+import sys
+import time
+import xml.etree.ElementTree as ET
+
+def potdate():
+    return time.strftime('%Y-%d-%m %H:%M+0000', time.gmtime())
+
+def prepare():
+    sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
+    sys.stderr = codecs.getwriter('utf-8')(sys.stderr)
+    
+    print "#, fuzzy"
+    print "msgid \"\""
+    print "msgstr \"\""
+    print "\"Project-Id-Version: appdata files\\n\""
+    print "\"Report-Msgid-Bugs-To: http://bugs.kde.org\\n\""
+    print "\"POT-Creation-Date: " + potdate() + "\\n\""
+    print "\"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\\n\""
+    print "\"Last-Translator: FULL NAME <EMAIL\@ADDRESS>\\n\""
+    print "\"Language-Team: LANGUAGE <kde-i18n-doc\@kde.org>\\n\""
+    print "\"MIME-Version: 1.0\\n\""
+    print "\"Content-Type: text/plain charset=UTF-8\\n\""
+    print "\"Content-Transfer-Encoding: 8bit\\n\""
+    print
+    print
+    
+def processfiles(basedir, files):
+    for filename in files:
+        try:
+            infh = open(os.path.join(basedir, filename))
+        except IOError:
+            sys.stderr.write('Cannot open file {0}'.format(f))
+            continue
+        
+        root = ET.parse(infh)
+        
+        for context in ('name', 'summary', 'description'):
+            for elem in root.findall(context):
+                if not 'lang' in elem.attrib:
+                    print "#: {0}".format(filename)
+                    print "msgctxt \"{0}\"".format(context)
+                    
+                    if context != 'description':
+                        print "msgid \"{0}\"".format(elem.text)
+                    else:
+                        # remove the indentation to make a nice message for \
translators +                        indent = elem.text.lstrip('\n')
+                        msgid = ''.join(ET.tostring(e) for e in elem)
+                        lines = [ line[len(indent):] if line.startswith(indent) else \
line +                                 for line in msgid.split('\n') ]
+                        print "msgid \"\"".format(msgid)
+                        for line in lines:
+                            print '"{0}"'.format(line)
+                        
+                    print "msgstr \"\""
+                    print
+        
+        infh.close()
+        
+def main():
+    p = OptionParser()
+    p.add_option('--file-list')
+    p.add_option('--base-dir')
+    options, args = p.parse_args()
+    
+    filelist = open(options.file_list).read().split('\n')
+    
+    prepare()
+    processfiles(options.base_dir, filelist)
+    
+if __name__ == '__main__':
+    main()

Property changes on: createappdatacontext.py
___________________________________________________________________
Added: svn:executable
## -0,0 +1 ##
+*
\ No newline at end of property
Index: findappdatafiles
===================================================================
--- findappdatafiles	(revision 0)
+++ findappdatafiles	(working copy)
@@ -0,0 +1,37 @@
+#! /usr/bin/env bash
+# This file extracts appdata files for translation
+
+if test -z "$1"; then
+  echo "call: $0 <filename>"
+  exit
+fi
+
+filelist=$1
+dir=`dirname $0`
+. $dir/get_paths
+
+ml="`list_modules $dir` l10n"
+
+rm -f "$filelist"_* $filelist
+
+: > $filelist
+
+for mod in $ml; do
+  dir=$BASEDIR/`get_path $mod`
+  if test ! -d $dir; then
+    echo "ERROR: module $mod cannot be found in directory $dir"
+    continue
+  fi
+  echo "$dir"
+  find $dir -name \*.appdata.xml  -a \( -type f -o -type l \) >> $filelist
+  initialdir=`pwd`
+  cd $initialdir
+done
+
+sort -o $filelist -u $filelist
+
+for mod in $ml; do 
+    subfile="$filelist"_$mod
+    fgrep $BASEDIR/`get_path $mod`/ $filelist > $subfile
+done
+
Index: merge_appdata_files.sh
===================================================================
--- merge_appdata_files.sh	(revision 0)
+++ merge_appdata_files.sh	(working copy)
@@ -0,0 +1,57 @@
+#! /bin/bash
+# kate: space-indent on; indent-width 2; replace-tabs on;
+langfile=`tempfile`
+KDEDIR=`tempfile`
+logfile=`tempfile`
+if test -f $KDEDIR; then rm -f $KDEDIR; mkdir $KDEDIR; fi
+export KDEDIR;
+lists=`ls -1 appdata_files_*`
+: > $logfile ;
+languages=`cat subdirs`; 
+for listfile in $lists; do 
+  mod=`echo $listfile | sed -e "s,appdata_files_,,"`
+  : > $langfile ;
+  for lang in $languages; do 
+    file=`find $lang/messages -name "appdata_$mod.po"`
+    if test -z "$file"; then 
+      continue
+    fi
+    charsetline=`egrep "^\"Content-Type: .*/.*;? charset=.*\n\"" $file`
+    if test -z "$charsetline"; then 
+      echo "ERROR: file $file contains no correct charset declaration!"
+      fgrep -i "Content-Type" $file
+      echo "--"
+      continue
+    else
+      charset=`echo $charsetline | sed -e "s#^.*charset=\(.*\)..\"#\1#"`
+      # The Gettext tools are strict about the spelling of UTF-8
+      if test "$charset" != "utf-8" -a "$charset" != "UTF-8"; then
+        echo "ERROR: file $file has non-UTF-8 charset: $charset"
+        continue
+      fi
+    fi
+    mkdir -p $KDEDIR/share/locale/abc/LC_MESSAGES
+    if ! msgfmt $file -o $KDEDIR/share/locale/abc/LC_MESSAGES/apply_$lang.mo; then 
+            echo "ERROR: file $file could not be processed by msgfmt!"
+            continue
+    fi
+    echo $lang >> $langfile
+  done
+  filelanguages=`sort -u $langfile`
+  list=`cat $listfile` 
+  for i in $list; do 
+    if python ./scripts/applyappdatacontext.py $i $filelanguages >> $logfile 2>&1; \
then +      if cmp -s $i $i.new; then
+        rm $i.new
+      else
+        chmod --reference=$i $i.new
+        mv -f $i.new $i
+      fi
+    else
+      echo "ERROR: applyappdatacontext.py failed for file $i"
+    fi
+  done
+done
+sort $logfile
+rm -f $logfile $langfile
+rm -rf $KDEDIR
Index: update_translations
===================================================================
--- update_translations	(revision 1368487)
+++ update_translations	(working copy)
@@ -27,6 +27,24 @@
   rm -f desktop.$$ desktop.$$.tmp
 }
 
+extract_appdata() {
+  python ./scripts/createappdatacontext.py --file-list=./$1 --base-dir=$2 > \
appdata.$$.tmp +  dest=$3
+  msguniq --to-code=UTF-8 -o appdata.$$ appdata.$$.tmp 2>/dev/null
+  if test -f appdata.$$; then
+    if test ! -f  $dest; then 
+      echo "File $dest is missing!" 
+      mv appdata.$$ $dest
+    elif diff -q -I^\"POT-Creation-Date: appdata.$$ $dest > /dev/null; then
+      rm -f appdata.$$
+      touch $dest
+    else
+      mv appdata.$$ $dest
+    fi
+  fi
+  rm -f appdata.$$ appdata.$$.tmp
+}
+
 postprocess_pot_file()
 {
 # $1: name of the file to process
@@ -330,14 +348,16 @@
     fi
     rm -f templatenames.tmp
     
-    test -z "$VERBOSE1" || echo "creating desktop*.pot files"
+    test -z "$VERBOSE1" || echo "creating (desktop|appdata)*.pot files"
     test -z "$TIMING1" || date
     bash scripts/findfiles `pwd`/all_files 
+    bash scripts/findappdatafiles `pwd`/appdata_files
 
     for mod in $releases l10n; do
       case "$mod" in
         extragear-*_*)
-          extract_desktop all_files_$mod $BASEDIR/`get_path $mod` \
templates/messages/`get_po_path $mod`/desktop_$mod.pot +          extract_desktop \
all_files_$mod $BASEDIR/`get_path $mod` templates/messages/`get_po_path \
$mod`/appdata_$mod.pot +          extract_appdata appdata_files_$mod \
$BASEDIR/`get_path $mod` templates/messages/`get_po_path $mod`/appdata_$mod.pot  ;;
         extragear-*)
           basedir=$BASEDIR/`get_path $mod`
@@ -345,15 +365,20 @@
           for subdir in $subdirs; do
             mods="$mod""_$subdir"
             fgrep $basedir/$subdir all_files_$mod > all_files_$mods
+            fgrep $basedir/$subdir appdata_files_$mod > appdata_files_$mods
             extract_desktop all_files_$mods $basedir/$subdir \
templates/messages/$mod/desktop_$mods.pot +            extract_appdata \
all_files_$mods $basedir/$subdir templates/messages/$mod/appdata_$mods.pot  done
           rm -f all_files_$mod
+          rm -f appdata_files_$mod
           ;;
         l10n)
           extract_desktop all_files_$mod $BASEDIR/`get_path $mod` \
templates/messages/kdelibs/desktop_l10n.pot +          extract_appdata \
appdata_files_$mod $BASEDIR/`get_path $mod` \
templates/messages/kdelibs/appdata_l10n.pot  ;;
         *)
           extract_desktop all_files_$mod $BASEDIR/`get_path $mod` \
templates/messages/`get_po_path $mod`/desktop_$mod.pot +          extract_appdata \
appdata_files_$mod $BASEDIR/`get_path $mod` templates/messages/`get_po_path \
$mod`/appdata_$mod.pot  ;;
       esac
     done
@@ -362,7 +387,7 @@
     test -z "$TIMING1" || date
     if cd templates/messages; then
       list=`find . -name desktop\*.pot`
-      # desktop*.pot files have already a correct Content-Type, so we do not need to \
check it or even to modify it +      # (desktop|appdata)*.pot files have already a \
correct Content-Type, so we do not need to check it or even to modify it  for i in \
$list; do  if test ! -f $BASEDIR/backup/templates/messages/$i; then
           echo "Adding desktop*.pot file: $i"
@@ -373,7 +398,7 @@
           cp -f $BASEDIR/backup/templates/messages/$i $i
         fi
       done
-      svn commit $SVNQUIETFLAG -m "SVN_SILENT made messages (desktop*.pot file \
committed)" > /dev/null +      svn commit $SVNQUIETFLAG -m "SVN_SILENT made messages \
(desktop/appdata*.pot file committed)" > /dev/null  cd ../..
     fi
     
@@ -474,14 +499,15 @@
 
   if true; then
     if cd $transmod; then
-      test -z "$VERBOSE1" || echo "applying desktop file translations"
+      test -z "$VERBOSE1" || echo "applying desktop/appdata file translations"
       test -z "$TIMING1" || date
       # Note: the executable should not be renamed to applycontext to avoid to have \
to change the script merge_desktop_files.sh  g++ -O2 -march=nocona -o apply \
scripts/applycontext.cpp  bash scripts/merge_desktop_files.sh 
+      bash scripts/merge_appdata_files.sh
       cd $BASEDIR
     fi
-    test -z "$VERBOSE1" || echo "commiting desktop files"
+    test -z "$VERBOSE1" || echo "commiting desktop/appdata files"
     test -z "$TIMING1" || date
     for i in $releases l10n; do
       if cd $BASEDIR/`get_path $i`; then
@@ -489,12 +515,12 @@
         branch=`get_branch $i`
         case "$vcs" in
           svn)
-            if ! svn commit $SVNQUIETFLAG -m "SVN_SILENT made messages (.desktop \
file)" > /dev/null; then +            if ! svn commit $SVNQUIETFLAG -m "SVN_SILENT \
                made messages (.desktop and appdata files)" > /dev/null; then
               # If the commit fails, then it means that a file was modified. \
Normally it will not be a .desktop file  echo "Need to update $i"
               svn update $SVNQUIETFLAG
-              if ! svn commit $SVNQUIETFLAG -m "SVN_SILENT made messages (.desktop \
                file, second try)"; then
-                echo "ERROR: commiting .desktop files failed for module $i!"
+              if ! svn commit $SVNQUIETFLAG -m "SVN_SILENT made messages (.desktop \
and appdata files, second try)"; then +                echo "ERROR: commiting \
.desktop/appdata files failed for module $i!"  svn revert -R .
               fi
             fi
@@ -503,9 +529,9 @@
             if git pull $SVNQUIETFLAG origin $branch; then
               status=`git status -s`
               if [ "x$status" != "x" ]; then
-                if git commit -a $SVNQUIETFLAG -m "SVN_SILENT made messages \
(.desktop file)"; then +                if git commit -a $SVNQUIETFLAG -m "SVN_SILENT \
made messages (.desktop and appdata files)"; then  if ! git push $SVNQUIETFLAG origin \
                HEAD:$branch; then
-                    echo "ERROR: commiting .desktop files failed for module $i!"
+                    echo "ERROR: commiting .desktop/appdata files failed for module \
$i!"  git reset --hard $SVNQUIETFLAG origin/$branch
                   fi
                 else
@@ -513,7 +539,7 @@
                 fi
               fi
             else
-              echo "ERROR: commiting .desktop files failed for module $i (possible \
conflict)" +              echo "ERROR: commiting .desktop/appdata files failed for \
module $i (possible conflict)"  git reset --hard $SVNQUIETFLAG origin/$branch
             fi
             ;;
@@ -525,7 +551,7 @@
       fi
     done
   else
-    echo "Skipping processing of .desktop files"
+    echo "Skipping processing of .desktop/appdata files"
   fi
 
   rm -rf apply all_files* messages



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic