[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-i18n-doc
Subject:    [PATCH] [poxml] make internal entities translatable
From:       Marc Mutz <mutz () kde ! org>
Date:       2008-05-28 22:36:16
Message-ID: 200805290036.19343.mutz () kde ! org
[Download RAW message or body]

[Attachment #2 (multipart/mixed)]


[xposted to core-devel]

Hei,

You might remember me as the guy that, four years ago, used internal entities 
in kleopatra/index.docbook to have a handy shortcut for the menu entries:
  <!ENTITY view-redisplay "<menuchoice><shortcut><keycombo action='simul'>
  <keycap>F5</keycap></keycombo></shortcut>
  <guimenu>View</guimenu><guimenuitem>Redisplay</guimenuitem></menuchoice>">
...
  Use &view-redisplay; to foo bar.

Up to now, these entities were squarely removed by the poxml parser, which 
made their contents inaccessible to translators. Which, in turn, made it 
impossible for me to use these handy shortcuts.

Since that is clearly an unacceptable state, and since I think that also 
translators could save a bunch of work if these shortcuts were made available 
to them, I've sat down and after a few hours of staring at incomprehensible 
code, came up with attached patch, which fixes this.

In order to be minimally intrusive, I've only enabled entity extraction for 
entities starting with "i18n-". That way, the standard entities such as 
kappname and language don't clobber the .po(t). It's trivial to extract all 
internal entities, though.

The patch is forwards- and backwards compatible. Old po2xml will ignore the 
new data, and new po2xml will leave the entity definition in peace if there's 
no translation for them,

On thing I haven't tested is the effect on the other two programs of the poxml 
suite. split2po and swappo. I just don't know what they do...

I'd like to ask for permission to apply this patch now, in the feature freeze, 
because I think it's a bugfix, and because poxml is a development tool that's 
hardly user-visible, so the freeze shouldn't apply to it.

Another reason is that I intend to make much, much use of this new facility 
when updating kleopatra/index.docbook for Kleopatra 2.0.

I'll commit in ~24h unless someone objects.

Please keep my @kdab address CC'ed.

Thanks,
Marc


["poxml-internal-entities.diff" (text/x-diff)]

Index: parser.h
===================================================================
--- parser.h	(revision 813020)
+++ parser.h	(working copy)
@@ -64,6 +64,12 @@
     void addAnchor(QString anchor) { anchors.insert(anchor, current); }
     void increasePara() { current++; }
 
+    ParaCounter & operator+=( const ParaCounter & other ) {
+        current += other.current;
+        anchors.unite( other.anchors );
+        return *this;
+    }
+
     QMap<QString, int> anchors;
     int current;
 };
@@ -73,6 +79,12 @@
 public:
     MsgList() {}
     ParaCounter pc;
+
+    MsgList & operator+=( const MsgList & other ) {
+        Q3ValueList<MsgBlock>::operator+=( other );
+        pc += other.pc;
+        return *this;
+    }
 };
 
 class StructureParser : public QXmlDefaultHandler
Index: lauri.po
===================================================================
--- lauri.po	(revision 813020)
+++ lauri.po	(working copy)
@@ -14,6 +14,12 @@
 "Content-Transfer-Encoding: 8bit\n"
 "X-Generator: KBabel 0.9.2\n"
 
+#. Tag: !ENTITY
+#: lauri.xml:3
+#, no-c-format
+msgid "<!ENTITY i18n-file-quit \"<menuchoice><shortcut><keycombo \
action='simul'>&Ctrl;<keycap>Q</keycap></keycombo></shortcut><guimenu>File</guimenu><guimenuitem>Quit</guimenuitem></menuchoice>\">"
 +msgstr "<!ENTITY i18n-file-quit \"<menuchoice><shortcut><keycombo \
action='simul'>&Ctrl;<keycap>Q</keycap></keycombo></shortcut><guimenu>Datei</guimenu><guimenuitem>Beenden</guimenuitem></menuchoice>\">"
 +
 #. Tag: title
 #: lauri.xml:16
 #, no-c-format
Index: lauri.xml
===================================================================
--- lauri.xml	(revision 813020)
+++ lauri.xml	(working copy)
@@ -1,7 +1,7 @@
 <?xml version="1.0" ?>
 <!DOCTYPE book PUBLIC "-//KDE//DTD DocBook XML V4.2-Based Variant V1.1//EN" \
"dtd/kdex.dtd" [  <!ENTITY % English "INCLUDE" > <!-- change language only here -->
-  <!ENTITY lauri "<emphasis>Lauri</emphasis>" >
+  <!ENTITY lauri "<emphasis>Lauri</emphasis>" > <!ENTITY i18n-file-quit \
"<menuchoice><shortcut><keycombo \
action='simul'>&Ctrl;<keycap>Q</keycap></keycombo></shortcut><guimenu>File</guimenu><guimenuitem>Quit</guimenuitem></menuchoice>">
  ]>
 
 <book>
Index: parser.cpp
===================================================================
--- parser.cpp	(revision 813020)
+++ parser.cpp	(working copy)
@@ -11,6 +11,17 @@
 
 using namespace std;
 
+static int countRev( const QString & str, QChar ch, int idx ) {
+    if ( idx < 0 )
+        idx += str.length();
+    if ( idx >= str.length() )
+        idx = str.length();
+    int count = 0;
+    for ( int i = 0 ; i <= idx ; ++i )
+        count += ( str[i] == ch );
+    return count;
+}
+
 static const char *singletags[] = {"beginpage","imagedata", "colspec", "spanspec",
                                    "anchor", "xref", "area",
                                    "footnoteref", "void", "inlinegraphic",
@@ -931,6 +942,38 @@
     QString contents = QString::fromUtf8( ccontents );
     StructureParser::cleanupTags(contents);
 
+    MsgList english;
+    {
+        // find internal entities that start with "i18n-", and extract
+        // their replacement texts:
+        QRegExp rx( "<!ENTITY\\s+([^\\s]+)\\s+([\"'])" );
+        for ( int index = rx.indexIn( contents, 0 ) ; index >= 0 ; index = \
rx.indexIn( contents, index ) ) { +            const QString name = rx.cap( 1 );
+            const QChar delim = rx.cap( 2 ).at( 0 );
+            const int start = index;
+            index = contents.indexOf( delim, index + rx.matchedLength() );
+            index = contents.indexOf( '>', index );
+            if ( !name.startsWith( "i18n-" ) )
+                continue;
+            const QString entity = contents.mid( start, index - start + 1 );
+            MsgBlock block;
+            block.tag = "!ENTITY";
+            BlockInfo bi;
+            bi.start_line = countRev( contents, '\n', index ) + 1;
+            bi.start_col  = start - contents.lastIndexOf( '\n', start ) - 1;
+            bi.end_line   = bi.start_line + entity.count( '\n' );
+            bi.end_col    = index - contents.lastIndexOf( '\n', index ) + 1;
+#ifdef POXML_DEBUG
+            qDebug( "ENTITY %s @ i:%d l:%d c:%d->l:%d c:%d", qPrintable( name ),
+                    index, bi.start_line, bi.start_col, bi.end_line, bi.end_col );
+#endif
+            block.lines.push_back( bi );
+            block.msgid = entity;
+            english.push_back( block );
+        }
+    }
+
+    // Remove all entity definitions now:
     while (true) {
         int index = contents.find("<!ENTITY");
         if (index < 0)
@@ -966,7 +1009,7 @@
     reader.setDTDHandler( &handler );
     // reader.setErrorHandler( &handler );
     reader.parse( source );
-    MsgList english = handler.getList();
+    english += handler.getList();
 
     bool changed = false;
 


[Attachment #6 (application/pgp-signature)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic