[prev in list] [next in list] [prev in thread] [next in thread] 

List:       fedora-docs-list
Subject:    Re: Legacy document translations
From:       Pete Travis <me () petetravis ! com>
Date:       2016-10-11 15:42:41
Message-ID: CAPDshLsWN+EuKcFwX8-JpM+FCh6StabcJo8=ALOzjwjXuL4Fkw () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


On Oct 11, 2016 10:23, "Shaun McCance" <shaunm@gnome.org> wrote:
>
> Hi all,
>
> I've been working on getting translations from Zanata and merging them
> into DocBook. There are two big issues, and I'd like to propose an
> alternative for legacy documents. Here are the issues:
>
> * Pulling from Zanata is slow. It's basically just a bunch of HTTP
> calls, at least one per language per XML file. I don't see any way to
> use etags or similar to avoid redownloading the same content. I had a
> brief chat with bex on IRC about possibly having a git mirror of the PO
> files. That would be faster.
>

Could we solve this by periodically pulling POs into the release branches
of our docs?  I'm picturing a nightly script that checks out ie the f25
branch, pulls a language, does some tests, commits if the tests pass, and
moves to the next language.  I read the suggestion as using a separate git
repo, which seems unnecessarily complex.

> * Merging requires Publican, because the merge code lives there. There
> are two possible ways around this:
>
> 1) We pull Publican's Translate.pm into a standalone module and have a
> tool ("publican-po"?) that just does PO extraction and merging exactly
> the way Publican does. We'd have to maintain this, but it would be a
> lower maintenance burden than all of Publican.
>
> 2) We merge with itstool instead. itstool's PO files don't exactly
> match Publican's. So a 100% translated document might drop to 90% or
> so. I could probably write custom ITS rules that would make it match
> better. I don't know if I could get it to match 100%.
>
Can you elaborate on what itstool does not do?  Entities?  I like the idea
of using an established tool vs partial fork, perhaps a little additional
processing will get us there.

> So, an alternative: For any documents that are no longer edited in any
> way, we could do a one time merge of all translations and just put it
> in git on that branch. That way there's no downloading (aside from the
> git clone we do anyway), no merging, and no maintaining a legacy merge
> tool going forward.
>
> The downside is that we'd be putting a lot more content in git, which
> could slow down git clones. Alternatively, we could put them all in a
> separate repo. For example, all release-notes translations could go
> into a new repo called release-notes-translations.
>
> Thoughts?
>
> --
> Shaun
>

OK, you did get to the separate repo question.  Time spent fetching remote
refs seems to be the only downside to continuing our POs-in-release-branch
SOP.   I don't see enough need for speed in the process to warrant the
increased procedural and architectural complexity.  IMO publishing the
source lang and translated langs asynchronously would be fine.

That said, I have not personally done a multi-language build with pintail,
there may well be something I'm missing.

-- Pete

[Attachment #5 (text/html)]

<p dir="ltr"></p>
<p dir="ltr">On Oct 11, 2016 10:23, &quot;Shaun McCance&quot; &lt;<a \
href="mailto:shaunm@gnome.org">shaunm@gnome.org</a>&gt; wrote:<br> &gt;<br>
&gt; Hi all,<br>
&gt;<br>
&gt; I&#39;ve been working on getting translations from Zanata and merging them<br>
&gt; into DocBook. There are two big issues, and I&#39;d like to propose an<br>
&gt; alternative for legacy documents. Here are the issues:<br>
&gt;<br>
&gt; * Pulling from Zanata is slow. It&#39;s basically just a bunch of HTTP<br>
&gt; calls, at least one per language per XML file. I don&#39;t see any way to<br>
&gt; use etags or similar to avoid redownloading the same content. I had a<br>
&gt; brief chat with bex on IRC about possibly having a git mirror of the PO<br>
&gt; files. That would be faster.<br>
&gt;</p>
<p dir="ltr">Could we solve this by periodically pulling POs into the release \
branches of our docs?   I&#39;m picturing a nightly script that checks out ie the f25 \
branch, pulls a language, does some tests, commits if the tests pass, and moves to \
the next language.   I read the suggestion as using a separate git repo, which seems \
unnecessarily complex.</p> <p dir="ltr">&gt; * Merging requires Publican, because the \
merge code lives there. There<br> &gt; are two possible ways around this:<br>
&gt;<br>
&gt; 1) We pull Publican&#39;s Translate.pm into a standalone module and have a<br>
&gt; tool (&quot;publican-po&quot;?) that just does PO extraction and merging \
exactly<br> &gt; the way Publican does. We&#39;d have to maintain this, but it would \
be a<br> &gt; lower maintenance burden than all of Publican.<br>
&gt;<br>
&gt; 2) We merge with itstool instead. itstool&#39;s PO files don&#39;t exactly<br>
&gt; match Publican&#39;s. So a 100% translated document might drop to 90% or<br>
&gt; so. I could probably write custom ITS rules that would make it match<br>
&gt; better. I don&#39;t know if I could get it to match 100%.<br>
&gt;<br>
Can you elaborate on what itstool does not do?   Entities?   I like the idea of using \
an established tool vs partial fork, perhaps a little additional processing will get \
us there.</p> <p dir="ltr">&gt; So, an alternative: For any documents that are no \
longer edited in any<br> &gt; way, we could do a one time merge of all translations \
and just put it<br> &gt; in git on that branch. That way there&#39;s no downloading \
(aside from the<br> &gt; git clone we do anyway), no merging, and no maintaining a \
legacy merge<br> &gt; tool going forward.<br>
&gt;<br>
&gt; The downside is that we&#39;d be putting a lot more content in git, which<br>
&gt; could slow down git clones. Alternatively, we could put them all in a<br>
&gt; separate repo. For example, all release-notes translations could go<br>
&gt; into a new repo called release-notes-translations.<br>
&gt;<br>
&gt; Thoughts?<br>
&gt;<br>
&gt; --<br>
&gt; Shaun<br>
&gt;</p>
<p dir="ltr">OK, you did get to the separate repo question.   Time spent fetching \
remote refs seems to be the only downside to continuing our POs-in-release-branch \
SOP.     I don&#39;t see enough need for speed in the process to warrant the \
increased procedural and architectural complexity.   IMO publishing the source lang \
and translated langs asynchronously would be fine.</p> <p dir="ltr">That said, I have \
not personally done a multi-language build with pintail, there may well be something \
I&#39;m missing.</p> <p dir="ltr">-- Pete</p>


[Attachment #6 (text/plain)]

_______________________________________________
docs mailing list -- docs@lists.fedoraproject.org
To unsubscribe send an email to docs-leave@lists.fedoraproject.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic