[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-kimageshop
Subject:    Re: Sphinx Application Documentation - Image duplication
From:       "=?utf-8?B?SnVsaXVzIEvDvG56ZWw=?=" <jk.kdedev () smartlab ! uber ! space>
Date:       2023-01-22 18:51:36
Message-ID: c28484ef483273bb4c4f65c8a8f6f88b55292311 () smartlab ! uber ! space
[Download RAW message or body]

Hi Ben, hi all,

I did a little research about this recently and unfortunately it seems to me as if \
there is not really a solution on the Sphinx side. One need to have separate build \
dirs for every language and it copies all static files (css, js, images,..) to every \
build dir. That's just how it works :-/ (Correct me in case anyone knows I am wrong). \
However we can of course try to solve this on our and and make our deploy tools smart \
in a way that they keep only one version of each image file and replace the others \
with symlinks. It should be more or less easy to detect images that are translated \
since they follow the pattern filename.de.png where "de" is the language code, so \
this image would be special for German, while for all other languages filename.png is \
used.

I hope that helps so far. I might be able to look into this, but probably not very \
soon so if anybody else can work on this I am more than happy.

Cheers,
Julius




15. Januar 2023 um 07:45, "Ben Cooksley" <bcooksley@kde.org> schrieb:


> 
> Hi all,
> 
> For some time now it has been known to me that the system for generating \
> application documentation websites using Sphinx with l10n support has had issues \
> with duplicating data - particularly images. 
> That leads to the following outcome, where aside from sites that we expect to be \
> quite large (like www.kde.org http://www.kde.org/  and api.kde.org \
> http://api.kde.org/ ) all of the application documentation sites are quite big as \
> well: 
> root@nicoda /srv/www # du -h --max-depth=1 ./generated/ | grep G
> 2.3G      ./generated/cutehmi.kde.org http://cutehmi.kde.org/ 
> 3.7G      ./generated/docs.digikam.org http://docs.digikam.org/ 
> 2.4G      ./generated/api.kde.org http://api.kde.org/ 
> 2.3G      ./generated/docs.krita.org http://docs.krita.org/ 
> 1.4G      ./generated/www.kde.org http://www.kde.org/ 
> 7.9G      ./generated/docs.kdenlive.org http://docs.kdenlive.org/ 
> 29G       ./generated/
> 
> This stands in comparison to the Docbook documentation site for all other KDE \
> applications: 
> root@nicoda /srv/www # du -h --max-depth=1 . | grep G
> 29G       ./generated
> 16G       ./api.kde.org-legacy
> 6.0G      ./docs.kde.org http://docs.kde.org/ 
> 51G       .
> 
> It would be nice if we could please look into some fixes for this, as it looks like \
> Sphinx is duplicating the images - once for every language - when that isn't \
> necessary. I could understand if the screenshots were updated as part of the \
> translation, but it looks like they're not in the majority of cases - below being \
> just a sample: 
> root@nicoda /srv/www/generated/docs.krita.org http://docs.krita.org/  # sha256sum \
> zh_CN/_images/Krita_cpb_mixing.gif \
> 12eb4cbad29a5a6486d3438dabb888a0aa0b9579e55b3be2f3c1d6e1d76fc1d7   \
> zh_CN/_images/Krita_cpb_mixing.gif root@nicoda /srv/www/generated/docs.krita.org \
> http://docs.krita.org/  # sha256sum en/_images/Krita_cpb_mixing.gif \
> 12eb4cbad29a5a6486d3438dabb888a0aa0b9579e55b3be2f3c1d6e1d76fc1d7   \
> en/_images/Krita_cpb_mixing.gif 
> While this isn't a massive issue right now, it is a future scalability issue as for \
> Krita at least each language costs 178MB or so, while for Digikam that sits at \
> 415MB per language and Kdenlive is 392MB. 
> Many thanks,
> Ben
> 


Julius Künzel
Volunteer KDE Developer, mainly hacking Kdenlive
KDE GitLab: https://my.kde.org/user/jlskuz/
Matrix: @jlskuz:kde.org


[Attachment #3 (text/html)]

<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; \
charset=utf-8"></head><body><div>Hi Ben, hi all,</div><div><br></div><div>I did a \
little research about this recently and unfortunately it seems to me as if there is \
not really a solution on the Sphinx side. One need to have separate build dirs for \
every language and it copies all static files (css, js, images,..) to every build \
dir. That's just how it works :-/ (Correct me in case anyone knows I am \
wrong).<br></div><div>However we can of course try to solve this on our and and make \
our deploy tools smart in a way that they keep only one version of each image file \
and replace the others with symlinks.</div><div>It should be more or less easy to \
detect images that are translated since they follow the pattern <code class="docutils \
literal notranslate"><span class="pre">filename.de.png where "de" is the language \
code, so this image would be special for German, while for all other languages \
filename.png is used.</span></code></div><div><br></div><div>I hope that helps so \
far. I might be able to look into this, but probably not very soon so if anybody else \
can work on this I am more than \
happy.</div><div><br></div><div>Cheers,</div><div>Julius<br></div><div><code \
class="docutils literal notranslate"><span class="pre"><br></span></code></div><p>15. \
Januar 2023 um 07:45, "Ben Cooksley" &lt;<a \
href="mailto:bcooksley@kde.org?to=%22Ben%20Cooksley%22%20%3Cbcooksley%40kde.org%3E" \
target="_blank" tabindex="-1">bcooksley@kde.org</a>&gt; schrieb:</p><blockquote><div \
dir="ltr"><div>Hi all,</div><div><br></div><div>For some time now it has been known \
to me that the system for generating application documentation websites using Sphinx \
with l10n support has had issues with duplicating data - particularly \
images.</div><div><br></div><div>That leads to the following outcome, where aside \
from sites that we expect to be quite large (like <a href="http://www.kde.org/" \
target="_blank" rel="external nofollow noopener noreferrer" \
tabindex="-1">www.kde.org</a> and <a href="http://api.kde.org/" target="_blank" \
rel="external nofollow noopener noreferrer" tabindex="-1">api.kde.org</a>) all of the \
application documentation sites are quite big as \
well:</div><div><br></div><div><div>root@nicoda /srv/www # du -h --max-depth=1 \
./generated/ | grep G</div><div>2.3G      ./generated/<a \
href="http://cutehmi.kde.org/" target="_blank" rel="external nofollow noopener \
noreferrer" tabindex="-1">cutehmi.kde.org</a></div><div>3.7G      ./generated/<a \
href="http://docs.digikam.org/" target="_blank" rel="external nofollow noopener \
noreferrer" tabindex="-1">docs.digikam.org</a></div><div>2.4G      ./generated/<a \
href="http://api.kde.org/" target="_blank" rel="external nofollow noopener \
noreferrer" tabindex="-1">api.kde.org</a></div><div>2.3G      ./generated/<a \
href="http://docs.krita.org/" target="_blank" rel="external nofollow noopener \
noreferrer" tabindex="-1">docs.krita.org</a></div><div>1.4G      ./generated/<a \
href="http://www.kde.org/" target="_blank" rel="external nofollow noopener \
noreferrer" tabindex="-1">www.kde.org</a></div><div>7.9G      ./generated/<a \
href="http://docs.kdenlive.org/" target="_blank" rel="external nofollow noopener \
noreferrer" tabindex="-1">docs.kdenlive.org</a></div><div>29G       \
./generated/</div></div><div><br></div><div>This stands in comparison to the Docbook \
documentation site for all other KDE \
applications:</div><div><br></div><div><div>root@nicoda /srv/www # du -h \
--max-depth=1 . | grep G</div><div>29G       ./generated</div><div>16G       \
./api.kde.org-legacy</div><div>6.0G      ./<a href="http://docs.kde.org/" \
target="_blank" rel="external nofollow noopener noreferrer" \
tabindex="-1">docs.kde.org</a></div><div>51G       \
.</div></div><div><br></div><div>It would be nice if we could please look into some \
fixes for this, as it looks like Sphinx is duplicating the images - once for every \
language - when that isn't necessary.</div><div>I could understand if the screenshots \
were updated as part of the translation, but it looks like they're not in the \
majority of cases - below being just a \
sample:</div><div><br></div><div><div>root@nicoda /srv/www/generated/<a \
href="http://docs.krita.org/" target="_blank" rel="external nofollow noopener \
noreferrer" tabindex="-1">docs.krita.org</a> # sha256sum \
zh_CN/_images/Krita_cpb_mixing.gif</div><div>12eb4cbad29a5a6486d3438dabb888a0aa0b9579e55b3be2f3c1d6e1d76fc1d7 \
zh_CN/_images/Krita_cpb_mixing.gif</div><div>root@nicoda /srv/www/generated/<a \
href="http://docs.krita.org/" target="_blank" rel="external nofollow noopener \
noreferrer" tabindex="-1">docs.krita.org</a> # sha256sum \
en/_images/Krita_cpb_mixing.gif</div><div>12eb4cbad29a5a6486d3438dabb888a0aa0b9579e55b3be2f3c1d6e1d76fc1d7 \
en/_images/Krita_cpb_mixing.gif</div></div><div><br></div><div>While this isn't a \
massive issue right now, it is a future scalability issue as for Krita at least each \
language costs 178MB or so, while for Digikam that sits at 415MB per language and \
Kdenlive is 392MB.</div><div><br></div><div>Many \
thanks,</div><div>Ben</div></div></blockquote><div><br></div><div><br></div><div \
class="rl-signature"><div>Julius Künzel</div><div>Volunteer KDE Developer, mainly \
hacking Kdenlive</div><div>KDE GitLab: <a \
href="https://my.kde.org/user/jlskuz/">https://my.kde.org/user/jlskuz/</a></div><div>Matrix: \
@jlskuz:kde.org</div></div><div><br></div></body></html>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic