[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kwrite-devel
Subject:    Fwd: using google-diff-match-patch
From:       Dominik Haumann <dhaumann () kde ! org>
Date:       2022-06-15 12:18:53
Message-ID: CALi_srBABvSUGEXYDXYeEWnv3oXywqS__u4byHPfMesfz+Z-ag () mail ! gmail ! com
[Download RAW message or body]

Hi all,

This is a recap mail from 8 years ago for
https://invent.kde.org/utilities/kate/-/merge_requests/737

Just in case it's useful...

Best regards
Dominik


---------- Forwarded message ---------
Von: Michal Humpula <michal.humpula@seznam.cz>
Date: Fr., 17. Jan. 2014, 13:12
Subject: Re: using google-diff-match-patch
To: Dominik Haumann <dhaumann@kde.org>
Cc: <kwrite-devel@kde.org>


And here comes the update

With the little tweaks to cpp code, I was able to push down the execution
time
for my test example from 8s to 4s (diff binary is able to execute in 0.5s).

After this, it seems that most of the time is spent converting from
QByteArray
to QString and back and creating temporary QStrings. IMHO this could be
speed
up considerably by switching to const QByteArray altogether and just
working
with references.

The whole output of the diff process is escaped with
QByteArray::toPercentEncoding, which currently takes half of the execution
time. But I have no idea, why is that necessary.

For comparison, I've run the same test with python and lua version:

Python: 13s
Lua: 18s
Java: 23s

Seems like general problem with algorithm to me.

On Wednesday 15 of January 2014 13:34:07 Dominik Haumann wrote:
> On Wednesday, January 15, 2014 12:11:24 Michal Humpula wrote:
> > did a quick speed test. Compiled with -O2, Qt4.8.6 (no _FAST_ switches).
> > Testing just diff_main routine
> >
> > Two 3,6k lines cpp files (one kate source file) took 4ms per diff.
> > Two ~11MB cpp files (100 concatenation off above files) took 3,9s per
> > diff.
> >
> > The numbers are correct, I've checked that twice. Seems like it doesn't
> > scale in linear time. Just for compare, the "diff" program can generate
> > output for the second run in 400ms, so there is definitely a space for
> > improvement.
>
> Run it through `valgrind --tool=callgrind <app>` and analyze the output
file
> with kcachegrind.
>
> Or with perf record <app> and then perf report.
>
> Maybe there is some Qt class or similar ill used?
>
> Greetings,
> Dominik
>
> > On Tuesday 14 of January 2014 10:57:27 Neil Fraser wrote:
> > > I have absolutely no objections to any open source use of diff match
> > > patch.
> > >
> > > That said, before you invest too heavily in this library, please
> > > verify that the C++/Qt version of diff match patch meets your
> > > performance needs.  I've never been happy with this port, in my tests
> > > it has performed slower than the JavaScript version.  One of these
> > > days I'd like to replace it with a pure C version.
> > >
> > > On 14 January 2014 05:48, Dominik Haumann <dhaumann@kde.org> wrote:
> > > > Dear Neil,
> > > >
> > > > I am writing to you on behalf of the KDE/Kate project [1]. Kate is
an
> > > > advanced text editor under KDE with lots of features for
programmers.
> > > >
> > > > One feature is showing a diff. For instance, if a file changed on
> > > > disk,
> > > > we
> > > > currently launch the external `diff` process to get the diff of the
> > > > text
> > > > buffer and the file on disk, and then show it graphically.
> > > > However, this is not portable and to some degree slow.
> > > >
> > > > Therefore, we are looking for a C++/Qt solution we can use to
generate
> > > > the
> > > > diff. Since google-diff-match-patch has a C++/Qt interface, it looks
> > > > like
> > > > the perfect match. However, Kate is licensed under "LGPL version 2
or
> > > > later".
> > > >
> > > > Is there any chance we are allowed to use google-diff-match-patch?
> > > > As far as we know, the Apache 2 license is incompatible with LGPLv2.
> > > >
> > > > Would it be possible to put google-diff-match-patch into a library,
> > > > and
> > > > then dynamically link the text editor to this library? Or would that
> > > > still violate licensing?
> > > >
> > > > We're very much interested in finding a solution and clarifying
> > > > possible
> > > > licensing conflicts.
> > > >
> > > > CC for reference: kwrite-devel@kde.org
> > > >
> > > > Thanks in advance,
> > > > Dominik & Kate developers
> > > >
> > > > [1] http://kate-editor.org/
> >
> > _______________________________________________
> > KWrite-Devel mailing list
> > KWrite-Devel@kde.org
> > https://mail.kde.org/mailman/listinfo/kwrite-devel

[Attachment #3 (text/html)]

<div dir="auto"><div>Hi all,</div><div dir="auto"><br></div><div dir="auto">This is a \
recap mail from 8 years ago for<br><a \
href="https://invent.kde.org/utilities/kate/-/merge_requests/737">https://invent.kde.org/utilities/kate/-/merge_requests/737</a></div><div \
dir="auto"><br></div><div dir="auto">Just in case it&#39;s useful...</div><div \
dir="auto"><br></div><div dir="auto">Best regards</div><div \
dir="auto">Dominik</div><div dir="auto"><br></div><div dir="auto"><br><div \
class="gmail_quote" dir="auto"><div dir="ltr" class="gmail_attr">---------- Forwarded \
message ---------<br>Von: <strong class="gmail_sendername" dir="auto">Michal \
Humpula</strong> <span dir="auto">&lt;<a \
href="mailto:michal.humpula@seznam.cz">michal.humpula@seznam.cz</a>&gt;</span><br>Date: \
Fr., 17. Jan. 2014, 13:12<br>Subject: Re: using google-diff-match-patch<br>To: \
Dominik Haumann &lt;<a href="mailto:dhaumann@kde.org">dhaumann@kde.org</a>&gt;<br>Cc: \
&lt;<a href="mailto:kwrite-devel@kde.org">kwrite-devel@kde.org</a>&gt;<br></div><br><br>And \
here comes the update<br> <br>
With the little tweaks to cpp code, I was able to push down the execution time <br>
for my test example from 8s to 4s (diff binary is able to execute in 0.5s).<br>
<br>
After this, it seems that most of the time is spent converting from QByteArray <br>
to QString and back and creating temporary QStrings. IMHO this could be speed <br>
up considerably by switching to const QByteArray altogether and just working <br>
with references.<br>
<br>
The whole output of the diff process is escaped with <br>
QByteArray::toPercentEncoding, which currently takes half of the execution <br>
time. But I have no idea, why is that necessary.<br>
<br>
For comparison, I&#39;ve run the same test with python and lua version:<br>
<br>
Python: 13s<br>
Lua: 18s<br>
Java: 23s<br>
<br>
Seems like general problem with algorithm to me.<br>
<br>
On Wednesday 15 of January 2014 13:34:07 Dominik Haumann wrote:<br>
&gt; On Wednesday, January 15, 2014 12:11:24 Michal Humpula wrote:<br>
&gt; &gt; did a quick speed test. Compiled with -O2, Qt4.8.6 (no _FAST_ \
switches).<br> &gt; &gt; Testing just diff_main routine<br>
&gt; &gt; <br>
&gt; &gt; Two 3,6k lines cpp files (one kate source file) took 4ms per diff.<br>
&gt; &gt; Two ~11MB cpp files (100 concatenation off above files) took 3,9s per<br>
&gt; &gt; diff.<br>
&gt; &gt; <br>
&gt; &gt; The numbers are correct, I&#39;ve checked that twice. Seems like it \
doesn&#39;t<br> &gt; &gt; scale in linear time. Just for compare, the \
&quot;diff&quot; program can generate<br> &gt; &gt; output for the second run in \
400ms, so there is definitely a space for<br> &gt; &gt; improvement.<br>
&gt; <br>
&gt; Run it through `valgrind --tool=callgrind &lt;app&gt;` and analyze the output \
file<br> &gt; with kcachegrind.<br>
&gt; <br>
&gt; Or with perf record &lt;app&gt; and then perf report.<br>
&gt; <br>
&gt; Maybe there is some Qt class or similar ill used?<br>
&gt; <br>
&gt; Greetings,<br>
&gt; Dominik<br>
&gt; <br>
&gt; &gt; On Tuesday 14 of January 2014 10:57:27 Neil Fraser wrote:<br>
&gt; &gt; &gt; I have absolutely no objections to any open source use of diff \
match<br> &gt; &gt; &gt; patch.<br>
&gt; &gt; &gt; <br>
&gt; &gt; &gt; That said, before you invest too heavily in this library, please<br>
&gt; &gt; &gt; verify that the C++/Qt version of diff match patch meets your<br>
&gt; &gt; &gt; performance needs.   I&#39;ve never been happy with this port, in my \
tests<br> &gt; &gt; &gt; it has performed slower than the JavaScript version.   One \
of these<br> &gt; &gt; &gt; days I&#39;d like to replace it with a pure C \
version.<br> &gt; &gt; &gt; <br>
&gt; &gt; &gt; On 14 January 2014 05:48, Dominik Haumann &lt;<a \
href="mailto:dhaumann@kde.org" target="_blank" \
rel="noreferrer">dhaumann@kde.org</a>&gt; wrote:<br> &gt; &gt; &gt; &gt; Dear \
Neil,<br> &gt; &gt; &gt; &gt; <br>
&gt; &gt; &gt; &gt; I am writing to you on behalf of the KDE/Kate project [1]. Kate \
is an<br> &gt; &gt; &gt; &gt; advanced text editor under KDE with lots of features \
for programmers.<br> &gt; &gt; &gt; &gt; <br>
&gt; &gt; &gt; &gt; One feature is showing a diff. For instance, if a file changed \
on<br> &gt; &gt; &gt; &gt; disk,<br>
&gt; &gt; &gt; &gt; we<br>
&gt; &gt; &gt; &gt; currently launch the external `diff` process to get the diff of \
the<br> &gt; &gt; &gt; &gt; text<br>
&gt; &gt; &gt; &gt; buffer and the file on disk, and then show it graphically.<br>
&gt; &gt; &gt; &gt; However, this is not portable and to some degree slow.<br>
&gt; &gt; &gt; &gt; <br>
&gt; &gt; &gt; &gt; Therefore, we are looking for a C++/Qt solution we can use to \
generate<br> &gt; &gt; &gt; &gt; the<br>
&gt; &gt; &gt; &gt; diff. Since google-diff-match-patch has a C++/Qt interface, it \
looks<br> &gt; &gt; &gt; &gt; like<br>
&gt; &gt; &gt; &gt; the perfect match. However, Kate is licensed under &quot;LGPL \
version 2 or<br> &gt; &gt; &gt; &gt; later&quot;.<br>
&gt; &gt; &gt; &gt; <br>
&gt; &gt; &gt; &gt; Is there any chance we are allowed to use \
google-diff-match-patch?<br> &gt; &gt; &gt; &gt; As far as we know, the Apache 2 \
license is incompatible with LGPLv2.<br> &gt; &gt; &gt; &gt; <br>
&gt; &gt; &gt; &gt; Would it be possible to put google-diff-match-patch into a \
library,<br> &gt; &gt; &gt; &gt; and<br>
&gt; &gt; &gt; &gt; then dynamically link the text editor to this library? Or would \
that<br> &gt; &gt; &gt; &gt; still violate licensing?<br>
&gt; &gt; &gt; &gt; <br>
&gt; &gt; &gt; &gt; We&#39;re very much interested in finding a solution and \
clarifying<br> &gt; &gt; &gt; &gt; possible<br>
&gt; &gt; &gt; &gt; licensing conflicts.<br>
&gt; &gt; &gt; &gt; <br>
&gt; &gt; &gt; &gt; CC for reference: <a href="mailto:kwrite-devel@kde.org" \
target="_blank" rel="noreferrer">kwrite-devel@kde.org</a><br> &gt; &gt; &gt; &gt; \
<br> &gt; &gt; &gt; &gt; Thanks in advance,<br>
&gt; &gt; &gt; &gt; Dominik &amp; Kate developers<br>
&gt; &gt; &gt; &gt; <br>
&gt; &gt; &gt; &gt; [1] <a href="http://kate-editor.org/" rel="noreferrer noreferrer" \
target="_blank">http://kate-editor.org/</a><br> &gt; &gt; <br>
&gt; &gt; _______________________________________________<br>
&gt; &gt; KWrite-Devel mailing list<br>
&gt; &gt; <a href="mailto:KWrite-Devel@kde.org" target="_blank" \
rel="noreferrer">KWrite-Devel@kde.org</a><br> &gt; &gt; <a \
href="https://mail.kde.org/mailman/listinfo/kwrite-devel" rel="noreferrer noreferrer" \
target="_blank">https://mail.kde.org/mailman/listinfo/kwrite-devel</a><br> <br>
</div></div></div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic