[prev in list] [next in list] [prev in thread] [next in thread] 

List:       koffice-devel
Subject:    Re: Quick question
From:       Jos van den Oever <jos.van.den.oever () kogmbh ! com>
Date:       2010-08-24 14:30:21
Message-ID: 201008241630.21479.jos.van.den.oever () kogmbh ! com
[Download RAW message or body]

On Tuesday, August 24, 2010 16:18:10 pm Dr. Robert Marmorstein wrote:
> > The overhead of sending complete ODF documents doing real-time editing is
> > often too large. So many implementations send 'patches' to and fro. The
> > format of the patches should be agreed upon. Operational transforms is a
> > way to create patches that are pretty good for merging on trees that have
> > changed in the meantime and also quite efficient in a wire protocol.
> > 
> > Cheers,
> > Jos
> 
> This is a problem I have been thinking about recently -- but for a
> different reason.  I am using "Unison" to synchronize files on various
> computers with my USB key.  Unfortunately, while unison can show a diff of
> two text files, it can't do anything with ODF documents.  I have a lot of
> old ODF documents floating around that I would like to be able to compare
> with each other.  There is an "xmldiff" program floating around, but it
> would take a lot of work to make it usable on ODF files.
> 
> This idea of using "patches" to handle collaboration might work really well
> with an "odfdiff" kind of utility that would make it easy to display the
> differences between two ODF documents.

There is a difference between showing differences between documents and creating 
and applying patches. For the former, more work must be done than for the 
latter, since the latter assumes a very similar working copy.

To display differences between XML files, you will first have to canonicalize 
them [1]. That means, formatting them in a standard way amongst other things 
by ordering the attributes in elements and doing whitespace normalization.

For ODF, there are more things to do before you can compare two documents. 
You'd have to order all elements, for which the order does not normally 
matter, e.g. style definitions, in a standardized way. Also cross-references 
like names of automatic styles would have to be normalized. If two automatic 
styles P1 and P2 have their names swapped in two versions, but are not altered 
and applied to the same paragraphs, you do not want your diff tool to mention 
them.
Then you need to compare embedded files, but that's not such a big problem.

A tool to do these comparisons would indeed be very useful. Not just for you 
synchronization needs, but also to check roundtrip accuracy of koffice.

[1] http://www.w3.org/TR/xml-c14n11/

-- 
Jos van den Oever, software architect
+49 391 25 19 15 53
http://kogmbh.com/legal/
_______________________________________________
koffice-devel mailing list
koffice-devel@kde.org
https://mail.kde.org/mailman/listinfo/koffice-devel
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic