On Tuesday, August 24, 2010 22:06:59 pm Dr. Robert Marmorstein wrote: > > There is a difference between showing differences between documents and > > creating and applying patches. For the former, more work must be done > > than for the latter, since the latter assumes a very similar working > > copy. > > > > A tool to do these comparisons would indeed be very useful. Not just for > > you synchronization needs, but also to check roundtrip accuracy of > > koffice. > > Exactly. It is not at ALL an easy problem. And yet, it would be very, > very nice to have. I think implementing "patches" (for change-tracking) > would be a good first step toward implementing such a tool. My naive idea > was that you could parse both XML files using a DOM approach (storing all > nodes in memory), then iterate through one on a node-by-node basis, > searching for the equivalent node in the second file. That might get very > expensive, though. I like your idea of assigning an (arbitrary) ordering > to the nodes better. > > Another problem would be displaying the results -- how do you display > things like format differences? Variables? You don't want a cluttered > interface, but you would still want to highlight the major differences. > > I don't think there's any good software for comparing Micro$oft word > documents in this manner, so having this could be a killer app for open > source. I'll continue to think it through some. I think a good initial step would be to formate an ODF canonicalization: how to order things without needed order. Automatic styles would be ordered by first reference and named styles by name. I'm sure there are more things that need ordering and that they can have some rule applied. Then the files in the zip should have an agreed upon order. A question there is: should image filenames be retained or not? I would say no, since this information is lost when going from the zip format to the xml only format anyway. Once you have canonicalized the two input files, which should not affect their contents at all, you can run a 'diff -r' or kdiff3 on the unzipped files. Cheers, Jos -- Jos van den Oever, software architect +49 391 25 19 15 53 http://kogmbh.com/legal/ _______________________________________________ koffice-devel mailing list koffice-devel@kde.org https://mail.kde.org/mailman/listinfo/koffice-devel