[prev in list] [next in list] [prev in thread] [next in thread] 

List:       koffice-devel
Subject:    Re: [RFC] Export filter architecture for MS Office
From:       Thomas Zander <zander () planescape ! com>
Date:       2001-09-12 17:07:18
[Download RAW message or body]

On Tue, Sep 11, 2001 at 09:47:15PM -0400, shaheed wrote:
> This note documents some thoughts about the export filter architecture for MS \
> Office to be used for the new common filter architecture from a discussion with \
> Werner on IRC. It is also related to the string at:   
> http://lists.kde.org/?l=koffice-devel&m=100016171520493&w=2 
> 
> titled "[RFC] filter redesign". Other resources are the MS Word exporter skeleton \
> at:   
> http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/wvware/wv/exporter/ 
> 
> and the wv2 projects at the same site. This note presents some ideas and issues: it \
> would be great to have some feedback on them!  
> The first thing is that it seems to me that the director/builder pattern that Eva \
> proposed is very similar to the KOffice olefilters architecture, since that \
> fundamentally is all about embedding objects one within another. So, it is \
> interesting that not only is there an existence proof that director/builder works \
> nicely in this application, but that there are implications that: 

I don't follow this sentence, the one thing that the Builder patter (which you
can find in your local copy of 'design patterns  isbn:0201633612) is that it 
allows you to build an object structure via a default interface. In other words, 
it does nothing with embedding at all.

One of the things inharent in the pattern is that it does not return anything to 
you, as you can see in Eva's example (and my reply to that). All methods are 
implemented to return void.

This is because the pattern is created to (and I quote) "Seperate the construction 
of a complex object from its representation so that the same construction process
can create different representations."

Where this and your example come together is that the director (which knows only
the common builder interface) calls the methods and knows only the basic relations
between the objects created by the builder. So the director should not have to know
the direct parent of the object it creates. 

Let me state an example.

I have a text document and want to convert that to KWord.
In KWord we use frames, and in a frame(set) we have paragraphs. This relation
is missing in the text document, the paragraphs are directly beneath the document \
'node'

If I now want to convert that same text document to OpenOffice, which does not
do frames, the paragraph would directly be a child of the document.

This example can only be implemented if we allow the builder to create the 
missing frame(set) if that is needed for the output format, in other words the 
internal object structure of the builder for a KWord document would have a different
set of parent child relations then the builder for a OpenOffice document, even when
calling the same series of methods to create a document. 

If you go back to an email I sent earlier, as an answer to Eva's email 
 http://lists.kde.org/?l=koffice-devel&m=100014931211274&w=4
you should notice my example at the bottom which starts with 'Consider the \
following;'.

This tells you the solution provided by the patters (again coming from the book) 
which uses numbering per type. So if someone does not know about a type 
there is no problem in the director/builder communication.


> 1. the director might usefully be an abstract class 

Right, for every input type you have one implementing class.  
Similarly you have one builder implementation for each output type.

> 2. we might usefully have implementations for OLE-based documents and anything else \
> that has a *regularised* notion of nesting 

Nesting is done on a document basis, right? Where a document is anything from a 
picture to an excel sheet.

Then I don't understand your concern about nesting _inside_ the filter. On a
global scale a document is opened via a filter and will be converted to something
native to the office suite. 
Any embedded parts will be treated in the same way, in that they will be filtered
to a native format.

FRom a reference POV in the document to a nested document this will have to be
done (and is currently done in KOffice) via filenames inside the archive for 
the whole document.
I.e. a document points to a picture with a href, as something like:
  <embedded type=picture href="images/picture1" width="123" height="123" />

There are 2 scenarios I can imagine when opening an input file with an embedded
part. One is that as soon as the import filter finds an embedded part the 
filter manager is told that it should convert it to filename 'tar:something/xyz'
this can be done directly or after the main filter has finished (and thus the 
converted document is not in mem anymore)
The second way is that this proofs to be impossible for some reason and the
same filter just creates a new builder and does the conversion in the same
filter instance, again immidiately or after the main document has been written.
 

> The point of all this is that if we do it right, I think we can make such filters \
> usable across multiple OpenSource projects. We only implement the logic of each \
> filter once: all projects get improved capabilites. Does this sound feasible? Any \
> better ideas around? 

The trick about this is to create an interface for the builder that every builder
should implement and every director can use. Therefor the interface has to be
so big that it can be used to build every type of output document we plan to
create. 
I certainly hope this can be done, but it will be very tricky ;) 

I have been talking about another approuch to this problem, the approuch is 
to get all the open source editors to use one DTD (not as hard as you imagine).
If this is done all filters will just have to output one DTD. I think you can
imagine the advantages of that.

Among other things it will allow all parties to join forces in creating a filter
for any format since it will work with all applications using the standard DTD.

Lots of the above ideas can still be used with the one DTD solve, and I hope 
that my lengty email does not scare you off of the notion of doing this at
all, since I believe its a good idea (hence my lengty email ;) 


-- 
Thomas Zander                                            zander@earthling.net
The only thing worse than failure is the fear of trying something new


[Attachment #3 (application/pgp-signature)]
_______________________________________________
Koffice-devel mailing list
Koffice-devel@mail.kde.org
http://mail.kde.org/mailman/listinfo/koffice-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic