[prev in list] [next in list] [prev in thread] [next in thread] 

List:       koffice-devel
Subject:    Re: time to point me to xml filtering stuff...
From:       Robert JACOLIN <rjacolin () ifrance ! com>
Date:       2002-04-05 16:41:22
[Download RAW message or body]

Le Vendredi 5 Avril 2002 17:09, Oliver Sinnen a écrit :
> Hi,
>
>[snip]
> Another advantage of using xslt is that you can write a standalone filter,
> without the need for kde/koffice. This is useful, for example, for a kword
> -> OpenOffice filter. An OpenOffice user can then read your files using
> this small filter application. Again, there is also a downside. Kde doesn't
> have its own xslt lib and thus you create a new dependence (for koffice).
> On the other hand, libxslt is already used for the documentation in kde,
> see Eva's mail. Anyway, sooner or later xslt will be supported in kde since
> it is foremost a browser technology, which, e.g., Mozilla already uses.
libxslt is in kdesupport and there already a dependance since a xslt filter 
has been written :)
So you can test your "filter" with the generic xslt filter.

> Further, as far as I understand, Robert did a generic framework for xslt
> based filters in koffice. So, an OpenOffice filter would be a prominent
> example for this framework. BTW, I think the filter page on www.koffice.org
> is wrong in this aspect. It should say: "A generic filter for xml based
> file formats, using xslt". As it is now, it seems that the input/output is
> a xslt file. Also, "xml" is a much nicer buzz word than xslt. ;-)
Right. there are too a kword2xslfo somewhere (in filters/xsltfilter/export I 
think :) ).

> The store/package format of OpenOffice is similar to that of koffice. It
> consists of several files packed into a zip file. Currently, koffice uses
> tar.gz, but as far as I understand this will change in the near future/has
> already changed in cvs. After unpacking the openoffice file format (the zip
> file), you have plain xml files which can be processed using xslt. To use a
> xslt filter within the generic xslt filter framework, the OpenOffice format
> must be unpacked before. This, of course,  becomes easier, if the package
> format is the same as that of koffice.

> The openoffice file format is very well documented (see pdf on
> www.openoffice.org), but quite complex. Fortunately, they try to rely on
> standards like MathML, SVG, XSL-FO etc., but the majority of the used
> xml-tags was introduced for OpenOffice.
>
> As you might already have guessed, I started writing an xslt filter for the
> Writer application of OpenOffice some time ago. Due to lack of time and
> experience with xslt, I have not come very far. The filter only transfers
> the paragraphs with their text and text-formating (bold, underline etc.),
> excluding the global formating, but also has only a few lines of xslt code.
> Holger, if you are interested I can send you this as a starting point for
> your filter. Or, you can also simply start on a filter for another
> application and we share experience. But I won't be able to spend much time
> on the kword filter. In a mail to this list, someone else also expressed
> interest in writing OpenOffice filters
> (http://lists.kde.org/?l=koffice-devel&m=101281407922691&w=2). Maybe you
> want to contact him, too.
>
> This brings me to kword's file format.
> I agree that the dtd should be reviewed, as it was discussed in a previous
> thread. If the file compatibility between koffice 1.1 and 1.2 is going to
> be broken anyway, the changes could even be extensive. A small example is
> that "0" and "1" should be true/false whereever they have this meaning.  I
> will focus here on two rather large topics I have come across.
> (1) The <FORMAT> tag within a paragraph just following the <TEXT> part uses
> 2 attributes, "pos" and "len", to specify the part of the text to which it
> applies. Wouldn't it be better to enclose the corresponding text in between
> the format tags? This seems to me more intuitive and corresponds to what we
> known from html, latex etc.. To separate the tag from the actual formatting
> automatically created styles/formats could be used (see OpenOffice format).
> Due to the use of pos and len, it is also more complicated to write xslt
> filters, since one has to locate the respective text first. ;-)
Right, I agree with you.

> (2) For every paragraph the <LAYOUT></LAYOUT> part defines all of its
> formating and layout attributes. As far as I understand, this is also done
> if the formatting and layouting is according to the associated style. Thus,
> many of the information is redundant. The file format of kword could be
> significantly reduced if such information is only saved for attributes
> which are different from the associated style.

And since we talk about kword dtd, I add my personal view.
By writting the latex filter and the xslt filer, I saw that list are like 
paragraph and they are in one level. Why ? What was the reason ? Why not 
something like that :
<list>
	<item/>
	<item/>
</list>
like in html, xsl:fo, latex, ...
So it's very difficult to handle lists with xslt to convert kword lists in 
xsl:fo lists for example.

Regard.

:)))))))))))))))))))
bobby
_______________________________________________
koffice-devel mailing list
koffice-devel@mail.kde.org
http://mail.kde.org/mailman/listinfo/koffice-devel
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic