[prev in list] [next in list] [prev in thread] [next in thread] 

List:       koffice-devel
Subject:    File filters status...
From:       Pierre <pinaraf () pinaraf ! info>
Date:       2009-01-19 1:54:08
Message-ID: 200901190254.20748.pinaraf () pinaraf ! info
[Download RAW message or body]

[Attachment #2 (multipart/signed)]


Hi

Yesterday, I just tried every filter listed in KWord, to find some quick and 
useful things to do...
I was quite surprised to see that just no output filter worked except the 
OpenDocument filter.
When I looked at the way the filters are implemented, I was really surprised 
to see that they were just expecting a KWord 1 file as input, so they can 
parse it and then convert it to the target file format.
I don't really see any benefit to this system : why don't the filter directly 
access the applications internals ? They would probably still be broken if 
they had been written this way, but at least we would have noticed some 
problems at compile time...
Following is a discussion we had about this topic on IRC yesterday...

Pinaraf> hi folks
slangkamp> Hi Pinaraf
Pinaraf> now that my internship is over, I can go back to more interesting 
activities like KOffice...
Pinaraf> I just did a small tests with kword : write an hello world
slangkamp> great
Pinaraf> and try to save it
Pinaraf> out of 11 filters, only 1 works :/
Pinaraf> it appears nobody took care of these pieces of code
SaroEngels> hi Pinaraf
slangkamp> that's a general problem, most filters don't work
Pinaraf> slangkamp: and what's the plan ? disable them before the release ? 
(try to) fix^W^W^W hack them ?
slangkamp> Pinaraf: It depends one the filter, if it can't be fixed for the 
release it will get disabled
* ingwa_ would like at least a rudimentary .doc filter.
Pinaraf> ingwa_: ho, so far, we can save to OpenDocument Text
slangkamp> ingwa_: the .doc filter should work, but needs a new libwv2 version
Pinaraf> so with some luck, everybody will instantly switch to MSOffice 2007 
SP2 when it's released, and they will be able to open ODT files
Pinaraf> slangkamp: I don't think it's supposed to work
Pinaraf> I don't know yet how the filters work
Pinaraf> but every filter failed with the same worrying message : "An error 
has occurred while parsing the KWord file."
ingwa_> slangkamp: that sounds encouraging
ingwa_> Pinaraf: well, that's not a filter, but the native save format.
ingwa_> Pinaraf: seems to me that that error message may have the same source 
for all filters. Maybe they can all be repaired the same way.
Pinaraf> ingwa_: I got this message when saving to .doc, .kwd, .html, .sxw....
slangkamp> as far as I know the problem with the .doc filter is that the 
result odt file isn't loaded correctly
Pinaraf> ingwa_: I only hope I mis-understood the error message
ingwa_> Pinaraf: well, since it doesn't give any details, there isn't much to 
misunderstand.
Pinaraf> well, I fear these filters work on the .kwd files directly, I hope 
I'm wrong.
slangkamp> you can look at the .desktop file of the filter, X-KDE-Export 
defines the output format
ingwa_> slangkamp: where are they?
slangkamp> ingwa_: in the folder of each filter
ingwa_> Pinaraf: are you talking about import or export filters?
Pinaraf> export filters
ingwa_> I'm afraid that you are right:
ingwa_> X-KDE-Export=application/vnd.sun.xml.writer
ingwa_> X-KDE-Import=application/x-kword
ingwa_> for example. application/x-kword is .kwd, right?
slangkamp> yes
slangkamp> why do we and oowriter export filter?
ingwa_> eh, what?
slangkamp> s/and/have
Pinaraf> slangkamp: the oowriter export filter is for the old OOo 1/StarOffice 
6 file format
Pinaraf> and there are some differences between that file format and 
OpenDocument...
/a>
slangkamp> Pinaraf: even OOo 1.x had an odf import filter at some point
Pinaraf> slangkamp: yes, beginning with 1.1.5 as far as I remember
ingwa_> there is a filter named msword and one named msword-odf
slangkamp> msword-odf creates odf files as output
Pinaraf> but that's weird
ingwa_> that's basically same as msword, but with odf instead of kword?
Pinaraf> why do our filters have to manipulate an input file format ?
Pinaraf> why don't we give them our document structure
Pinaraf> with a proper list of styles, shapes and so on
slangkamp> ingwa_: was part of a gsoc project and it has improvements on image 
support (with a new wv2)
Pinaraf> it'd be much more efficient, more future-proof
ingwa_> Pinaraf: good question
slangkamp> konverter maybe?
Pinaraf> slangkamp: what is it ?
slangkamp> Pinaraf: command line file converter
slangkamp> can be used like: konverter file1.doc file2.odt
slangkamp> sorry it's koconverter
Pinaraf> slangkamp: well, I don't see what it does exactly change
Pinaraf> you could still instanciate a full document structure, parse the 
input in this structure, and then use the right output filter
psn> they are designed to be chainable
ingwa_> would be nice if it would just load the right input and output filter 
as dynamic libraries instead of writing and reading full ODF files. That would 
be very much faster.
psn> but the svg filter uses karbon's structure and because of that doesn't 
work with any of the other applications
Pinaraf> why do you want to chain filters ?
ingwa_> Pinaraf: to convert from e.g. kword to doc
Pinaraf> ingwa_: what's the problem with using internal koffice structures in 
this situation ?
slangkamp> not every filter export a directly usable format for example ps-
>svg->karbon internal
ingwa_> Pinaraf: none at all
slangkamp> Pinaraf: not ever filter can directly produce our internal 
structures
slangkamp> every
Pinaraf> slangkamp: any example ?
slangkamp> Pinaraf: as I said ps
slangkamp> uses pstoedit to convert ps to svg
Pinaraf> I don't get it, if it produces something we just can't open, what's 
the use of the filter in koffice ?
psn> we can open svg... but only in karbon due to the filter using karbon's 
internal structures
Pinaraf> and where would we use that filter ?
slangkamp> psn: odf doesn't have all svg feature so we have to do that, it's 
simply a second native format
slangkamp> Pinaraf: import ps/eps files in karbon
Pinaraf> that's why I find it strange to try to have an input and an output 
file format in filters
Pinaraf> a filter is here to load a file in an application, or to save it
psn> slangkamp: I know the reason... still it only requires some changes to 
the libs to make it work for all apps.
slangkamp> Pinaraf: the problem is that the app might not be able to open the 
output format directly
Pinaraf> what output format ?
Pinaraf> the output of an export filter ?
Pinaraf> well, what's the problem with that ?
slangkamp> Pinaraf: import filter
Pinaraf> an input filter should feed in the application structure, not a file 
format
Pinaraf> that's a non-sense
slangkamp> Pinaraf: but it's not always possible
slangkamp> Pinaraf: filter can use external apps/libs that don't have to know 
our external structures
Pinaraf> and ?
Pinaraf> libwv isn't just a converter from word to odf, you can go in the 
document structure
slangkamp> e.g. pstoedit
Pinaraf> slangkamp: you still didn't give the real problem, the filter can 
just use the whatever output of the external tool it uses and feed it in the 
application
estan> how many of the import filters are based on external apps/libs that 
can't give access to the source's structure? it feels to me like it should be 
a minority no? because i kind of agree with Pinaraf.
psn> Pinaraf: well in the ps filter case that would mean duplicating the svg 
filter in the ps filter
estan> do all the current import filters work in this way? e.g. convert to 
file that is then opened?
Pinaraf> psn: well, can't you just have a dependency of the ps filter on the 
svg filter ?
Pinaraf> estan: yes, it looks like it's done this way
psn> estan: all except karbon's svg filter yes...
estan> ok.
slangkamp> if filters are not coded against internal structues we are more 
flexible
psn> Pinaraf: well as long as one doesn't end up with filters that only works 
in specific applications I'm not against using the internal structure 
directly.
psn> Pinaraf: but I think the risk is pretty big that you do.
slangkamp> for example other apps might use the filters
Pinaraf> I don't see how you would use, for instance, a word, kword or abiword 
filter in any application other than kword
estan> a command line converter like koconverter?
slangkamp> that's something we criticized OOo for, as they have the filters 
using internal structures and we can't uses them
Pinaraf> moreover, most of the interesting structures, at least for text 
processing, are shared between applications
Pinaraf> slangkamp: and now look at the mess we have since we changed our 
default file format
slangkamp> Pinaraf: we have changed our internal structures too
Pinaraf> indeed, except that it's checked by the compiler
Pinaraf> it's not a runtime thing that you'll never be able to fully trust 
except with hundreds of testcases
slangkamp> the outcome would be more or less the same
slangkamp> at the moment you would only need a kwd -> odt filter
Pinaraf> not really, that issue wouldn't have been allowed to come that long 
in the development process
slangkamp> and the other direction
Pinaraf> much easier said than done...
slangkamp> all filter would break in both cases
Pinaraf> since it means poorly tested code (filters) will rely on even less 
tested code (kword file format support)
estan> i wonder if there's any exceptionally nice design of import/export 
functionality out there we could take a look at, which has solved all these 
problems and conflicting requirements in a nice way. i mean, these problems 
can't be new for sure ;)
* estan is not going to say more on the matter since i can understand both 
viewpoints.
estan> anyway, i don't think a change to another file format is exactly 
imminent, so maybe what needs to be done is just to bite the bullet and update 
the current filters to work with OpenDocument instead of the old format.
Pinaraf> it's just about rewriting the kword filters...
Pinaraf> when I see that a filter has to parse another file format to generate 
its output, it's quite insane... parsing properly an OpenDocument file is 
quite complex, doing this work in each filter is a waste of time
slangkamp> Pinaraf: you should propose that on the mailinglist
Pinaraf> slangkamp: yes, I'll do that, but tomorrow, my eyes start hurting... 
I need some rest
slangkamp> Pinaraf: cyrille is also collecting filter todos on the wiki
Pinaraf> ok


Any opinion on this topic would be really welcome. Most important is : what do 
we do for the 2.0 release ? No filter at all ? Hacked filters so that they 
work with bugs ? Rewrite as many filters as possible ?

Pierre

["signature.asc" (application/pgp-signature)]

_______________________________________________
koffice-devel mailing list
koffice-devel@kde.org
https://mail.kde.org/mailman/listinfo/koffice-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic