'Re: Formatting large docs in Kword'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       koffice
Subject:    Re: Formatting large docs in Kword
From:       Nicolas Goutte <nicog () snafu ! de>
Date:       2002-12-13 13:15:17
[Download RAW message or body]

On Friday 13 December 2002 06:40, Clarence Dang wrote:
> On Wed, 11 Dec 2002 12:59 am, Nicolas Goutte wrote:
> > On Tuesday 10 December 2002 07:25, Clarence Dang wrote:
(...)
> > > > BTW Kword now loads and edits these big docs pretty quickly.
> > >
> > > Really?  The filter reads one character at a time, last time I
> > > checked...
> >
> > The filter is fast.
>
> I agree that in the grand scheme of things it is pretty fast (practically
> and in terms of time complexity).
>
> But since I am an optimisation person (still programming 286s :)), I should
> point out that calling QTextStream::operator>> (QChar) in
> ASCIIImport::readLine 20 million times (for bug #45973, at least) is rather
> expensive.

I looked in QTextStream and I have to say that you are right.

> QT did not inline the function and there is way too much logic to read in a
> single character in the qt libs especially for those 20MB DNA sequences
> that we open every day :)
>
> QStreamText::readLine does support Mac newlines (now) and would be much
> faster....

In the meatime, it seems that it does. It was not the case when I had needed 
it.

(Remains only the removing of form feeds, but that can be done after having 
read the line.)

>
> But then, I'm not complaining :) so actually, don't change it to
> QStreamText::readLine yet because I actually have a use for
> ASCIIImport::readLine (since I'm going to be hacking that filter in a week
> or two)....

>
> I wonder if something like "partial importing" (only load part of the
> document, what vi and msword appear to do) could be implemented in the
> KOffice filter system....

Currently, you cannot, as KWord wants a complete KWord file.

The problem for the plain text import filter is that it stores the KWord 
document as QDom. But if the filter has not enough memory to do that, KWord 
will not have enough memory either to load the KWord document.

>
> > It is loading in KWord that is not, especially the
> > needed paginating after load. The pagination may need even more time than
> > the load itself. (That is why I have made KDE Bug #50467
> > http://bugs.kde.org/show_bug.cgi?id=50467 )
>
> Yes KWord seems to be the bottleneck with mswritefilter too but maybe it's
> the amount of XML that gets passed to it...

I doubt it very much.

When the filter is much faster than loading in KWord and that pagination in 
KWord is even slower than loading, it is not a problem of XML. The problems 
is more in KWord's or KoText's structures/classes.

You can check it yourself. You can use koconverter to measure the time needed 
for the filter itself. The loading (of the KWD document) in KWord is the time 
that the progress bar needs to go from 0% to 100% and the pagination is 
between 100% and the first draw of the document. (You can also follow 
pagination through the kdebug output.)

>
> Clarence

Have a nice day/evevning/night!

>
> ____________________________________
> koffice mailing list
> koffice@mail.kde.org
> To unsubscribe please visit:
> http://mail.kde.org/mailman/listinfo/koffice

____________________________________
koffice mailing list
koffice@mail.kde.org
To unsubscribe please visit:
http://mail.kde.org/mailman/listinfo/koffice
[prev in list] [next in list] [prev in thread] [next in thread]