From koffice Fri Dec 13 14:18:37 2002 From: Nicolas Goutte Date: Fri, 13 Dec 2002 14:18:37 +0000 To: koffice Subject: Re: Formatting large docs in Kword X-MARC-Message: https://marc.info/?l=koffice&m=103978920608461 On Friday 13 December 2002 06:54, Clarence Dang wrote: > On Tue, 10 Dec 2002 07:04 pm, Jonathan Drews wrote: > > On Tuesday 10 December 2002 12:25 am, Clarence Dang wrote: > > > Hi Jonathan, > > > > > > Try loading your text file with the other two options in the "End o= f > > > Paragraph" group i.e. "Sentence" and "Old method" (not "As is") in > > > the "KWord's Plain Text Import Filter" dialog. They are not perfec= t > > > but I think they will do what you want. > > > > Splendid! Worked like a charm Clarence. > > > :) > > I would be careful about those modes though (esp. "Old method") because > they heuristically determine when it's time to end the paragraph (so > sentences could actually be appended to headings). Yes, sometimes "Old Method" is too smart. That is also why there is no re= al=20 name for this method. However it works well with paragraphs separated by=20 empty lines. > > > > > BTW Kword now loads and edits these big docs pretty quickly. > > > > > > Really? The filter reads one character at a time, last time I > > > checked... You should try loading the 20MB DNA sequence associated > > > with bug #45973 :-) (actually the real problem with that bug is the > > > amount of memory that KWord takes up but anyway I'm getting OT agai= n > > > > Yes, but by using your advice it loaded Volume 1 of Decline and Fall= of > > The Roman Empire pretty quickly. > > How many gigahertz is your machine? :) > On my Pentium II, it took 8 minutes CPU time before I stopped loading > it.... I'm actually going to be working on this filter in a few weeks > (colour text via ANSI escape codes, table support...). Do you plan to do it for the import only or for the export too? > > > I then saved that as a Kword doc (with > > the new full page format) and reopened it. Memory consumption was > > negligible! > > I don't think KWord internally uses XML (haven't checked kwdoc.cc, but)= =2E > > > When I loaded the original text file using "As is: At the > > end of the line", the memory consumption was enormous. > > Surely, "Sentence" mode would have have taken a similar amount of memor= y > (it would be slightly less but not significantly less)? Not necessary, end of lines are not commonly end of sentences, so you wil= l=20 commonly get much less paragraphs than with the "as is" mode. That is why I had made this mode, as it can make quite good paragraphs ou= t of=20 monolithic texts (for example those converted from HTML with clueless=20 scripts.) > > > It used ~500 Mb. > > So the page formating makes a big difference in the memory > > consumed? > > Well there isn't really a difference in Page Formatting at all with tho= se > modes, just how often "paragraphs" get written. Dunno about memory tho= ugh > but I would imagine fewer paragraphs (not "As is" mode) would be better= =2E > > > It took about half an hour to paste the 722 additional pages, to make > > the 1444 page *kwd. > > IMHO, that's much too long! But, I'm too busy to fix that at the momen= t. It is perhaps too long but it is probably the time KWord currently needs = to=20 paginate. (Do not forget: you are making tons of double precision floatin= g=20 point operations, you cannot have them in no time!) > > Clarence Have a nice day/evening/night! > > > ____________________________________ > koffice mailing list > koffice@mail.kde.org > To unsubscribe please visit: > http://mail.kde.org/mailman/listinfo/koffice ____________________________________ koffice mailing list koffice@mail.kde.org To unsubscribe please visit: http://mail.kde.org/mailman/listinfo/koffice