[prev in list] [next in list] [prev in thread] [next in thread]
List: koffice-devel
Subject: Re: design discussion: caching QTextFormat for faster text loading
From: Jos van den Oever <jos.van.den.oever () kogmbh ! com>
Date: 2010-08-16 9:59:21
Message-ID: 201008161517.22941.jos.van.den.oever () kogmbh ! com
[Download RAW message or body]
On Friday, August 13, 2010 17:00:53 pm zander@kde.org wrote:
> On Friday 13. August 2010 11.59.10 Jos van den Oever wrote:
> > As you may have noticed, I'm working on speeding up KOffice at the
> > moment. The biggest bottleneck for the loading of documents at the
> > moment is our styling system. In this mail, I want to explain why this
> > is the case and give a suggestion for improving the situation.
>
> The idea sounds intriguing :) I like it.
>
> There is something extra that you might be very happy to learn as this will
> make things a lot easier for you.
>
> In QTextDocument each unique QTextFormat gets put in a list and gets an
> index that will never change over the lifetime of the QTextDocument.
> See QTextDocument::allFormats().
QTextDocument::allFormats() returns a QVector<QTextFormat>.
The QTextDocument says:
Each document element is described by an associated format object. Each
format object is treated as a unique object by QTextDocuments, and can be
passed to objectForFormat() to obtain the document element that it is
applied to.
Each QTextFormat, which is basically a QMap<uint,QVariant>, contains an index
over which the object it applies to can be retrieved. This makes each
QTextFormat unique even though many paragraphs can have the same formatting.
So comparing QTextFormat for two paragraphs will return false if the formats
are associated with a different paragraph. If the the format is not associated
with a QTextDocument, the property ObjectIndex is not set and the formats
would be seen as equal.
Comparing QTextFormat objects is expensive. This is sped up somewhat by the
internal hash Qt uses for them, but as soon as one property is changed, the
hash is invalid. So taking two formats from a QTextDocument and comparing them
after removing the ObjectIndex property is always a full loop over all
properties in the two QTextFormat instances and hence always expensive.
The fact that the property ObjectIndex is part of the QTextFormat is a big
design problem in Qt, because now two styles on different paragraph always take
separate memory, even though the are the same (apart from the ObjectIndex
value).
> This would mean that KoParagraphStyle::applyStyle and
> KoCharacterStyle::applyStyle could get some extra code to detect the ideal
> case; which is almost always hit in an empty document. The ideal case is
> when we apply the style on a block without style.
I do not see how to check that. Can you tell me how I can check whether a
QTextCursor or QTextBlock has no QTextFormat associated with it?
The most relevant signature are these:
void KoCharacterStyle::applyStyle(QTextCursor *selection) const
void KoCharacterStyle::applyStyle(QTextBlock &block) const
void KoCharacterStyle::applyStyle(QTextCursor *selection) const
void KoParagraphStyle::applyStyle(QTextBlockFormat &format) const
void KoParagraphStyle::applyStyle(QTextBlock &block, bool applyListStyle)
const
In my opinion, any call passing a QTextCursor, QTextBlock or a QTextFormat
where ObjectIndex is set is problematic. The Ko*Style classes should convert
from a style to a QTextFormat which is then set on a part of the
QTextDocument. That way all the performance problems of QTextFormat are
avoided and the QTextFormat instances can be cached during loading.
> So a block has a format set to index (QTextBlock::blockFormatIndex) that is
> known to be 'empty' and we turn it into a know other format. The tricky
> part is that this is per QTextDocument instance.
> But I bet its easy to store a little structure inside the style keeping a
> QTextDocument pointer, an int of the 'before' format Id and the new format.
> Which you just apply.
The style should not have a pointer to QTextDocument in my opinion. The style
is a representation of the ODF style element that applies to the entire ODF
document and giving it private information pertaining to one specific text
shape breaks that.
In fact, to have a cleaner design and allow for better predictability and
caching, the Ko*Style classes should only work with QTextFormat objects and
not with any other of the QText* classes in their api and implementation.
Then the Ko*Style classes would give back QTextFormat objects that are applied
to the relevant block, instead of letting the styles merge the style they
represent with the style that is present at the particular position.
> That would be fast as all you need to do for knowing you can apply the
> format is to do one integer comparison.
I do not see how this would work. Can you explain it with some code?
Cheers,
Jos
--
Jos van den Oever, software architect
+49 391 25 19 15 53
http://kogmbh.com/legal/
_______________________________________________
koffice-devel mailing list
koffice-devel@kde.org
https://mail.kde.org/mailman/listinfo/koffice-devel
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic