'Re: [Kde-pim] Review Request 124967: Try to optimize KMime::CRLFtoLF() and KMime::LFtoCRLF() in the '

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-pim
Subject:    Re: [Kde-pim] Review Request 124967: Try to optimize KMime::CRLFtoLF() and KMime::LFtoCRLF() in the 
From:       "Volker Krause" <vkrause () kde ! org>
Date:       2015-08-28 12:10:09
Message-ID: 20150828121009.11390.9142 () mimi ! kde ! org
[Download RAW message or body]

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/124967/#review84538
-----------------------------------------------------------

Ship it!

Nice :)

Unfortunately I don't think the assumption holds though, these functions are used to \
process arbitrary crappy input from the internet, if there is one thing you can be \
sure about it's not following any rules ;-(

I think the real solution is to avoid those methods altogether in the long run and \
make sure our parsers can handle both line endings.

- Volker Krause

On Aug. 28, 2015, 11:51 a.m., Daniel Vrátil wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/124967/
> -----------------------------------------------------------
> 
> (Updated Aug. 28, 2015, 11:51 a.m.)
> 
> 
> Review request for KDEPIM and Volker Krause.
> 
> 
> Repository: kmime
> 
> 
> Description
> -------
> 
> `QByteArray::indexOf()` is way cheaper than `QByteArray::replace()` and does not \
> cause detach. This hugely optimizes the ideal case when the data passed to \
> `KMime::CRLFtoLF()` already use `\n` instead of `\r\n` (basically all mails written \
> by KMime), because it does not trigger deep copy. 
> In the bad case when we actually need to do the replace the added cost of \
> `indexOf()` is negligable compared to `replace()`, especially because `indexOf()` \
> will return after scanning only single line of the data until running to the first \
> `\r\n` (in most cases, anyway, and that's what we are optimizing for). 
> 
> We could optimize the `CRLFtoLF()` case even further by assuming that the entire \
> document uses either `\n` or `\r\n` consistently. In that case we would be able to \
> avoid scanning the entire document for `\r\n` but instead only look for the first \
> `\n` and check if it's preceeded by `\r` or not. This basically means that in both \
> cases (good and bad) we would only ever need to scan the first line of the \
> document, and then return immediatelly or go for the `replace()`: 
> 
> const int pos = s.indexOf('\n');
> if (pos < 1 || s[pos - 1] != '\r') {
> return s;
> }
> ...
> 
> 
> Question is whether the assumption about consistent use of line ends is acceptable.
> 
> 
> Diffs
> -----
> 
> src/kmime_util.cpp 47ca087 
> 
> Diff: https://git.reviewboard.kde.org/r/124967/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Daniel Vrátil
> 
> 

_______________________________________________
KDE PIM mailing list kde-pim@kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/

[prev in list] [next in list] [prev in thread] [next in thread]