From koffice-devel  Thu Nov 19 10:03:16 2009
From: Ariya Hidayat <ariya () kde ! org>
Date: Thu, 19 Nov 2009 10:03:16 +0000
To: koffice-devel
Subject: Re: koffice/libs/flake
Message-Id: <ba035dd10911190203o5c99cccfh2d037e847e42ad3c () mail ! gmail ! com>
X-MARC-Message: https://marc.info/?l=koffice-devel&m=125862505823384

> I had again a look at the code and the part that Thomas thinks is expensive
> was already used in the old code for big images. The difference is that it is
> now also used for small images where this should not be a problem. Also it is
> better memory wise to have no duplicate images. Therefore a small overhead
> when the image is create can be excepted in my opinion.

I do not fully understand the said problem and the proposed solution,
but may I give a suggestion anyway?

Instead of taking the cryptographic hash of the whole image (converted
to PNG), how about having a complementary "quick hash" version as
well? This hash value is calculated from e.g. N first RGBA (int32)
values of the image data, whereas N is a small number, e.g. 32 or 64.
If the quick hashes are different, then we can be sure that the images
are not the same at all, and this is only at the small cost of the
performance (speed). If the quick hashes are the same, more checks
need to be done and this is the case (hopefully rare) where we will be
"hit".

Of course, to avoid the (hopefully still rare) case where two images
are different yet the first few pixels are the same (e.g. a logo with
a large transparent border area) and thus yields the quick hash miss,
you can vary "N first RGBA values" to .e.g "N first prime-number
indexed RGBA values", i.e. calculated from the 2nd, 3th, 5th, 7th
pixels, and so on.

My apology if this idea is completely non-sense.


Regards,


-- 
Ariya Hidayat
http://www.linkedin.com/in/ariyahidayat
_______________________________________________
koffice-devel mailing list
koffice-devel@kde.org
https://mail.kde.org/mailman/listinfo/koffice-devel