[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-panel-devel
Subject: Fwd: Scrap baloo?
From: Christoph Cullmann <cullmann () absint ! com>
Date: 2016-09-14 22:13:56
Message-ID: 1619031179.21635.1473891236452.JavaMail.zimbra () absint ! com
[Download RAW message or body]
FYI, if you care, follow frameworks devel, guess double posting
only ends in pain.
Greetings
Christoph
----- Weitergeleitete Mail -----
Von: "cullmann" <cullmann@absint.com>
An: "kde-frameworks-devel" <kde-frameworks-devel@kde.org>
Gesendet: Mittwoch, 14. September 2016 23:29:22
Betreff: Scrap baloo?
Hi,
first, read that from my mail to the maintainer thread:
<snip>
Hi,
after looking a bit more at the code, I think there are ATM a lot of things that need \
fixing:
1) 32-bit system: I see no fix, > 1GB of index and baloo + all baloo using \
applications fail
see bugs like https://bugs.kde.org/show_bug.cgi?id=356114 here we have the 5GB \
limit, which is now raised for 64-bit, but not for 32-bit
2) Larger filesystems: unfortunately one decided to ignore the upper 32-bit of the \
inodes
/**
* Convert the QT_STATBUF into a 64 bit unique identifier for the file.
* This identifier is combination of the device id and inode number.
*/
inline quint64 statBufToId(const QT_STATBUF& stBuf)
{
// We're loosing 32 bits of info, so this could potentially break
// on file systems with really large inode and device ids
return devIdAndInodeToId(static_cast<quint32>(stBuf.st_dev),
static_cast<quint32>(stBuf.st_ino));
}
=> random breakage e.g. on my NFS drive here as the IDs clash and all invariants no \
longer hold. (e.g. something can be a file but in addition a directory, ....)
3) No error handling of most lmdb faults (like already mentioned)
4) No error handling for any data corruption: e.g. many places will just endless loop \
or malloc, like DocumentUrlDB::get(quint64 docId) (we have bugs for that)
5) lmdb locking issues: crash one read-write process => all other things stall (or \
crash because of 3+4)
6) No resource management nor crash handling for the baloo_file_extractor which \
either OOMs you or corrupts the database on crash leading to 5)
CC'd Vishesh, perhaps I am wrong with that issues and misunderstand the code, \
unfortunately e.g. the database structure is not that well documented, if I don't \
just not find the correct docs in the git.
</snip>
Now executive summary, after a day more looking at the code.
1) 32-bit systems: never will be usable, thanks to lmdb, at least not with \
non-trivial index sizes
2) network file system homes: never will be usable, thanks to lmdb (ask its author: \
http://lmdb.tech/doc/ "Do not use LMDB databases on remote filesystems, even between \
processes on the same host. This breaks flock() on some OSes, possibly memory map \
sync, and certainly sync between programs on different hosts."
3) close to no error handling in the code => see the crash reports, I cleaned up a \
bit, but they are piling \
https://bugs.kde.org/reports.cgi?product=frameworks-baloo&output=show_chart&datasets=C \
ONFIRMED&datasets=ASSIGNED&datasets=REOPENED&datasets=UNCONFIRMED&datasets=RESOLVED&banner=1
4) fundamental problems like: wrong data structure for index (32-bit inodes in 21th \
century?) and close to zero docs what it does internally
Proposal:
Scrap baloo_file* and Co. and just reimplement the public API (modulo the settings \
for the then non-existing indexer daemon) to use tracker.
Benefits:
1) Tracker is maintained: https://github.com/GNOME/tracker/graphs/contributors
2) We share the index with GNOME/* and save double indexing on "many" Linux systems \
which are not plain KDE Plasma Desktop based 3) We can delete 99% of the code \
(question is if we can remove the very buggy extractors from KFileMetaData, too, \
afterwards somewhen).
=> Opinions?
Greetings
Christoph
--
----------------------------- Dr.-Ing. Christoph Cullmann ---------
AbsInt Angewandte Informatik GmbH Email: cullmann@AbsInt.com
Science Park 1 Tel: +49-681-38360-22
66123 Saarbrücken Fax: +49-681-38360-20
GERMANY WWW: http://www.AbsInt.com
--------------------------------------------------------------------
Geschäftsführung: Dr.-Ing. Christian Ferdinand
Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234
--
----------------------------- Dr.-Ing. Christoph Cullmann ---------
AbsInt Angewandte Informatik GmbH Email: cullmann@AbsInt.com
Science Park 1 Tel: +49-681-38360-22
66123 Saarbrücken Fax: +49-681-38360-20
GERMANY WWW: http://www.AbsInt.com
--------------------------------------------------------------------
Geschäftsführung: Dr.-Ing. Christian Ferdinand
Eingetragen im Handelsregister des Amtsgerichts Saarbrücken, HRB 11234
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic