[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-devel
Subject: KFile plugins
From: Thomas Kadauke <tkadauke () gmx ! de>
Date: 2005-10-13 15:59:06
Message-ID: 200510131759.06106.tkadauke () gmx ! de
[Download RAW message or body]
Hello list,
This is my first post to this list, so I want to introduce myself: My name is
Thomas Kadauke, I'm a compsci student in Tübingen, Germany (yeah, where
Matthias Ettrich studied :)
Recently, I uploaded several KFile plugins to kde-apps.org:
- http://www.kde-apps.org/content/show.php?content=30112
BibTeX-plugin: calculates total number of references, number of
book/article/other references
- http://www.kde-apps.org/content/show.php?content=30113
.kdevelop project file plugin: extracts author, e-mail, version, language
and some keywords
- http://www.kde-apps.org/content/show.php?content=30114
LaTeX-plugin: extracts title, author, date and claculates number of
chapters, sections, paragraphs, words, commands, footnotes and comments
- http://www.kde-apps.org/content/show.php?content=30115
M3U playlist plugin: calculates number of tracks, total length, number of
local files/streams
- http://www.kde-apps.org/content/show.php?content=30116
MIDI-plugin: extracts number of tracks, instruments and length
I got several requests to include these plugins into the main KDE
distribution. However, I'm new to KDE development and don't even have an SVN
account for that. So if you think these plugins are useful, feel free to
include them into KDE SVN.
While implementing these plugins, I encountered several problems/bugs:
- Documentation. To be honest, the documentation for KFilePlugin et.al. sucks.
Most of the interesting methods are not documented at all. This wouldn't
really be a problem, were there a descent tutorial about KFile plugins. This
is *really* needed, because a KFile-plugin is *the* opportunity for
KDE/QT-newbies to produce something useful quickly without digging too deep
into kdelibs. But since complaining isn't helping anyone, I volunteer on
updating/completing the relevant documentation.
- Four of the five plugins deal with text files. I'm planning on writing even
more kfile-plugins, among them: kate-project, python source, quanta webprj,
icalendar, docbook, rtf, java, vcalendar (if not already there) and vcard (if
not already there). All these are actually (more or less) human-readable text
files. So these could benefit from the meta information of the generic
kfile_txt plugin that extracts line count, word count, etc. However, the
current KFile/KDE API does not permit "mimetype-specialization", which would
be needed to e.g. declare a text/x-latex file to be a specialization of
text/plain. This would also solve the file association problem when e.g. a
new text editor is installed and you want to update all text-based formats to
use this new editor.
- What happens if two mimetypes contain the same filename pattern? AFAICS,
this is handled on a first-come-first-serve basis. This, however, is not
satisfactory, as e.g. the types text/x-tex and text/x-latex contain the same
pattern (*.tex), but are completely different in nature. I'm proposing to use
the filename pattern only as a hint, and determine the actual filetype based
on the content.
- Konqueror (in KDE 3.4.1) does not show any meta-information of a text-based
mimetype, if there is no KFile plugin for that mimetype. Specialization would
help here.
- Some of the mimetypes have to be updated (if possible):
- text/x-tex includes the patterns *.tex (good) and *.ltx (bad). *.ltx
stands for LaTeX files, which generally have only little in common with plain
TeX files. Also, I think the patterns *.sty and *.cls should get their own
mimetypes.
- text/x-c++hdr does not include the pattern "*.h", obviously because this
is reserved by text/x-chdr. This, too, would be solved by determining the
mimetype based on content.
- text/x-objchdr does not have any filename pattern.
- I guess Tenor in KDE4 will use KFile-plugins (or whatever there will be for
KDE4) to extract meta information from files. However, the current API is not
sufficient for that. Say that I'm a java-programmer who uses Javadoc and want
to use the KDE search-tool to look for a certain Java-method. I know that I
just could use the fulltext search from text-files but that will most
probably return a lot of noise, when the generated documentation is searched.
So here the kfile-plugins should be able to extract a list of Java-Methods
from a Java-file (it would be cool to even extract the method signature :)
and assign a high priority to that information, since it's more relevant than
the same words in the documentation. Right now, they are restricted to
extract only a summary of a file's content (such as the number of methods in
a java file, which is rather uninteresting)
- I haven't yet started to write KFile-plugins for programming language source
files such as java or python, because I think a full-blown parser (which is
needed for that) is too much for just extracting the method count and such.
It would be great to reuse existing parsers (maybe from kdevelop) for that
task and to extract all useful information from the source file. Right now,
at least the text-based kfile-plugins are QRegExp-based hacks. A parser which
really understands what it's reading there would bring benefit to the
accuracy of the extraction of meta-information.
- The MIDI-plugin links to the somewhat broken and long-time-not-updated
libkmid in kdelibs. Is it true that this library gets thrown out for KDE4? If
not, is anyone going to fix it? (see the source of kfile_midi for a short
description of what is broken)
So I'm proposing the following (for the upcoming KDE4, obviously):
- use the filename pattern as a hint for the mimetype, especially when no
context is given (e.g. in konqueror). When a context IS given (e.g. in krita,
you're most probably dealing with image files), but the pattern is unknown,
try a context-specific list of mimetypes based on the file contents.
- use a KFile-plugin to determine if the contents of a file match the
mimetype, regardless of the filename pattern.
- allow mimetypes to be a specialization of another mimetype (I think there is
no need to allow "multiple inheritance"). The benefit here will most
obviously be in text files. But it would also help to extract meta
informtaion from all these XML-based file types out there. The KFile-plugin
for a more specialized mimetype must extract all information that the
KFile-plugin for the less specialized mimetype claims to extract.
- allow kfile-plugins to extract lists of meta-information (see above). This
could be generalized to the full text extraction: the kfile_txt plugin could
extract the meta-field "words" which contains the list of word in the
documents. Besides, you get the word count for free.
- allow to assign a priority to meta information. The more specialized a
mimetype (and therefore a kfile-plugin) is, the more important tends the
extracted information to be.
OK, i guess that's it for now. Please tell me what you think. And thanks for
your patience :)
--Thomas
>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic