[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-frameworks-devel
Subject: Re: Review Request 125762: External extractor plugin support for KFileMetaData
From: "Vishesh Handa" <me () vhanda ! in>
Date: 2015-12-06 12:02:10
Message-ID: 20151206120210.29696.65003 () mimi ! kde ! org
[Download RAW message or body]
--===============0936766810465421497==
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/125762/#review89174
-----------------------------------------------------------
If you're planning on pushing this please push it to a testing branch. Untill we \
actually have some plugins using this we might not be sure of the API. My points \
below are quite negative and I was hoping on writing a more positive email about the \
different ways to go about this, but I seem to have been procrastinating and have not \
written it. Sorry.
My main objections with this -
1. One cannot choose between external plugins. Since all of them are in 1-plugin \
(called ExternalPlugin), and all the mimtypes are just combined. This matters less in \
the case of Baloo, but when someone else wants to choose specific extractors. It's \
impossible.
2. Extraction of plain text is simply not supported.
3. No way to differentiate in the implementation of a plugin. We may get plugins for \
the same mimetype using different technologies, perhaps we need a way to identify \
which one to choose. Maybe this will never be a problem, but I'm skeptical.
4. Contamination of the PATH, maybe we could put them in a different place which \
makes it obvious that users should never execute them.
5. Getting the list of extractors now means running a possibly large number of \
executables with possibly bad implementations. They could just get stuck, and all \
other plugins will suffer. Perhaps the list of mimetypes supported could be in a \
desktop file?
- Vishesh Handa
On Oct. 24, 2015, 12:19 p.m., Boudhayan Gupta wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/125762/
> -----------------------------------------------------------
>
> (Updated Oct. 24, 2015, 12:19 p.m.)
>
>
> Review request for Baloo, KDE Frameworks, Pinak Ahuja, and Vishesh Handa.
>
>
> Repository: kfilemetadata
>
>
> Description
> -------
>
> This patch introduces support for external metadata extractors in KFileMetaData
>
> The external extractors themselves can be written in any language, provided that it \
> can be executed as a standalone executable (compiled or script with a hashbang), \
> with command line arguments, and can output data to stdout.
> The extractors are executed like so:
>
> * `extractor --mimetypes` - outputs a list of mimetypes supported by the extractor, \
> one per line.
> * `extractor filename` - outputs a json document with the metadata. The keys are \
> such that they can be directly used with PropertyInfo::fromName().
> At the KFileMetaData end, an additional internal plugin (ExternalExtractor) is \
> provided that forms a conduit between external extractors and the internal API. \
> This plugin looks for executables called kfilemetadata_extractor_<something> in \
> /usr/bin to find external extractors, and executes them with the --mimetypes arg to \
> find the list of mimetypes each extractor supports. ExternalExtractor then claims \
> to support all of these mimetypes, and then delegates to the extractor executable \
> when doing the actual extraction.
>
> Diffs
> -----
>
> README.md 19b1a26
> src/extractors/CMakeLists.txt 5dd223e
> src/extractors/externalextractor.h PRE-CREATION
> src/extractors/externalextractor.cpp PRE-CREATION
>
> Diff: https://git.reviewboard.kde.org/r/125762/diff/
>
>
> Testing
> -------
>
> Tested with the sample executable file extractor (as attched, written in python) \
> with the dump manual test in KFileMetaData. Works.
>
> File Attachments
> ----------------
>
> kfilemetadata_extractor_executable
> https://git.reviewboard.kde.org/media/uploaded/files/2015/10/23/146b657f-31d9-4117-a82f-ef966a6339d4__kfilemetadata_extractor_executable
>
>
> Thanks,
>
> Boudhayan Gupta
>
>
--===============0936766810465421497==
MIME-Version: 1.0
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: 7bit
<html>
<body>
<div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
<table bgcolor="#f9f3c9" width="100%" cellpadding="12" style="border: 1px #c9c399 \
solid; border-radius: 6px; -moz-border-radius: 6px; -webkit-border-radius: 6px;"> \
<tr> <td>
This is an automatically generated e-mail. To reply, visit:
<a href="https://git.reviewboard.kde.org/r/125762/">https://git.reviewboard.kde.org/r/125762/</a>
</td>
</tr>
</table>
<br />
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;"><p style="padding: \
0;text-rendering: inherit;margin: 0;line-height: inherit;white-space: inherit;">If \
you're planning on pushing this please push it to a testing branch. Untill we \
actually have some plugins using this we might not be sure of the API. My points \
below are quite negative and I was hoping on writing a more positive email about the \
different ways to go about this, but I seem to have been procrastinating and have not \
written it. Sorry.</p> <p style="padding: 0;text-rendering: inherit;margin: \
0;line-height: inherit;white-space: inherit;">My main objections with this -</p> <ol \
style="padding: 0;text-rendering: inherit;margin: 0 0 0 2em;line-height: \
inherit;white-space: normal;"> <li style="padding: 0;text-rendering: inherit;margin: \
0;line-height: inherit;white-space: normal;"> <p style="padding: 0;text-rendering: \
inherit;margin: 0;line-height: inherit;white-space: inherit;">One cannot choose \
between external plugins. Since all of them are in 1-plugin (called ExternalPlugin), \
and all the mimtypes are just combined. This matters less in the case of Baloo, but \
when someone else wants to choose specific extractors. It's impossible.</p> </li>
<li style="padding: 0;text-rendering: inherit;margin: 0;line-height: \
inherit;white-space: normal;"> <p style="padding: 0;text-rendering: inherit;margin: \
0;line-height: inherit;white-space: inherit;">Extraction of plain text is simply not \
supported.</p> </li>
<li style="padding: 0;text-rendering: inherit;margin: 0;line-height: \
inherit;white-space: normal;"> <p style="padding: 0;text-rendering: inherit;margin: \
0;line-height: inherit;white-space: inherit;">No way to differentiate in the \
implementation of a plugin. We may get plugins for the same mimetype using different \
technologies, perhaps we need a way to identify which one to choose. Maybe this will \
never be a problem, but I'm skeptical.</p> </li>
<li style="padding: 0;text-rendering: inherit;margin: 0;line-height: \
inherit;white-space: normal;"> <p style="padding: 0;text-rendering: inherit;margin: \
0;line-height: inherit;white-space: inherit;">Contamination of the PATH, maybe we \
could put them in a different place which makes it obvious that users should never \
execute them.</p> </li>
<li style="padding: 0;text-rendering: inherit;margin: 0;line-height: \
inherit;white-space: normal;"> <p style="padding: 0;text-rendering: inherit;margin: \
0;line-height: inherit;white-space: inherit;">Getting the list of extractors now \
means running a possibly large number of executables with possibly bad \
implementations. They could just get stuck, and all other plugins will suffer. \
Perhaps the list of mimetypes supported could be in a desktop file?</p> </li>
</ol></pre>
<br />
<p>- Vishesh Handa</p>
<br />
<p>On October 24th, 2015, 12:19 p.m. UTC, Boudhayan Gupta wrote:</p>
<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="12" style="border: \
1px #888a85 solid; border-radius: 6px; -moz-border-radius: 6px; \
-webkit-border-radius: 6px;"> <tr>
<td>
<div>Review request for Baloo, KDE Frameworks, Pinak Ahuja, and Vishesh Handa.</div>
<div>By Boudhayan Gupta.</div>
<p style="color: grey;"><i>Updated Oct. 24, 2015, 12:19 p.m.</i></p>
<div style="margin-top: 1.5em;">
<b style="color: #575012; font-size: 10pt;">Repository: </b>
kfilemetadata
</div>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" \
style="border: 1px solid #b8b5a0"> <tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;"><p style="padding: 0;text-rendering: inherit;margin: 0;line-height: \
inherit;white-space: inherit;">This patch introduces support for external metadata \
extractors in KFileMetaData</p> <p style="padding: 0;text-rendering: inherit;margin: \
0;line-height: inherit;white-space: inherit;">The external extractors themselves can \
be written in any language, provided that it can be executed as a standalone \
executable (compiled or script with a hashbang), with command line arguments, and can \
output data to stdout.</p> <p style="padding: 0;text-rendering: inherit;margin: \
0;line-height: inherit;white-space: inherit;">The extractors are executed like \
so:</p> <ul style="padding: 0;text-rendering: inherit;margin: 0 0 0 1em;line-height: \
inherit;white-space: normal;"> <li style="padding: 0;text-rendering: inherit;margin: \
0;line-height: inherit;white-space: normal;"><code style="text-rendering: \
inherit;color: #4444cc;padding: 0;white-space: normal;margin: 0;line-height: \
inherit;">extractor --mimetypes</code> - outputs a list of mimetypes supported by the \
extractor, one per line.</li> <li style="padding: 0;text-rendering: inherit;margin: \
0;line-height: inherit;white-space: normal;"><code style="text-rendering: \
inherit;color: #4444cc;padding: 0;white-space: normal;margin: 0;line-height: \
inherit;">extractor filename</code> - outputs a json document with the metadata. The \
keys are such that they can be directly used with PropertyInfo::fromName().</li> \
</ul> <p style="padding: 0;text-rendering: inherit;margin: 0;line-height: \
inherit;white-space: inherit;">At the KFileMetaData end, an additional internal \
plugin (ExternalExtractor) is provided that forms a conduit between external \
extractors and the internal API. This plugin looks for executables called \
kfilemetadata_extractor_<something> in /usr/bin to find external extractors, \
and executes them with the --mimetypes arg to find the list of mimetypes each \
extractor supports. ExternalExtractor then claims to support all of these mimetypes, \
and then delegates to the extractor executable when doing the actual \
extraction.</p></pre> </td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: \
1px solid #b8b5a0"> <tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;"><p style="padding: 0;text-rendering: inherit;margin: 0;line-height: \
inherit;white-space: inherit;">Tested with the sample executable file extractor (as \
attched, written in python) with the dump manual test in KFileMetaData. \
Works.</p></pre> </td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">
<li>README.md <span style="color: grey">(19b1a26)</span></li>
<li>src/extractors/CMakeLists.txt <span style="color: grey">(5dd223e)</span></li>
<li>src/extractors/externalextractor.h <span style="color: \
grey">(PRE-CREATION)</span></li>
<li>src/extractors/externalextractor.cpp <span style="color: \
grey">(PRE-CREATION)</span></li>
</ul>
<p><a href="https://git.reviewboard.kde.org/r/125762/diff/" style="margin-left: \
3em;">View Diff</a></p>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">File Attachments \
</h1>
<li><a href="https://git.reviewboard.kde.org/media/uploaded/files/2015/10/23/146b657f \
-31d9-4117-a82f-ef966a6339d4__kfilemetadata_extractor_executable">kfilemetadata_extractor_executable</a></li>
</ul>
</td>
</tr>
</table>
</div>
</body>
</html>
--===============0936766810465421497==--
[Attachment #3 (text/plain)]
_______________________________________________
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic