[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-devel
Subject:    Re: Storage implementation in KDE
From:       "Manuel Amador (Rudd-O)" <amadorm () usm ! edu ! ec>
Date:       2003-09-08 2:24:42
[Download RAW message or body]

Hi, Luke!

Quoting Luke Chatburn <lchatburn@isset.org>:

> I don't weigh in too often on these threads, but I'd like to make a few
> points (feel free to flame :) ).
> 
> In general, metadata based filesystems are not terribly user-friendly,
> worse, adding a database into the mix can produce a disaster.

The user-friendliness depends upon the interface presented to the user.
Storage's user interface looked extremely user-friendly to me.

> 
> First and most important point:
> 
> - Metadata does not scale well. It's like having a desk in an office and
> filling it with pieces of paper.

While post-its might be metadata, they are only metadata when they are pasted to
other documents.  If they stand alone, they are data unto themselves.  Metadata
is "data about data": information that describes another piece of information.

> It works fine if you have a limited
> quantity of pieces of paper, but for business and for regular home use over
> a period of time, you do need organisations. In the real world, this is in
> the form Room->Filing Cabinet/Box->Folder->Document. It's a normal
> heirarchical file system. That's not to say that a relational look-up can't
> help with a few things, but is can't be used for everything.

The paper makes a valid point that hierarchical organizations limit the ability
of the user to manage files.  A hierarchical system depends upon the user doing
Recall brain operations constantly.  That sucks.

You only need "organization" when you can no longer find what you're looking
for.  If there's a system that lets you get what you want, without the need for
organization, suddenly the need to organize disappears.

Don't let your habits preclude from new forms of thinking =)

> 
> Critical considerations:
> 
> - Metadata searching is very, very expensive in processing terms and worse
> for larger document sets. Moreover, you limit yourself by I/O abilities.

If you were to search in the document's data, I'd agree.  The whole point of
metadata is that it should be on separate physical locations, but bound to the
data file logically, so as to provide fast searches.  The speed issue is solved
by indexes.

> 
> - Using a database for metadata means that you have to have the service

The author's implementation of Storage uses a SQL database, but future
implementations may not need one.  Look at the work of Hans Reiser.

> completely fresh, light-weight DB for this, if you liked, but it is hardly
> ideal, and we'd spend two or three years bug and security-fixing the thing.

No need to.  Already ReiserFS is being worked upon to provide such facilities.

> 
> - Namespace collisions. Every run across a problem where you had to work
> with documents with the same name, one for one purpose and one with a few
> changes for another, in another folder (clearly marked as the other purpose,
> hopefully!)?
> The system will have to display many results and offer a clear
> indication to the user where the file was contained.

That assumes that you still have a hierarchical file system.

> Moreover, often users
> will identify files based upon other files in the same location. Metadata
> can compensate for this, but it isn't easy.

Same deal.

> 
> - The user or applications must specify metadata for different files
> constantly and consistently. We can't get MIME types to always work
> effectively, so how do we get all applications to play nicely with this?

Evidently, if you want to have a myriad of documents you either put them into
folders, or fill metadata in.

Putting them into folders has the already discussed disadvantages.  Filling
metadata is much easier.  Just let the computer make up the "Folders".

> 
> - Users are used to thinking in a heirarchical manner. They don't especially
> want to Google for their data and then have to manually sift through the
> results for the right one.

No, it's not googling.  Google offers unstructured text search.  Doesn't
leverage metadata much.  Storage DOES leverage metadata to levels never ever
offered before.

Go to namesys.com and look for Reiser's paper.  Has great ideas on these topics.

> 
> With all of that having been said (and there are ther arguments), I do agree
> that metadata can be valuable for augmenting existing search facilities.

with current data volumes, it's not only valuable, it's time!  How many times
have I tried to collect all songs of a particular album?

> In
> order to do this effectively and within a reasonable time, however, it needs
> to be done at the filesystem level, not by a high-level system embedded in
> KDE.

Agreed!

> ReiserFS 4 should allow us to do these things with relatively little
> system slowdown, which is why I look forward to it so avidly. Smarter
> filesystems are the key, really, but not SQL database-based FSs.
> 
> I have been thinking that someone might float an idea like this for a while
> now, and it shouldn't surprise me that the Gnome folks decided to blink
> first and give it a go. It was evident as soon as Microsoft mooted it as a
> feature in Longhorn, and it was also evident that the problem was
> non-trivial and that there were a large number of very critical issues.
> 
> The point most worth making, however, is that Windows is stuck at a point
> where they seemingly can't develop further in application or DE
> usability/functionality and they're going for frills. To be frank, though,
> most users don't have a difficult time finding their files (assuming they
> understand the concepts of directories, however).

They do!

> They spend 99% of their
> time sitting staring at an application trying to do work of some kind.

Most of my users don't save letters and stuff, because they don't find value in
reopening those letters again, because they can't find them.  It's that
critical!  They spend time retyping letters.  I first thought of them as stupid,
but then understood why they did that.

> I
> believe that there is significant innovation taking place in other aspects
> of the KDE project and applications, and that efforts are better served
> focusing there. When Reiser 4 turns up, this stuff will be trivial to
> implement, but until then, it's duplication of that effort.
> 
> To be honest, there are a number of areas where genuine progress can be made
> to make a real difference, almost immediately. Merging of instant messaging
> with PIM components is a start. Some sort of development to the KDE Notes
> system to assign notes to any document in any application and when the same
> program opens the same document again, they get reopened would be a great
> improvement to workflow.

Yep!  I so much wish for that to happen.  And that I could sync my notes with my
palm, would be great!

> 
> If you really want to try something radical, implement this idea I floated
> with the Slicker folks (they seem to be dead atm):
> 
> http://www.linuxcomment.com/slicker.htm
> 
> KDE is already very strong in network transparency, but adding in
> collections of documents, contacts, messages, services/web services and
> other data on a project basis would be a powerful productivity tool.
> 
> At the moment, metadata searching really isn't ready for primetime, and the
> benifits are dubious. Don't let me stop you, but beware! :)

You raised valid points =).  Good luck, Luke!

> 
> -Luke
> 
> ----- Original Message ----- 
> From: "Stefano Borini" <munehiro@ferrara.linux.it>
> To: <kde-devel@kde.org>
> Sent: Sunday, September 07, 2003 12:06 PM
> Subject: Re: Storage implementation in KDE
> 
> 
> > On Sat, Sep 06, 2003 at 12:41:13PM -0700, David wrote:
> > > Someone has to categorize these files.
> >
> > the user, of course, but also some metadata already present in the
> > inserted file, such as the id3 for mp3 or the text itself for documents
> > of whatever nature.
> >
> > > If the user isn't already categorizing their files, a system that
> > > expects them to do so every time a file is created or aquired isn't
> > > going to help.
> >
> > let the user insert keywords instead of filenames. Searching files in
> > this filesystem should become similar to searching with a search engine.
> > If there are too many matches, the ioslave should categorize them with
> > the other, unselected keywords, or by alphabetical index. i.e. i search
> > "love songs" and the system finds 600 files. Then it displays on the
> > screen "love songs (A)" "... (B)" etc... or better "love songs from
> > Artist1" "... from Artist2" etc...
> >
> > About the keyword, they should be left totally arbitrary, in the way
> > similar to ldap. The filename is not a name, but a bunch of metadata
> > informations.
> >
> > Problems arise to give the user the ability to understand that this
> > filesystem and the traditional filesystem (which is needed, at least at
> > the system level) are different and behave in a very different way.
> >
> > > I think a better way to approach this is mapping a filesystem API on top
> > > of a traditional database.A dbfs in other words. The latter has the
> > > advantage of categorizing your mp3 collection without being intrusive.
> > > It could also be compatible with storage, should that be necessary.
> >
> > i'm strongly for this approach, and since i have a deep study of
> > postgres features i propose myself as a contributor if help is needed.
> >
> > > (heck, what we really need are OS/2 HPFS metadata attributes).
> >
> > i have no infos about this, since i've never experienced OS2. can you
> > provide some link ?
> >
> >
> >
> >
> > >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to
> unsubscribe <<
> 
>  
> >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe
> <<
> 


    suerte,

    Rudd-O

===========================================================
     UNIVERSIDAD TECNICA FEDERICO SANTA MARIA
                 CAMPUS GUAYAQUIL
        CENTRO DE SERVICIOS INFORMATICOS
Mail enviado a traves de IMP-USM: http://www.usm.edu.ec/imp
    Los invitamos a visitar  http://www.usm.edu.ec
===========================================================
 
>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic