I don't weigh in too often on these threads, but I'd like to make a few points (feel free to flame :) ). In general, metadata based filesystems are not terribly user-friendly, worse, adding a database into the mix can produce a disaster. First and most important point: - Metadata does not scale well. It's like having a desk in an office and filling it with pieces of paper. It works fine if you have a limited quantity of pieces of paper, but for business and for regular home use over a period of time, you do need organisations. In the real world, this is in the form Room->Filing Cabinet/Box->Folder->Document. It's a normal heirarchical file system. That's not to say that a relational look-up can't help with a few things, but is can't be used for everything. Critical considerations: - Metadata searching is very, very expensive in processing terms and worse for larger document sets. Moreover, you limit yourself by I/O abilities. - Using a database for metadata means that you have to have the service running constantly in the background. Security considerations should be inserted here. Moreover, it is taking up cycles and a very significant chunk of memory. The more files, the worse the footprint. SQL dbs are huge, really, especially when you add the interpretation stage for SQL on top of the myriad of different functions they have to perform. You could write a completely fresh, light-weight DB for this, if you liked, but it is hardly ideal, and we'd spend two or three years bug and security-fixing the thing. - Namespace collisions. Every run across a problem where you had to work with documents with the same name, one for one purpose and one with a few changes for another, in another folder (clearly marked as the other purpose, hopefully!)? The system will have to display many results and offer a clear indication to the user where the file was contained. Moreover, often users will identify files based upon other files in the same location. Metadata can compensate for this, but it isn't easy. - The user or applications must specify metadata for different files constantly and consistently. We can't get MIME types to always work effectively, so how do we get all applications to play nicely with this? - Users are used to thinking in a heirarchical manner. They don't especially want to Google for their data and then have to manually sift through the results for the right one. With all of that having been said (and there are ther arguments), I do agree that metadata can be valuable for augmenting existing search facilities. In order to do this effectively and within a reasonable time, however, it needs to be done at the filesystem level, not by a high-level system embedded in KDE. ReiserFS 4 should allow us to do these things with relatively little system slowdown, which is why I look forward to it so avidly. Smarter filesystems are the key, really, but not SQL database-based FSs. I have been thinking that someone might float an idea like this for a while now, and it shouldn't surprise me that the Gnome folks decided to blink first and give it a go. It was evident as soon as Microsoft mooted it as a feature in Longhorn, and it was also evident that the problem was non-trivial and that there were a large number of very critical issues. The point most worth making, however, is that Windows is stuck at a point where they seemingly can't develop further in application or DE usability/functionality and they're going for frills. To be frank, though, most users don't have a difficult time finding their files (assuming they understand the concepts of directories, however). They spend 99% of their time sitting staring at an application trying to do work of some kind. I believe that there is significant innovation taking place in other aspects of the KDE project and applications, and that efforts are better served focusing there. When Reiser 4 turns up, this stuff will be trivial to implement, but until then, it's duplication of that effort. To be honest, there are a number of areas where genuine progress can be made to make a real difference, almost immediately. Merging of instant messaging with PIM components is a start. Some sort of development to the KDE Notes system to assign notes to any document in any application and when the same program opens the same document again, they get reopened would be a great improvement to workflow. If you really want to try something radical, implement this idea I floated with the Slicker folks (they seem to be dead atm): http://www.linuxcomment.com/slicker.htm KDE is already very strong in network transparency, but adding in collections of documents, contacts, messages, services/web services and other data on a project basis would be a powerful productivity tool. At the moment, metadata searching really isn't ready for primetime, and the benifits are dubious. Don't let me stop you, but beware! :) -Luke ----- Original Message ----- From: "Stefano Borini" To: Sent: Sunday, September 07, 2003 12:06 PM Subject: Re: Storage implementation in KDE > On Sat, Sep 06, 2003 at 12:41:13PM -0700, David wrote: > > Someone has to categorize these files. > > the user, of course, but also some metadata already present in the > inserted file, such as the id3 for mp3 or the text itself for documents > of whatever nature. > > > If the user isn't already categorizing their files, a system that > > expects them to do so every time a file is created or aquired isn't > > going to help. > > let the user insert keywords instead of filenames. Searching files in > this filesystem should become similar to searching with a search engine. > If there are too many matches, the ioslave should categorize them with > the other, unselected keywords, or by alphabetical index. i.e. i search > "love songs" and the system finds 600 files. Then it displays on the > screen "love songs (A)" "... (B)" etc... or better "love songs from > Artist1" "... from Artist2" etc... > > About the keyword, they should be left totally arbitrary, in the way > similar to ldap. The filename is not a name, but a bunch of metadata > informations. > > Problems arise to give the user the ability to understand that this > filesystem and the traditional filesystem (which is needed, at least at > the system level) are different and behave in a very different way. > > > I think a better way to approach this is mapping a filesystem API on top > > of a traditional database.A dbfs in other words. The latter has the > > advantage of categorizing your mp3 collection without being intrusive. > > It could also be compatible with storage, should that be necessary. > > i'm strongly for this approach, and since i have a deep study of > postgres features i propose myself as a contributor if help is needed. > > > (heck, what we really need are OS/2 HPFS metadata attributes). > > i have no infos about this, since i've never experienced OS2. can you > provide some link ? > > > > > >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe << >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<