[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-devel
Subject:    databases, xml, filesystems and search (was: Magellan)
From:       rupert thurner <rthurner () edu ! uni-klu ! ac ! at>
Date:       1999-04-06 9:45:35
[Download RAW message or body]

Andreas Pour wrote:
> things a file system can never do.  I don't know where this bias against databases
> comes from . . . :-( .
using databases for simple things complicates the usage. the main
purpose of databases is to offer an acid principle for
transactions (atomicity, consistency, independency, and
durability). and for a high locking granularity in a (nearly)
single user environment (message, preferences of a whole
application) this enhanced functionality is simple overkill. and,
the databases mentioned in the thread offer no object
orientation. so, storing structured documents and have an
efficient search for the contents, and simple change and addition
of contents is not easy to reach.

take oracle as an example, and what they were and still are
trying all the time: storing everything in a database. and take
the microsoft registry as an exmple:

----------------------
1. oracle interoffice
some years ago (and i think this interoffice thing still exists)
oracle started to develop a messaging solution. everything should
be stored in the database. did you ever see or try to use the
resulting product? i did (our company is very oracle based, and
the bosses said: try to use interoffice, cause everything is in
the database, accessible via sql, easy to export, you have
database advantages for free etc, etc; all the arguments that
came here too).

what was the result? interoffice was:
- slow
- usability miles behind other messaging
  solutions (netscape, standard unix, microsoft, lotus)
- complicated to install (database, interfaces and application)
- and, search capabilities were a simple catastrophe
- and all this with one of the most advanced relational database

why this happened? the overhead of using the database crashed the
customer orientation (the client was slow, large and buggy,
missing a whole bunch of funcionalities that were state of the
art at the competing products). the product made the impression
the developpers where completely fixed to their database. and the
oracle full-text search capability did not offer searching for
document contents (word, multimedia, etc). and searching only for
message headers, dates, is not full search support.

the only thing where a database was practical, was adresses. and
oracle also implemented the ldap to access their directory out of
standard mail-clients.

------------------
2. oracle raw iron
they try to replace operating system overhead through a
leightweight thing with database functionality included. what
will happen? you can bet your head, the same thing as with
interoffice. too large, too complicated, and therefor too buggy
and too slow.

---------------------
3. microsoft registry
did you ever run a microsoft machine for a longer time, installed
applications, and removed applications? you would have noticed
the the registry grows and grows and the machine slows down.
removal of keys not any longer needed is a horror. and all this
for sake of a simple interface to access it? is editing
preferences a multiuser-environment, where you need transactions?
no, not at all.


----------------------------------------------------------------------
whats the solution of the web community and advanced operating
systems

BE tried to enhance the filesystem (but dropped it for sake of
posix compatibility). i still think this would be the most
efficient way to go.

the web community (www.w3.org, www.xml.org) introduced and still
introduces standards like xml and surrounding things like xml
schema (storing structured data), xml querying (for having a
query language to find things), extended link language, extended
style sheets, document object model (dom) etc. for giving more
structure and better accessibility to contents.

----------------------------------
what could be the direction of kde

1. try text

2. if its not sufficient, try structured content (xml and related
things), 
   which would be ideal for storing contents like the one of
magellan, or
   storing preferences.

   you have full text search capabilities, enhanced with search
   capabilities for special tags (attributes in relational
world).
   you also have the possibility of indexing.

   but you can store your data where it belongs, especially
preferences
   in a directory where you can remove them with your
application, or
   in your home directory.

   and you can build a web-page (manually, through searching)
which offers
   you the access of your scattered contents in a single place,
with linking.

   you also benefit from a fast development in this field, and
can use
   xml-editors or simple text editors to change your data, and
search tools
   for finding your data, and browsers for viewing it or even
editing it
   (html forms, hopefully also xml forms).

   i think you can use an xml parser and dom, document types,
   and maybe xml schema to store data (where you use gnudb
today),
   and you have a truly distributed data store (through file
system,
   and internet).

3. use database technology where you really need multiuser
transaction
   processing (locking!), e.g. shopping carts in web-stores.
   the ideal solution would be using a database where you can
store objects
   and therefor also xml-contents, like in poet, but i think odbc
and the
   other databases mentioned here will also do a great job.

so, i do think that one should use databases, but only where the
usage pays off, and not only for sake of using databases.

regards, rupert.

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic