[prev in list] [next in list] [prev in thread] [next in thread] 

List:       webappsec
Subject:    RE: Controlling access to pdf/doc files (db "better" than filesystem?)
From:       "Jannie Hanekom" <jannie.hanekom () opendev ! net>
Date:       2004-02-28 23:58:29
Message-ID: 4FD003E13C7E17458143D84512D7D5247B04F4 () exchange ! steeldesign ! co ! uk
[Download RAW message or body]

I'm by no means an expert on the topic, but I'd like to point out that
in my Windows/SQL/IIS experience, database-stored files tend to be
significantly slower than their file system counterparts.  Databases
introduce significant overhead;  in addition to the layer of translation
between the database itself and the file system, there is also the layer
of processing that the client database driver introduces.  There's also
the fact that not the entire 8K page is used for data storage - some
overhead is introduced by the database system in use as well.

With regards to caching of files by the database in stead of the file
system, it will introduce quite a bit of overhead, as the process
rendering the file has to traverse the database driver to retrieve the
file;  also, your dba may not appreciate you flushing out a couple of MB
of crucial indexes every time you retrieve a large blob from the
database.  (How do databases deal with this?  Do they cache blobs at
all?  I've not done any testing on that...)

Also keep in mind that in most larger implementations, the database is
stored on a server separate from the actual web server, introducing a
network layer with its inherent latency into the mix.  Add a firewall
between the database and web server, and you can guess where this is
going...

Lastly, this approach wreaks havoc with the performance of solutions
where web-server clustering is used to add scalability.  Granted, this
is less of an issue with the comparatively low volume of traffic that
pdf/doc files are likely to produce, but it's pretty effective at
bringing a site to a crawl if high-frequency files (such as images) are
stored in a database (yes, it's actually done more frequently than you
might guess...)

The transactional integrity argument is a pretty strong one, but I have
reservations about the argument that storing files in the database is
more secure than on the web server itself.  As of late, there have been
far more SQL injection vulnerabilities than web server directory
traversal vulnerabilities, so my money would be on dynamically creating
and deleting the files or placing them in a secure area outside of the
web root.  It's much easier to audit file permissions than it is to find
injection vulnerabilities in a large amount of ASP/PHP code.

As always, each situation is unique and has its own merits.  From a
dba/server admin perspective, I'd prefer it if the dev team stored their
files on the file system and their data in the database, the way it was
intended.

Jannie

-----Original Message-----
From: Ido Rosen [mailto:ido@cs.uchicago.edu] 
Sent: 28 February 2004 20:55
To: David Wall @ Yozons, Inc.
Cc: webappsec@securityfocus.com
Subject: Re: Controlling access to pdf/doc files (db "better" than
filesystem?)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, 28 Feb 2004 11:13:21 -0800
"David Wall @ Yozons, Inc." <dwall@yozons.com> wrote:

> > that  in SQL Server is that all data in SQL Server is split over ~8k

> > pages. When you add a BLOB it needs to be split into 8k chunks. When

> > you
> 
> But filesystems also store data into pages, often much smaller than 8k

> chunk.

I agree that storing files with their metadata for such a solution in a
database is a better solution than storing files.  It's also probably
more secure, since the web developer is less likely to botch some
permissions, security, or sanity checks and since most database systems
already have some sanity checks built in.  Your reasoning in that last
sentence is a bit off, though:  Database systems (such as MySQL, PgSQL,
ThinkSQL, and MSSQL) all must use the filesystem, so their 8k chunks may
not match, and the storage may be out of phase.  This is just a result
of overlaying one file storage paradigm over another, and shouldn't
cause too much trouble speed-wise.  By adding a layer on top of the
filesystem, you do increase the likelihood of inefficiency.


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic