'Re: Fwd: Re: kdelibs/kio'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-core-devel
Subject:    Re: Fwd: Re: kdelibs/kio
From:       Andreas Pour <pour () mieterra ! com>
Date:       2000-05-27 14:55:07
[Download RAW message or body]

Waldo Bastian wrote:

[ ... ]

> > > >
> > > > Modified Files:
> > > >       kshred.cpp kshred.h
> > > > Log Message:
> > > > Contributed by Andreas Pour <pour@mieterra.com> :
> > > > // UPDATED: this function now uses 35 passes based on the the article
> > > > // Peter Gutmann, "Secure Deletion of Data from Magnetic and Solid-State
> > > > // Memory", first published in the Sixth USENIX Security Symposium
> > > > // Proceedings, San Jose, CA, July 22-25, 1996 (available online at
> > > > // http://rootprompt.org/article.php3?article=473)
> > >
> > > So it writes 35 times a lot of grabage to the linux kernel... which uses
> > > write caching and only writes the last version to disk... repeat after me
> > > "faked security".
> >
> > It's not that bad, as a "flush()" is done after every write (this is
> > proven by the length of time it takes to do the shred -- even a 3K file
> > takes about 10 seconds).  However, there is still the problem of caching
> > in the disk cache, and I don't know if Linux flushes the drive cache (as
> > opposed to memory) on a flush.  I suggested to David that we do a
> > sleep(1) after each flush to give the drive a chance to flush its
> > buffers to disk.  I suppose one could improve the likelihood of that
> > happening by doing a bunch of reads on the device during the "sleep"
> > period.
> 
> Sure if you do enough stuff which you have no idea of what it does, you will
> surely get the desired result after some time. Monkeys.. typewriters...
> shakespear.
> 
> > In the end, all we can do is the best we can do.
> 
> And when that is not enough we shouldn't be doing it all.

Whether or not it's enough depends on your standards.  There is no way
to achieve 100% security with off-the-shelf parts.  That's why people
who are really into security buy special hardware to achieve it.

But if we are grounded in reality the kshred class can accomplish useful
things that a normal 'rm' does not.

Now, here is some info to help make a more informed decision.  The
weakness in the "failproof" shreadding under Linux is the question
whether disk caches can be flushed by the kernel when doing a write.  I
think in this case it is instructive to look at how transactional
databases work with Linux.  They obviously have to rely completely on
data being written to disk when they do a flush() rather than cached in
the OS or on the disk drive cache.  For example, Borland seems to
believe that Linux flushes the actual disk caches (see
http://community.borland.com/article/0%2C1410%2C16926%2C00.html).

On the other hand, here is a comment excerpt from linux/fs/buffer.c,
function sync_dev() (from linux 2.2.12):

        /*
         * FIXME(eric) we need to sync the physical devices here.
         * This is because some (scsi) controllers have huge amounts of
         * cache onboard (hundreds of Mb), and we need to instruct
         * them to commit all of the dirty memory to disk, and we should
         * not return until this has happened.
         *
         * This would need to get implemented by going through the
assorted
         * layers so that each block major number can be synced, and
this
         * would call down into the upper and mid-layer scsi.
         */

So I readily concede that until disk buffer flushing is fully
implemented, it is hard to guarantee unrecoverability using
sophisticated technology.  

However, I don't think it is the intent of kshred to guarantee this
level of shredding.  After all, shredding is not burning, and even words
on burned papers can be reconstructed from the ashes if someone wants to
badly enough.  If the documentation only says that kshred will prevent
undeletes and normal raw device accesses from discovering the data, then
the current implementation is well up to the task.

I ran a few experiments to verify this.  For one I created on a floppy
drive, containing the data "This is a test" in a file which occurred
enough times in the file to fill the entire floppy.  Next I tried to
grep for the data in the device, and obvously found it.  Next I deleted
the file using regular delete, unmounted the floppy and grep'd for the
phrase again on the device.  Again a match.

After shredding the file, even after just the first pass (out of 35
passes) the sequence "This" did not appear on the device.  Plus
shredding the 1.2MB file took over 25 minutes (about 40 sec. per pass,
whence using 'dd' to write the *entire* disk -- 1.44MB -- takes about 48
seconds), during which entire time disk activity was quite obvious.

Next I tried it with an IDE hard drive partition.  Same results.  I also
timed the results.  Shredding a 10MB file with no other disk activity
took 3 1/2 minutes (and you could clearly see, from the disk activity
light, the 35 buffer/flush/write cycles).  On the other hand, doing a
simple 'dd' of 10MB followed by a 'sync' took about 3 seconds.

I think it's appropriate to say in the docs that if your kernel does not
support disk cache flushing there can be reasonable assurances that the
data cannot be recovered using sophisticated data recovery tools (well,
as the software is licensed AS IS, there are no assurances at all, but
you get the point).  But whether or not disk buffer flushing is
supported, the data will be overwritten with random bits, so that
undelete's and normal raw device accesses cannot reconstruct the data.

> 
> > It's certainly not
> > "faked security", although it's not 100% guaranteed to be impossible to
> > reconstruct the file either.
> 
> I think it is 0% guaranteed. I think you can't even guarantee that grep
> /dev/hda doesn't give you your, oh so confidental, data back.

I don't think it's appropriate to make such assertions without doing
tests.  Try shredding a file containing a lot of repetitions of a phrase
(be sure to grep the device for the phrase before starting) and try
getting it back using raw device accesses (you can even try after the
first of the 35 passes) using the latest CVS version (well, as soon as
someone imports the latest version, I hope David can do that).  

The question to me is, is it useful to have a feature that prevents
undeletes and raw device accesses, since as a practical matter that is a
far bigger privacy threat facing users than someone taking the disk to a
laboratory.

If you prefer a name other than 'shred', that's fine too, though I think
the name is appropriate even w/out disk buffer flushing.

> 
> Luring people into a false sense of security is worse than having no security
> at all.

The sense of security is only "false" if you expect the impossible.  If
you have realistic expectations, the security is fine.  In particular, I
would advertise the feature only as preventing undeletes/raw device
accesses to the data; one might mention in the footnote that if the OS
supports disk cache flushing it *may* (I have not done research to
verify that the algorithm in the paper works) foil a more determined
retrieval effort as well.

Ciao,

Andreas

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic