[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-dev
Subject:    my submission though it's no faster - RE: Converting a FSDirectory (on disk index) to a RAMDirectory
From:       "Spencer, Dave" <dave () lumos ! com>
Date:       2002-02-27 0:08:06
[Download RAW message or body]

I've attached a modified version of RAMDirectory that has an additional,
"copy" constructor for creating a RAMDirectory based on another,
existing,
Directory, presumably a FSDirectory (i.e. an on-disk directory).

Below Doug asked for 2 additional ctrs but I didn't add them since 
the appropriate ctrs in FSDirectory didn't exist and I wasn't sure
if I should add the 'boolean create' flags when the implicit
FSDirectory.getDirectory
call was made...and because I wanted to get this "out the door".

I based this on the last nightly release (2-26) which I just loaded.
It compiles, and I ran the "ant test" target too, successfully.

I ran my own humble benchmark and it seemed that running with a
RAMDirectory
and a FSDirectory leads to equal times for searches, though the
RAMDirectory
case takes up more RAM (as expected) and takes longer to initialize
(again, as expected).

The test runs against a database that takes 90MB on disk and has 140,000
entries.
It applies my unpublished SubstringQuery on a couple of works and ends
up with a query
of 100 or so terms w/ boosting and checks against 3 fields. The query
returns approx 400
matches in both tests (thus a sanity check was made that both
directories return the same
# of matches).

Memory is measured twice, by calling into java.lang.Runtime,
freeMemory() and totalMemory()
(1) after the directory is created and
(2) just before the test finishes.
Before measuring memory I call System.gc() to try to get rid of junk in
the system.

The VM is invoked like this:
	java -Xverify:none -ms32m -mx256m
The verify:none tells the vm not to perform sanity checks on the class
files.
Consequence is it starts up faster and runs fine if your tree is
compiled.

The queries were run 25 times each in one run, and later 5 times each -
results are for the
last, shorter run.

	startup	free/total	free/total    min/max/avg(ms)
fs	10ms	      31/32mb	30/32         10064/10274/10114
ram	6889ms	65/159mb	67/164        10124/10384/10192

So all the numbers are more or less as expected except for the times at
the end - they're
almost identical which is kinda weird. I even tried rerunning the ram
test and deleting
the database after it started to "prove" that it's reading out of
ram,and I get the same
numbers [note: yes I mean 'delete', not 'rename', just in case something
funny could be happening].

At the moment I can't easily publish my benchmark code but will do so if
it's needed
later this week.

I suggest that this version of RAMDirectory be added to the src base as
after all, the
ctr itself is reasonable. I'd like to know if anyone has run w/
RAMDirectory
and proven that it's faster.

My conclusion from the tests I've ran is that FSDirectory must be doing
good buffering/reading
such that an in-memory directory has no benefit.

PS
  I really gotta hit send, but I have a feeling the benchmark is invalid
since I reran the
  same query over and over again, and thus didn't stress out the
filesystem i/o since
  we all know lucene is well implemented and probably doesn't do that
much i/o. Maybe I need
  another pass where the benchmark cycles thru a number of diff queries,
thus the FSDirectory
  should have to hit the disk more...







-----Original Message-----
From: Doug Cutting [mailto:DCutting@grandcentral.com]
Sent: Thursday, February 21, 2002 1:33 PM
To: 'Lucene Developers List'
Subject: RE: Converting a FSDirectory (on disk index) to a RAMDirectory


> From: Spencer, Dave [mailto:dave@lumos.com]
> 
> Could anyone glance at this and verify that this code is correct.
> Goal is to convert an existing, on-disk, index to a 
> RAMDirectory, which presumably is purely in memory.

It looks right to me.  Did you test it?  Did it work?

> If the code is correct I'd suggest someone w/ CVS powers adding it to
> the source base - maybe a static method in  RAMDirectory itself.

How about a RAMDirectory constructor?  Since only generic Directory
methods
are required it could just be:

  public RAMDirectory(Directory dirToCopy) { ... }

and, as conveniences:

  public RAMDirectory(File f)   { this(new FSDirectory(f)); }
  public RAMDirectory(String s) { this(new FSDirectory(s)); }

Doug

--
To unsubscribe, e-mail:
<mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-dev-help@jakarta.apache.org>


["RAMDirectory.java" (application/octet-stream)]

--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic