[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ietf-nfsv4
Subject:    Re: duplicate request cache and locking (was: comments....)
From:       Eric Werme USG <werme () zk3 ! dec ! com>
Date:       2002-08-23 19:51:43
Message-ID: 200208231951.g7NJpi00001314114 () anw ! zk3 ! dec ! com
[Download RAW message or body]

>> Still, in the past, we've relied on the duplicate request
>> cache to catch dangling writes, and operational experience
>> for the past 15 years or so says it works fine.
>> I think returning _OLD_STATEID on
>> an I/O ought to be a server implemention choice.
 
> I think the works fine for 15 years is primarily because customers
> have simply avoided locking in NFS. A better perspective is to
> look at the expectations in the Windows world where they
> actually regularly use locking and depend on the transport to
> maintain sequencing.

Our production system serves mail via NFS and also accesses things like
.forward files via NFS.  I haven't been dragged over to help with an Email
problem in ages, especially after changing the NFS client to use a single
source port so we don't exhaust the privileged port space.  And we still
have (shudder) rpc.lockd!

We _have_ seen some failures of the DRC in the QA stress tests.  Two
key things have changed in the last 15 years that make me fear the ice is
getting awfully thin:

1) We used to think that 50% usage of 10BaseX networks meant they were
   severely overloaded.  Today we can saturate 1000BaseT, at least with
   big I/Os and Jumbo frames.
2) Some non-idempotent ops take little time, e.g. unstable V3 writes or
   almost anything on systems with big battery backed caches.

NFS over TCP helps quite a bit thanks to guaranteed delivery, long NFS
retransmit delays, and servers that infrequently crash.  Of course, we
haven't kept up TCP window sizes with the media speed.   How many of
us use windows that buffer a second's worth of 1000BaseT data?  I was
experimenting at Connectathon with a 500 KB window, I "just couldn't" bring
myself to try 125 MB!

How long does a DRC entry need to stick around?  At the very least a
couple seconds, or whatever a minimum UDP retransmit delay is.  If
several processes are trying to delete a couple million files and one
reply gets lost, The DRC needs to be big enough to hold the full speed
delete stream from the other processes.  Call it 250 bytes per
request, that's 2,000 bits or 2 usec of clock time at 1000BaseT rates.
Two seconds of DRC time is a million requests.  Tru64's DRC ain't that
big....

	-Ric Werme



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic