[prev in list] [next in list] [prev in thread] [next in thread] 

List:       netbsd-tech-kern
Subject:    Re: changing VOP_REMOVE(), was Re: ufs-ism in lookup(9)
From:       Bill Studenmund <wrstuden () NetBSD ! org>
Date:       2004-03-31 7:24:20
Message-ID: 20040331072420.GA20484 () netbsd ! org
[Download RAW message or body]


On Wed, Mar 31, 2004 at 01:16:01PM +0900, YAMAMOTO Takashi wrote:
> > I certainly don't want to get bitten by it. :-) I'm concerned it'd be a 
> > new way to make the kernel fall over.. So in that way, I want to "support" 
> > not dying.
> > 
> > Also, when exactly would we be doing this getcwd-like thing to get the 
> > mountpoint path? How will we know when to (and not to) do the parent 
> > directory lookup?
> 
> always when getfsstat is called?

We could do that. And we may want to be able to do something, since after 
all a directory farther up the chain could have been renamed or moved. 
However I'm not sure getfsstat() is the right call. I don't see the 
corelation between getfsstat() calls and directory re-arrangements. I 
think if we want a fix-up call for the paths, we should make some sort of 
explicit operation for it, and let the admin invoke it.

> > My point is that we are talking about (with the (dvp, component) cache)
> > doing mount-point detection using something that is fragile for a remote
> > file system, unless we can put some sort of watch (or lock or guard) on
> > each path component.  Otherwise either the server or another client can
> > come in and do a rename that will invalidate the usefullness of our cache.
> > And we wouldn't even know it.
> 
> anyway it's fragile to mount filesystem on the remote vnode. :-)

Agreed!

> i've been almost concerned that the (dvp, component) cache was not
> a good idea.  maybe it's better to do the mountpoint test in
> each dirop VOPs as you said.  i'm not sure if it's the best way, though.
> 
> > > > Sounds like the best thing to do for now is leave VOP_RENAME() alone.
> > > 
> > > the current "VOP_LOOKUP for dirop" method is wasteful even for ufs.
> > > ie. in-core inode is bloated unnecessarily to store a result of VOP_LOOKUP.
> > > i guess it's worse for more complicated directory implementations.
> > 
> > [sorry I mentioned VOP_RENAME(); we're talking about VOP_REMOVE()]
> > 
> > Huh?
> > 
> > For ufs, AFAICT we add two int32_t's to hold slot information. We also 
> > use these values for file creation. So it doesn't seem like a waste to me.
> 
> it's a waste because the members are only used during dirops.
> you can implement an internal version of ufs_lookup, which stores slot info
> into kernel stack or somewhere instead of the in-core inode, and
> call it from dirop VOPs.

I really think you're adding a lot of complexity for two ints.

> > Also, how exactly do we eliminate the lookup for NFS? Yes, we could do 
> > something like NOLEAF to the lookup, and we could push the parent and 
> > component into VOP_REMOVE(). But fundamentally, as part of the remove 
> > process, we have to know if we have an in-kernel vnode. How do we find 
> > that vnode without doing a lookup? "Lookup" after all is the way we map a 
> > name to an inode.
> 
> it needs lookup, but doesn't need VOP_LOOKUP, which always loads the vnode
> as you said below.
> 
> > And since we have to do the lookup to find the vnode, what's wrong with
> > what we're doing now, other than that we will always load the vnode into
> > core even if it's not in-core?
> 
> isn't it enough?

How much code are we talking about? Is it really worth making ourselves 
incompatable with all the other BSDs for this?

One other option is to keep the calling code pattern the same, and add a 
NOLOAD flag. Its semantics are we do the lookup, and we return the vnode 
if it is in core, but we don't load a vnode in if there isn't one now. My 
idea is that since the parent directory will be locked the whole time, no 
one can come in and create an entry behind our back. Thus this would be a 
case where it's valid to return NULL with no error from namei(). This 
change would mean continuing to cache slot info, but I think that'll be 
fine.

Take care,

Bill

[Attachment #3 (application/pgp-signature)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic