[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-nfsv4
Subject:    Re: system hangs, nfs4_cb: server not responding errors
From:       Jeff Layton <jlayton () redhat ! com>
Date:       2009-09-23 18:46:28
Message-ID: 20090923144628.776ae992 () tlielax ! poochiereds ! net
[Download RAW message or body]

On Tue, 22 Sep 2009 13:54:32 -0400
Rob Henderson <robh@cs.indiana.edu> wrote:

> We are using nfsv4 for home directories and have been noticing
> infrequent hangs.  The symptom is that the home directory access will
> hang for about a minute (perhaps a little longer) and the server logs
> the following at the time of the hang:
> 
>             Sep 21 15:09:14 frog kernel: nfs4_cb: server 129.79.245.31
> not responding, timed out
> 
> In this example, frog is the nfsv4 home directory server and
> 129.79.245.31 is the IP address of the client system.
> 
> It almost always seems to be when doing something with firefox or
> thunderbird and my feeling is that it seems to happen one time shortly
> after you login and then doesn't seem to happen again for that login
> session.  It may even only happen one time after a system is rebooted
> and then not again until the system is rebooted again but I'm not sure
> about that.  It happens infrequently enough that it is hard to
> quantify.  But, here is the scenario that I seem to see:
> 
>     1) The system boots up
>     2)  You login and things seem to work fine
>     3)  After being logged in for a while (maybe a minute, maybe an
> hour) the system will hang while using thunderbird or firefox.  I'm
> suspicious that firefox and thunderbird trigger the problem because they
> are doing lots of file locking (just a WAG).
>     4)  About 1-2 minutes later, the system comes back to life and no
> further hangs are observed for that login session.
> 
> Any idea what's causing this and if there is anything I can do to
> prevent it?  I do see earlier discussions regarding the nfs_cb error I'm
> seeing but they all seem to be centered around eliminating the annoying
> error messages and not that they are related to any hanging behavior
> like I'm seeing. 
> 
> BTW, I'm running  the 2.6.18 RHEL5 kernel.  Perhaps this is RHEL
> specific and I should just post to bugzilla.redhat.com?
> 

The problem is probably client-side. The server message generally means
that the client wasn't responding to delegation callback requests for
some reason. When that happens, the server can't respond immediately to
some requests from other clients and has to wait until the delegation's
lease times out.

I'd probably recommend a RH support case instead...bugzilla is fine
when you know that it's a bug and don't need much help troubleshooting
the problem (even better when you have a proposed patch). Chasing down
something like this will probably mean some data collection and
troubleshooting work.

-- 
Jeff Layton <jlayton@redhat.com>
_______________________________________________
NFSv4 mailing list
NFSv4@linux-nfs.org
http://linux-nfs.org/cgi-bin/mailman/listinfo/nfsv4
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic