[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gpfsug-discuss
Subject:    Re: [gpfsug-discuss] CNFS issue after upgrading from 4.2.3.11 to 5.0.4.2
From:       Bryan Hill <bhill () physics ! ucsd ! edu>
Date:       2020-06-01 15:32:09
Message-ID: CADL=sMJuwqjG_dufuta567AnOgXpKVQN=vijsUa44xf5Bkivmw () mail ! gmail ! com
[Download RAW message or body]

Hi:

Just a note on this:  the pidof fix was accepted upstream but has not
made its way into rhel 8.2 yet


Thanks,
Bryan

---
Bryan Hill
Lead System Administrator
UCSD Physics Computing Facility

9500 Gilman Dr.  # 0319
La Jolla, CA 92093
+1-858-534-5538
bhill@ucsd.edu

On Mon, Feb 17, 2020 at 12:02 AM Malahal R Naineni <mnaineni@in.ibm.com> wrote:
> 
> I filed a defect here, let us see what Redhat says. Yes, it doesn't work for any \
> kernel threads. It doesn't work for user level threads/processes. 
> https://bugzilla.redhat.com/show_bug.cgi?id=1803640
> 
> Regards, Malahal.
> 
> 
> ----- Original message -----
> From: Bryan Hill <bhill@physics.ucsd.edu>
> Sent by: gpfsug-discuss-bounces@spectrumscale.org
> To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
> Cc:
> Subject: [EXTERNAL] Re: [gpfsug-discuss] CNFS issue after upgrading from 4.2.3.11 \
>                 to 5.0.4.2
> Date: Mon, Feb 17, 2020 8:26 AM
> 
> Ah wait, I see what you might mean.  pidof works but not specifically for processes \
> like nfsd.  That is odd. 
> Thanks,
> Bryan
> 
> 
> 
> On Sun, Feb 16, 2020 at 10:19 AM Bryan Hill <bhill@physics.ucsd.edu> wrote:
> 
> Hi Malahal:
> 
> Just to clarify, are you saying that on your VM pidof is missing?   Or that it is \
> there and not working as it did prior to RHEL/CentOS 8?  pidof is returning pid \
> numbers on my system.  I've been looking at the mmnfsmonitor script and trying to \
> see where the check for nfsd might be failing, but I've not been able to figure it \
> out yet. 
> 
> 
> Thanks,
> Bryan
> 
> ---
> Bryan Hill
> Lead System Administrator
> UCSD Physics Computing Facility
> 
> 9500 Gilman Dr.  # 0319
> La Jolla, CA 92093
> +1-858-534-5538
> bhill@ucsd.edu
> 
> On Sat, Feb 15, 2020 at 2:03 AM Malahal R Naineni <mnaineni@in.ibm.com> wrote:
> 
> I am not familiar with CNFS but looking at git source seems to indicate that it \
> uses 'pidof' to check if a program is running or not. "pidof nfsd" works on RHEL7.x \
> but  it fails on my centos8.1 I just created. So either we need to make sure pidof \
> works on kernel threads or fix CNFS scripts. 
> Regards, Malahal.
> 
> 
> ----- Original message -----
> From: Bryan Hill <bhill@physics.ucsd.edu>
> Sent by: gpfsug-discuss-bounces@spectrumscale.org
> To: gpfsug-discuss@spectrumscale.org
> Cc:
> Subject: [EXTERNAL] [gpfsug-discuss] CNFS issue after upgrading from 4.2.3.11 to \
>                 5.0.4.2
> Date: Fri, Feb 14, 2020 11:40 PM
> 
> Hi All:
> 
> I'm performing a rolling upgrade of one of our GPFS clusters.  This particular \
> cluster has 2 CNFS servers for some of our NFS clients.  I wiped one of the nodes \
> and installed RHEL 8.1 and GPFS 5.0.4.2.  The filesystem mounts fine on the node \
> when I disable CNFS on the node, but with it enabled it's a no go.  It appears \
> mmnfsmonitor doesn't recognize that nfsd has started, so it assumes the worst and \
> shuts down the file system (I currently have reboot on failure disabled to debug \
> this).  The thing is, it actually does start nfsd processes when running mmstartup \
> on the node.  Doing a "ps" shows 32 nfsd threads are running. 
> Below is the CNFS-specific output from an attempt to start the node:
> 
> CNFS[27243]: Restarting lockd to start grace
> CNFS[27588]: Enabling 172.16.69.76
> CNFS[27694]: Restarting lockd to start grace
> CNFS[27699]: Starting NFS services
> CNFS[27764]: NFS clients of node 172.16.69.122 notified to reclaim NLM locks
> CNFS[27910]: Monitor has started pid=27787
> CNFS[28702]: Monitor detected nfsd was not running, will attempt to start it
> CNFS[28705]: Starting NFS services
> CNFS[28730]: NFS clients of node 172.16.69.122 notified to reclaim NLM locks
> CNFS[28755]: Monitor detected nfsd was not running, will attempt to start it
> CNFS[28758]: Starting NFS services
> CNFS[28789]: NFS clients of node 172.16.69.122 notified to reclaim NLM locks
> CNFS[28813]: Monitor detected nfsd was not running, will attempt to start it
> CNFS[28816]: Starting NFS services
> CNFS[28844]: NFS clients of node 172.16.69.122 notified to reclaim NLM locks
> CNFS[28867]: Monitor detected nfsd was not running, will attempt to start it
> CNFS[28874]: Monitoring detected NFSD is inactive. mmnfsmonitor: NFS server is not \
> running or responding. Node failure initiated as configured. CNFS[28924]: \
> Unexporting all GPFS filesystems 
> Any thoughts?  My other CNFS node is handling everything for the time being, \
> thankfully! 
> Thanks,
> Bryan
> 
> ---
> Bryan Hill
> Lead System Administrator
> UCSD Physics Computing Facility
> 
> 9500 Gilman Dr.  # 0319
> La Jolla, CA 92093
> +1-858-534-5538
> bhill@ucsd.edu
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic