[prev in list] [next in list] [prev in thread] [next in thread]
List: gpfsug-discuss
Subject: Re: [gpfsug-discuss] CNFS issue after upgrading from 4.2.3.11 to 5.0.4.2
From: Bryan Hill <bhill () physics ! ucsd ! edu>
Date: 2020-06-01 15:32:09
Message-ID: CADL=sMJuwqjG_dufuta567AnOgXpKVQN=vijsUa44xf5Bkivmw () mail ! gmail ! com
[Download RAW message or body]
Hi:
Just a note on this: the pidof fix was accepted upstream but has not
made its way into rhel 8.2 yet
Thanks,
Bryan
---
Bryan Hill
Lead System Administrator
UCSD Physics Computing Facility
9500 Gilman Dr. # 0319
La Jolla, CA 92093
+1-858-534-5538
bhill@ucsd.edu
On Mon, Feb 17, 2020 at 12:02 AM Malahal R Naineni <mnaineni@in.ibm.com> wrote:
>
> I filed a defect here, let us see what Redhat says. Yes, it doesn't work for any \
> kernel threads. It doesn't work for user level threads/processes.
> https://bugzilla.redhat.com/show_bug.cgi?id=1803640
>
> Regards, Malahal.
>
>
> ----- Original message -----
> From: Bryan Hill <bhill@physics.ucsd.edu>
> Sent by: gpfsug-discuss-bounces@spectrumscale.org
> To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>
> Cc:
> Subject: [EXTERNAL] Re: [gpfsug-discuss] CNFS issue after upgrading from 4.2.3.11 \
> to 5.0.4.2
> Date: Mon, Feb 17, 2020 8:26 AM
>
> Ah wait, I see what you might mean. pidof works but not specifically for processes \
> like nfsd. That is odd.
> Thanks,
> Bryan
>
>
>
> On Sun, Feb 16, 2020 at 10:19 AM Bryan Hill <bhill@physics.ucsd.edu> wrote:
>
> Hi Malahal:
>
> Just to clarify, are you saying that on your VM pidof is missing? Or that it is \
> there and not working as it did prior to RHEL/CentOS 8? pidof is returning pid \
> numbers on my system. I've been looking at the mmnfsmonitor script and trying to \
> see where the check for nfsd might be failing, but I've not been able to figure it \
> out yet.
>
>
> Thanks,
> Bryan
>
> ---
> Bryan Hill
> Lead System Administrator
> UCSD Physics Computing Facility
>
> 9500 Gilman Dr. # 0319
> La Jolla, CA 92093
> +1-858-534-5538
> bhill@ucsd.edu
>
> On Sat, Feb 15, 2020 at 2:03 AM Malahal R Naineni <mnaineni@in.ibm.com> wrote:
>
> I am not familiar with CNFS but looking at git source seems to indicate that it \
> uses 'pidof' to check if a program is running or not. "pidof nfsd" works on RHEL7.x \
> but it fails on my centos8.1 I just created. So either we need to make sure pidof \
> works on kernel threads or fix CNFS scripts.
> Regards, Malahal.
>
>
> ----- Original message -----
> From: Bryan Hill <bhill@physics.ucsd.edu>
> Sent by: gpfsug-discuss-bounces@spectrumscale.org
> To: gpfsug-discuss@spectrumscale.org
> Cc:
> Subject: [EXTERNAL] [gpfsug-discuss] CNFS issue after upgrading from 4.2.3.11 to \
> 5.0.4.2
> Date: Fri, Feb 14, 2020 11:40 PM
>
> Hi All:
>
> I'm performing a rolling upgrade of one of our GPFS clusters. This particular \
> cluster has 2 CNFS servers for some of our NFS clients. I wiped one of the nodes \
> and installed RHEL 8.1 and GPFS 5.0.4.2. The filesystem mounts fine on the node \
> when I disable CNFS on the node, but with it enabled it's a no go. It appears \
> mmnfsmonitor doesn't recognize that nfsd has started, so it assumes the worst and \
> shuts down the file system (I currently have reboot on failure disabled to debug \
> this). The thing is, it actually does start nfsd processes when running mmstartup \
> on the node. Doing a "ps" shows 32 nfsd threads are running.
> Below is the CNFS-specific output from an attempt to start the node:
>
> CNFS[27243]: Restarting lockd to start grace
> CNFS[27588]: Enabling 172.16.69.76
> CNFS[27694]: Restarting lockd to start grace
> CNFS[27699]: Starting NFS services
> CNFS[27764]: NFS clients of node 172.16.69.122 notified to reclaim NLM locks
> CNFS[27910]: Monitor has started pid=27787
> CNFS[28702]: Monitor detected nfsd was not running, will attempt to start it
> CNFS[28705]: Starting NFS services
> CNFS[28730]: NFS clients of node 172.16.69.122 notified to reclaim NLM locks
> CNFS[28755]: Monitor detected nfsd was not running, will attempt to start it
> CNFS[28758]: Starting NFS services
> CNFS[28789]: NFS clients of node 172.16.69.122 notified to reclaim NLM locks
> CNFS[28813]: Monitor detected nfsd was not running, will attempt to start it
> CNFS[28816]: Starting NFS services
> CNFS[28844]: NFS clients of node 172.16.69.122 notified to reclaim NLM locks
> CNFS[28867]: Monitor detected nfsd was not running, will attempt to start it
> CNFS[28874]: Monitoring detected NFSD is inactive. mmnfsmonitor: NFS server is not \
> running or responding. Node failure initiated as configured. CNFS[28924]: \
> Unexporting all GPFS filesystems
> Any thoughts? My other CNFS node is handling everything for the time being, \
> thankfully!
> Thanks,
> Bryan
>
> ---
> Bryan Hill
> Lead System Administrator
> UCSD Physics Computing Facility
>
> 9500 Gilman Dr. # 0319
> La Jolla, CA 92093
> +1-858-534-5538
> bhill@ucsd.edu
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic