[prev in list] [next in list] [prev in thread] [next in thread]
List: lustre-discuss
Subject: [lustre-discuss] Appropriate Umount Ordering
From: Ellis Wilson via lustre-discuss <lustre-discuss () lists ! lustre ! org>
Date: 2022-02-17 16:07:56
Message-ID: MN2PR21MB143988BA5FAB6DF96907CBA5BE369 () MN2PR21MB1439 ! namprd21 ! prod ! outlook ! com
[Download RAW message or body]
Hi all,
(Hopefully) simple two questions this time around. This is for 2.14.0, and my \
cluster is setup with no failovers for MDTs or OSTs. OBD timeouts have not been \
altered from the defaults.
Question 1:
I read on the Lustre Wiki that the appropriate ordering to umount the various \
components of a Lustre filesystem is: 1. Clients
2. MDT(s)
3. OSTs
4. MGS
However, if I do it this way, the OST mounts always hang for 04:25 seconds before \
umounting. Dmesg reports: [88944.272233] Lustre: \
30178:0:(client.c:2282:ptlrpc_expire_one_request()) @@@ Request sent has timed out \
for slow reply: [sent 1645111309/real 1645111309] req@00000000cc9c1aeb \
x1724931853622016/t0(0) o39->lustrefs-MDT0000-lwp-OST0000@10.1.98.8@tcp:12/10 lens \
224/224 e 0 to 1 dl 1645111574 ref 2 fl Rpc:XNQr/0/ffffffff rc 0/-1 job:'' \
[88944.275884] Lustre: Failing over lustrefs-OST0000 [88944.429622] Lustre: server \
umount lustrefs-OST0000 complete
For reference, if I reverse OSTs and MDT (do the MDT second), then all of the OST \
umounts are fast, but the MDT takes a whopping 8 minutes and 50 seconds to umount.
Why is the canonical shutdown ordering delaying so long (and so specifically) for me?
Question 2:
In all cases (OSTs or MDTs) of umount, whether they are fast or not, I see messages \
like the following in dmesg: [88944.275884] Lustre: Failing over lustrefs-OST0000
or
[78406.007678] Lustre: Failing over lustrefs-MDT0000
There is no failover configured in my setup. The MGS is up the entire time in all \
cases. What is lustre doing here? How do I explicitly disable this failover \
attempt, since it seems to be at best misleading and at worst directly related to the \
lengthy delays? FWIW, I have tried umount with '-f' to cause the MDT to go into \
failout rather than failover to no avail.
Thanks for any help folks can offer on this in advance,
ellis
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic