[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lustre-discuss
Subject:    [lustre-discuss] =?gb2312?b?u9i4tDogIENhbm5vdCBhZGQgbmV3IE9TVCBh?= =?gb2312?b?ZnRlciB1cGdyYWRlIGZyb2
From:       wanglu <wanglu () ihep ! ac ! cn>
Date:       2018-12-29 3:06:55
Message-ID: 201812291106558575205 () ihep ! ac ! cn
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]

[Attachment #4 (text/plain)]

Hi, 

This new OSTs are formated with e2fsprogs-1.44.3.wc1-0.el7.x86_64, while the MGS and \
other old OSTs  are formated with e2fsprogs-1.42.12.wc1 last year, and mount with  \
e2fsprogs-1.44.3.wc1-0.el7.x86_64 Do we need to run writeconf on all the devices \
following this process? \
https://lustre-discuss.lustre.narkive.com/Z5s6LU8B/lustre-2-5-2-unable-to-mount-ost 

Thanks,
Lu

====================================================================
Computing center,the Institute of High Energy Physics, CAS, China
Wang Lu                                        Tel: (+86) 10 8823 6087
P.O. Box 918-7                               Fax: (+86) 10 8823 6839
Beijing 100049  P.R. China            Email: Lu.Wang@ihep.ac.cn
===================================================================
 
From: wanglu
Date: 2018-12-28 10:45
To: lustre-discuss@lists.lustre.org
Subject: [lustre-discuss] Cannot add new OST after upgrade from 2.5.3 to 2.10.6
Hi£¬ 

For hardware compatibiility reason, we just upgraded a 2.5.3 instance to 2.10.6.  \
After that, when we tried to mount a new formated OST on 2.10.6, we got  failures on \
OSS. Here is the symptom: 1. The ost mount operation will stuck for about 10 mins, \
and then we got ¡°Is the MGS running?...¡± on terminal  2. In syslog, we found
 LustreError: 166-1: MGC192.168.50.63@tcp: Connection to MGS (at 192.168.50.63@tcp) \
was lost; in progress operations using this service will fail   LustreError: \
105461:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out \
(enqueued at 1545962328, 300s ago), entering recovery for MGS@MGC192.168.50.63@tcp_0 \
ns: MGC192.168.50.63@tcp lock: ffff9ae9283b8200/0xa4c148c2f2e256b9 lrc: 4/1,0 mode: \
-/CR res: [0x73666361:0x0:0x0].0x0 rrc: 3 type: PLN flags: 0x1000000000000 nid: local \
remote: 0x38d3cf901311c189 expref: -99 pid: 105461 timeout: 0 lvb_type: 0 3. During \
the stuck, we can see  ll_OST_XX and lazyldiskfsinit running on the new OSS, but the \
obdfilter directory can not be found under /proc/fs/lustre 4. On MDS+MGS node, we got \
  " 166-1: MGC192.168.50.63@tcp: Connection to MGS (at 0@lo) was lost; in progress \
operations using this service will fail" on MGS 5. After that , other new clients \
cannot mount the system.   6. It seemed the OST mount operation had caused problems \
on MGS, so we umounted the MDT and run e2fsck, and remount it.   7. After that,client \
mount  is possible, and we got deactivate ost on "lfs df". 8. When we tried to mount \
the new OSS, the symptom repeat again...

Any one has a hint on this problem?

Cheers,
Lu

====================================================================
Computing center,the Institute of High Energy Physics, CAS, China
Wang Lu                                        Tel: (+86) 10 8823 6087
P.O. Box 918-7                               Fax: (+86) 10 8823 6839
Beijing 100049  P.R. China            Email: Lu.Wang@ihep.ac.cn
===================================================================


[Attachment #5 (text/html)]

<html><head><meta http-equiv="content-type" content="text/html; \
charset=GB2312"><style>body { line-height: 1.5; }blockquote { margin-top: 0px; \
margin-bottom: 0px; margin-left: 0.5em; }div.foxdiv20181229110125866062 { }body { \
font-size: 10.5pt; font-family: ΢ÈíÑźÚ; color: rgb(0, 0, 0); line-height: 1.5; \
}</style></head><body>  <div style="FONT-FAMILY: \
Tahoma"><span></span>Hi,&nbsp;</div><div style="FONT-FAMILY: Tahoma"><br></div><div \
style="FONT-FAMILY: Tahoma"><span style="font-size: 10.5pt; line-height: 1.5; \
background-color: window;">This new OSTs are formated with&nbsp;</span><span \
style="color: rgb(0, 0, 0); font-size: 10.5pt; line-height: 1.5; background-color: \
rgba(0, 0, 0, 0);">e2fsprogs-1.44.3.wc1-0.el7.x86_64, while the MGS and other old \
OSTs &nbsp;are formated with&nbsp;</span><span style="color: rgb(0, 0, 0); font-size: \
10.5pt; line-height: 1.5; background-color: rgba(0, 0, 0, 0);">e2fsprogs-1.42.12.wc1 \
last year, and mount with &nbsp;</span><span style="font-size: 10.5pt; line-height: \
1.5; background-color: window;">e2fsprogs-1.44.3.wc1-0.el7.x86_64</span></div><div \
style="FONT-FAMILY: Tahoma"><span style="color: rgb(0, 0, 0); font-size: 10.5pt; \
line-height: 1.5; background-color: rgba(0, 0, 0, 0);">Do we need to run writeconf on \
all the devices following this process?</span></div><div style="FONT-FAMILY: \
Tahoma"><span style="background-color: rgba(0, 0, 0, 0); font-size: 10.5pt; \
line-height: 1.5;"></span><a \
href="https://lustre-discuss.lustre.narkive.com/Z5s6LU8B/lustre-2-5-2-unable-to-mount-ost" \
style="font-size: 10.5pt; line-height: 1.5; background-color: \
window;">https://lustre-discuss.lustre.narkive.com/Z5s6LU8B/lustre-2-5-2-unable-to-mount-ost</a><span \
style="color: rgb(0, 0, 0); font-size: 10.5pt; line-height: 1.5; background-color: \
rgba(0, 0, 0, 0);">&nbsp;</span></div><div style="FONT-FAMILY: Tahoma"><span \
style="color: rgb(0, 0, 0); font-size: 10.5pt; line-height: 1.5; background-color: \
rgba(0, 0, 0, 0);"><br></span></div><div style="FONT-FAMILY: \
Tahoma">Thanks,</div><div style="FONT-FAMILY: Tahoma">Lu</div><div \
style="FONT-FAMILY: Tahoma"><span style="color: rgb(0, 0, 0); font-size: 10.5pt; \
line-height: 1.5; background-color: rgba(0, 0, 0, 0);"><br></span></div> <div></div>
<div><div style="font-family: ΢ÈíÑźÚ, Tahoma; font-size: 14px; line-height: \
normal;">====================================================================</div><div \
style="font-family: ΢ÈíÑźÚ, Tahoma; font-size: 14px; line-height: \
normal;">Computing center,the Institute of High Energy Physics, CAS, China</div><div \
style="font-family: ΢ÈíÑźÚ, Tahoma; font-size: 14px; line-height: normal;">Wang Lu \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Tel: (+86) 10 8823 \
6087</div><div style="font-family: ΢ÈíÑźÚ, Tahoma; font-size: 14px; line-height: \
normal;">P.O. Box 918-7 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Fax: (+86) 10 8823 6839</div><div \
style="font-family: ΢ÈíÑźÚ, Tahoma; font-size: 14px; line-height: normal;">Beijing \
100049&nbsp; P.R. China &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Email: \
Lu.Wang@ihep.ac.cn</div><div style="font-family: ΢ÈíÑźÚ, Tahoma; font-size: 14px; \
line-height: normal;">===================================================================</div></div>
 <blockquote style="margin-Top: 0px; margin-Bottom: 0px; margin-Left: \
0.5em"><div>&nbsp;</div><div style="border:none;border-top:solid #B5C4DF \
1.0pt;padding:3.0pt 0cm 0cm 0cm"><div style="PADDING-RIGHT: 8px; PADDING-LEFT: 8px; \
FONT-SIZE: 12px;FONT-FAMILY:tahoma;COLOR:#000000; BACKGROUND: #efefef; \
PADDING-BOTTOM: 8px; PADDING-TOP: 8px"><div><b>From:</b>&nbsp;<a \
href="mailto:wanglu@ihep.ac.cn">wanglu</a></div><div><b>Date:</b>&nbsp;2018-12-28&nbsp;10:45</div><div><b>To:</b>&nbsp;<a \
href="mailto:lustre-discuss@lists.lustre.org">lustre-discuss@lists.lustre.org</a></div><div><b>Subject:</b>&nbsp;[lustre-discuss] \
Cannot add new OST after upgrade from 2.5.3 to 2.10.6</div></div></div><div><div \
class="FoxDiv20181229110125866062">  <div style="FONT-FAMILY: \
Tahoma"><span></span>Hi£¬&nbsp;</div><div style="FONT-FAMILY: \
Tahoma"><br></div><div><span style="font-family: Tahoma;">For hardware compatibiility \
reason, we just upgraded a 2.5.3 instance to 2.10.6. &nbsp;After that, when we tried \
to mount a new formated OST on 2.10.6, we got &nbsp;failures on OSS. Here is \
the&nbsp;</span><span style="font-family: Tahoma; color: rgb(0, 0, 0); \
background-color: rgba(0, 0, 0, 0);">symptom</span><span style="font-family: \
Tahoma;">:</span></div><div><span style="font-family: Tahoma;">1. The ost mount \
operation will stuck for about 10 mins, and then we got ¡°Is the MGS \
running?...¡±</span><span style="font-family: Tahoma; font-size: 10.5pt; line-height: \
1.5; background-color: window;">&nbsp;on terminal&nbsp;</span></div><div><span \
style="font-family: Tahoma; font-size: 10.5pt; line-height: 1.5; background-color: \
window;">2. In syslog, we found</span></div><div><span style="font-family: Tahoma; \
font-size: 10.5pt; line-height: 1.5; background-color: window;">&nbsp;</span><span \
style="font-family: Tahoma; color: rgb(0, 0, 0); font-size: 10.5pt; line-height: 1.5; \
background-color: rgba(0, 0, 0, \
0);">LustreError:&nbsp;166-1:&nbsp;MGC192.168.50.63@tcp:&nbsp;Connection&nbsp;to&nbsp; \
MGS&nbsp;(at&nbsp;192.168.50.63@tcp)&nbsp;was&nbsp;lost;&nbsp;in&nbsp;progress&nbsp;op \
erations&nbsp;using&nbsp;this&nbsp;service&nbsp;will&nbsp;fail&nbsp;</span></div><span \
style="font-family: Tahoma; color: rgb(0, 0, 0); background-color: rgba(0, 0, 0, \
0);">&nbsp; </span><span style="font-family: \
Tahoma;">&nbsp;LustreError:&nbsp;105461:0:(ldlm_request.c:148:ldlm_expired_completion_ \
wait())&nbsp;###&nbsp;lock&nbsp;timed&nbsp;out&nbsp;(enqueued&nbsp;at&nbsp;1545962328, \
&nbsp;300s&nbsp;ago),&nbsp;entering&nbsp;recovery&nbsp;for&nbsp;MGS@MGC192.168.50.63@t \
cp_0&nbsp;ns:&nbsp;MGC192.168.50.63@tcp&nbsp;lock:&nbsp;ffff9ae9283b8200/0xa4c148c2f2e \
256b9&nbsp;lrc:&nbsp;4/1,0&nbsp;mode:&nbsp;-/CR&nbsp;res:&nbsp;[0x73666361:0x0:0x0].0x \
0&nbsp;rrc:&nbsp;3&nbsp;type:&nbsp;PLN&nbsp;flags:&nbsp;0x1000000000000&nbsp;nid:&nbsp \
;local&nbsp;remote:&nbsp;0x38d3cf901311c189&nbsp;expref:&nbsp;-99&nbsp;pid:&nbsp;105461&nbsp;timeout:&nbsp;0&nbsp;lvb_type:&nbsp;0</span><div \
style="FONT-FAMILY: Tahoma"><span style="font-size: 10.5pt; line-height: 1.5; \
background-color: window;">3. During the stuck, we can see &nbsp;ll_OST_XX and \
lazyldiskfsinit running on the new OSS, but the obdfilter directory can not be found \
under /proc/fs/lustre</span></div><div style="FONT-FAMILY: Tahoma"><span \
style="font-size: 10.5pt; line-height: 1.5; background-color: window;">4. On MDS+MGS \
node, we got &nbsp;</span></div><div style="FONT-FAMILY: Tahoma">&nbsp; &nbsp;"<span \
style="background-color: rgba(0, 0, 0, 0); font-size: 10.5pt; line-height: \
1.5;">&nbsp;166-1:&nbsp;MGC192.168.50.63@tcp:&nbsp;Connection&nbsp;to&nbsp;MGS&nbsp;(a \
t&nbsp;0@lo)&nbsp;was&nbsp;lost;&nbsp;in&nbsp;progress&nbsp;operations&nbsp;using&nbsp;this&nbsp;service&nbsp;will&nbsp;fail</span><span \
style="font-size: 10.5pt; line-height: 1.5; background-color: window;">" on \
MGS</span></div><div style="FONT-FAMILY: Tahoma"><span style="font-size: 10.5pt; \
line-height: 1.5; background-color: window;">5. After that , other new clients cannot \
mount the system. &nbsp;</span></div><div style="FONT-FAMILY: Tahoma"><span \
style="font-size: 10.5pt; line-height: 1.5; background-color: window;">6. It seemed \
the OST mount operation had caused problems on MGS, so we umounted the MDT and run \
e2fsck, and remount it. &nbsp;</span></div><div style="FONT-FAMILY: Tahoma"><span \
style="font-size: 10.5pt; line-height: 1.5; background-color: window;">7. After \
that,client mount &nbsp;is possible, and we got deactivate ost on "lfs \
df".</span></div><div style="FONT-FAMILY: Tahoma"><span style="font-size: 10.5pt; \
line-height: 1.5; background-color: window;">8. When we tried to mount the new OSS, \
the symptom repeat again...</span></div><div style="FONT-FAMILY: Tahoma"><span \
style="font-size: 10.5pt; line-height: 1.5; background-color: \
window;"><br></span></div><div style="FONT-FAMILY: Tahoma">Any one has a hint on this \
problem?</div><div style="FONT-FAMILY: Tahoma"><br></div><div style="FONT-FAMILY: \
Tahoma">Cheers,</div><div style="FONT-FAMILY: Tahoma">Lu</div><div \
style="FONT-FAMILY: Tahoma"><br></div> <div></div>
<div><div style="font-family: ΢ÈíÑźÚ, Tahoma; font-size: 14px; line-height: \
normal;">====================================================================</div><div \
style="font-family: ΢ÈíÑźÚ, Tahoma; font-size: 14px; line-height: \
normal;">Computing center,the Institute of High Energy Physics, CAS, China</div><div \
style="font-family: ΢ÈíÑźÚ, Tahoma; font-size: 14px; line-height: normal;">Wang Lu \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Tel: (+86) 10 8823 \
6087</div><div style="font-family: ΢ÈíÑźÚ, Tahoma; font-size: 14px; line-height: \
normal;">P.O. Box 918-7 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Fax: (+86) 10 8823 6839</div><div \
style="font-family: ΢ÈíÑźÚ, Tahoma; font-size: 14px; line-height: normal;">Beijing \
100049&nbsp; P.R. China &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Email: \
Lu.Wang@ihep.ac.cn</div><div style="font-family: ΢ÈíÑźÚ, Tahoma; font-size: 14px; \
line-height: normal;">===================================================================</div></div>
 </div></div></blockquote></body></html>



_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic