[prev in list] [next in list] [prev in thread] [next in thread]
List: lustre-discuss
Subject: [Lustre-discuss] help
From: Colin_Faber () xyratex ! com (Colin Faber)
Date: 2011-09-30 14:46:48
Message-ID: CA68FDCAE785124C81F07EC64FA6302F01C67485 () XYUS-EX21 ! xyus ! xyratex ! com
[Download RAW message or body]
Hi,
Looks like connection timeout, likely temporary as it appears to have
reconnected and recovered without any problems.
What other issue are you experiencing?
-cf
On 09/29/2011 10:39 PM, Ashok nulguda wrote:
> Dear All,
>
> I am having lustre error on my HPC as given below.Please any one can
> help me to resolve this problem.
> Thanks in Advance.
> Sep 30 08:40:23 service0 kernel: [343138.837222] Lustre:
> 8300:0:(client.c:1476:ptlrpc_expire_one_request()) Skipped 1 previous
> similar message
> Sep 30 08:40:23 service0 kernel: [343138.837233] Lustre:
> lustre-OST0008-osc-ffff880b272cf800: Connection to service
> lustre-OST0008 via nid 10.148.0.106 at o2ib was lost; in progress
> operations using this service will wait for recovery to complete.
> Sep 30 08:40:24 service0 kernel: [343139.837260] Lustre:
> 8300:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
> x1380984193067288 sent from lustre-OST0006-osc-ffff880b272cf800 to NID
> 10.148.0.106 at o2ib 7s ago has timed out (7s prior to deadline).
> Sep 30 08:40:24 service0 kernel: [343139.837263]
> req at ffff880a5f800c00 x1380984193067288/t0
> o3->lustre-OST0006_UUID at 10.148.0.106@o2ib:6/4 lens 448/592 e 0 to 1 dl
> 1317352224 ref 2 fl Rpc:/0/0 rc 0/0
> Sep 30 08:40:24 service0 kernel: [343139.837269] Lustre:
> 8300:0:(client.c:1476:ptlrpc_expire_one_request()) Skipped 38 previous
> similar messages
> Sep 30 08:40:24 service0 kernel: [343140.129284] LustreError:
> 9983:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Got rc -11 from
> cancel RPC: canceling anyway
> Sep 30 08:40:24 service0 kernel: [343140.129290] LustreError:
> 9983:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Skipped 1 previous
> similar message
> Sep 30 08:40:24 service0 kernel: [343140.129295] LustreError:
> 9983:0:(ldlm_request.c:1587:ldlm_cli_cancel_list())
> ldlm_cli_cancel_list: -11
> Sep 30 08:40:24 service0 kernel: [343140.129299] LustreError:
> 9983:0:(ldlm_request.c:1587:ldlm_cli_cancel_list()) Skipped 1 previous
> similar message
> Sep 30 08:40:25 service0 kernel: [343140.837308] Lustre:
> 8300:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
> x1380984193067299 sent from lustre-OST0010-osc-ffff880b272cf800 to NID
> 10.148.0.106 at o2ib 7s ago has timed out (7s prior to deadline).
> Sep 30 08:40:25 service0 kernel: [343140.837311]
> req at ffff880a557c4400 x1380984193067299/t0
> o3->lustre-OST0010_UUID at 10.148.0.106@o2ib:6/4 lens 448/592 e 0 to 1 dl
> 1317352225 ref 2 fl Rpc:/0/0 rc 0/0
> Sep 30 08:40:25 service0 kernel: [343140.837316] Lustre:
> 8300:0:(client.c:1476:ptlrpc_expire_one_request()) Skipped 4 previous
> similar messages
> Sep 30 08:40:26 service0 kernel: [343141.245365] LustreError:
> 30978:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Got rc -11 from
> cancel RPC: canceling anyway
> Sep 30 08:40:26 service0 kernel: [343141.245371] LustreError:
> 22729:0:(ldlm_request.c:1587:ldlm_cli_cancel_list())
> ldlm_cli_cancel_list: -11
> Sep 30 08:40:26 service0 kernel: [343141.245378] LustreError:
> 30978:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Skipped 1 previous
> similar message
> Sep 30 08:40:33 service0 kernel: [343148.245683] Lustre:
> 22725:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
> x1380984193067302 sent from lustre-OST0004-osc-ffff880b272cf800 to NID
> 10.148.0.106 at o2ib 14s ago has timed out (14s prior to deadline).
> Sep 30 08:40:33 service0 kernel: [343148.245686]
> req at ffff8805c879e800 x1380984193067302/t0
> o103->lustre-OST0004_UUID at 10.148.0.106@o2ib:17/18 lens 296/384 e 0 to
> 1 dl 1317352233 ref 1 fl Rpc:N/0/0 rc 0/0
> Sep 30 08:40:33 service0 kernel: [343148.245692] Lustre:
> 22725:0:(client.c:1476:ptlrpc_expire_one_request()) Skipped 2 previous
> similar messages
> Sep 30 08:40:33 service0 kernel: [343148.245708] LustreError:
> 22725:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Got rc -11 from
> cancel RPC: canceling anyway
> Sep 30 08:40:33 service0 kernel: [343148.245714] LustreError:
> 22725:0:(ldlm_request.c:1587:ldlm_cli_cancel_list())
> ldlm_cli_cancel_list: -11
> Sep 30 08:40:33 service0 kernel: [343148.245717] LustreError:
> 22725:0:(ldlm_request.c:1587:ldlm_cli_cancel_list()) Skipped 1
> previous similar message
> Sep 30 08:40:36 service0 kernel: [343151.548005] LustreError: 11-0: an
> error occurred while communicating with 10.148.0.106 at o2ib. The
> ost_connect operation failed with -16
> Sep 30 08:40:36 service0 kernel: [343151.548008] LustreError: Skipped
> 1 previous similar message
> Sep 30 08:40:36 service0 kernel: [343151.548024] LustreError: 167-0:
> This client was evicted by lustre-OST000b; in progress operations
> using this service will fail.
> Sep 30 08:40:36 service0 kernel: [343151.548250] LustreError:
> 30452:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5
> Sep 30 08:40:36 service0 kernel: [343151.550210] LustreError:
> 8300:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID
> req at ffff88049528c400 x1380984193067406/t0
> o3->lustre-OST000b_UUID at 10.148.0.106@o2ib:6/4 lens 448/592 e 0 to 1 dl
> 0 ref 2 fl Rpc:/0/0 rc 0/0
> Sep 30 08:40:36 service0 kernel: [343151.594742] Lustre:
> lustre-OST0000-osc-ffff880b272cf800: Connection restored to service
> lustre-OST0000 using nid 10.148.0.106 at o2ib.
> Sep 30 08:40:36 service0 kernel: [343151.837203] Lustre:
> lustre-OST0006-osc-ffff880b272cf800: Connection restored to service
> lustre-OST0006 using nid 10.148.0.106 at o2ib.
> Sep 30 08:40:37 service0 kernel: [343152.842631] Lustre:
> lustre-OST0003-osc-ffff880b272cf800: Connection restored to service
> lustre-OST0003 using nid 10.148.0.106 at o2ib.
> Sep 30 08:40:37 service0 kernel: [343152.842636] Lustre: Skipped 3
> previous similar messages
>
>
> Thanks and Regards
> Ashok
>
> --
> *Ashok Nulguda
> *
> *TATA ELXSI LTD*
> *Mb : +91 9689945767
> *
> *Email :ashokn at tataelxsi.co.in <mailto:tshrikant at tataelxsi.co.in>*
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
______________________________________________________________________
This email may contain privileged or confidential information, which should only be \
used for the purpose for which it was sent by Xyratex. No further rights or licenses \
are granted to use such information. If you are not the intended recipient of this \
message, please notify the sender by return and delete it. You may not use, copy, \
disclose or rely on the information contained in it.
Internet email is susceptible to data corruption, interception and unauthorised \
amendment for which Xyratex does not accept liability. While we have taken reasonable \
precautions to ensure that this email is free of viruses, Xyratex does not accept \
liability for the presence of any computer viruses in this email, nor for any losses \
caused as a result of viruses.
Xyratex Technology Limited (03134912), Registered in England & Wales, Registered \
Office, Langstone Road, Havant, Hampshire, PO9 1SA.
The Xyratex group of companies also includes, Xyratex Ltd, registered in Bermuda, \
Xyratex International Inc, registered in California, Xyratex (Malaysia) Sdn Bhd \
registered in Malaysia, Xyratex Technology (Wuxi) Co Ltd registered in The People's \
Republic of China and Xyratex Japan Limited registered in Japan. \
______________________________________________________________________
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic