[prev in list] [next in list] [prev in thread] [next in thread]
List: lustre-discuss
Subject: Re: [lustre-discuss] 2.15.4 o2iblnd on RoCEv2?
From: Andreas Dilger via lustre-discuss <lustre-discuss () lists ! lustre ! org>
Date: 2024-01-10 21:55:57
Message-ID: 03B63E33-2F62-427D-95E6-4BAC13B15D44 () ddn ! com
[Download RAW message or body]
[Attachment #2 (text/plain)]
Granted that I'm not an LNet expert, but "errno: -1 descr: cannot parse net \
'<255:65535>' " doesn't immediately lead me to the same conclusion as if "unknown \
internface 'ib0' " were printed for the error message. Also "errno: -1" is "-EPERM = \
Operation not permitted", and doesn't give the same information as "-ENXIO = No such \
device or address" or even "-EINVAL = Invalid argument" would.
That said, I can't even offer a patch for this myself, since that exact error message \
is used in a few different places, though I suspect it is coming from \
lustre_lnet_config_ni().
Looking further into this, now that I've found where (I think) the error message is \
generated, it seems that "errno: -1" is not "-EPERM" but rather \
"LUSTRE_CFG_RC_BAD_PARAM", which is IMHO a travesty to use different error numbers \
(and then print them after "errno:") instead of existing POSIX error codes that could \
fill the same role (with some creative mapping):
#define LUSTRE_CFG_RC_NO_ERR 0 => fine
#define LUSTRE_CFG_RC_BAD_PARAM -1 => -EINVAL
#define LUSTRE_CFG_RC_MISSING_PARAM -2 => -EFAULT
#define LUSTRE_CFG_RC_OUT_OF_RANGE_PARAM -3 => -ERANGE
#define LUSTRE_CFG_RC_OUT_OF_MEM -4 => -ENOMEM
#define LUSTRE_CFG_RC_GENERIC_ERR -5 => -ENODATA
#define LUSTRE_CFG_RC_NO_MATCH -6 => -ENOMSG
#define LUSTRE_CFG_RC_MATCH -7 => -EXFULL
#define LUSTRE_CFG_RC_SKIP -8 => -EBADSLT
#define LUSTRE_CFG_RC_LAST_ELEM -9 => -ECHRNG
#define LUSTRE_CFG_RC_MARSHAL_FAIL -10 => -ENOSTR
I don't think "overloading" the POSIX error codes to mean something similar is worse \
than using random numbers to report errors. Also, in some cases (even in \
lustre_lnet_config_ni()) it is using "rc = -errno" so the LUSTRE_CFG_RC_* errors are \
*already* conflicting with POSIX error numbers, and it impossible to distinguish \
between them...
The main question is whether changing these numbers will break a user->kernel \
interface, or if these definitions are only in userspace? It looks like lnetctl.c \
is only ever checking "!= LUSTRE_CFG_RC_NO_ERR", so maybe it is fine? None of the \
values currently overlap, so it would be possible to start accepting either of the \
values for the return in the user tools, and then at some point in the future start \
actually returning them... Something for the LNet folks to figure out.
Cheers, Andreas
On Jan 10, 2024, at 13:29, Jeff Johnson \
<jeff.johnson@aeoncomputing.com<mailto:jeff.johnson@aeoncomputing.com>> wrote:
A LU ticket and patch for lnetctl or for me being an under-caffeinated
idiot? ;-)
On Wed, Jan 10, 2024 at 12:06 PM Andreas Dilger \
<adilger@whamcloud.com<mailto:adilger@whamcloud.com>> wrote:
It would seem that the error message could be improved in this case? Could you file \
an LU ticket for that with the reproducer below, and ideally along with a patch?
Cheers, Andreas
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud
[Attachment #3 (text/html)]
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: \
after-white-space;" class=""> Granted that I'm not an LNet expert, but "errno: \
-1 descr: cannot parse net '<255:65535>' " doesn't immediately lead \
me to the same conclusion as if "unknown internface 'ib0' " were printed \
for the error message. Also "errno: -1" is "-EPERM = Operation \
not permitted", and doesn't give the same information as "-ENXIO = No such \
device or address" or even "-EINVAL = Invalid argument" would. <div \
class=""><br class=""> </div>
<div class="">That said, I can't even offer a patch for this myself, since that exact \
error message is used in a few different places, though I suspect it is coming \
from lustre_lnet_config_ni().</div> <div class=""><br class="">
</div>
<div class="">Looking further into this, now that I've found where (I think) the \
error message is generated, it seems that "errno: -1" is not \
"-EPERM" but rather "LUSTRE_CFG_RC_BAD_PARAM", which is IMHO a \
travesty to use different error numbers (and then print them after \
"errno:") instead of existing POSIX error codes that could fill the same \
role (with some creative mapping):</div> <div class=""><br class="">
</div>
<div class=""><font face="Courier New" class=""> #define \
LUSTRE_CFG_RC_NO_ERR \
0 => fine<br class=""> #define \
LUSTRE_CFG_RC_BAD_PARAM \
-1 => -EINVAL<br class=""> #define \
LUSTRE_CFG_RC_MISSING_PARAM -2 \
=> -EFAULT<br class=""> #define \
LUSTRE_CFG_RC_OUT_OF_RANGE_PARAM -3 => \
-ERANGE<br class=""> #define LUSTRE_CFG_RC_OUT_OF_MEM \
-4 => -ENOMEM<br class=""> \
#define LUSTRE_CFG_RC_GENERIC_ERR \
-5 => -ENODATA</font></div> <div class=""><font \
face="Courier New" class=""> #define LUSTRE_CFG_RC_NO_MATCH \
-6 => -ENOMSG<br \
class=""> #define LUSTRE_CFG_RC_MATCH \
-7 => -EXFULL<br \
class=""> #define LUSTRE_CFG_RC_SKIP \
-8 => -EBADSLT<br \
class=""> #define LUSTRE_CFG_RC_LAST_ELEM \
-9 => -ECHRNG<br class=""> \
#define LUSTRE_CFG_RC_MARSHAL_FAIL \
-10 => -ENOSTR<br class=""> </font><br class="">
I don't think "overloading" the POSIX error codes to mean something similar \
is worse than using random numbers to report errors. Also, in some cases (even \
in lustre_lnet_config_ni()) it is using "rc = -errno" so the \
LUSTRE_CFG_RC_* errors are *already* conflicting with POSIX error numbers, and it \
impossible to distinguish between them...</div> <div class=""><br class="">
</div>
<div class="">The main question is whether changing these numbers will break a \
user->kernel interface, or if these definitions are only in userspace? \
It looks like lnetctl.c is only ever checking "!= \
LUSTRE_CFG_RC_NO_ERR", so maybe it is fine? None of the values currently \
overlap, so it would be possible to start accepting either of the values for the \
return in the user tools, and then at some point in the future start actually \
returning them... Something for the LNet folks to figure out.<br class=""> <div \
class=""><br class=""> </div>
<div class="">Cheers, Andreas<br class="">
<div><br class="">
<blockquote type="cite" class="">
<div class="">On Jan 10, 2024, at 13:29, Jeff Johnson <<a \
href="mailto:jeff.johnson@aeoncomputing.com" \
class="">jeff.johnson@aeoncomputing.com</a>> wrote:</div> <br \
class="Apple-interchange-newline"> <div class="">
<div class="">A LU ticket and patch for lnetctl or for me being an \
under-caffeinated<br class=""> idiot? ;-)<br class="">
<br class="">
On Wed, Jan 10, 2024 at 12:06 PM Andreas Dilger <<a \
href="mailto:adilger@whamcloud.com" class="">adilger@whamcloud.com</a>> wrote:<br \
class=""> <blockquote type="cite" class=""><br class="">
It would seem that the error message could be improved in this case? Could you \
file an LU ticket for that with the reproducer below, and ideally along with a \
patch?<br class=""> <br class="">
Cheers, Andreas<br class="">
</blockquote>
_______________________________________________<br class="">
lustre-discuss mailing list<br class="">
<a href="mailto:lustre-discuss@lists.lustre.org" \
class="">lustre-discuss@lists.lustre.org</a><br class=""> \
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org<br class=""> </div>
</div>
</blockquote>
</div>
<br class="">
<div class="">
<div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); \
letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; \
white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; \
text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: \
after-white-space;" class=""> <div dir="auto" style="caret-color: rgb(0, 0, 0); \
color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; \
text-transform: none; white-space: normal; word-spacing: 0px; \
-webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; \
-webkit-nbsp-mode: space; line-break: after-white-space;" class=""> <div dir="auto" \
style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; \
text-align: start; text-indent: 0px; text-transform: none; white-space: normal; \
word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: \
break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""> <div \
dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: \
normal; text-align: start; text-indent: 0px; text-transform: none; white-space: \
normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; \
word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" \
class=""> <div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); \
letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; \
white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; \
text-decoration: none; word-wrap: break-word; -webkit-nbsp-mode: space; line-break: \
after-white-space;" class=""> <div dir="auto" style="caret-color: rgb(0, 0, 0); \
color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; \
text-transform: none; white-space: normal; word-spacing: 0px; \
-webkit-text-stroke-width: 0px; text-decoration: none; word-wrap: break-word; \
-webkit-nbsp-mode: space; line-break: after-white-space;" class=""> <div>Cheers, \
Andreas</div> <div>--</div>
<div>Andreas Dilger</div>
<div>Lustre Principal Architect</div>
<div>Whamcloud</div>
<div><br class="">
</div>
<div><br class="">
</div>
<div><br class="">
</div>
</div>
</div>
</div>
</div>
</div>
<br class="Apple-interchange-newline">
</div>
<br class="Apple-interchange-newline">
<br class="Apple-interchange-newline">
</div>
<br class="">
</div>
</div>
</body>
</html>
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
--===============4676233661479678054==--
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic