[prev in list] [next in list] [prev in thread] [next in thread]
List: grid-engine-dev
Subject: question about behaviour of utilbin/<arch>/gethostbyaddr
From: Chris Dagdigian <dag () sonsorol ! org>
Date: 2004-06-21 20:57:49
Message-ID: 40D74BCD.9030107 () sonsorol ! org
[Download RAW message or body]
The vast majority of SGE installation problems I see are related to
hostname resolution issues. If everything is not perfectly resolving
forwards and backwords a smooth install will not be likely to occur.
I got bit by this today (again) with SGE 6
The problem was with utilbin/gethostbyaddr
It does not cleanly handle reverse lookups on IPs that return only a
hostname instead of a FQDN ptr record. It fails in a way that would
confuse and frustrate someone attempting to intall SGE for the first
time (assuming they can even trace the install failure to a hostname
resolution issue...)
Every other lookup related tool such as dig, host and nslookup handles
this situation just fine. It is only yhe gethostbyaddr in SGE's utilbin
dir that fails and this will often be the root cause of a failure to
successfully install a qmaster host.
My basic home lab setup uses a private internal-only DNS zone called
"private.sonsorol.net". I've got forward and reverse DNS enabled but my
zone maps for the reverse range just return a hostname pointer instead
of a FQDN. This has never been a problem with any OS or any query tool
and I think is how many zonefiles are setup by default:
An example would be:
> $ host bladebox
> bladebox.private.sonsorol.net has address 192.168.0.205
And the reverse...
> $ host 192.168.0.205
> 205.0.168.192.in-addr.arpa domain name pointer bladebox.
The problem with SGE utilbin/gethostbbyaddr is that I see this behavior
with N1GE 6 on OS X and Linux:
$ ./utilbin/lx24-x86/gethostbyaddr -all 192.168.0.205
error resolving ip "192.168.0.205": can't resolve ip (Success)
This sort of resolving error is all that you need to have a failed SGE
install. It is only by habit and experience that I know to start running
the utilbin programs when things go awry :) This could frustrate some
new users I suspect.
The problem goes away if I change my reverse DNS zonefile to use a FQDN:
$ ./utilbin/lx24-x86/gethostbyaddr -all 192.168.0.205
Hostname: bladebox.private.sonsorol.net
SGE name: bladebox.private.sonsorol.net
Aliases:
Host Address(es): 192.168.0.205
My basic question is this:
Is there a reason why gethostbyaddr fails to properly perform a reverse
IP lookup when non FQDN's are involved? Does it not honor search paths
in resolv.conf or something?
If there is a valid reason :) can gethostbyaddr perhaps throw a more
informative error message that points the user to the root cause?
Regards,
Chris
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic