[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-user
Subject:    RE: Why I cannot see live nodes in a LAN-based cluster setup?
From:       <Jeff.Schmitz () shell ! com>
Date:       2011-06-28 14:30:48
Message-ID: C4F27D8A27AA5748B100E6CC5EF36CB40284234B () houic-s-03344 ! americas ! shell ! com
[Download RAW message or body]

You may also try removing the hadoop-"yourname" directory from /tmp - and \
reformatting HDFS - it may be corrupted

-----Original Message-----
From: GOEKE, MATTHEW (AG/1000) [mailto:matthew.goeke@monsanto.com] 
Sent: Monday, June 27, 2011 10:57 PM
To: common-user@hadoop.apache.org
Subject: RE: Why I cannot see live nodes in a LAN-based cluster setup?

At this point if that is the correct ip then I would see if you can actually ssh from \
the DN to the NN to make sure it can actually connect to the other box. If you can \
successfully connect through ssh then it's just a matter of figuring out why that \
port is having issues (netstat is your friend in this case). If you see it listening \
on 54310 then just power cycle the box and try again.

Matt

-----Original Message-----
From: Jingwei Lu [mailto:jlu@ucsd.edu] 
Sent: Monday, June 27, 2011 5:38 PM
To: common-user@hadoop.apache.org
Subject: Re: Why I cannot see live nodes in a LAN-based cluster setup?

Hi Matt and Jeff:

Thanks a lot for your instructions. I corrected the mistakes in conf files
of DN, and now the log on DN becomes:

2011-06-27 15:32:36,025 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 0 time(s).
2011-06-27 15:32:37,028 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 1 time(s).
2011-06-27 15:32:38,031 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 2 time(s).
2011-06-27 15:32:39,034 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 3 time(s).
2011-06-27 15:32:40,037 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 4 time(s).
2011-06-27 15:32:41,040 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 5 time(s).
2011-06-27 15:32:42,043 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 6 time(s).
2011-06-27 15:32:43,046 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 7 time(s).
2011-06-27 15:32:44,049 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 8 time(s).
2011-06-27 15:32:45,052 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: clock.ucsd.edu/132.239.95.91:54310. Already tried 9 time(s).
2011-06-27 15:32:45,053 INFO org.apache.hadoop.ipc.RPC: Server at
clock.ucsd.edu/132.239.95.91:54310 not available yet, Zzzzz...

Seems DN is trying to bind with NN but always fails...



Best Regards
Yours Sincerely

Jingwei Lu



On Mon, Jun 27, 2011 at 2:22 PM, GOEKE, MATTHEW (AG/1000) <
matthew.goeke@monsanto.com> wrote:

> As a follow-up to what Jeff posted: go ahead and ignore the message you got
> on the NN for now.
> 
> If you look at the address that the DN log shows it is 127.0.0.1 and the
> ip:port it is trying to connect to for the NN is 127.0.0.1:54310 ---> it
> is trying to bind to itself as if it was still in single machine mode. Make
> sure that you have correctly pushed the URI for the NN into the config files
> on both machines and then bounce DFS.
> 
> Matt
> 
> -----Original Message-----
> From: Jeff.Schmitz@shell.com [mailto:Jeff.Schmitz@shell.com]
> Sent: Monday, June 27, 2011 4:08 PM
> To: common-user@hadoop.apache.org
> Subject: RE: Why I cannot see live nodes in a LAN-based cluster setup?
> 
> http://www.mentby.com/tim-robertson/error-register-getprotocolversion.html
> 
> 
> 
> -----Original Message-----
> From: Jingwei Lu [mailto:jlu@ucsd.edu]
> Sent: Monday, June 27, 2011 3:58 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Why I cannot see live nodes in a LAN-based cluster setup?
> 
> Hi,
> 
> I just manually modify the masters & slaves files in the both machines.
> 
> I found something wrong in the log files, as shown below:
> 
> -- Master :
> namenote.log:
> 
> ****************************************
> 2011-06-27 13:44:47,055 INFO org.mortbay.log: jetty-6.1.14
> 2011-06-27 13:44:47,394 INFO org.mortbay.log: Started
> SelectChannelConnector@0.0.0.0:50070
> 2011-06-27 13:44:47,395 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at:
> 0.0.0.0:50070
> 2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
> Responder: starting
> 2011-06-27 13:44:47,395 INFO org.apache.hadoop.ipc.Server: IPC Server
> listener on 54310: starting
> 2011-06-27 13:44:47,396 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 0 on 54310: starting
> 2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 54310: starting
> 2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 54310: starting
> 2011-06-27 13:44:47,397 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 3 on 54310: starting
> 2011-06-27 13:44:47,402 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 4 on 54310: starting
> 2011-06-27 13:44:47,404 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 5 on 54310: starting
> 2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 6 on 54310: starting
> 2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 7 on 54310: starting
> 2011-06-27 13:44:47,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 8 on 54310: starting
> 2011-06-27 13:44:47,408 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 9 on 54310: starting
> 2011-06-27 13:44:47,500 INFO org.apache.hadoop.ipc.Server: Error register
> getProtocolVersion
> java.lang.IllegalArgumentException: Duplicate
> metricsName:getProtocolVersion
> at
> org.apache.hadoop.metrics.util.MetricsRegistry.add(MetricsRegistry.java:53)
> at
> 
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:89)
>  at
> 
> org.apache.hadoop.metrics.util.MetricsTimeVaryingRate.<init>(MetricsTimeVaryingRate.java:99)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:416)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
> 2011-06-27 13:45:02,572 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.registerDatanode: node registration from 127.0.0.1:50010storage
> DS-87816363-127.0.0.1-50010-1309207502566
> ****************************************
> 
> 
> -- slave:
> datanode.log:
> 
> ****************************************
> 1 2011-06-27 13:45:00,335 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> 2 /************************************************************
> 3 STARTUP_MSG: Starting DataNode
> 4 STARTUP_MSG:   host = hdl.ucsd.edu/127.0.0.1
> 5 STARTUP_MSG:   args = []
> 6 STARTUP_MSG:   version = 0.20.2
> 7 STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
> 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
> 8 ************************************************************/
> 9 2011-06-27 13:45:02,476 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 0 time(s).
> 10 2011-06-27 13:45:03,549 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 1 time(s).
> 11 2011-06-27 13:45:04,552 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 2 time(s).
> 12 2011-06-27 13:45:05,609 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 3 time(s).
> 13 2011-06-27 13:45:06,640 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 4 time(s).
> 14 2011-06-27 13:45:07,643 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 5 time(s).
> 15 2011-06-27 13:45:08,646 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 6 time(s).
> 16 2011-06-27 13:45:09,661 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 7 time(s).
> 17 2011-06-27 13:45:10,664 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 8 time(s).
> 18 2011-06-27 13:45:11,678 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: hdl.ucsd.edu/127.0.0.1:54310. Already tried 9 time(s).
> 19 2011-06-27 13:45:11,679 INFO org.apache.hadoop.ipc.RPC: Server at
> hdl.ucsd.edu/127.0.0.1:54310 not available yet, Zzzzz...
> ****************************************
> 
> (just guess, is this due to some porting problem?)
> 
> Any comments will be greatly appreciated!
> 
> Best Regards
> Yours Sincerely
> 
> Jingwei Lu
> 
> 
> 
> On Mon, Jun 27, 2011 at 1:28 PM, GOEKE, MATTHEW (AG/1000) <
> matthew.goeke@monsanto.com> wrote:
> 
> > Did you make sure to define the datanode/tasktracker in the slaves file
> in
> > your conf directory and push that to both machines? Also have you checked
> > the logs on either to see if there are any errors?
> > 
> > Matt
> > 
> > -----Original Message-----
> > From: Jingwei Lu [mailto:jlu@ucsd.edu]
> > Sent: Monday, June 27, 2011 3:24 PM
> > To: HADOOP MLIST
> > Subject: Why I cannot see live nodes in a LAN-based cluster setup?
> > 
> > Hi Everyone:
> > 
> > I am quite new to hadoop here. I am attempting to set up Hadoop locally
> in
> > two machines, connected by LAN. Both of them pass the single-node test.
> > However, I failed in two-node cluster setup, as shown in the 2 cases
> below:
> > 
> > 1) set one as dedicated namenode and the other as dedicated datanode
> > 2) set one as both name- and data-node, and the other as just datanode
> > 
> > I launch *start-dfs.sh *on the namenode. Since I have all the *ssh
> *issues
> > cleared, thus I can always observe the startup of daemon in every
> datanode.
> > However, by website of *http://(URI of namenode):50070 *it shows only 0
> > live
> > node for (1) and 1 live node for (2), which is the same as the output by
> > command-line *hadoop dfsadmin -report*
> > 
> > Generally it appears that from the namenode you cannot observe the remote
> > datanode alive, let alone a normal across-node MapReduce execution.
> > 
> > Could anyone give some hints / instructions at this point? I really
> > appreciate it!
> > 
> > Thank.
> > 
> > Best Regards
> > Yours Sincerely
> > 
> > Jingwei Lu
> > This e-mail message may contain privileged and/or confidential
> information,
> > and is intended to be received only by persons entitled
> > to receive such information. If you have received this e-mail in error,
> > please notify the sender immediately. Please delete it and
> > all attachments from any servers, hard drives or any other media. Other
> use
> > of this e-mail by you is strictly prohibited.
> > 
> > All e-mails and attachments sent and received are subject to monitoring,
> > reading and archival by Monsanto, including its
> > subsidiaries. The recipient of this e-mail is solely responsible for
> > checking for the presence of "Viruses" or other "Malware".
> > Monsanto, along with its subsidiaries, accepts no liability for any
> damage
> > caused by any such code transmitted by or accompanying
> > this e-mail or any attachment.
> > 
> > 
> > The information contained in this email may be subject to the export
> > control laws and regulations of the United States, potentially
> > including but not limited to the Export Administration Regulations (EAR)
> > and sanctions regulations issued by the U.S. Department of
> > Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of
> this
> > information you are obligated to comply with all
> > applicable U.S. export laws and regulations.
> > 
> 


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic