'Re: extremely imbalance in the hdfs cluster'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-user
Subject:    Re: extremely imbalance in the hdfs cluster
From:       茅旭峰 <m9suns () gmail ! com>
Date:       2011-06-30 4:22:06
Message-ID: BANLkTimBMkO+VXh7TVcty6_ZB4FJk1-F7g () mail ! gmail ! com
[Download RAW message or body]


Thanks Edward! It seems like we could only live with this issue.

On Wed, Jun 29, 2011 at 11:24 PM, Edward Capriolo <edlinuxguru@gmail.com>wr=
ote:

> We have run into this issue as well. Since hadoop is RR writing different
> size disks really screw things up royally especially if you are running a=
t
> high capacity. We have found that decommissioning hosts for stretches of
> time is more effective then the balancer in extreme situations. Another
> hokey trick is that nodes that launch a job always use that node as the
> first replica. You can leverage that by launching jobs from your bigger
> machines which makes data more likely to be saved there. Super hokey
> solution is moving blocks around with rsync! (block reports later happen
> and
> deal with this (I do not suggest this)).
>
> Hadoop really does need a more intelligent system then Round Robin writin=
g
> for heterogeneous systems, there might be a jira open on this somewhere.
> But
> if you are on 0.20.X you have to work with it.
>
> Edward
>
> On Wed, Jun 29, 2011 at 9:06 AM, =E8=8C=85=E6=97=AD=E5=B3=B0 <m9suns@gmai=
l.com> wrote:
>
> > Hi,
> >
> > I'm running a 37 DN hdfs cluster. There are 12 nodes have 20TB capacity
> > each
> > node, and the other 25 nodes have 24TB each node.Unfortunately, there a=
re
> > several nodes that contain much more data than others, and I can still
> see
> > the data increasing crazy. The 'dstat' shows
> >
> > dstat -ta 2
> > -----time----- ----total-cpu-usage---- -dsk/total- -net/total-
> ---paging--
> > ---system--
> >  date/time   |usr sys idl wai hiq siq| read  writ| recv  send|  in   ou=
t
> |
> > int   csw
> > 24-06 00:42:43|  1   1  95   2   0   0|  25M   62M|   0     0 |   0   0=
.1
> > |3532  5644
> > 24-06 00:42:45|  7   1  91   0   0   0|  16k  176k|8346B 1447k|   0    =
 0
> > |1201   365
> > 24-06 00:42:47|  7   1  91   0   0   0|  12k  172k|9577B 1493k|   0    =
 0
> > |1223   334
> > 24-06 00:42:49| 11   3  83   1   0   1|  26M   11M|  78M   66M|   0    =
 0
> |
> >  12k   18k
> > 24-06 00:42:51|  4   3  90   1   0   2|  17M  181M| 117M   53M|   0    =
 0
> |
> >  15k   26k
> > 24-06 00:42:53|  4   3  87   4   0   2|  15M  375M| 117M   55M|   0    =
 0
> |
> >  16k   26k
> > 24-06 00:42:55|  3   2  94   1   0   1|  15M   37M|  80M   17M|   0    =
 0
> |
> >  10k   15k
> > 24-06 00:42:57|  0   0  98   1   0   0|  18M   23M|7259k 5988k|   0    =
 0
> > |1932  1066
> > 24-06 00:42:59|  0   0  98   1   0   0|  16M  132M| 708k  106k|   0    =
 0
> > |1484   491
> > 24-06 00:43:01|  4   2  91   2   0   1|  23M   64M|  76M   41M|   0    =
 0
> > |8441    13k
> > 24-06 00:43:03|  4   3  88   3   0   1|  17M  207M|  91M   48M|   0    =
 0
> |
> >  11k   16k
> >
> > From the result of dstat, we can see that the throughput of write is mu=
ch
> > more than read.
> > I've started a balancer processor, with dfs.balance.bandwidthPerSec set
> to
> > bytes. From
> > the balancer log, I can see the balancer works well. But the balance
> > operation can not
> > catch up with the write operation.
> >
> > Now I can only stop the mad increase of data size by stopping the
> datanode,
> > and setting
> > dfs.datanode.du.reserved 300GB, then starting the datanode again. Until
> the
> > total size
> > reaches the 300GB reservation line, the increase stopped.
> >
> > The output of 'hadoop dfsadmin -report' shows for the crazy nodes,
> >
> > Name: 10.150.161.88:50010
> > Decommission Status : Normal
> > Configured Capacity: 20027709382656 (18.22 TB)
> > DFS Used: 14515387866480 (13.2 TB)
> > Non DFS Used: 0 (0 KB)
> > DFS Remaining: 5512321516176(5.01 TB)
> > DFS Used%: 72.48%
> > DFS Remaining%: 27.52%
> > Last contact: Wed Jun 29 21:03:01 CST 2011
> >
> >
> > Name: 10.150.161.76:50010
> > Decommission Status : Normal
> > Configured Capacity: 20027709382656 (18.22 TB)
> > DFS Used: 16554450730194 (15.06 TB)
> > Non DFS Used: 0 (0 KB)
> > DFS Remaining: 3473258652462(3.16 TB)
> > DFS Used%: 82.66%
> > DFS Remaining%: 17.34%
> > Last contact: Wed Jun 29 21:03:02 CST 2011
> >
> > while the other normal datanode, it just like
> >
> > Name: 10.150.161.65:50010
> > Decommission Status : Normal
> > Configured Capacity: 23627709382656 (21.49 TB)
> > DFS Used: 5953984552236 (5.42 TB)
> > Non DFS Used: 1200643810004 (1.09 TB)
> > DFS Remaining: 16473081020416(14.98 TB)
> > DFS Used%: 25.2%
> > DFS Remaining%: 69.72%
> > Last contact: Wed Jun 29 21:03:01 CST 2011
> >
> >
> > Name: 10.150.161.80:50010
> > Decommission Status : Normal
> > Configured Capacity: 23627709382656 (21.49 TB)
> > DFS Used: 5982565373592 (5.44 TB)
> > Non DFS Used: 1202701691240 (1.09 TB)
> > DFS Remaining: 16442442317824(14.95 TB)
> > DFS Used%: 25.32%
> > DFS Remaining%: 69.59%
> > Last contact: Wed Jun 29 21:03:02 CST 2011
> >
> > Any hint on this issue? We are using 0.20.2-cdh3u0.
> >
> > Thanks and regards,
> >
> > Mao Xu-Feng
> >
>


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic