'Re: problem of lost name-node'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-dev
Subject:    Re: problem of lost name-node
From:       Ravi Prakash <ravihadoop () gmail ! com>
Date:       2011-09-29 14:13:31
Message-ID: CAMs9kViovQ8bOM9t3skVqYU4EFp49ENKZW0gy2UhSMFW4z50Qw () mail ! gmail ! com
[Download RAW message or body]


Hi,

@Mirko: Please file a JIRA. This seems an appropriate time.

@Steve: If we store the absolute filenames (i.e. the whole path), would we
still have the problem you outlined in the 2nd para? I do agree updating
would have to be pushed out and that might be cumbersome, but hey, we are
processing heartbeats from the datanodes every 3 seconds anyway. Maybe we
can piggyback those updates? I'm sure there are better solutions as well an=
d
I don't think these problems are show-stoppers. If this solutions helps to
decrease the FUD, then I think it might be worth it (apart from its merit)

Just my $.02
Ravi




On Wed, Sep 28, 2011 at 9:06 AM, Steve Loughran <stevel@apache.org> wrote:

>
> One of the issues here is keeping that list up to date. You don't want
> filename operations on the NN to push out changes to datanodes (which may
> not be there, after all), and you don't necessarily want every block
> creation operation on a DN to force an update on what effectively becomes=
 a
> mini-db of (filename, block) mappings. Yes, it could just be a text file,
> but you still need to push out atomic updates which don't lose the previo=
us
> version on a power failure. That update would have to be thread safe, you
> would have to decide whether to make it save-immediately vs lazy-write.
>
> In the situation in which your NN loses the entire image -and all its
> backups- you are going to lose the directory tree. All the per-DN metadat=
a
> would do is leave you with some useful filenames (2011_09_22_EMEA_paying_=
*
> *customers.csv.lzo) and lots that aren't (mapout0043.something). Someone
> is still going to have to try and recreate what appears to be a functiona=
l
> directory tree from it. Then once you add layers on top like HBase, life =
is
> even more complicated as the filenames will stop bearing any relationship=
 to
> the content.
>
> I'd go for a process that makes checkpointing NN state more reliable. Tha=
t
> could include making it easier for the secondary namenode to push out
> updates to worker nodes in the system that can store timestamped/version
> stamped copies of the state; it could be improving recovery of state, and=
 it
> could be better code to make sure that the secondary Namenode is actually
> working. Because you will need a secondary namenode on any cluster of
> moderate size, and you will need to make sure it is working -and test it-
>
>
> On 28/09/11 14:27, Ravi Prakash wrote:
>
>> Hi Mirko,
>>
>> Its seems like a great idea to me!! The architects and senior developers
>> might have some more insight on this though.
>>
>> I think part of the reason why the community might be lazy about
>> implementing this is because the Namenode being a single point of failur=
e
>> is
>> usually regarded as FUD. There are simple tricks (like writing the fsima=
ge
>> and editslog to NFS) which can guard against some failure scenarios, and=
 I
>> think most users of hadoop are satisfied with that.
>>
>> I wouldn't be too surprised if there is already a JIRA for this. But if
>> you
>> could come up with a patch, I'm hopeful the community would be intereste=
d
>> in
>> it.
>>
>> Cheers
>> Ravi
>>
>> 2011/9/27 Mirko K=E4mpf<mirko.kaempf@googlemail.**com<mirko.kaempf@googl=
email.com>
>> >
>>
>>  Hi,
>>> during the Cloudera Developer Training at Berlin I came up with an idea=
,
>>> regarding a lost name-node.
>>> As in this case all data blocks are lost. The solution could be, to hav=
e
>>> a
>>> table which relates filenames and block_ids on that node, which can be
>>> scaned
>>> after a name-node is lost. Or on every block could be a kind of a
>>> backlink
>>> to the filename and the total nr of blocks and/or a total hashsum
>>> attached.
>>> This would it make easy to recover with minimal overhead.
>>>
>>> Now I would like to ask the developer community, if there is any good
>>> reason
>>> not to do this?
>>> Before I start to figure out where to start an implementation of such a
>>> feature.
>>>
>>> Thanks,
>>> Mirko
>>>
>>>
>>
>


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic