[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-user
Subject:    Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?
From:       James Cipar <jcipar () andrew ! cmu ! edu>
Date:       2009-08-28 19:32:16
Message-ID: D7AE9147-B145-4870-B26A-B9CEC0FECFCB () andrew ! cmu ! edu
[Download RAW message or body]

Sorry that last one, I replied to the wrong message.



On Aug 28, 2009, at 3:04 PM, Steve Gao wrote:

> Thanks, Brian. Would you tell me what is the filename of the code  
> snippet?
>
> --- On Fri, 8/28/09, Brian Bockelman <bbockelm@cse.unl.edu> wrote:
>
> From: Brian Bockelman <bbockelm@cse.unl.edu>
> Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to use / 
> tmp?
> To: common-user@hadoop.apache.org
> Date: Friday, August 28, 2009, 2:37 PM
>
> Actually, poking the code, it seems that the streaming package does  
> set this value:
>
>     String tmp = jobConf_.get("stream.tmpdir"); //, "/tmp/$ 
> {user.name}/"
>
> Try setting stream.tmpdir to a different directory maybe?
>
> Brian
>
> On Aug 28, 2009, at 1:31 PM, Steve Gao wrote:
>
>> Thanks lot, Brian. It seems to be a design flaw of hadoop that it  
>> can not manage (or pass in) the temp of "java.util.zip". Can we  
>> create a jira ticket for this?
>>
>> --- On Fri, 8/28/09, Brian Bockelman <bbockelm@cse.unl.edu> wrote:
>>
>> From: Brian Bockelman <bbockelm@cse.unl.edu>
>> Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to  
>> use /tmp?
>> To:
>> Cc: common-user@hadoop.apache.org
>> Date: Friday, August 28, 2009, 2:27 PM
>>
>> Hey Steve,
>>
>> Correct, java.util.zip.* does not necessarily respect hadoop  
>> settings.
>>
>> Try setting TMPDIR in the environment to your large local disk  
>> space.  It might respect that, if Java decides to act like a unix  
>> utility.
>>
>> http://en.wikipedia.org/wiki/TMPDIR
>>
>> Brian
>>
>> On Aug 28, 2009, at 1:19 PM, Steve Gao wrote:
>>
>>> would someone give us a hint? Thanks.
>>> Why "java.util.zip.ZipOutputStream" need to use /tmp?
>>>
>>> The hadoop version is 0.18.3 . Recently we got "out of space"  
>>> issue. It's from "java.util.zip.ZipOutputStream".
>>> We found that /tmp is full and after cleaning /tmp the problem is  
>>> solved.
>>>
>>> However why hadoop needs to use /tmp? We had already configured  
>>> hadoop tmp to a local disk in: hadoop-site.xml
>>>
>>> <property>
>>>     <name>hadoop.tmp.dir</name>
>>>     <value> ... some large local disk ... </value>
>>> </property>
>>>
>>>
>>> Could it because java.util.zip.ZipOutputStream uses /tmp even if  
>>> we configured hadoop.tmp.dir to a large local disk?
>>>
>>> The error log is here FYI:
>>>
>>> java.io.IOException: No space left on device
>>> at java.io.FileOutputStream.write(Native Method)
>>>    at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java: 
>>> 445)
>>> at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)
>>> at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java: 
>>> 220)
>>> at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)
>>> at  
>>> java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java: 
>>> 146)
>>> at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)
>>> at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)
>>> at  
>>> org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java: 
>>> 628)
>>> at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java: 
>>> 843)
>>> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)
>>> at  
>>> org 
>>> .apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java: 
>>> 33)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at  
>>> sun 
>>> .reflect 
>>> .NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> at  
>>> sun 
>>> .reflect 
>>> .DelegatingMethodAccessorImpl 
>>> .invoke(DelegatingMethodAccessorImpl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
>>> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
>>> Executing Hadoop job failure
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>
>
>

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic