[prev in list] [next in list] [prev in thread] [next in thread]
List: hadoop-user
Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /tmp?
From: James Cipar <jcipar () andrew ! cmu ! edu>
Date: 2009-08-28 19:32:16
Message-ID: D7AE9147-B145-4870-B26A-B9CEC0FECFCB () andrew ! cmu ! edu
[Download RAW message or body]
Sorry that last one, I replied to the wrong message.
On Aug 28, 2009, at 3:04 PM, Steve Gao wrote:
> Thanks, Brian. Would you tell me what is the filename of the code
> snippet?
>
> --- On Fri, 8/28/09, Brian Bockelman <bbockelm@cse.unl.edu> wrote:
>
> From: Brian Bockelman <bbockelm@cse.unl.edu>
> Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to use /
> tmp?
> To: common-user@hadoop.apache.org
> Date: Friday, August 28, 2009, 2:37 PM
>
> Actually, poking the code, it seems that the streaming package does
> set this value:
>
> String tmp = jobConf_.get("stream.tmpdir"); //, "/tmp/$
> {user.name}/"
>
> Try setting stream.tmpdir to a different directory maybe?
>
> Brian
>
> On Aug 28, 2009, at 1:31 PM, Steve Gao wrote:
>
>> Thanks lot, Brian. It seems to be a design flaw of hadoop that it
>> can not manage (or pass in) the temp of "java.util.zip". Can we
>> create a jira ticket for this?
>>
>> --- On Fri, 8/28/09, Brian Bockelman <bbockelm@cse.unl.edu> wrote:
>>
>> From: Brian Bockelman <bbockelm@cse.unl.edu>
>> Subject: Re: [Help] Why "java.util.zip.ZipOutputStream" need to
>> use /tmp?
>> To:
>> Cc: common-user@hadoop.apache.org
>> Date: Friday, August 28, 2009, 2:27 PM
>>
>> Hey Steve,
>>
>> Correct, java.util.zip.* does not necessarily respect hadoop
>> settings.
>>
>> Try setting TMPDIR in the environment to your large local disk
>> space. It might respect that, if Java decides to act like a unix
>> utility.
>>
>> http://en.wikipedia.org/wiki/TMPDIR
>>
>> Brian
>>
>> On Aug 28, 2009, at 1:19 PM, Steve Gao wrote:
>>
>>> would someone give us a hint? Thanks.
>>> Why "java.util.zip.ZipOutputStream" need to use /tmp?
>>>
>>> The hadoop version is 0.18.3 . Recently we got "out of space"
>>> issue. It's from "java.util.zip.ZipOutputStream".
>>> We found that /tmp is full and after cleaning /tmp the problem is
>>> solved.
>>>
>>> However why hadoop needs to use /tmp? We had already configured
>>> hadoop tmp to a local disk in: hadoop-site.xml
>>>
>>> <property>
>>> <name>hadoop.tmp.dir</name>
>>> <value> ... some large local disk ... </value>
>>> </property>
>>>
>>>
>>> Could it because java.util.zip.ZipOutputStream uses /tmp even if
>>> we configured hadoop.tmp.dir to a large local disk?
>>>
>>> The error log is here FYI:
>>>
>>> java.io.IOException: No space left on device
>>> at java.io.FileOutputStream.write(Native Method)
>>> at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java:
>>> 445)
>>> at java.util.zip.ZipOutputStream.writeEXT(ZipOutputStream.java:362)
>>> at java.util.zip.ZipOutputStream.closeEntry(ZipOutputStream.java:
>>> 220)
>>> at java.util.zip.ZipOutputStream.finish(ZipOutputStream.java:301)
>>> at
>>> java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:
>>> 146)
>>> at java.util.zip.ZipOutputStream.close(ZipOutputStream.java:321)
>>> at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:79)
>>> at
>>> org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java:
>>> 628)
>>> at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java:
>>> 843)
>>> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:110)
>>> at
>>> org
>>> .apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:
>>> 33)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun
>>> .reflect
>>> .NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> at
>>> sun
>>> .reflect
>>> .DelegatingMethodAccessorImpl
>>> .invoke(DelegatingMethodAccessorImpl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
>>> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
>>> Executing Hadoop job failure
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>
>
>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic