[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-dev
Subject:    [jira] [Resolved] (HADOOP-11281) Add flag to fs.shell to skip _COPYING_ file
From:       "Chris Nauroth (JIRA)" <jira () apache ! org>
Date:       2015-01-30 19:06:37
Message-ID: JIRA.12753788.1415389733000.217271.1422644797705 () Atlassian ! JIRA
[Download RAW message or body]


     [ https://issues.apache.org/jira/browse/HADOOP-11281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel \
]

Chris Nauroth resolved HADOOP-11281.
------------------------------------
    Resolution: Duplicate

> Add flag to fs.shell to skip _COPYING_ file
> -------------------------------------------
> 
> Key: HADOOP-11281
> URL: https://issues.apache.org/jira/browse/HADOOP-11281
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs, fs/s3
> Environment: Hadoop 2.2 but is in all of them.
> AWS EMR 3.0.4
> Reporter: Corby Wilson
> Priority: Critical
> 
> Amazon S3 does not have a rename feature.
> When you use the hadoop shell or distcp feature, hadoop first uploads the file \
> using the ._COPYING_ extension, then renames the file to the final output. Code:
> org/apache/hadoop/fs/shell/CommandWithDestination.java
> PathData tempTarget = target.suffix("._COPYING_");
> targetFs.setWriteChecksum(writeChecksum);
> targetFs.writeStreamToFile(in, tempTarget, lazyPersist);
> targetFs.rename(tempTarget, target);
> The problem is that on rename, we actually have to download the file again (through \
> an InputStream), then upload it again. For very large files (>= 5GB) we have to use \
> multipart upload. So if we are processing several TB of multi GB files, we are \
> actually writing the file to S3 twice and reading it once from S3. It would be nice \
> to have a flag or core-site.xml setting that allowed us to tell hadoop to skip the \
> copy and just write the file once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic