[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-dev
Subject:    [jira] [Created] (HADOOP-17611) Distcp parallel file copy breaks the modification time
From:       "Adam Maroti (Jira)" <jira () apache ! org>
Date:       2021-03-29 15:12:00
Message-ID: JIRA.13368401.1617030686000.158858.1617030720140 () Atlassian ! JIRA
[Download RAW message or body]

Adam Maroti created HADOOP-17611:
------------------------------------

             Summary: Distcp parallel file copy breaks the modification tim=
e
                 Key: HADOOP-17611
                 URL: https://issues.apache.org/jira/browse/HADOOP-17611
             Project: Hadoop Common
          Issue Type: Bug
            Reporter: Adam Maroti


The commit=C2=A0HADOOP-11794. Enable distcp to copy blocks in parallel. (bf=
3fb585aaf2b179836e139c041fc87920a3c886) broke the modification time of larg=
e files.

=C2=A0

In CopyCommitter.java inside concatFileChunks Filesystem.concat is called w=
hich changes the modification time therfore the modification times of files=
 copeid by distcp will not match the source files. However this only occure=
s for large enough files, which are copied by splitting them up by distcp.

In concatFileChunks. before calling concat extract the modification time an=
d apply that to the concatenated resulting file after the concat. (probably=
 best after the rename()).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic