'[jira] [Reopened] (HADOOP-18706) Improve S3ABlockOutputStream recovery'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-dev
Subject:    [jira] [Reopened] (HADOOP-18706) Improve S3ABlockOutputStream recovery
From:       "Steve Loughran (Jira)" <jira () apache ! org>
Date:       2023-05-24 18:25:00
Message-ID: JIRA.13532928.1681753552000.22903.1684952700045 () Atlassian ! JIRA
[Download RAW message or body]


     [ https://issues.apache.org/jira/browse/HADOOP-18706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel \
]

Steve Loughran reopened HADOOP-18706:
-------------------------------------

bad news Chris, had to revert this. 
Can you do a new pr which has very short filenames (ideally span id and some minimal \
info for users but enough so we never run out of filename. thanks

> Improve S3ABlockOutputStream recovery
> -------------------------------------
> 
> Key: HADOOP-18706
> URL: https://issues.apache.org/jira/browse/HADOOP-18706
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/s3
> Reporter: Chris Bevard
> Assignee: Chris Bevard
> Priority: Minor
> Labels: pull-request-available
> Fix For: 3.4.0
> 
> 
> If an application crashes during an S3ABlockOutputStream upload, it's possible to \
> complete the upload if fast.upload.buffer is set to disk by uploading the s3ablock \
> file with putObject as the final part of the multipart upload. If the application \
> has multiple uploads running in parallel though and they're on the same part number \
> when the application fails, then there is no way to determine which file belongs to \
> which object, and recovery of either upload is impossible. If the temporary file \
> name for disk buffering included the s3 key, then every partial upload would be \
> recoverable. h3. Important disclaimer
> This change does not directly add the Syncable semantics which applications that \
> require {{Syncable.hsync()}} to only return after all pending data has been durably \
> written to the destination path. S3 is not a filesystem and this change does not \
> make it so. What is does do is assist anyone trying to implement some post-crash \
> recovery process which # interrogates s3 to identofy pending uploads to a specific \
> path and get a list of uploaded blocks yet to be committed # scans the local \
> fs.s3a.buffer dir directories to identify in-progress-write blocks for the same \
> target destination. That is those which were being uploaded, queued for uploaded \
> and the single "new data being written to" block for an output stream # uploads all \
> those pending blocks # generates a new POST to complete a multipart upload with all \
> the blocks in the correct order All this patch does is ensure the buffered block \
> filenames include the final path and block ID, to aid in identify which blocks need \
> to be uploaded and what order.  h2. warning
> causes HADOOP-18744 -always include the relevant fix when backporting



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic