[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-dev
Subject:    [jira] [Created] (HADOOP-17434) Improve S3A upload statistics collection from ProgressEvent callback
From:       "Steve Loughran (Jira)" <jira () apache ! org>
Date:       2020-12-15 17:20:00
Message-ID: JIRA.13346160.1608052751000.329448.1608052800492 () Atlassian ! JIRA
[Download RAW message or body]

Steve Loughran created HADOOP-17434:
---------------------------------------

             Summary: Improve S3A upload statistics collection from ProgressEvent \
callbacks  Key: HADOOP-17434
                 URL: https://issues.apache.org/jira/browse/HADOOP-17434
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: fs/s3
    Affects Versions: 3.4.0
            Reporter: Steve Loughran


Collection of S3A upload stats from ProgressEvent callbacks can be improved

Two similar but different implementations of listeners
* org.apache.hadoop.fs.s3a.S3ABlockOutputStream.BlockUploadProgress
* org.apache.hadoop.fs.s3a.ProgressableProgressListener. Used on simple PUT calls.

Both call back into S3A FS to incrementWriteOperations; BlockUploadProgress also \
updates S3AInstrumentation/IOStatistics.

* I'm not 100% confident that BlockUploadProgress is updating things (especially \
                gauges of pending bytes) at the right time
* or that completion is being handled
* And the other interface doesn't update S3AInstrumentation; numbers are lost.
* And there's no incremental updating during \
{{CommitOperations.uploadFileToPendingCommit()}}, which doesn't call \
                Progressable.progress() other than on every block.
* or in MultipartUploader 

Proposed: 
* a single Progress listener which updates BlockOutputStreamStatistics, used by all \
                interfaces.
* WriteOperations to help set this up for callers; 
* And it's uploadPart API to take a Progressable (or the progress listener to use for \
                uploading that part)
* Multipart upload API to also add a progressable...would help for distcp-like \
applications.

+Itests to verify that the gauges come out right. At the end of each operation, the \
#of bytes pending upload == 0; that of bytes uploaded == the original size





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic