[prev in list] [next in list] [prev in thread] [next in thread] 

List:       flume-dev
Subject:    [jira] [Commented] (FLUME-2922) HDFSSequenceFile Should Sync Writer
From:       "Kevin Conaway (JIRA)" <jira () apache ! org>
Date:       2016-06-29 21:21:10
Message-ID: JIRA.12977310.1465494638000.159.1467235270695 () Atlassian ! JIRA
[Download RAW message or body]


    [ https://issues.apache.org/jira/browse/FLUME-2922?page=com.atlassian.jira.plugin. \
system.issuetabpanels:comment-tabpanel&focusedCommentId=15355802#comment-15355802 ] 

Kevin Conaway commented on FLUME-2922:
--------------------------------------

[~hshreedharan] did you get a chance to review this one?

> HDFSSequenceFile Should Sync Writer
> -----------------------------------
> 
> Key: FLUME-2922
> URL: https://issues.apache.org/jira/browse/FLUME-2922
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v1.6.0
> Reporter: Kevin Conaway
> Priority: Critical
> Attachments: FLUME-2922.patch
> 
> 
> There is a possibility of losing data with the current HDFS sequence file writer.
> Internally, the `SequenceFile.Writer` buffers data and periodically syncs it to the \
> underlying output stream.  The mechanism for doing this is dependent on whether you \
> are using compression or not but in both scenarios, the key/values are appended to \
> an internal buffer and only flushed to disk after the buffer reaches a certain \
> size. Thus it is quite possible for Flume to lose messages if the agent crashes, or \
> is stopped, before the internal buffer is flushed to disk. The correct action is to \
> force the writer to sync its internal buffers to the underlying \
> `FSDataOutputStream` first before calling hflush/sync. Additionally, I believe we \
> should be calling hsync instead of hflush.  Its my understanding writes with hsync \
> should be more durable which I believe are the semantics we want here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic