[prev in list] [next in list] [prev in thread] [next in thread]
List: flume-dev
Subject: [jira] [Commented] (FLUME-2922) HDFSSequenceFile Should Sync Writer
From: "Kevin Conaway (JIRA)" <jira () apache ! org>
Date: 2016-06-29 21:21:10
Message-ID: JIRA.12977310.1465494638000.159.1467235270695 () Atlassian ! JIRA
[Download RAW message or body]
[ https://issues.apache.org/jira/browse/FLUME-2922?page=com.atlassian.jira.plugin. \
system.issuetabpanels:comment-tabpanel&focusedCommentId=15355802#comment-15355802 ]
Kevin Conaway commented on FLUME-2922:
--------------------------------------
[~hshreedharan] did you get a chance to review this one?
> HDFSSequenceFile Should Sync Writer
> -----------------------------------
>
> Key: FLUME-2922
> URL: https://issues.apache.org/jira/browse/FLUME-2922
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v1.6.0
> Reporter: Kevin Conaway
> Priority: Critical
> Attachments: FLUME-2922.patch
>
>
> There is a possibility of losing data with the current HDFS sequence file writer.
> Internally, the `SequenceFile.Writer` buffers data and periodically syncs it to the \
> underlying output stream. The mechanism for doing this is dependent on whether you \
> are using compression or not but in both scenarios, the key/values are appended to \
> an internal buffer and only flushed to disk after the buffer reaches a certain \
> size. Thus it is quite possible for Flume to lose messages if the agent crashes, or \
> is stopped, before the internal buffer is flushed to disk. The correct action is to \
> force the writer to sync its internal buffers to the underlying \
> `FSDataOutputStream` first before calling hflush/sync. Additionally, I believe we \
> should be calling hsync instead of hflush. Its my understanding writes with hsync \
> should be more durable which I believe are the semantics we want here.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic