[prev in list] [next in list] [prev in thread] [next in thread]
List: avro-user
Subject: Re: AvroKeyValueInputFormat/AvroKeyValueOutputFormat vs AvroSequenceFileInputFormat/AvroSequenceFile
From: Martin Kleppmann <mkleppmann () linkedin ! com>
Date: 2014-05-23 8:24:07
Message-ID: 206685CE-B5E8-4D46-A9DE-6833028E49BE () linkedin ! com
[Download RAW message or body]
In general, you're probably better off with \
AvroKeyValueInputFormat/AvroKeyValueOutputFormat, since that generates Avro data \
files which you can read from other applications and other languages. Hadoop sequence \
files aren't really supported by anything other than Hadoop.
If your data remains entirely within Hadoop, there are cases where you might want to \
use sequence files. For example, it might be used for the transient files generated \
during the shuffle (output of mappers being fed into reducers).
Martin
On 20 May 2014, at 16:34, Jim Donofrio <donofrio111@gmail.com> wrote:
> What are the pro's and con's of AvroKeyValueInputFormat/AvroKeyValueOutputFormat vs \
> AvroSequenceFileInputFormat/AvroSequenceFileOutputFormat? Which is more commonly \
> used?
> They both use AvroKey, AvroValue. The only difference seems to be one serializes \
> into avro data files and other hadoop sequence files.
> Thanks
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic