'Re: Map Reduce support'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cassandra-user
Subject:    Re: Map Reduce support
From:       Drew Dahlke <drew.dahlke () bronto ! com>
Date:       2010-06-28 13:32:40
Message-ID: AANLkTikeuUD4zNjs5ZqXgDLvWmNPCQeAwp4N_qRXXaH4 () mail ! gmail ! com
[Download RAW message or body]

I'm afraid I didn't hold on to it, sorry folks

On Mon, Jun 28, 2010 at 8:58 AM, Carlos Sanchez
<carlos.sanchez@riskmetrics.com> wrote:
> Drew,
> 
> I was wondering if you care to share your map-reduce code
> 
> Thanks
> 
> Carlos
> ________________________________________
> From: Drew Dahlke [drew.dahlke@bronto.com]
> Sent: Monday, June 28, 2010 7:17 AM
> To: user@cassandra.apache.org
> Subject: Re: Map Reduce support
> 
> The difference is noticeable but small. I did a test just reading data
> in from Cassandra on our cluster & dumping it to a csv file. Pure map
> reduce was going at ~17k records/sec versus ~15k from Pig. There is
> overhead to using Pig, but it'll reduce your development time & make
> for more readable code if it suits your needs.
> 
> On Sun, Jun 27, 2010 at 9:53 AM, Atul Gosain <atul.gosain@gmail.com> wrote:
> > Thanks for the information Drew and Jonathan.
> > Is there any difference in performance while using Pig compared to MapReduce
> > directly on data store ?
> > I will do the experiments with both of them though in some time.
> > 
> > On Fri, Jun 25, 2010 at 5:46 PM, Drew Dahlke <drew.dahlke@bronto.com> wrote:
> > > 
> > > The cassandra column family input format will go over a an entire
> > > column family sending a slice of a row into a mapper at a time. From
> > > there there's a lot you can do. As far as how you aggregate data
> > > together, I'd suggest experimenting with the latest version of Pig
> > > which thankfully supports the new input format. It gives you a
> > > SQL'esque syntax for manipulating the data and is probably the easiest
> > > way to experiment.
> > > 
> > > On Thu, Jun 24, 2010 at 11:01 AM, Atul Gosain <atul.gosain@gmail.com>
> > > wrote:
> > > > Hi
> > > > What kind of Map Reduce support is provided for Cassandra ?
> > > > Can i get some columns from different rows and then aggregate them up
> > > > together. Its basically aggregation of statistics for various devices
> > > > connected to a network manager. Is it a right kind of use case to be
> > > > supported by MR ?
> > > > Thanks
> > > > Atul
> > 
> > 
> 
> This email message and any attachments are for the sole use of the intended \
> recipients and may contain proprietary and/or confidential information which may be \
> privileged or otherwise protected from disclosure. Any unauthorized review, use, \
> disclosure or distribution is prohibited. If you are not an intended recipient, \
> please contact the sender by reply email and destroy the original message and any \
> copies of the message as well as any attachments to the original message. 


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic