[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cassandra-user
Subject:    Issues with MapReduce Job (using Brisk)
From:       Silvère_Lestang <silvere.lestang () gmail ! com>
Date:       2011-05-31 11:00:47
Message-ID: BANLkTimPLpvd+O8-RsPBmr7x_kqDePBS_Q () mail ! gmail ! com
[Download RAW message or body]

Hi,
I try to create a MapReduce job that calculate the average of values stored
in cassandra and write the result back to cassandra (using
ColumnFamilyOutputFormat and ColumnFamilyInputFormat). I use the Brisk
distribution of Hadoop but I don't know if it's somehow related.
My code is here: http://pastebin.com/8gd21VuP
As I understand in the WordCount example, the first parameters of the map
method is the row key and the second one is a map of <column.name, column>.
I found confirmation of this in the class ColumnFamilyRecordReader.
But my code didn't works. I dumped the row key and the column name using
logger and I saw that both seems to be get_range_slices objects, which is
very unexpected.
Here is an example of log (the CF contains 5 rows called row[0-4] with 10
columns each called columnKey[0-9] with value 42 ( * in ascii)):

*MAP rowkey*:
get_range_slicesrow1columnKey0*`columnKey1*acolumnKey2*bcolumnKey3*ccolumnK=
ey4*dcolumnKey5*ecolumnKey6*fcolumnKey7*gcolumnKey8*hcolumnKey9*Hrow2column=
Key0*columnKey1*columnKey2*columnKey3*columnKey4*columnKey5*columnKey6*colu=
mnKey7*columnKey8*columnKey9*row4columnKey0*pcolumnKey1*qcolumnKey2*rcolumn=
Key3*XcolumnKey4*YcolumnKey5*ZcolumnKey6*[columnKey7*@columnKey8*AcolumnKey=
9*Brow3columnKey0*columnKey1*columnKey2*columnKey3*columnKey4*columnKey5*co=
lumnKey6*columnKey7*columnKey8*columnKey9*row0columnKey0*columnKey1*:column=
Key2*:!columnKey3*:"columnKey4*:#columnKey5*>columnKey6*>columnKey7*>column=
Key8*>columnKey9*>

*MAP columnKey:*get_range_slicesrow1columnKey0*`columnKey1*acolumnKey2*bcol=
umnKey3*ccolumnKey4*dcolumnKey5*ecolumnKey6*fcolumnKey7*gcolumnKey8*hcolumn=
Key9*Hrow2columnKey0*columnKey1*columnKey2*columnKey3*columnKey4*columnKey5=
*columnKey6*columnKey7*columnKey8*columnKey9*row4columnKey0*pcolumnKey1*qco=
lumnKey2*rcolumnKey3*XcolumnKey4*YcolumnKey5*ZcolumnKey6*[columnKey7*@colum=
nKey8*AcolumnKey9*Brow3columnKey0*columnKey1*columnKey2*columnKey3*columnKe=
y4*columnKey5*columnKey6*columnKey7*columnKey8*columnKey9*row0columnKey0*co=
lumnKey1*:columnKey2*:!columnKey3*:"columnKey4*:#columnKey5*>columnKey6*>co=
lumnKey7*>columnKey8*>columnKey9*>


I can't understand why I get this instead of row key and column key, if
anybody have an idea?
Silv=E8re

[Attachment #3 (text/html)]

<div>Hi,</div><div>I try to create a MapReduce job that calculate the average of \
values stored in cassandra and write the result back to cassandra (using \
ColumnFamilyOutputFormat and ColumnFamilyInputFormat). I use the Brisk distribution \
of Hadoop but I don&#39;t know if it&#39;s somehow related.</div> <div>My code is \
here: <a href="http://pastebin.com/8gd21VuP">http://pastebin.com/8gd21VuP</a></div><div>As \
I understand in the WordCount example, the first parameters of the map method is the \
row key and the second one is a map of &lt;<a \
href="http://column.name">column.name</a>, column&gt;. I found confirmation of this \
in the class ColumnFamilyRecordReader.</div> <div>But my code didn&#39;t works. I \
dumped the row key and the column name using logger and I saw that both seems to be \
get_range_slices objects, which is very unexpected.</div><div>Here is an example of \
log (the CF contains 5 rows called row[0-4] with 10 columns each called \
columnKey[0-9] with value 42 ( * in ascii)):</div> <blockquote \
class="webkit-indent-blockquote" style="margin: 0 0 0 40px; border: none; padding: \
0px;"><b>MAP rowkey</b>: \
get_range_slicesrow1columnKey0*`columnKey1*acolumnKey2*bcolumnKey3*ccolumnKey4*dcolumn \
Key5*ecolumnKey6*fcolumnKey7*gcolumnKey8*hcolumnKey9*Hrow2columnKey0*columnKey1*column \
Key2*columnKey3*columnKey4*columnKey5*columnKey6*columnKey7*columnKey8*columnKey9*row4 \
columnKey0*pcolumnKey1*qcolumnKey2*rcolumnKey3*XcolumnKey4*YcolumnKey5*ZcolumnKey6*[co \
lumnKey7*@columnKey8*AcolumnKey9*Brow3columnKey0*columnKey1*columnKey2*columnKey3*colu \
mnKey4*columnKey5*columnKey6*columnKey7*columnKey8*columnKey9*row0columnKey0*columnKey \
1*:columnKey2*:!columnKey3*:&quot;columnKey4*:#columnKey5*&gt;columnKey6*&gt;columnKey7*&gt;columnKey8*&gt;columnKey9*&gt; \
<br> <b>MAP columnKey:</b> \
get_range_slicesrow1columnKey0*`columnKey1*acolumnKey2*bcolumnKey3*ccolumnKey4*dcolumn \
Key5*ecolumnKey6*fcolumnKey7*gcolumnKey8*hcolumnKey9*Hrow2columnKey0*columnKey1*column \
Key2*columnKey3*columnKey4*columnKey5*columnKey6*columnKey7*columnKey8*columnKey9*row4 \
columnKey0*pcolumnKey1*qcolumnKey2*rcolumnKey3*XcolumnKey4*YcolumnKey5*ZcolumnKey6*[co \
lumnKey7*@columnKey8*AcolumnKey9*Brow3columnKey0*columnKey1*columnKey2*columnKey3*colu \
mnKey4*columnKey5*columnKey6*columnKey7*columnKey8*columnKey9*row0columnKey0*columnKey \
1*:columnKey2*:!columnKey3*:&quot;columnKey4*:#columnKey5*&gt;columnKey6*&gt;columnKey7*&gt;columnKey8*&gt;columnKey9*&gt;<br>
 </blockquote><div><br></div>I can&#39;t understand why I get this instead of row key \
and column key, if anybody have an idea?<div>Silvère</div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic