[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cassandra-user
Subject:    Re: Cassandra Collections performance issue
From:       "Agrawal, Pratik" <paagrawa () amazon ! com>
Date:       2016-02-24 22:10:40
Message-ID: D2F33C6E.1A897%paagrawa () amazon ! com
[Download RAW message or body]

Hi Daemeon,

We tried changing the behavior "we overwrite every value" to update only 1 element in \
the map, and still we saw the same performance degradation.

Thanks,
Pratik

From: daemeon reiydelle <daemeonr@gmail.com<mailto:daemeonr@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" \
                <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tuesday, February 9, 2016 at 11:39 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" \
                <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Cc: "Peddi, Praveen" <peddi@amazon.com<mailto:peddi@amazon.com>>
Subject: Re: Cassandra Collections performance issue

I think the key to your problem might be around "we overwrite every value". You are \
creating a large number of tombstones, forcing many reads to pull current results. \
You would do well to rethink why you are having to to overwrite values all the time \
under the same key. You would be better to figure out haw to add values under a key \
then age off the old values. I would say that (at least at scale) you have a classic \
anti-pattern in play.


.......

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872

On Mon, Feb 8, 2016 at 5:23 PM, Robert Coli \
<rcoli@eventbrite.com<mailto:rcoli@eventbrite.com>> wrote: On Mon, Feb 8, 2016 at \
2:10 PM, Agrawal, Pratik <paagrawa@amazon.com<mailto:paagrawa@amazon.com>> wrote: \
Recently we added one of the table fields from as Map<text, text> in Cassandra \
2.1.11. Currently we read every field from Map and overwrite map values. Map is of \
size 3. We saw that writes are 30-40% slower while reads are 70-80% slower. Please \
find below some metrics that can help.

My question is, Are there any known issues in Cassandra map performance?  As I \
understand it each of the CQL3 Map entry, maps to a column in cassandra, with that \
assumption we are just creating 3 columns right? Any insight on this issue would be \
helpful.

I have previously heard reports along similar lines, but in the other direction.

eg - "I moved from a collection to a TEXT column with JSON in it, and my reads and \
writes both became much faster!"

I'm not sure if the issue has been raised as an Apache Cassandra Jira, iow if it is a \
known and expected limitation as opposed to just a performance issue.

If I were you, I would consider filing a repro case as a Jira ticket, and responding \
to this thread with its URL. :D

=Rob


[Attachment #3 (text/html)]

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: \
after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-family: Calibri, \
sans-serif;"> <div>Hi Daemeon,</div>
<div><br>
</div>
<div>We tried changing the behavior &#8220;we overwrite every value&#8221; to update \
only 1 element in the map, and still we saw the same performance degradation.</div> \
<div><br> </div>
<div>Thanks,</div>
<div>Pratik</div>
<div><br>
</div>
<span id="OLK_SRC_BODY_SECTION">
<div style="font-family:Calibri; font-size:11pt; text-align:left; color:black; \
BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM: 0in; \
PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; BORDER-RIGHT: \
medium none; PADDING-TOP: 3pt"> <span style="font-weight:bold">From: </span>daemeon \
reiydelle &lt;<a href="mailto:daemeonr@gmail.com">daemeonr@gmail.com</a>&gt;<br> \
<span style="font-weight:bold">Reply-To: </span>&quot;<a \
href="mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>&quot; &lt;<a \
href="mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>&gt;<br> <span \
style="font-weight:bold">Date: </span>Tuesday, February 9, 2016 at 11:39 AM<br> <span \
style="font-weight:bold">To: </span>&quot;<a \
href="mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>&quot; &lt;<a \
href="mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>&gt;<br> <span \
style="font-weight:bold">Cc: </span>&quot;Peddi, Praveen&quot; &lt;<a \
href="mailto:peddi@amazon.com">peddi@amazon.com</a>&gt;<br> <span \
style="font-weight:bold">Subject: </span>Re: Cassandra Collections performance \
issue<br> </div>
<div><br>
</div>
<div>
<div>
<div dir="ltr">
<div class="gmail_default" style="font-family:comic sans \
ms,sans-serif;color:rgb(7,55,99)"> I think the key to your problem might be around \
&quot;we overwrite every value&quot;. You are creating a large number of tombstones, \
forcing many reads to pull current results. You would do well to rethink why you are \
having to to overwrite values all the time under  the same key. You would be better \
to figure out haw to add values under a key then age off the old values. I would say \
that (at least at scale) you have a classic anti-pattern in play.<br> </div>
</div>
<div class="gmail_extra"><br clear="all">
<div>
<div class="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr"><span style="color:rgb(56,118,29)"><span \
style="background-color:rgb(255,255,255)"><b><span style="font-family: 'comic sans \
ms', sans-serif;"></span></b></span></span><span style="color:rgb(56,118,29)"><span \
style="background-color:rgb(255,255,255)"><b><span style="font-family: 'comic sans \
                ms', sans-serif;"><br>
.......</span></b></span></span><span style="color:rgb(56,118,29)"><span \
style="background-color:rgb(255,255,255)"><b><span style="font-family: 'comic sans \
ms', sans-serif;"><br> <br>
Daemeon C.M. Reiydelle<br>
USA (&#43;1) 415.501.0198<br>
London (&#43;44) (0) 20 8144 9872</span></b></span></span><font size="1"><i><br>
</i></font></div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<div class="gmail_quote">On Mon, Feb 8, 2016 at 5:23 PM, Robert Coli <span dir="ltr">
&lt;<a href="mailto:rcoli@eventbrite.com" \
target="_blank">rcoli@eventbrite.com</a>&gt;</span> wrote:<br> <blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"> <div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote"><span class="">On Mon, Feb 8, 2016 at 2:10 PM, Agrawal, \
Pratik <span dir="ltr">&lt;<a href="mailto:paagrawa@amazon.com" \
target="_blank">paagrawa@amazon.com</a>&gt;</span> wrote:<br> <blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"> <div \
style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
 <div>Recently we added one of the table fields from as Map&lt;text, text&gt; in \
<b>Cassandra 2.1.11</b>. Currently we read every field from Map and overwrite map \
values. Map is of size 3. We saw that writes are 30-40% slower while reads are 70-80% \
slower. Please  find below some metrics that can help.&nbsp;<br>
</div>
<span>
<div>
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
 <span>
<div>
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
 <span>
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
 <div>
<div><br>
</div>
<div>My question is, Are there any known issues in Cassandra map performance?&nbsp; \
As I understand it each of the CQL3 Map entry, maps to a column in cassandra, with \
that assumption we are just creating 3 columns right? Any insight on this issue would \
be helpful.</div> </div>
</div>
</span></div>
</div>
</span></div>
</div>
</span></div>
</blockquote>
<div><br>
</div>
</span>
<div>I have previously heard reports along similar lines, but in the other \
direction.</div> <div><br>
</div>
<div>eg - &quot;I moved from a collection to a TEXT column with JSON in it, and my \
reads and writes both became much faster!&quot;</div> <div><br>
</div>
<div>I'm not sure if the issue has been raised as an Apache Cassandra Jira, iow if it \
is a known and expected limitation as opposed to just a performance issue.</div> \
<div><br> </div>
<div>If I were you, I would consider filing a repro case as a Jira ticket, and \
responding to this thread with its URL. :D</div> <div><br>
</div>
<div>=Rob</div>
<div>&nbsp;</div>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</span>
</body>
</html>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic