[prev in list] [next in list] [prev in thread] [next in thread]
List: cassandra-user
Subject: Re: Cassandra Collections performance issue
From: "Agrawal, Pratik" <paagrawa () amazon ! com>
Date: 2016-02-24 22:10:40
Message-ID: D2F33C6E.1A897%paagrawa () amazon ! com
[Download RAW message or body]
Hi Daemeon,
We tried changing the behavior "we overwrite every value" to update only 1 element in \
the map, and still we saw the same performance degradation.
Thanks,
Pratik
From: daemeon reiydelle <daemeonr@gmail.com<mailto:daemeonr@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" \
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tuesday, February 9, 2016 at 11:39 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" \
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Cc: "Peddi, Praveen" <peddi@amazon.com<mailto:peddi@amazon.com>>
Subject: Re: Cassandra Collections performance issue
I think the key to your problem might be around "we overwrite every value". You are \
creating a large number of tombstones, forcing many reads to pull current results. \
You would do well to rethink why you are having to to overwrite values all the time \
under the same key. You would be better to figure out haw to add values under a key \
then age off the old values. I would say that (at least at scale) you have a classic \
anti-pattern in play.
.......
Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872
On Mon, Feb 8, 2016 at 5:23 PM, Robert Coli \
<rcoli@eventbrite.com<mailto:rcoli@eventbrite.com>> wrote: On Mon, Feb 8, 2016 at \
2:10 PM, Agrawal, Pratik <paagrawa@amazon.com<mailto:paagrawa@amazon.com>> wrote: \
Recently we added one of the table fields from as Map<text, text> in Cassandra \
2.1.11. Currently we read every field from Map and overwrite map values. Map is of \
size 3. We saw that writes are 30-40% slower while reads are 70-80% slower. Please \
find below some metrics that can help.
My question is, Are there any known issues in Cassandra map performance? As I \
understand it each of the CQL3 Map entry, maps to a column in cassandra, with that \
assumption we are just creating 3 columns right? Any insight on this issue would be \
helpful.
I have previously heard reports along similar lines, but in the other direction.
eg - "I moved from a collection to a TEXT column with JSON in it, and my reads and \
writes both became much faster!"
I'm not sure if the issue has been raised as an Apache Cassandra Jira, iow if it is a \
known and expected limitation as opposed to just a performance issue.
If I were you, I would consider filing a repro case as a Jira ticket, and responding \
to this thread with its URL. :D
=Rob
[Attachment #3 (text/html)]
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: \
after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-family: Calibri, \
sans-serif;"> <div>Hi Daemeon,</div>
<div><br>
</div>
<div>We tried changing the behavior “we overwrite every value” to update \
only 1 element in the map, and still we saw the same performance degradation.</div> \
<div><br> </div>
<div>Thanks,</div>
<div>Pratik</div>
<div><br>
</div>
<span id="OLK_SRC_BODY_SECTION">
<div style="font-family:Calibri; font-size:11pt; text-align:left; color:black; \
BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM: 0in; \
PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; BORDER-RIGHT: \
medium none; PADDING-TOP: 3pt"> <span style="font-weight:bold">From: </span>daemeon \
reiydelle <<a href="mailto:daemeonr@gmail.com">daemeonr@gmail.com</a>><br> \
<span style="font-weight:bold">Reply-To: </span>"<a \
href="mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>" <<a \
href="mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>><br> <span \
style="font-weight:bold">Date: </span>Tuesday, February 9, 2016 at 11:39 AM<br> <span \
style="font-weight:bold">To: </span>"<a \
href="mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>" <<a \
href="mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>><br> <span \
style="font-weight:bold">Cc: </span>"Peddi, Praveen" <<a \
href="mailto:peddi@amazon.com">peddi@amazon.com</a>><br> <span \
style="font-weight:bold">Subject: </span>Re: Cassandra Collections performance \
issue<br> </div>
<div><br>
</div>
<div>
<div>
<div dir="ltr">
<div class="gmail_default" style="font-family:comic sans \
ms,sans-serif;color:rgb(7,55,99)"> I think the key to your problem might be around \
"we overwrite every value". You are creating a large number of tombstones, \
forcing many reads to pull current results. You would do well to rethink why you are \
having to to overwrite values all the time under the same key. You would be better \
to figure out haw to add values under a key then age off the old values. I would say \
that (at least at scale) you have a classic anti-pattern in play.<br> </div>
</div>
<div class="gmail_extra"><br clear="all">
<div>
<div class="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr"><span style="color:rgb(56,118,29)"><span \
style="background-color:rgb(255,255,255)"><b><span style="font-family: 'comic sans \
ms', sans-serif;"></span></b></span></span><span style="color:rgb(56,118,29)"><span \
style="background-color:rgb(255,255,255)"><b><span style="font-family: 'comic sans \
ms', sans-serif;"><br>
.......</span></b></span></span><span style="color:rgb(56,118,29)"><span \
style="background-color:rgb(255,255,255)"><b><span style="font-family: 'comic sans \
ms', sans-serif;"><br> <br>
Daemeon C.M. Reiydelle<br>
USA (+1) 415.501.0198<br>
London (+44) (0) 20 8144 9872</span></b></span></span><font size="1"><i><br>
</i></font></div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<div class="gmail_quote">On Mon, Feb 8, 2016 at 5:23 PM, Robert Coli <span dir="ltr">
<<a href="mailto:rcoli@eventbrite.com" \
target="_blank">rcoli@eventbrite.com</a>></span> wrote:<br> <blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"> <div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote"><span class="">On Mon, Feb 8, 2016 at 2:10 PM, Agrawal, \
Pratik <span dir="ltr"><<a href="mailto:paagrawa@amazon.com" \
target="_blank">paagrawa@amazon.com</a>></span> wrote:<br> <blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"> <div \
style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
<div>Recently we added one of the table fields from as Map<text, text> in \
<b>Cassandra 2.1.11</b>. Currently we read every field from Map and overwrite map \
values. Map is of size 3. We saw that writes are 30-40% slower while reads are 70-80% \
slower. Please find below some metrics that can help. <br>
</div>
<span>
<div>
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
<span>
<div>
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
<span>
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
<div>
<div><br>
</div>
<div>My question is, Are there any known issues in Cassandra map performance? \
As I understand it each of the CQL3 Map entry, maps to a column in cassandra, with \
that assumption we are just creating 3 columns right? Any insight on this issue would \
be helpful.</div> </div>
</div>
</span></div>
</div>
</span></div>
</div>
</span></div>
</blockquote>
<div><br>
</div>
</span>
<div>I have previously heard reports along similar lines, but in the other \
direction.</div> <div><br>
</div>
<div>eg - "I moved from a collection to a TEXT column with JSON in it, and my \
reads and writes both became much faster!"</div> <div><br>
</div>
<div>I'm not sure if the issue has been raised as an Apache Cassandra Jira, iow if it \
is a known and expected limitation as opposed to just a performance issue.</div> \
<div><br> </div>
<div>If I were you, I would consider filing a repro case as a Jira ticket, and \
responding to this thread with its URL. :D</div> <div><br>
</div>
<div>=Rob</div>
<div> </div>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</span>
</body>
</html>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic