'Re: Read Performance'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cassandra-user
Subject:    Re: Read Performance
From:       James Golick <jamesgolick () gmail ! com>
Date:       2010-03-31 23:21:27
Message-ID: t2z1ab2da821003311621te1f2738er58b8b1989747ffb5 () mail ! gmail ! com
[Download RAW message or body]

Keyspace: ActivityFeed
        Read Count: 699443
        Read Latency: 16.11017477192566 ms.
        Write Count: 69264920
        Write Latency: 0.020393242755495856 ms.
        Pending Tasks: 0
...snip....

                Column Family: Events
                SSTable count: 5
                Space used (live): 680625289
                Space used (total): 680625289
                Memtable Columns Count: 65974
                Memtable Data Size: 6901772
                Memtable Switch Count: 121
                Read Count: 232378
                Read Latency: 0.396 ms.
                Write Count: 919233
                Write Latency: 0.055 ms.
                Pending Tasks: 0
                Key cache capacity: 47
                Key cache size: 0
                Key cache hit rate: NaN
                Row cache capacity: 500000
                Row cache size: 62768
                Row cache hit rate: 0.007716049382716049


On Wed, Mar 31, 2010 at 4:15 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> What does the CFS mbean think read latencies are?  Possibly something
> else is introducing latency after the read.
>
> On Wed, Mar 31, 2010 at 5:37 PM, James Golick <jamesgolick@gmail.com>
> wrote:
> > Standard CF. 10 columns per row. Between about 800 bytes and 2k total per
> > row.
> > On Wed, Mar 31, 2010 at 3:06 PM, Chris Goffinet <goffinet@digg.com>
> wrote:
> >>
> >> How many columns in each row?
> >> -Chris
> >> On Mar 31, 2010, at 2:54 PM, James Golick wrote:
> >>
> >> I just tried running the same multi_get against cassandra 1000 times,
> >> assuming that that'd force it in to cache.
> >> I'm definitely seeing a 5-10ms improvement, but it's still looking like
> >> 20-30ms on average. Would you expect it to be faster than that?
> >> - James
> >>
> >> On Wed, Mar 31, 2010 at 11:44 AM, Jonathan Ellis <jbellis@gmail.com>
> >> wrote:
> >>>
> >>> But then you'd still be caching the same things memcached is, so
> >>> unless you have a lot more ram you'll presumably miss the same rows
> >>> too.
> >>>
> >>> The only 2-layer approach that makes sense to me would be to have
> >>> cassandra keys cache at 100% behind memcached for the actual rows,
> >>> which will actually reduce the penalty for a memcache miss by
> >>> half-ish.
> >>>
> >>> On Wed, Mar 31, 2010 at 1:32 PM, David Strauss <david@fourkitchens.com
> >
> >>> wrote:
> >>> > Or, if faking memcached misses is too high a price to pay, queue some
> >>> > proportion of the reads to replay asynchronously against Cassandra.
> >>> >
> >>> > On Wed, 2010-03-31 at 11:04 -0500, Jonathan Ellis wrote:
> >>> >> Can you redirect some of the reads from memcache to cassandra?
>  Sounds
> >>> >> like the cache isn't getting warmed up.
> >>> >>
> >>> >> On Wed, Mar 31, 2010 at 11:01 AM, James Golick <
> jamesgolick@gmail.com>
> >>> >> wrote:
> >>> >> > I'm testing on the live cluster, but most of the production reads
> >>> >> > are being
> >>> >> > served by the cache. It's definitely the right CF.
> >>> >> >
> >>> >> > On Wed, Mar 31, 2010 at 8:30 AM, Jonathan Ellis <
> jbellis@gmail.com>
> >>> >> > wrote:
> >>> >> >>
> >>> >> >> On Wed, Mar 31, 2010 at 12:01 AM, James Golick
> >>> >> >> <jamesgolick@gmail.com>
> >>> >> >> wrote:
> >>> >> >> > Okay, so now my row cache hit rate jumps between 1.0, 99.5,
> 95.6,
> >>> >> >> > and
> >>> >> >> > NaN.
> >>> >> >> > Seems like that stat is a little broken.
> >>> >> >>
> >>> >> >> Sounds like you aren't getting enough requests for the
> >>> >> >> getRecentHitRate to make sense.  use getHits / getRequests.
> >>> >> >>
> >>> >> >> But if you aren't getting enough requests for getRecentHitRate,
> are
> >>> >> >> you sure you're tuning the cache on the right CF for your 35ms
> >>> >> >> test?
> >>> >> >> Are you testing live?  If not, what's your methodology here?
> >>> >> >>
> >>> >> >> -Jonathan
> >>> >> >
> >>> >> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>
> >>
> >
> >
>

[Attachment #3 (text/html)]

<div>Keyspace: ActivityFeed</div><div>        Read Count: 699443</div><div>        \
Read Latency: 16.11017477192566 ms.</div><div>        Write Count: \
69264920</div><div>        Write Latency: 0.020393242755495856 ms.</div> <div>        \
Pending Tasks: 0</div><div>...snip....</div><div><br></div><div>                \
Column Family: Events</div><div>                SSTable count: 5</div><div>           \
Space used (live): 680625289</div><div>  Space used (total): 680625289</div><div>     \
Memtable Columns Count: 65974</div><div>                Memtable Data Size: \
6901772</div><div>                Memtable Switch Count: 121</div><div>               \
Read Count: 232378</div> <div>                Read Latency: 0.396 ms.</div><div>      \
Write Count: 919233</div><div>                Write Latency: 0.055 ms.</div><div>     \
Pending Tasks: 0</div><div>                Key cache capacity: 47</div> <div>         \
Key cache size: 0</div><div>                Key cache hit rate: NaN</div><div>        \
Row cache capacity: 500000</div><div>                Row cache size: 62768</div><div> \
Row cache hit rate: 0.007716049382716049</div> <div><br></div><br><div \
class="gmail_quote">On Wed, Mar 31, 2010 at 4:15 PM, Jonathan Ellis <span \
dir="ltr">&lt;<a href="mailto:jbellis@gmail.com">jbellis@gmail.com</a>&gt;</span> \
wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px \
#ccc solid;padding-left:1ex;"> What does the CFS mbean think read latencies are?  \
Possibly something<br> else is introducing latency after the read.<br>
<div><div></div><div class="h5"><br>
On Wed, Mar 31, 2010 at 5:37 PM, James Golick &lt;<a \
href="mailto:jamesgolick@gmail.com">jamesgolick@gmail.com</a>&gt; wrote:<br> &gt; \
Standard CF. 10 columns per row. Between about 800 bytes and 2k total per<br> &gt; \
row.<br> &gt; On Wed, Mar 31, 2010 at 3:06 PM, Chris Goffinet &lt;<a \
href="mailto:goffinet@digg.com">goffinet@digg.com</a>&gt; wrote:<br> &gt;&gt;<br>
&gt;&gt; How many columns in each row?<br>
&gt;&gt; -Chris<br>
&gt;&gt; On Mar 31, 2010, at 2:54 PM, James Golick wrote:<br>
&gt;&gt;<br>
&gt;&gt; I just tried running the same multi_get against cassandra 1000 times,<br>
&gt;&gt; assuming that that&#39;d force it in to cache.<br>
&gt;&gt; I&#39;m definitely seeing a 5-10ms improvement, but it&#39;s still looking \
like<br> &gt;&gt; 20-30ms on average. Would you expect it to be faster than that?<br>
&gt;&gt; - James<br>
&gt;&gt;<br>
&gt;&gt; On Wed, Mar 31, 2010 at 11:44 AM, Jonathan Ellis &lt;<a \
href="mailto:jbellis@gmail.com">jbellis@gmail.com</a>&gt;<br> &gt;&gt; wrote:<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; But then you&#39;d still be caching the same things memcached is, so<br>
&gt;&gt;&gt; unless you have a lot more ram you&#39;ll presumably miss the same \
rows<br> &gt;&gt;&gt; too.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; The only 2-layer approach that makes sense to me would be to have<br>
&gt;&gt;&gt; cassandra keys cache at 100% behind memcached for the actual rows,<br>
&gt;&gt;&gt; which will actually reduce the penalty for a memcache miss by<br>
&gt;&gt;&gt; half-ish.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; On Wed, Mar 31, 2010 at 1:32 PM, David Strauss &lt;<a \
href="mailto:david@fourkitchens.com">david@fourkitchens.com</a>&gt;<br> &gt;&gt;&gt; \
wrote:<br> &gt;&gt;&gt; &gt; Or, if faking memcached misses is too high a price to \
pay, queue some<br> &gt;&gt;&gt; &gt; proportion of the reads to replay \
asynchronously against Cassandra.<br> &gt;&gt;&gt; &gt;<br>
&gt;&gt;&gt; &gt; On Wed, 2010-03-31 at 11:04 -0500, Jonathan Ellis wrote:<br>
&gt;&gt;&gt; &gt;&gt; Can you redirect some of the reads from memcache to cassandra?  \
Sounds<br> &gt;&gt;&gt; &gt;&gt; like the cache isn&#39;t getting warmed up.<br>
&gt;&gt;&gt; &gt;&gt;<br>
&gt;&gt;&gt; &gt;&gt; On Wed, Mar 31, 2010 at 11:01 AM, James Golick &lt;<a \
href="mailto:jamesgolick@gmail.com">jamesgolick@gmail.com</a>&gt;<br> &gt;&gt;&gt; \
&gt;&gt; wrote:<br> &gt;&gt;&gt; &gt;&gt; &gt; I&#39;m testing on the live cluster, \
but most of the production reads<br> &gt;&gt;&gt; &gt;&gt; &gt; are being<br>
&gt;&gt;&gt; &gt;&gt; &gt; served by the cache. It&#39;s definitely the right CF.<br>
&gt;&gt;&gt; &gt;&gt; &gt;<br>
&gt;&gt;&gt; &gt;&gt; &gt; On Wed, Mar 31, 2010 at 8:30 AM, Jonathan Ellis &lt;<a \
href="mailto:jbellis@gmail.com">jbellis@gmail.com</a>&gt;<br> &gt;&gt;&gt; &gt;&gt; \
&gt; wrote:<br> &gt;&gt;&gt; &gt;&gt; &gt;&gt;<br>
&gt;&gt;&gt; &gt;&gt; &gt;&gt; On Wed, Mar 31, 2010 at 12:01 AM, James Golick<br>
&gt;&gt;&gt; &gt;&gt; &gt;&gt; &lt;<a \
href="mailto:jamesgolick@gmail.com">jamesgolick@gmail.com</a>&gt;<br> &gt;&gt;&gt; \
&gt;&gt; &gt;&gt; wrote:<br> &gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt; Okay, so now my row \
cache hit rate jumps between 1.0, 99.5, 95.6,<br> &gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt; \
and<br> &gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt; NaN.<br>
&gt;&gt;&gt; &gt;&gt; &gt;&gt; &gt; Seems like that stat is a little broken.<br>
&gt;&gt;&gt; &gt;&gt; &gt;&gt;<br>
&gt;&gt;&gt; &gt;&gt; &gt;&gt; Sounds like you aren&#39;t getting enough requests for \
the<br> &gt;&gt;&gt; &gt;&gt; &gt;&gt; getRecentHitRate to make sense.  use getHits / \
getRequests.<br> &gt;&gt;&gt; &gt;&gt; &gt;&gt;<br>
&gt;&gt;&gt; &gt;&gt; &gt;&gt; But if you aren&#39;t getting enough requests for \
getRecentHitRate, are<br> &gt;&gt;&gt; &gt;&gt; &gt;&gt; you sure you&#39;re tuning \
the cache on the right CF for your 35ms<br> &gt;&gt;&gt; &gt;&gt; &gt;&gt; test?<br>
&gt;&gt;&gt; &gt;&gt; &gt;&gt; Are you testing live?  If not, what&#39;s your \
methodology here?<br> &gt;&gt;&gt; &gt;&gt; &gt;&gt;<br>
&gt;&gt;&gt; &gt;&gt; &gt;&gt; -Jonathan<br>
&gt;&gt;&gt; &gt;&gt; &gt;<br>
&gt;&gt;&gt; &gt;&gt; &gt;<br>
&gt;&gt;&gt; &gt;<br>
&gt;&gt;&gt; &gt;<br>
&gt;&gt;&gt; &gt;<br>
&gt;&gt;&gt; &gt;<br>
&gt;&gt;<br>
&gt;&gt;<br>
&gt;<br>
&gt;<br>
</div></div></blockquote></div><br>



[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic