'Re: Node configuration and capacity'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cassandra-user
Subject:    Re: Node configuration and capacity
From:       Elliott Sims <elliott () backblaze ! com>
Date:       2021-01-13 21:26:30
Message-ID: CAARvq2PtbzVDJzQsrd0RUKvEgYSTRk4jEmcJW6GHrWRGmB4G5A () mail ! gmail ! com
[Download RAW message or body]

1% packet loss can definitely lead to drops.  At higher speeds, that's
enough to limit TCP throughput to the point that cross-node communication
can't keep up.  TCP_BBR will do better than other strategies at maintaining
high throughput despite single-digit packet loss, but you'll also want to
track down the actual cause.

I'd be a bit hesitant to tune the transport threads any further until
you've solved the packet loss problem.

On Wed, Jan 13, 2021 at 8:53 AM MyWorld <timeplus.1111@gmail.com> wrote:

> Hi,
>
> We are currently using apache cassandra 3.11.6 in our production
> environment with single DC of 4 nodes.
>
> 2 nodes have configuration : Ssd 24 cores, 64gb ram, 20gb heap size
>
> Other 2 nodes have: Ssd 32cores, 64gb ram, 20gb heap size
>
> I have several questions around this.
>
> 1. Does different configuration nodes(cores) in single dc have any impact ?
>
> 2. Can we have different heap size in single DC on different nodes?
>
> 3. Which is better : single partition disk or multiple partition disk?
>
> 4. Currently we have 200 writes and around 5000 reads per sec per node (In
> 4 node cluster). How to determine max node capacity?
>
> 5. We are getting read/write operation timeout intermittently. There is no
> GC issue. However we have observed 1% packet loss between nodes. Can this
> be the cause of timeout issue?
>
> 6. Currently we are getting 1100 established connections from client side.
> Shall we increase native_transport_max_threads to 1000+? Currently we have
> increased it from default 128 to 512 after finding pending NTR requests
> during timeout issue.
>
> 7. Have found below h/w production recommendation from dse site. How much
> this is helpful for apache cassandra ?
>
> net.ipv4.tcp_keepalive_time=60
> net.ipv4.tcp_keepalive_probes=3
> net.ipv4.tcp_keepalive_intvl=10
> net.core.rmem_max=16777216
> net.core.wmem_max=16777216
> net.core.rmem_default=16777216
> net.core.wmem_default=16777216
> net.core.optmem_max=40960
> net.ipv4.tcp_rmem=4096 87380 16777216
> net.ipv4.tcp_wmem=4096 65536 16777216
>
>

[Attachment #3 (text/html)]

<div dir="ltr"><div>1% packet loss can definitely lead to drops.   At higher speeds, \
that&#39;s enough to limit TCP throughput to the point that cross-node communication \
can&#39;t keep up.   TCP_BBR will do better than other strategies at maintaining high \
throughput despite single-digit packet loss, but you&#39;ll also want to track down \
the actual cause.<br><br></div>I&#39;d be a bit hesitant to tune the transport \
threads any further until you&#39;ve solved the packet loss \
problem.<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On \
Wed, Jan 13, 2021 at 8:53 AM MyWorld &lt;<a \
href="mailto:timeplus.1111@gmail.com">timeplus.1111@gmail.com</a>&gt; \
wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div \
dir="auto">Hi,<div dir="auto"><br></div><div dir="auto">We are currently using apache \
cassandra 3.11.6 in our production environment with single DC of 4 nodes.</div><div \
dir="auto"><br></div><div dir="auto">2 nodes have configuration : Ssd 24 cores, 64gb \
ram, 20gb heap size</div><div dir="auto"><br></div><div dir="auto">Other 2 nodes \
have: Ssd 32cores, 64gb ram, 20gb heap size</div><div dir="auto"><br></div><div \
dir="auto">I have several questions around this.  </div><div \
dir="auto"><br></div><div dir="auto">1. Does different configuration nodes(cores) in \
single dc have any impact ?</div><div dir="auto"><br></div><div dir="auto">2. Can we \
have different heap size in single DC on different nodes?</div><div \
dir="auto"><br></div><div dir="auto">3. Which is better : single partition disk or \
multiple partition disk?</div><div dir="auto"><br></div><div dir="auto">4. Currently \
we have 200 writes and around 5000 reads per sec per node (In 4 node cluster). How to \
determine max node capacity?  </div><div dir="auto"><br></div><div dir="auto">5. We \
are getting read/write operation timeout intermittently. There is no GC issue. \
However we have observed 1% packet loss between nodes. Can this be the cause of \
timeout issue?</div><div dir="auto"><br></div><div dir="auto">6. Currently we are \
getting 1100 established connections from client side. Shall we increase \
native_transport_max_threads to 1000+? Currently we have increased it from default \
128 to 512 after finding pending NTR requests during timeout issue.</div><div \
dir="auto"><br></div><div dir="auto">7. Have found below h/w production \
recommendation from dse site. How much this is helpful for apache cassandra \
?</div><div dir="auto"><pre \
style="font-size:16px;margin-top:12px;margin-bottom:0.5em;color:rgb(45,49,51);padding: \
8px;background-color:rgb(248,249,250);max-height:600px;border-radius:7px;border:medium \
none"><code style="font-size:inherit;color:inherit;background-color:transparent;border:medium \
none">net.ipv4.tcp_keepalive_time=60 net.ipv4.tcp_keepalive_probes=3
net.ipv4.tcp_keepalive_intvl=10
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.core.rmem_default=16777216
net.core.wmem_default=16777216
net.core.optmem_max=40960
net.ipv4.tcp_rmem=4096 87380 16777216
net.ipv4.tcp_wmem=4096 65536 16777216</code></pre></div></div>
</blockquote></div>



[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic