[prev in list] [next in list] [prev in thread] [next in thread]
List: cassandra-user
Subject: OversizedMessageException in AntiEntropyStage
From: Sébastien_Rebecchi <srebecchi () kameleoon ! com>
Date: 2021-11-10 8:51:51
Message-ID: CA+ts6np8eo50UKaoVeuDXFiHnKgMF4mJ6G17CGJfoA7HM7cwzQ () mail ! gmail ! com
[Download RAW message or body]
Hi,
Digging in Cassandra logs, I saw this exception happening several times,
see trace below.
I was wondering, what can cause such big messages, and could this error
make a repair session fail? I currently see that my repairs are very long
and sometimes I even stop them cause it seems they are just hanging and
will never return. For information, I run repair (node+table) by
(node+table), 1 after 1 sequentially, and I stop after 5 days stucking on
the same (node+table). I see the repair session is still alive, the pid is
here, no repair failure message, just nothing happens and repair % is not
increasing.
Is there something I can do to avoid that error? If I could only fix by
changing the configuration, what is the configuration setting to increase?
I guess it is commitlog_segment_size_in_mb, but I am not able to find a
definite confirmation on the web.
Here some informations of the installation:
- Apache Cassandra 4.0 GA release
- output of cqlsh: cqlsh 6.0.0 | Cassandra 4.0.1 | CQL spec 3.4.5 | Native
protocol v5
- OS: CentOS Linux release 8.4.2105
- Output of nodetool status (hiding address column which is sensitive). As
you can see cluster remains imbalanced 2 weeks after I joined the 2 last
nodes (the ones having the lesser load), before it was perfectly balanced
with the same data model, if this information may help:
-- Load Tokens Owns (effective) Host ID
Rack
UN 712.72 GiB 8 32.6%
00f8bb86-5283-4b01-9819-fe4d59337680 rack1
UN 830.91 GiB 8 35.5%
b9490ee5-44ba-4898-add7-159a7eeb06d9 rack1
UN 763.69 GiB 8 34.7%
931ccba9-9aef-4d79-9fa0-53e4654554f5 rack1
UN 720.29 GiB 8 33.0%
4b5445ad-fb3c-419e-98b9-80599014e2b4 rack1
UN 273.6 GiB 8 22.8%
34db97d3-6c6d-4544-b656-9d4ae2a82dca rack1
UN 616.92 GiB 8 41.3%
631d7b77-de74-4fe3-86a2-8bd3beec191e rack1
Thank you for your help!
S=C3=A9bastien.
--
/var/log/cassandra/system.log:ERROR [AntiEntropyStage:1] 2021-11-05
00:48:28,074 CassandraDaemon.java:579 - Exception in thread
Thread[AntiEntropyStage:1,5,main]
/var/log/cassandra/system.log-org.apache.cassandra.net.Message$OversizedMes=
sageException:
Message of size 140580605 bytes exceeds allowed maximum of 134217728 bytes
/var/log/cassandra/system.log- at
org.apache.cassandra.net.OutboundConnection.enqueue(OutboundConnection.java=
:328)
/var/log/cassandra/system.log- at
org.apache.cassandra.net.OutboundConnections.enqueue(OutboundConnections.ja=
va:84)
/var/log/cassandra/system.log- at
org.apache.cassandra.net.MessagingService.doSend(MessagingService.java:338)
/var/log/cassandra/system.log- at
org.apache.cassandra.net.OutboundSink.accept(OutboundSink.java:70)
/var/log/cassandra/system.log- at
org.apache.cassandra.net.MessagingService.send(MessagingService.java:327)
/var/log/cassandra/system.log- at
org.apache.cassandra.net.MessagingService.send(MessagingService.java:314)
/var/log/cassandra/system.log- at
org.apache.cassandra.repair.Validator.respond(Validator.java:269)
/var/log/cassandra/system.log- at
org.apache.cassandra.repair.Validator.run(Validator.java:257)
/var/log/cassandra/system.log- at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1=
149)
/var/log/cassandra/system.log- at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:=
624)
/var/log/cassandra/system.log- at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnabl=
e.java:30)
/var/log/cassandra/system.log- at java.lang.Thread.run(Thread.java:748)
[Attachment #3 (text/html)]
<div dir="ltr">Hi,<div><br></div><div>Digging in Cassandra logs, I saw this exception \
happening several times, see trace below.</div><div><br></div><div>I was wondering, \
what can cause such big messages, and could this error make a repair session fail? I \
currently see that my repairs are very long and sometimes I even stop them cause it \
seems they are just hanging and will never return. For information, I run repair \
(node+table) by (node+table), 1 after 1 sequentially, and I stop after 5 days \
stucking on the same (node+table). I see the repair session is still alive, the pid \
is here, no repair failure message, just nothing happens and repair % is not \
increasing.</div><div><br></div><div>Is there something I can do to avoid that \
error? If I could only fix by changing the configuration, what is the configuration \
setting to increase? I guess it is commitlog_segment_size_in_mb, but I am not able \
to find a definite confirmation on the web.</div><div><br></div><div>Here some \
informations of the installation:</div><div>- Apache Cassandra 4.0 GA \
release</div><div>- output of cqlsh: cqlsh 6.0.0 | Cassandra 4.0.1 | CQL spec 3.4.5 \
| Native protocol v5</div><div>- OS: CentOS Linux release 8.4.2105</div><div>- \
Output of nodetool status (hiding address column which is sensitive). As you can see \
cluster remains imbalanced 2 weeks after I joined the 2 last nodes (the ones having \
the lesser load), before it was perfectly balanced with the same data model, if this \
information may help:</div><div>-- Load Tokens Owns \
(effective) Host ID Rack <br>UN \
712.72 GiB 8 32.6% 00f8bb86-5283-4b01-9819-fe4d59337680 \
rack1<br>UN 830.91 GiB 8 35.5% \
b9490ee5-44ba-4898-add7-159a7eeb06d9 rack1<br>UN 763.69 GiB 8 34.7% \
931ccba9-9aef-4d79-9fa0-53e4654554f5 rack1<br>UN 720.29 GiB 8 33.0% \
4b5445ad-fb3c-419e-98b9-80599014e2b4 rack1<br>UN 273.6 GiB 8 \
22.8% 34db97d3-6c6d-4544-b656-9d4ae2a82dca rack1<br>UN \
616.92 GiB 8 41.3% 631d7b77-de74-4fe3-86a2-8bd3beec191e \
rack1<br></div><div><br></div><div>Thank you for your \
help!</div><div><br></div><div>Sébastien.</div><div><br></div><div>--</div><div><br></div><div>/var/log/cassandra/system.log:ERROR \
[AntiEntropyStage:1] 2021-11-05 00:48:28,074 CassandraDaemon.java:579 - Exception in \
thread Thread[AntiEntropyStage:1,5,main]<br>/var/log/cassandra/system.log-org.apache.cassandra.net.Message$OversizedMessageException: \
Message of size 140580605 bytes exceeds allowed maximum of 134217728 \
bytes<br>/var/log/cassandra/system.log- at \
org.apache.cassandra.net.OutboundConnection.enqueue(OutboundConnection.java:328)<br>/var/log/cassandra/system.log- at \
org.apache.cassandra.net.OutboundConnections.enqueue(OutboundConnections.java:84)<br>/var/log/cassandra/system.log- at \
org.apache.cassandra.net.MessagingService.doSend(MessagingService.java:338)<br>/var/log/cassandra/system.log- at \
org.apache.cassandra.net.OutboundSink.accept(OutboundSink.java:70)<br>/var/log/cassandra/system.log- at \
org.apache.cassandra.net.MessagingService.send(MessagingService.java:327)<br>/var/log/cassandra/system.log- at \
org.apache.cassandra.net.MessagingService.send(MessagingService.java:314)<br>/var/log/cassandra/system.log- at \
org.apache.cassandra.repair.Validator.respond(Validator.java:269)<br>/var/log/cassandra/system.log- at \
org.apache.cassandra.repair.Validator.run(Validator.java:257)<br>/var/log/cassandra/system.log- at \
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)<br>/var/log/cassandra/system.log- at \
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)<br>/var/log/cassandra/system.log- at \
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)<br>/var/log/cassandra/system.log- at \
java.lang.Thread.run(Thread.java:748)<br></div></div>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic