'OversizedMessageException in AntiEntropyStage'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cassandra-user
Subject:    OversizedMessageException in AntiEntropyStage
From:       Sébastien_Rebecchi <srebecchi () kameleoon ! com>
Date:       2021-11-10 8:51:51
Message-ID: CA+ts6np8eo50UKaoVeuDXFiHnKgMF4mJ6G17CGJfoA7HM7cwzQ () mail ! gmail ! com
[Download RAW message or body]

Hi,

Digging in Cassandra logs, I saw this exception happening several times,
see trace below.

I was wondering, what can cause such big messages, and could this error
make a repair session fail? I currently see that my repairs are very long
and sometimes I even stop them cause it seems they are just hanging and
will never return. For information, I run repair (node+table) by
(node+table), 1 after 1 sequentially, and I stop after 5 days stucking on
the same (node+table). I see the repair session is still alive, the pid is
here, no repair failure message, just nothing happens and repair % is not
increasing.

Is there something I can do to avoid that error? If I could only fix by
changing the configuration, what is the configuration setting to increase?
I guess it is commitlog_segment_size_in_mb, but I am not able to find a
definite confirmation on the web.

Here some informations of the installation:
- Apache Cassandra 4.0 GA release
- output of cqlsh: cqlsh 6.0.0 | Cassandra 4.0.1 | CQL spec 3.4.5 | Native
protocol v5
- OS: CentOS Linux release 8.4.2105
- Output of nodetool status (hiding address column which is sensitive). As
you can see cluster remains imbalanced 2 weeks after I joined the 2 last
nodes (the ones having the lesser load), before it was perfectly balanced
with the same data model, if this information may help:
--          Load        Tokens  Owns (effective)  Host ID
            Rack
UN     712.72 GiB  8       32.6%
00f8bb86-5283-4b01-9819-fe4d59337680  rack1
UN     830.91 GiB  8       35.5%
b9490ee5-44ba-4898-add7-159a7eeb06d9  rack1
UN   763.69 GiB  8       34.7%
931ccba9-9aef-4d79-9fa0-53e4654554f5  rack1
UN    720.29 GiB  8       33.0%
4b5445ad-fb3c-419e-98b9-80599014e2b4  rack1
UN     273.6 GiB   8       22.8%
34db97d3-6c6d-4544-b656-9d4ae2a82dca  rack1
UN    616.92 GiB  8       41.3%
631d7b77-de74-4fe3-86a2-8bd3beec191e  rack1

Thank you for your help!

S=C3=A9bastien.

--

/var/log/cassandra/system.log:ERROR [AntiEntropyStage:1] 2021-11-05
00:48:28,074 CassandraDaemon.java:579 - Exception in thread
Thread[AntiEntropyStage:1,5,main]
/var/log/cassandra/system.log-org.apache.cassandra.net.Message$OversizedMes=
sageException:
Message of size 140580605 bytes exceeds allowed maximum of 134217728 bytes
/var/log/cassandra/system.log- at
org.apache.cassandra.net.OutboundConnection.enqueue(OutboundConnection.java=
:328)
/var/log/cassandra/system.log- at
org.apache.cassandra.net.OutboundConnections.enqueue(OutboundConnections.ja=
va:84)
/var/log/cassandra/system.log- at
org.apache.cassandra.net.MessagingService.doSend(MessagingService.java:338)
/var/log/cassandra/system.log- at
org.apache.cassandra.net.OutboundSink.accept(OutboundSink.java:70)
/var/log/cassandra/system.log- at
org.apache.cassandra.net.MessagingService.send(MessagingService.java:327)
/var/log/cassandra/system.log- at
org.apache.cassandra.net.MessagingService.send(MessagingService.java:314)
/var/log/cassandra/system.log- at
org.apache.cassandra.repair.Validator.respond(Validator.java:269)
/var/log/cassandra/system.log- at
org.apache.cassandra.repair.Validator.run(Validator.java:257)
/var/log/cassandra/system.log- at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1=
149)
/var/log/cassandra/system.log- at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:=
624)
/var/log/cassandra/system.log- at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnabl=
e.java:30)
/var/log/cassandra/system.log- at java.lang.Thread.run(Thread.java:748)

[Attachment #3 (text/html)]

<div dir="ltr">Hi,<div><br></div><div>Digging in Cassandra logs, I saw this exception \
happening several times, see trace below.</div><div><br></div><div>I was wondering, \
what can cause such big messages, and could this error make a repair session fail? I \
currently see that my repairs are very long and sometimes I even stop them cause it \
seems they are just hanging and will never return. For information, I run repair \
(node+table) by (node+table), 1 after 1 sequentially, and I stop after 5 days \
stucking on the same  (node+table). I see the repair session is still alive,  the pid \
is here, no repair failure message, just nothing happens and repair % is not \
increasing.</div><div><br></div><div>Is there something  I can do to avoid that \
error? If I could only fix by changing the configuration, what is the configuration \
setting to increase? I guess it is  commitlog_segment_size_in_mb, but I am not able \
to find a definite confirmation on the web.</div><div><br></div><div>Here some \
informations of the installation:</div><div>-  Apache Cassandra 4.0 GA \
release</div><div>- output of cqlsh:  cqlsh 6.0.0 | Cassandra 4.0.1 | CQL spec 3.4.5 \
| Native protocol v5</div><div>- OS:  CentOS Linux release 8.4.2105</div><div>- \
Output of nodetool status (hiding address  column which is sensitive). As you can see \
cluster remains imbalanced 2 weeks after I joined the 2 last nodes (the ones having \
the lesser load), before it was perfectly  balanced with the same data model, if this \
information may help:</div><div>--               Load            Tokens   Owns \
(effective)   Host ID                                              Rack <br>UN        \
712.72 GiB   8          32.6%                   00f8bb86-5283-4b01-9819-fe4d59337680  \
rack1<br>UN        830.91 GiB   8          35.5%                   \
b9490ee5-44ba-4898-add7-159a7eeb06d9   rack1<br>UN     763.69 GiB   8          34.7%  \
931ccba9-9aef-4d79-9fa0-53e4654554f5   rack1<br>UN      720.29 GiB   8          33.0% \
4b5445ad-fb3c-419e-98b9-80599014e2b4   rack1<br>UN        273.6 GiB    8          \
22.8%                   34db97d3-6c6d-4544-b656-9d4ae2a82dca   rack1<br>UN      \
616.92 GiB   8          41.3%                   631d7b77-de74-4fe3-86a2-8bd3beec191e  \
rack1<br></div><div><br></div><div>Thank you for your \
help!</div><div><br></div><div>Sébastien.</div><div><br></div><div>--</div><div><br></div><div>/var/log/cassandra/system.log:ERROR \
[AntiEntropyStage:1] 2021-11-05 00:48:28,074 CassandraDaemon.java:579 - Exception in \
thread Thread[AntiEntropyStage:1,5,main]<br>/var/log/cassandra/system.log-org.apache.cassandra.net.Message$OversizedMessageException: \
Message of size 140580605 bytes exceeds allowed maximum of 134217728 \
bytes<br>/var/log/cassandra/system.log-	at \
org.apache.cassandra.net.OutboundConnection.enqueue(OutboundConnection.java:328)<br>/var/log/cassandra/system.log-	at \
org.apache.cassandra.net.OutboundConnections.enqueue(OutboundConnections.java:84)<br>/var/log/cassandra/system.log-	at \
org.apache.cassandra.net.MessagingService.doSend(MessagingService.java:338)<br>/var/log/cassandra/system.log-	at \
org.apache.cassandra.net.OutboundSink.accept(OutboundSink.java:70)<br>/var/log/cassandra/system.log-	at \
org.apache.cassandra.net.MessagingService.send(MessagingService.java:327)<br>/var/log/cassandra/system.log-	at \
org.apache.cassandra.net.MessagingService.send(MessagingService.java:314)<br>/var/log/cassandra/system.log-	at \
org.apache.cassandra.repair.Validator.respond(Validator.java:269)<br>/var/log/cassandra/system.log-	at \
org.apache.cassandra.repair.Validator.run(Validator.java:257)<br>/var/log/cassandra/system.log-	at \
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)<br>/var/log/cassandra/system.log-	at \
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)<br>/var/log/cassandra/system.log-	at \
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)<br>/var/log/cassandra/system.log-	at \
java.lang.Thread.run(Thread.java:748)<br></div></div>



[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic