[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cyrus-devel
Subject:    Debugging Deadlocks
From:       Дилян_ =?UTF-8?Q?=D0=9F=D0=B0=D0=BB=D0=B0=D
Date:       2019-11-19 20:56:49
Message-ID: 92cca1d7baac62ef2b3cbe3f59a771796aba19dd.camel () aegee ! org
[Download RAW message or body]

Hello,

I run cyrus imap 3.0.x with some private changes.

Sometimes when stop the master process, the master process utilizes one CPU core to \
100% for 5 minutes.  After the fifth minute, systemd enforces kill -9. When I attach \
to the maste process, I see that it some janitor does some work, but I have not \
checked the details.  Has anybody experienced this?

I have very few users, but one of the users (me) uses many client simuitaneously.  \
Lets say two IMAP clients, making 4-6  connections in parallel and three CalDAV \
clients, doing estimated 3-6 connections in parallel.  The httpd process is behind a \
proxy and most of the time the proxy server manages to serialize the requests, and in \
fact a single httpd process handles the requests.  At least it is not visible that \
under normal circumstances there is a second running httpd process.  Under normal \
circumstances I see also a single lmtpd process and many imapd processes.

On some days I observe that the IMAP client cannot obtain list of new messages, it \
just times out.  This could because of deadlocks, but it can be because on that \
particular day the IO is extremely slow and thus the problem is not withn cyrus.  \
Sometimes I observe afterwards that tha INBOX index is being rebuild.  Sometimes, \
after the INBOX index is rebuild things start working.

So on such days I suspect that there is some deadlock.  Lets say, if there are two or \
more long-term running lmtpd processes, then I suspect a deadlock.  What approach can \
use to find where the deadlock is and how can get rid of it?

I can attach to a process with STRACE, get the current backtrace and variable values \
with GDB and I can see (eg. with LSOF) which files are opened in which mode.  But I \
do not know what to look for.  Or rather, when I try investigating, almost always I \
see a process rebuiding my INBOX index and after waiting, waiting, waiting, \
eventually the INDEX is rebuild.  How can I find out why the index was broken?

Greetings
  Дилян


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic