[prev in list] [next in list] [prev in thread] [next in thread] 

List:       fedora-directory-users
Subject:    =?utf-8?q?=5B389-users=5D?= Re: Crash with SEGV after compacting
From:       Mark Reynolds <mareynol () redhat ! com>
Date:       2022-08-03 18:20:13
Message-ID: 3ddaa48b-0a88-aa76-4ac3-297903725796 () redhat ! com
[Download RAW message or body]


On 8/3/22 1:11 PM, Niklas Schmatloch wrote:
> Hi
> 
> My organisation is using a replicated 389-dirsrv. Lately, it has been crashing
> each time after compacting.
> 
> It is replicable on our instances by lowering the compactdb-interval to
> trigger the compacting:
> 
> dsconf -D "cn=Directory Manager" ldap://127.0.0.1 -w 'PASSWORD_HERE' backend config \
> set --compactdb-interval 300

Tip - you can use the server instance name in place of the credentials 
and URL.  It will use LDAPI as long as run it as root:

     dsconf slapd-INSTANCE backend config set --compactdb-interval 300

or even shorter (without the "slapd-" if the instance name does not 
match an argument in dsconf):

    dsconf INSTANCE backend config set --compactdb-interval 300

Makes it easier to use the new tools IMHO.


> 
> This is the log:
> 
> [03/Aug/2022:16:06:38.552781605 +0200] - NOTICE - checkpoint_threadmain - \
> Compacting DB start: userRoot [03/Aug/2022:16:06:38.752592692 +0200] - NOTICE - \
> bdb_db_compact_one_db - compactdb: compact userRoot - 8 pages freed \
> [03/Aug/2022:16:06:44.172233009 +0200] - NOTICE - bdb_db_compact_one_db - \
> compactdb: compact userRoot - 888 pages freed [03/Aug/2022:16:06:44.179315345 \
> +0200] - NOTICE - checkpoint_threadmain - Compacting DB start: changelog \
> [03/Aug/2022:16:13:18.020881527 +0200] - NOTICE - bdb_db_compact_one_db - \
> compactdb: compact changelog - 458 pages freed dirsrv@auth-alpha.service: Main \
> process exited, code=killed, status=11/SEGV dirsrv@auth-alpha.service: Failed with \
> result 'signal'. dirsrv@auth-alpha.service: Consumed 2d 6h 22min 1.122s CPU time.
> 
> The first steps are done very quickly, but the step before the 458 pages of the
> retro-changelog are freed, takes several minutes. In this time the dirsrv writes
> more than 10 G and reads more than 7 G (according to iotop).
> 
> After this line is printed the dirsrv crashes within seconds.
> What I also noticed is, that even though it said it freed a lot of pages the
> retro-changelog does not seem to change in size.
> The file `/var/lib/dirsrv/slapd-auth-alpha/db/changelog/id2entry.db` is 7.2 G
> before and after the compacting.
> 
> 
> Debian 11.4
> 389-ds-base/stable,now 1.4.4.11-2 amd64
> 
> Does someone have an idea how to debug / fix this?

Definitely need a good stacktrace from the crash.  Unfortunately I think 
this doc is slightly outdated but it's mostly accurate (the core file 
location is probably wrong): 
https://www.port389.org/docs/389ds/FAQ/faq.html#sts=Debugging%C2%A0Crashes

You could also live debug it as well by just attaching gdb to the 
ns-slapd process (after installing the devel and debuginfo packages) and 
waiting for the compaction to occur.  Then when it crashes get the stack 
of the crashing thread.  Or, all threads: (gdb) thread apply all bt full

Question, is there trimming set up on the retrocl?  How aggressive are 
the trimming settings?  Not sure if trimming more entries before the 
next compaction would help or hurt.

Anyway the server should never crash, so please provide the requested 
information and we will take a look at it.

Thanks,

Mark

> 
> Thanks
> _______________________________________________
> 389-users mailing list -- 389-users@lists.fedoraproject.org
> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
> Fedora Code of Conduct: \
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: \
> https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: \
> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do \
> not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue

-- 
Directory Server Development Team
_______________________________________________
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
 Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic