[prev in list] [next in list] [prev in thread] [next in thread] 

List:       sssd-users
Subject:    =?utf-8?q?=5BSSSD-users=5D?= Re: Weird SSSD shutdown
From:       Lachlan Musicman <datakid () gmail ! com>
Date:       2017-11-02 22:31:55
Message-ID: CAGBeqiMC=xFyoWev6ZOvFQVLdQS2P3OHQTzL7XV9ADi8yT9A3Q () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


On 3 November 2017 at 09:02, Lukas Slebodnik <lslebodn@redhat.com> wrote:

> On (03/11/17 08:53), Lachlan Musicman wrote:
> >On 3 November 2017 at 08:19, Lukas Slebodnik <lslebodn@redhat.com> wrote:
> >
> >> On (02/11/17 08:20), Lachlan Musicman wrote:
> >> >Last night sssd shutdown on one of my servers.
> >> >
> >> >I had updated the IPA server earlier in the day - but only patches to
> >> >4.5.0, nothing major.
> >> >
> >> >The error I saw this AM was:
> >> >
> >> >
> >> >(Wed Nov  1 17:08:22 2017) [sssd[be[unix.domain.com]]]
> [orderly_shutdown]
> >> >(0x0010): SIGTERM: killing
> >> >children
> >> >(Wed Nov  1 17:08:50 2017) [sssd[be[unix.domain.com]]]
> >> >[sysdb_domain_cache_connect] (0x0010): DB version too old [0.18],
> expected
> >> >[0.19] for domain unix.domain.com!
> >>
> >> sysdb version 0.19 is only in sssd-1.16.0 which is not in el7.4 by
> default.
> >>
> >
> >
> >Ah!
> >
> >And we are using the SSSD 1.16.0 from COPR.
> >
> >Hmm. What should we do? All of our servers are using  sssd from the COPR
> >repo and our IPA server is using the CentOS repos for ipa-*.
> >
>
> sssd cache should be upgraded after restart. I have no idea how it is
> possible that new binaries are used and sssd cache is old.
>
> In theory, there is an explanation that sssd was not restarted
> and backend(sssd_be was restarted) and thus new version of binary was used.
>
> Another explanation is that upgrade for some reason failed.
> But in this case I would expect that sssd should not run.
>
> It would be good if you could provide more details or even
> reproducer :-)
>

No doubt! I'm sorry I can't be more helpful - it was hard to diagnose what
the problem was exactly because of various symptoms which were eventually
unrelated. We were in the middle of diagnosing why the replica installation
was failing (ipareplica-conncheck kept failing despite being able to login
via ssh in both directions) when it all happened.

The client in question is relatively busy - the login node to the cluster -
the important thing (for my manager) was to make it work and damn the
diagnostics. Because the upgrade had worked in my dev environment, I didn't
think to start taking logs or to worry.

Of course, now I have the unenviable problem that my manager is update shy,
and doesn't want to upgrade the IPA server again. The plan is to clone the
problematic client, boot the clone into the dev domain and test there. I
can report back on reproducibility then. Debug has been turned back on in
prod.

Also, I can't remember clearing the cache on the client after the update -
I probably didn't due to the size of our domain and the high usage of that
server. So it may be that the problem has disappeared.


Cheers
L.


------
"The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic civics
is the insistence that we cannot ignore the truth, nor should we panic
about it. It is a shared consciousness that our institutions have failed
and our ecosystem is collapsing, yet we are still here — and we are
creative agents who can shape our destinies. Apocalyptic civics is the
conviction that the only way out is through, and the only way through is
together. "

*Greg Bloom* @greggish
https://twitter.com/greggish/status/873177525903609857

[Attachment #5 (text/html)]

<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On 3 November 2017 \
at 09:02, Lukas Slebodnik <span dir="ltr">&lt;<a href="mailto:lslebodn@redhat.com" \
target="_blank">lslebodn@redhat.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex"><span class="gmail-">On (03/11/17 08:53), Lachlan \
Musicman wrote:<br> &gt;On 3 November 2017 at 08:19, Lukas Slebodnik &lt;<a \
href="mailto:lslebodn@redhat.com">lslebodn@redhat.com</a>&gt; wrote:<br> &gt;<br>
&gt;&gt; On (02/11/17 08:20), Lachlan Musicman wrote:<br>
&gt;&gt; &gt;Last night sssd shutdown on one of my servers.<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt;I had updated the IPA server earlier in the day - but only patches \
to<br> &gt;&gt; &gt;4.5.0, nothing major.<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt;The error I saw this AM was:<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt;(Wed Nov   1 17:08:22 2017) [sssd[be[<a href="http://unix.domain.com" \
rel="noreferrer" target="_blank">unix.domain.com</a>]]] [orderly_shutdown]<br> \
&gt;&gt; &gt;(0x0010): SIGTERM: killing<br> &gt;&gt; &gt;children<br>
&gt;&gt; &gt;(Wed Nov   1 17:08:50 2017) [sssd[be[<a href="http://unix.domain.com" \
rel="noreferrer" target="_blank">unix.domain.com</a>]]]<br> &gt;&gt; \
&gt;[sysdb_domain_cache_connect] (0x0010): DB version too old [0.18], expected<br> \
&gt;&gt; &gt;[0.19] for domain <a href="http://unix.domain.com" rel="noreferrer" \
target="_blank">unix.domain.com</a>!<br> &gt;&gt;<br>
&gt;&gt; sysdb version 0.19 is only in sssd-1.16.0 which is not in el7.4 by \
default.<br> &gt;&gt;<br>
&gt;<br>
&gt;<br>
&gt;Ah!<br>
&gt;<br>
&gt;And we are using the SSSD 1.16.0 from COPR.<br>
&gt;<br>
&gt;Hmm. What should we do? All of our servers are using   sssd from the COPR<br>
&gt;repo and our IPA server is using the CentOS repos for ipa-*.<br>
&gt;<br>
<br>
</span>sssd cache should be upgraded after restart. I have no idea how it is<br>
possible that new binaries are used and sssd cache is old.<br>
<br>
In theory, there is an explanation that sssd was not restarted<br>
and backend(sssd_be was restarted) and thus new version of binary was used.<br>
<br>
Another explanation is that upgrade for some reason failed.<br>
But in this case I would expect that sssd should not run.<br>
<br>
It would be good if you could provide more details or even<br>
reproducer :-)<br>
</blockquote><div><br></div><div>No doubt! I&#39;m sorry I can&#39;t be more helpful \
- it was hard to diagnose what the problem was exactly because of various symptoms \
which were eventually unrelated. We were in the middle of diagnosing why the replica \
installation was failing (ipareplica-conncheck kept failing despite being able to \
login via ssh in both directions) when it all happened. <br><br>The client in \
question is relatively busy - the login node to the cluster - the important thing \
(for my manager) was to make it work and damn the diagnostics. Because the upgrade \
had worked in my dev environment, I didn&#39;t think to start taking logs or to \
worry. <br><br></div><div>Of course, now I have the unenviable problem that my \
manager is update shy, and doesn&#39;t want to upgrade the IPA server again. The plan \
is to clone the problematic client, boot the clone into the dev domain and test \
there. I can report back on r<span class="gmail-st">eproducibility</span> then. Debug \
has been turned back on in prod. <br><br></div><div>Also, I can&#39;t remember \
clearing the cache on the client after the update - I probably didn&#39;t due to the \
size of our domain and the high usage of that server. So it may be that the problem \
has disappeared.<br><br></div><div><br></div><div>Cheers<br></div><div>L.<br></div><div><br><br \
clear="all">------<br>&quot;The antidote to apocalypticism is  <b>apocalyptic \
civics</b>. Apocalyptic civics is the  insistence that we cannot ignore the truth, \
nor should we panic about  it. It is a shared consciousness that our institutions \
have failed and  our ecosystem is collapsing, yet we are still here — and we are \
creative  agents who can shape our destinies. Apocalyptic civics is the 
conviction that the only way out is through, and the only way through is
 together. &quot;<br><br><i>Greg Bloom</i> @greggish <a \
href="https://twitter.com/greggish/status/873177525903609857" \
target="_blank">https://twitter.com/greggish/status/873177525903609857</a> \
<br></div></div><br></div></div>


[Attachment #6 (text/plain)]

_______________________________________________
sssd-users mailing list -- sssd-users@lists.fedorahosted.org
To unsubscribe send an email to sssd-users-leave@lists.fedorahosted.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic