[prev in list] [next in list] [prev in thread] [next in thread]
List: shibboleth-users
Subject: RE: Weird IdP Hanging Issue
From: "Royder, Kyle D" <kroyder () austin ! utexas ! edu>
Date: 2013-03-26 20:30:17
Message-ID: FCB0A8ABF711AA479956F6AC25A01D4D4BA378E3 () EXMBX03 ! austin ! utexas ! edu
[Download RAW message or body]
I just wanted to follow-up on out solution for this. Still being new to the admin \
role, I hadn't yet performed a thread dump of tomcat using kill -3 on the process. \
After doing so and looking at catalina.out, I was able to determine that the \
mechanism that was implemented to send emails when any errors were logged in \
logging.conf was failing and causing threads for SMTPAppender to be created and never \
cleared causing tomcat to max out on open threads.
-Kyle
-----Original Message-----
From: users-bounces@shibboleth.net [mailto:users-bounces@shibboleth.net] On Behalf Of \
Russell Beall
Sent: Friday, March 22, 2013 1:35 PM
To: Shib Users
Subject: Re: Weird IdP Hanging Issue
Yep, looks like you have more than enough memory.
In days long past when our LDAP servers were occasionally overloaded, the shibboleth \
nodes would start to backlog requests and this would make it look like it was \
hanging. Are your data servers overloaded?
I haven't played with the connection pool. The only settings that we have on that is \
set in the data connector directly and that just sets the pool limit to 100.
Regards,
Russ.
On Mar 22, 2013, at 10:58 AM, "Royder, Kyle D" <kroyder@austin.utexas.edu>
wrote:
> Thanks Russ.
>
> We're currently running with "-Xms512m -Xmx2048m -XX:+DisableExplicitGC \
> -XX:MaxPermSize=1024m". We have plenty of free memory so I could try doubling this \
> but I'll check the garbage collection with debugging on. Thanks for the tip there. \
> It would be nice to see something wrong show up so I know what I'm dealing with.
> One change I have made is adding connection pool options as described on \
> https://wiki.shibboleth.net/confluence/display/SHIB2/ResolverLDAPDataConnector.
> The thing that has me concerned is the mention of the default options:
> blockWhenEmpty: Whether to wait for an available connection when the entire pool is \
> in use, default is true. If set to false then the number of connections can grow \
> beyond maxPoolSize.
> blockWaitTime: Length of time to wait, given in XML duration notation, on the pool \
> if blockWhenEmpty is true and the pool is empty. Default value is to wait \
> indefinitely.
> I'm wondering if there are connections that get tied up and then everything gets \
> held up waiting on one of the 3 defaults connection to become available. I've made \
> a change to test it out by setting blockWhenEmpty to false. There's probably a \
> better way to setup the connection pool options but I'm wondering if this has \
> anything to do with it. It's only been a couple of hours since I made this change \
> but so far so good. That doesn't really mean anything though. :)
> Does anyone know if the connection pool defaults are all in place even if you are \
> explicitly using the <ConnectionPool /> definition?
> Thanks,
> Kyle
>
> -----Original Message-----
> From: users-bounces@shibboleth.net [mailto:users-bounces@shibboleth.net] On Behalf \
> Of Russell Beall
> Sent: Friday, March 22, 2013 12:29 PM
> To: Shib Users
> Subject: Re: Weird IdP Hanging Issue
>
> I'm just wondering if you provided any extra memory to the tomcat process. You \
> might be hitting your memory limits and causing tomcat to go into regular Full GC \
> cycles. This is usually what I have seen cause the behavior you described. It \
> would be useful to add debug logging to the tomcat process which will print garbage \
> collection details. For instance, I use these in my JAVA_OPTS:
> -verbose:gc
> -XX:+PrintGCTimeStamps
> -XX:-TraceClassUnloading
>
> Also, regarding the resource reloading polling frequency, I have those set at one \
> minute, but the reload is never invoked unless there is a change. It should be \
> safe to maintain a short interval on that reload check. If you remove the reload \
> interval, then I believe it won't ever check and you would have to restart the IdP \
> to get it to load the change. That might be fine for the relying-party.xml file \
> which Chad said shouldn't really be reloaded on the fly, but others that get \
> changed more frequently and are safe should be reloadable, such as the \
> attribute-filter.xml that was discussed already.
> Regards,
> Russ.
>
> On Mar 21, 2013, at 11:11 AM, "Royder, Kyle D" <kroyder@austin.utexas.edu> wrote:
>
> > Thanks for the help! I'll turn up our LDAP logging and watch check the LDAP logs \
> > as well, and turn configuration reloading. We have a pool of IdPs and like to \
> > reboot them one at a time anyways to bring the one with changes back down if \
> > there is an issue to make sure the service stays up as a safety precaution. If \
> > this is our change policy, I don't think configuration reloading with gain us \
> > anything the way we are currently doing things.
> > Thanks,
> > Kyle
> >
> > -----Original Message-----
> > From: users-bounces@shibboleth.net [mailto:users-bounces@shibboleth.net] On \
> > Behalf Of Cantor, Scott
> > Sent: Thursday, March 21, 2013 1:04 PM
> > To: Shib Users
> > Subject: RE: Weird IdP Hanging Issue
> >
> > > I haven't had to mess with this before, but reading about it on the wiki, I'm
> > > assuming you're referring to the configurationResourcePollingFrequency
> > > attribute added to one of the four reloadable services in service.xml?
> >
> > Yes.
> >
> > > If so, it does look like this was setup to reload all four every 1 minute?
> >
> > Yikes.
> >
> > > I don't want this turned on so I'm going to remove the
> > > configurationResourcePollingFrequency attribute from all four of these.
> >
> > You may well want the filer policy reloading, but probably not every minute.
> >
> > > Hopefully this will resolve this issue. The weird thing is that none of these
> > > configs have been changes in a couple of weeks and this just started over the
> > > past couple of days.
> >
> > Agreed, I didn't necessarily think it was the cause, but there's an explicit bug \
> > in relying-party reloading, though I think that actually hangs hard.
> > I would have to think your data connectors are the underlying cause here. If it's \
> > LDAP, I'd probably suggest logging more there.
> > -- Scott
> >
> >
> > --
> > To unsubscribe from this list send an email to users-unsubscribe@shibboleth.net
> > --
> > To unsubscribe from this list send an email to users-unsubscribe@shibboleth.net
> >
>
>
> --
> To unsubscribe from this list send an email to users-unsubscribe@shibboleth.net
> --
> To unsubscribe from this list send an email to users-unsubscribe@shibboleth.net
>
--
To unsubscribe from this list send an email to users-unsubscribe@shibboleth.net
--
To unsubscribe from this list send an email to users-unsubscribe@shibboleth.net
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic