[prev in list] [next in list] [prev in thread] [next in thread]
List: gluster-devel
Subject: Re: [Gluster-devel] fail-over taking too long when a node reboots
From: Niels de Vos <ndevos () redhat ! com>
Date: 2016-07-27 11:49:15
Message-ID: 20160727114915.GD16998 () ndevos-x240 ! usersys ! redhat ! com
[Download RAW message or body]
[Attachment #2 (multipart/signed)]
On Wed, Jul 27, 2016 at 12:40:58PM +0530, Pranith Kumar Karampuri wrote:
> hi,
> Does anyone have complete understanding of keepalive timeout vs TCP
> User timeout (UTO) options? For both afr and EC when the server reboots it
> takes 42 seconds for the fops to fail with ENOTCONN
> (saved_frames_unwind()). I am wondering if there is any way to reduce this
> time by playing with these two options. As per our earlier research on this
> (I think it was kp who did that) keepalive was not getting triggered when
> there are fops in progress and he saw quite a few game-dev forums talk
> about this problem too. It seems like there is a new timeout called TCP
> User timeout which seems to address this. I am wondering if anyone of you
> have any experience with this and suggest defaults to be changed for these
> timeouts which are more meaningful. I think at the moment default is 42
> seconds.
http://review.gluster.org/8065 might be related? More details are in
https://www.gluster.org/pipermail/gluster-devel/2014-May/040755.html and
https://bugzilla.redhat.com/show_bug.cgi?id=1129787
HTH,
Niels
["signature.asc" (application/pgp-signature)]
_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic