'Re: PSQLException: An I/O error occurred while sending to the backend.'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-general
Subject:    Re: PSQLException: An I/O error occurred while sending to the backend.
From:       Argha Deep Ghoshal <ghoshal.arghadeep () gmail ! com>
Date:       2020-07-30 16:11:26
Message-ID: CAPJci=NzXgovCzgJWQNDnqNf-SbWnkf3zSLwicF_MSeK88xqWw () mail ! gmail ! com
[Download RAW message or body]

Hi Tom,

Appreciate your inputs. Please find my comments inline below.


> We are using PostgreSQL 11 wherein intermittently the below exception is
> > popping up, causing our application to lose connection with the database.
> > It isn't reconnecting until the application is restarted.
>
> >     org.postgresql.util.PSQLException: An I/O error occurred while
> sending
> > to the backend.
>
> That certainly looks like loss of network connection.  Had the connection
> been sitting idle for awhile before this query attempt?
>
> *- We are sending requests continuously using Jmeter and the exceptions
are interspersed. Out of 100 say 8-9 requests are getting this exception
and there is no lag between them. The connections I think are being kept
open after the testing is done, but shouldn't the error come against the
first response when we are reopening for test. The exceptions are coming
after 10-15 requests.*


> > We have checked the PostgreSQL logs in detail, however we are unable to
> > find any significant errors related to this issue.
>
> I'd expect that the backend would eventually notice the dead connection.
> But the timeout before it does so might be completely different from the
> time at which the client notices the dead connection, so the relationship
> might not be very obvious.
>

- *Initially I was seeing connection termination error in the logs.
However, currently this exception is not breaking the connectivity so no
errors are getting logged in the database.*

>
> > All the servers are present in the same region and building.
>
> Doesn't mean there's not routers or firewalls between them.  I'd start
> by looking for network timeouts, and possibly configuring the server
> to send TCP keepalives more aggressively.  (In this case it might be
> HAProxy that needs to be sending keepalives ... don't know what options
> it has for that.)
>
>
- *I have made the below changes in our HAProxy server. *




*net.ipv4.tcp_keepalive_time = 600net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 20*

*Currently we are testing to see whether this did the trick.*



>                         regards, tom lane
>

[Attachment #3 (text/html)]

<div dir="ltr"><div dir="ltr"><br></div><div>Hi \
Tom,</div><div><br></div><div>Appreciate your inputs. Please find my comments inline \
below.</div><div><br></div><br><div class="gmail_quote"><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex">&gt; We are using PostgreSQL 11 wherein \
intermittently the below exception is<br> &gt; popping up, causing our application to \
lose connection with the database.<br> &gt; It isn&#39;t reconnecting until the \
application is restarted.<br> <br>
&gt;        org.postgresql.util.PSQLException: An I/O error occurred while \
sending<br> &gt; to the backend.<br>
<br>
That certainly looks like loss of network connection.   Had the connection<br>
been sitting idle for awhile before this query attempt?<br>
<br></blockquote><div><i>- We are sending requests continuously using Jmeter and the \
exceptions are interspersed. Out of 100 say 8-9 requests are getting this exception \
and there is no lag between them. The connections I think are being kept open after \
the testing is done, but shouldn&#39;t the error come against the first response  \
when we are reopening for test. The exceptions are coming after 10-15 \
requests.</i></div><div>  </div><blockquote class="gmail_quote" style="margin:0px 0px \
0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> &gt; We have \
checked the PostgreSQL logs in detail, however we are unable to<br> &gt; find any \
significant errors related to this issue.<br> <br>
I&#39;d expect that the backend would eventually notice the dead connection.<br>
But the timeout before it does so might be completely different from the<br>
time at which the client notices the dead connection, so the relationship<br>
might not be very obvious.<br></blockquote><div><br></div><div>- <i>Initially I was \
seeing connection termination error in the logs. However, currently this exception is \
not breaking the connectivity  so no errors are getting logged in the \
database.</i></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <br>
&gt; All the servers are present in the same region and building.<br>
<br>
Doesn&#39;t mean there&#39;s not routers or firewalls between them.   I&#39;d \
start<br> by looking for network timeouts, and possibly configuring the server<br>
to send TCP keepalives more aggressively.   (In this case it might be<br>
HAProxy that needs to be sending keepalives ... don&#39;t know what options<br>
it has for that.)<br>
<br></blockquote><div><br></div><div>- <i>I have made the below changes in our \
HAProxy server.  </i></div><div><i><br></i></div><div><i>net.ipv4.tcp_keepalive_time \
= 600<br>net.ipv4.tcp_keepalive_intvl = 60 <br>net.ipv4.tcp_keepalive_probes = \
20<br></i></div><div><i><br></i></div><div><i>Currently we are testing to see whether \
this did the trick.</i></div><div><br></div><div>  </div><blockquote \
class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex">  regards, tom lane<br>
</blockquote></div></div>



[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic