[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-admin
Subject:    Re: [ADMIN] D.R. Site Failover (Streaming Replication) - user access / network options
From:       Fernando Hevia <fhevia () gmail ! com>
Date:       2016-03-08 18:00:41
Message-ID: CAGYT1XQ0fJX2tEW=VNewf8ayUnF9OfeNG_1d_R_g+YLGtWwq_g () mail ! gmail ! com
[Download RAW message or body]

On Tue, Mar 8, 2016 at 1:48 PM, CS DBA <cs_dba@consistentstate.com> wrote:

>
> I do however have a few questions related to this, I'm interested to find
> out what others have done here, in particular how do you go about moving
> end users (assuming a web app is the end user entry point) to point
> seamlessly to the secondary site?  Also how have you all dealt with the
> possible split brain issue (i.e. we fail over, then the primary site comes
> back up and existing/old connections to the old site then write to the old
> master)
>

While not seamlessly, you can achieve a pretty good failover rate by using
DNS servers with short TTL (under 2 min). On failure, have your monitoring
tool fire the failover scripts (promote postgres server, enable app server,
etc.) and then change the apps DNS record with the secondary site IP
address. In very short time you should have your users working on the
secondary site.

Cloudflare or Amazon's Route 56 can provide the DNS capability. It is
simple, reliable and cheap.

Once the primary site is back, split brain shouldn't be a problem since
your DNS will keep forwarding traffic to your secondary site till you
intervene to switch back.

Or... you can go with BGP and let the network team do the dirty work at the
routing level. With BGP you should also expect somewhere between 10 and 120
seconds downtime till the route changes propagate.


Cheers,
Fernando.

[Attachment #3 (text/html)]

<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar \
8, 2016 at 1:48 PM, CS DBA <span dir="ltr">&lt;<a \
href="mailto:cs_dba@consistentstate.com" \
target="_blank">cs_dba@consistentstate.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><br> I do however have a few questions related to this, \
I&#39;m interested to find out what others have done here, in particular how do you \
go about moving end users (assuming a web app is the end user entry point) to point \
seamlessly to the secondary site?   Also how have you all dealt with the possible \
split brain issue (i.e. we fail over, then the primary site comes back up and \
existing/old connections to the old site then write to the old \
master)<br></blockquote><div><br></div><div>While not seamlessly, you can achieve a \
pretty good failover rate by using DNS servers with short TTL (under 2 min). On \
failure, have your monitoring tool fire the failover scripts (promote postgres \
server, enable app server, etc.) and then change the apps DNS record with the \
secondary site IP address. In very short time you should have your users working on \
the secondary site.<br></div><div><br></div><div>Cloudflare or Amazon&#39;s Route 56 \
can provide the DNS capability. It is simple, reliable and \
cheap.</div><div><br></div><div>Once the primary site is back, split brain \
shouldn&#39;t be a problem since your DNS will keep forwarding traffic to your \
secondary site till you intervene to switch back.</div><div><br></div><div>Or... you \
can go with BGP and let the network team do the dirty work at the routing level. With \
BGP you should also expect somewhere between 10 and 120 seconds downtime till the \
route changes propagate.</div><div><br></div><div><br></div><div>Cheers,</div><div>Fernando.</div></div></div></div>




[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic