[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-general
Subject:    Trigger file behavior with the standby
From:       Keiko Oda <keiko713 () gmail ! com>
Date:       2018-02-23 18:58:41
Message-ID: CAFVgSmZM70c8hrHrHN9SjC_RVmL2noP6vEA4vZwCFOv1T5Bsyg () mail ! gmail ! com
[Download RAW message or body]

Hello,

I'm seeing the following behavior with a trigger file which is very
confusing to me, I'd like to get some advice of what is the expected
behavior of the trigger file with the standby.

1. setup the replication, with the standby having the following
recovery.conf

  # we use wal-e
  restore_command = 'wal-e wal-fetch  "%f" "%p"'
  standby_mode = 'true'
  trigger_file = '/my/path/to/trigger-file/STANDBY_OFF'
  recovery_target_timeline = 'latest'
  primary_conninfo = 'host=myhost port=5432 user=foo
password=verysecurepassword'

2. create a trigger file while standby is having a "lag" (and replication
is not streaming, but file-based log-shipping at this point)
3. looks like Postgres doesn't recognize a trigger file at all, standby
keeps replaying/recovering WALs
  * tried to see if Postgres is doing anything with DEBUG5 log, but it
doesn't say anything about a trigger file
  * also tried to restart Postgres, sending SIGUSR1, etc. to see if it
helps but it just keeps replaying WALs
4. once the standby "caught up" with the leader (replayed all WALs and
about to switch to the streaming replication and/or switch to the streaming
replication), Postgres finally realize that there is a trigger file, and do
the failover

The doc (
https://www.postgresql.org/docs/current/static/warm-standby-failover.html)
says:

> To trigger failover of a log-shipping standby server, run pg_ctl promote
or create a trigger file with the file name and path specified by the
trigger_file setting in recovery.conf.

So, I'd expect that the standby will trigger a failover as soon as we
create a trigger file at step 2. However, the failover doesn't happen until
step 3 above, and between step 2 and step 3 can take many hours sometimes.

I've reproduced this with Postgres 9.4 and 9.5, currently trying to
reproduce with 10.
Please let me know if there is any other information I could provide.

Thanks!
Keiko Oda

[Attachment #3 (text/html)]

<div dir="ltr">Hello,<div><br></div><div>I&#39;m seeing the following behavior with a \
trigger file which is very confusing to me, I&#39;d like to get some advice of what \
is the expected behavior of the trigger file with the \
standby.</div><div><br></div><div>1. setup the replication, with the standby having \
the following recovery.conf</div><div><br></div><div><span \
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:small;font-style:nor \
mal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spac \
ing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;wor \
d-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline"> \
# we use wal-e</span><br></div><div><div>   restore_command = &#39;wal-e wal-fetch   \
&quot;%f&quot; &quot;%p&quot;&#39;</div><div>   standby_mode = \
&#39;true&#39;</div><div>   trigger_file = \
&#39;/my/path/to/trigger-file/STANDBY_OFF&#39;</div><div>   recovery_target_timeline \
= &#39;latest&#39;</div><div>   primary_conninfo = &#39;host=myhost port=5432 \
user=foo password=verysecurepassword&#39;<br></div></div><div><br></div><div>2. \
create a trigger file while standby is having a &quot;lag&quot; (and replication is \
not streaming, but  file-based log-shipping at this point)</div><div>3. looks like \
Postgres doesn&#39;t recognize a trigger file at all, standby keeps \
replaying/recovering WALs</div><div>   * tried to see if Postgres is doing anything \
with DEBUG5 log, but it doesn&#39;t say anything about a trigger file</div><div>   * \
also tried to restart Postgres, sending SIGUSR1, etc. to see if it helps but it just \
keeps replaying WALs</div><div>4. once the standby &quot;caught up&quot; with the \
leader (replayed all WALs and about to switch to the streaming replication and/or \
switch to the streaming replication), Postgres finally realize that there is a \
trigger file, and do the failover</div><div><br></div><div>The doc (<a \
href="https://www.postgresql.org/docs/current/static/warm-standby-failover.html">https://www.postgresql.org/docs/current/static/warm-standby-failover.html</a>) \
says:</div><div><br></div><div>&gt; To trigger failover of a log-shipping standby \
server, run pg_ctl promote or create a trigger file with the file name and path \
specified by the trigger_file setting in recovery.conf.<br><div><br></div><div>So, \
I&#39;d expect that the standby will trigger a failover as soon as we create a \
trigger file at step 2. However, the failover doesn&#39;t happen until step 3 above, \
and between step 2 and step 3 can take many hours \
sometimes.</div><div><br></div><div>I&#39;ve reproduced this with Postgres 9.4 and \
9.5, currently trying to reproduce with 10.</div><div>Please let me know if there is \
any other information I could provide.  \
</div><div><br></div><div>Thanks!</div><div>Keiko Oda</div></div></div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic