'Re: [Slony1-general] Slave can't catch up, postgres error 'stack depth limit exceeded''

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       slony1-general
Subject:    Re: [Slony1-general] Slave can't catch up, postgres error 'stack depth limit exceeded'
From:       Cédric_Villemain <cedric.villemain.debian () gmail ! com>
Date:       2012-01-28 21:19:45
Message-ID: CAF6yO=3_ZeZudxFXjDDLnUk-VRT4YUbzBeZfvr=PXOLjTBOJWQ () mail ! gmail ! com
[Download RAW message or body]

Le 28 janvier 2012 00:29, Jan Wieck <JanWieck@yahoo.com> a =E9crit :
> On 1/24/2012 4:57 AM, C=E9dric Villemain wrote:
>>
>> Le 22 janvier 2012 17:16, Steve Singer<steve@ssinger.info> =A0a =E9crit :
>>>
>>> =A0On Sun, 22 Jan 2012, Brian Fehrle wrote:
>>>
>>>> =A0Hi all,
>>>>
>>>> =A0PostgreSQL 9.1.2
>>>> =A0Slony 2.1.0
>>>
>>>
>>> =A0Set max_stack_depth in your postgresql.conf to something higher.
>>>
>>> =A0sync_group_maxsize in your slon.conf to something low MIGHT help (ie=
 1
>>> or 2)
>>> =A0but I think the default in 2.1 is pretty low anyway (like 20).
>>
>>
>> Immediate workaround is in fact to increase max_stack_depth ( but max
>> it to (ulimit -s minus 1MB)
>>
>> but ... isn't it slony which should not use more than
>> default_stack_size ? can't there be an underlining bug ?
>
>
> Not a bug per se. Maybe something to improve in a future release.
>
> That list of log_actionseq <> clauses only ever occurs on the first sync =
of
> a new SUBSCRIBE SET directly from the origin. The subscriber copies all t=
he
> tables in some state in between two SYNC events. During the first SYNC af=
ter
> that, it needs to filter out all the log rows, that are already incorpora=
ted
> in that data, so it collects them and saves them in that set's sl_setsync
> row. These are all the actions that happened and committed in between the
> last SYNC event created on the origin and the copy_set operation starting.
>
> If those are many thousands, then this is a mighty busy database or the s=
lon
> on the origin maybe wasn't running for a while.
>
> The improvement for a future release would be to have the remote worker g=
et
> the log_actionseq list at the beginning of copy_set. If that list is long=
er
> than a configurable maximum, it would abort the subscribe and retry in a =
few
> seconds. It may take a couple of retries, but it should eventually hit a
> moment where a SYNC event was created recently enough so that there are o=
nly
> a few hundred log rows to ignore.

Not a bug, I am abusing the word, sorry.
Good explanations, thanks to you and Steve for that.

Btw, it is nice that an improvement can be done in 2.2, as noted
upthread the issue here is limited to 9.1: maybe related to SSI taking
more stack space or any other change in postgresql code. Slony ability
to workaround the changes of PostgreSQL internals and manage different
version of PostgreSQL is very nice combo that I am very happy to see
slony dev-team maintaining.

-- =

C=E9dric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - D=E9veloppement, Expertise et Formation
_______________________________________________
Slony1-general mailing list
Slony1-general@lists.slony.info
http://lists.slony.info/mailman/listinfo/slony1-general
[prev in list] [next in list] [prev in thread] [next in thread]