[prev in list] [next in list] [prev in thread] [next in thread] 

List:       slony1-general
Subject:    Re: [Slony1-general] sl_confirm aging issue?
From:       Richard Yen <dba () richyen ! com>
Date:       2007-03-30 21:38:30
Message-ID: 164F7A99-345B-48FC-951F-15DDF594DF48 () richyen ! com
[Download RAW message or body]

Hi,

As a follow-up to my previous post about sl_confirm getting aged, I  
*did* do a move_set from node 4 to node 1 about 6 days ago.  Any  
reason why the slon cleanup cycle didn't pick up these confirmations  
and delete them?  Perhaps it is a bug of some sort?

In any case, I deleted the rows in sl_confirm, so the  
test_slony_state-dbi.pl script doesn't list these anomalies anymore.   
Could anyone else has encountered this, or have an explanation for this?

--Richard




On Mar 30, 2007, at 12:17 PM, Richard Yen wrote:

> Hi all,
>
> I've recently been experiencing climbing lags, followed by a sudden
> drop, at random times during the day.  I understand that for some
> people a ~40 event lag isn't much, but it's quite unusual for my
> cluster.
>
> I run a 4-node cluster (1 provider, 3 subscribers), and it appears
> that at random times, the event lag climbs up to ~40, and then
> suddenly drops to 0.  Load on all nodes is < 1.0 during these times,
> so I don't suspect that it's hardware or configuration.  That leaves
> me with no explanation of what's happening that causes these "lag
> spikes."
>
> Tried running test_slony_state-dbi.pl, and found the following output:
>
> ===BEGIN LOG===
> Tests for node 1 - DSN = dbname=tii host=tii-
> db1.oaktown.iparadigms.com user=slony password=3l3phant
> ========================================
> pg_listener info:
> Pages: 9
> Tuples: 1
>
> Size Tests
> ================================================
>         sl_log_1      1918 26082.000000
>         sl_log_2         0  0.000000
>        sl_seqlog        20 1543.000000
>
> Listen Path Analysis
> ===================================================
> No problems found with sl_listen
>
> ---------------------------------------------------------------------- 
> --
> --------
> Summary of event info
> Origin  Min SYNC  Max SYNC Min SYNC Age Max SYNC Age
> ====================================================================== 
> ==
> ========
>        2   2277006   2277401     00:00:00     00:19:00    0
>        1   2999671   3001970     00:00:00     00:19:00    0
>        5    516048    516088     00:00:00     00:20:00    0
>        4    173746    174140     00:00:00     00:19:00    0
>
>
> ---------------------------------------------------------------------- 
> --
> ---------
> Summary of sl_confirm aging
>     Origin   Receiver   Min SYNC   Max SYNC  Age of latest SYNC  Age
> of eldest SYNC
> ====================================================================== 
> ==
> =========
>          1          2    2999672    3001969      00:00:00
> 00:19:00    0
>          1          4    2999678    3001969      00:00:00
> 00:19:00    0
>          1          5    2999671    3001962      00:00:00
> 00:19:00    0
>          2          1    2277006    2277401      00:00:00
> 00:19:00    0
>          2          4    2277006    2277401      00:00:00
> 00:19:00    0
>          2          5    2277006    2277400      00:00:00
> 00:19:00    0
>          4          1     173746     174140      00:00:00
> 00:19:00    0
>          4          2    6030310    6030310  6 days 01:52:00  6 days
> 01:52:00    1
>          4          5    6030307    6030307  6 days 01:52:00  6 days
> 01:52:00    1
>          5          1     516048     516088      00:00:00
> 00:20:00    0
>          5          2     516048     516088      00:00:00
> 00:20:00    0
>          5          4     516048     516088      00:00:00
> 00:20:00    0
>
>
> ---------------------------------------------------------------------- 
> --
> ------
>
> Listing of old open connections
>         Database             PID            User    Query
> Age                Query
> ====================================================================== 
> ==
> ========
> ===END OF LOG===
>
> If you notice, the lines for Origin->Receiver on 4->2 and 4->2 have
> some old SYNCs.  These nodes (2 and 5) are the ones I experience the
> "lag spikes" on.  The other subscriber, node 4, doesn't experience
> lag spikes at all.  This report is similar for every node in the
> test_slony_state-dbi.pl script, so I'm kind of perplexed.
>
> Wondering if anyone would be able to interpret this for me and
> provide and help/advice.
>
> Thanks a lot!
> --Richard
> _______________________________________________
> Slony1-general mailing list
> Slony1-general@gborg.postgresql.org
> http://gborg.postgresql.org/mailman/listinfo/slony1-general

_______________________________________________
Slony1-general mailing list
Slony1-general@gborg.postgresql.org
http://gborg.postgresql.org/mailman/listinfo/slony1-general
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic