[prev in list] [next in list] [prev in thread] [next in thread] 

List:       pgsql-performance
Subject:    Re: [PERFORM] Trying to understand Stats/Query planner issue
From:       Tom Lane <tgl () sss ! pgh ! pa ! us>
Date:       2011-11-13 16:17:22
Message-ID: 6469.1321201042 () sss ! pgh ! pa ! us
[Download RAW message or body]

"Strange, John W" <john.w.strange@jpmorgan.com> writes:
> I have a question on how the analyzer works in this type of scenario.
> We calculate some results and COPY INTO some partitioned tables, which we use some \
> selects to aggregate the data back out into reports.  Everyone once in a while the \
> aggregation step picks a bad plan due to stats on the tables that were just \
> populated.   Updating the stats and rerunning the query seems to solve the problem, \
> this only happens if we enable nested loop query plans.

Well, even if auto-analyze launches instantly after you commit the
insertions (which it won't), it's going to take time to scan the table
and then commit the updates to pg_statistic.  So there is always going
to be some window where queries will get planned with obsolete
information.  If you're inserting enough data to materially change the
statistics of a table, and you need to query that table right away,
doing a manual ANALYZE rather than waiting for auto-analyze is
recommended.

> The other option is just to analyze each table involved in the query after the \
> insert, but that seems a bit counterproductive.

Why would you think that?  This type of scenario is exactly why ANALYZE
isn't deprecated as a user command.

			regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic