[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-general
Subject:    Re: postgres=# VACUUM FULL pg_statistic => ERROR: missing chunk number 0 for toast value .. in pg_to
From:       Justin Pryzby <pryzby () telsasoft ! com>
Date:       2019-08-30 14:26:55
Message-ID: 20190830142655.GA14011 () telsasoft ! com
[Download RAW message or body]

Moving this old thread to -hackers
https://www.postgresql.org/message-id/flat/20180519142603.GA30060%40telsasoft.com

I wanted to mention that this seems to still be an issue, now running pg11.5.

log_time               | 2019-08-30 23:20:00.118+10
user_name              | postgres
database               | ts
session_id             | 5d69227e.235
session_line           | 1
command_tag            | CLUSTER
session_start_time     | 2019-08-30 23:19:58+10
error_severity         | ERROR
sql_state_code         | XX000
message                | unexpected chunk number 1 (expected 0) for toast value \
2369261203 in pg_toast_2619 query                  | CLUSTER pg_statistic USING \
pg_statistic_relid_att_inh_index application_name       | psql

Note that my original report was for "missing" chunk during "VACUUM FULL", and
the current error is "unexpected chunk" during CLUSTER.  I imagine that's
related issue.  I haven't seen this in awhile (but stopped trying to reproduce
it long ago).  A recently-deployed update to this maintenance script is
probably why it's now doing CLUSTER.

On Fri, May 25, 2018 at 08:49:50AM -0500, Justin Pryzby wrote:
> On Sat, May 19, 2018 at 02:39:26PM -0400, Tom Lane wrote:
> > Justin Pryzby <pryzby@telsasoft.com> writes:
> > > [pryzbyj@database ~]$ while :; do for db in `psql postgres -Atc "SELECT datname \
> > > FROM pg_database WHERE datallowconn"`; do for t in pg_statistic pg_attrdef \
> > > pg_constraint; do echo "$db.$t..."; PGOPTIONS=-cstatement_timeout='9s' psql $db \
> > > -qc "VACUUM FULL $t"; done; done; done
> > 
> > > ...
> > > postgres.pg_statistic...
> > > postgres.pg_attrdef...
> > > postgres.pg_constraint...
> > > template1.pg_statistic...
> > > template1.pg_attrdef...
> > > template1.pg_constraint...
> > > ts.pg_statistic...
> > > ERROR:  canceling statement due to statement timeout
> > > ts.pg_attrdef...
> > > ts.pg_constraint...
> > > postgres.pg_statistic...
> > > ERROR:  missing chunk number 0 for toast value 3372855171 in pg_toast_2619
> > 
> > Hm, so was the timeout error happening every time through on that table,
> > or just occasionally, or did you provoke it somehow?  I'm wondering how
> > your 9s timeout relates to the expected completion time.
> 
> Actually statement_timeout isn't essential for this (maybe it helps to triggers
> it more often - not sure).
> 
> Could you try:
> time sh -ec 'while :; do time psql postgres -c "VACUUM FULL VERBOSE \
> pg_toast.pg_toast_2619"; psql postgres -c "VACUUM FULL VERBOSE pg_statistic"; \
> done'; date 
> Three servers experienced error within 30min, but one server didn't fail until
> 12h later, and a handful others still haven't failed..
> 
> Does this help at all ?
> 2018-05-24 21:57:49.98-03  | 5b075f8d.1ad1 | LOG            | pryzbyj   | postgres \
> | statement: VACUUM FULL VERBOSE pg_toast.pg_toast_2619 2018-05-24 21:57:50.067-03 \
> | 5b075f8d.1ad1 | INFO           | pryzbyj   | postgres | vacuuming \
> "pg_toast.pg_toast_2619" 2018-05-24 21:57:50.09-03  | 5b075f8d.1ad1 | INFO          \
> | pryzbyj   | postgres | "pg_toast_2619": found 0 removable, 408 nonremovable row \
> versions in 99 pages 2018-05-24 21:57:50.12-03  | 5b075f8e.1ada | LOG            | \
> pryzbyj   | postgres | statement: VACUUM FULL VERBOSE pg_statistic 2018-05-24 \
> 21:57:50.129-03 | 5b075f8e.1ada | INFO           | pryzbyj   | postgres | vacuuming \
> "pg_catalog.pg_statistic" 2018-05-24 21:57:50.185-03 | 5b075f8e.1ada | ERROR        \
> | pryzbyj   | postgres | missing chunk number 0 for toast value 3382957233 in \
> pg_toast_2619 
> Some thing; this server has autovacuum logging, although it's not clear to me
> if that's an essential component of the problem, either:
> 2018-05-24 21:16:39.856-06 | LOG   | 5b078017.7b99 | pryzbyj   | postgres | \
> statement: VACUUM FULL VERBOSE pg_toast.pg_toast_2619 2018-05-24 21:16:39.876-06 | \
> LOG   | 5b078010.7968 |           |          | automatic vacuum of table \
> "postgres.pg_toast.pg_toast_2619": index scans: 1                        + |       \
> |               |           |          | pages: 0 removed, 117 r 2018-05-24 \
> 21:16:39.909-06 | INFO  | 5b078017.7b99 | pryzbyj   | postgres | vacuuming \
> "pg_toast.pg_toast_2619" 2018-05-24 21:16:39.962-06 | INFO  | 5b078017.7b99 | \
> pryzbyj   | postgres | "pg_toast_2619": found 0 removable, 492 nonremovable row \
> versions in 117 pages 2018-05-24 21:16:40.025-06 | LOG   | 5b078018.7b9b | pryzbyj  \
> | postgres | statement: VACUUM FULL VERBOSE pg_statistic 2018-05-24 21:16:40.064-06 \
> | INFO  | 5b078018.7b9b | pryzbyj   | postgres | vacuuming \
> "pg_catalog.pg_statistic" 2018-05-24 21:16:40.145-06 | ERROR | 5b078018.7b9b | \
> pryzbyj   | postgres | missing chunk number 0 for toast value 765874692 in \
> pg_toast_2619 
> Or this one?
> 
> postgres=# SELECT log_time, database, user_name, session_id, left(message,999) FROM \
> postgres_log WHERE (log_time>='2018-05-24 19:56' AND log_time<'2018-05-24 19:58') \
> AND (database='postgres' OR database IS NULL OR user_name IS NULL OR \
> user_name='pryzbyj') AND message NOT LIKE 'statement:%' ORDER BY 1; 
> log_time   | 2018-05-24 19:56:35.396-04
> database   | 
> user_name  | 
> session_id | 5b075131.3ec0
> left       | skipping vacuum of "pg_toast_2619" --- lock not available
> 
> ...
> 
> log_time   | 2019-05-24 19:57:35.78-04
> database   | 
> user_name  | 
> session_id | 5b07516d.445e
> left       | automatic vacuum of table "postgres.pg_toast.pg_toast_2619": index \
> scans: 1 : pages: 0 removed, 85 remain, 0 skipped due to pins, 0 skipped frozen
> > tuples: 1 removed, 348 remain, 0 are dead but not yet removable, oldest xmin: \
> > 63803106
> > buffer usage: 179 hits, 4 misses, 87 dirtied
> > avg read rate: 1.450 MB/s, avg write rate: 31.531 MB/s
> > system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.02 s
> 
> log_time   | 2018-05-24 19:57:35.879-04
> database   | postgres
> user_name  | pryzbyj
> session_id | 5b07516f.447f
> left       | missing chunk number 0 for toast value 624341680 in pg_toast_2619
> 
> log_time   | 2018-05-24 19:57:44.332-04
> database   |
> user_name  |
> session_id | 5af9fda3.70d5
> left       | checkpoint starting: time


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic