'Re: [PERFORM] Postgres Replaying WAL slowly'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       pgsql-performance
Subject:    Re: [PERFORM] Postgres Replaying WAL slowly
From:       Simon Riggs <simon () 2ndquadrant ! com>
Date:       2014-09-17 17:15:32
Message-ID: CA+U5nMLW3GxPryXA_B56eghr8M42TvretOA=mfjDyvJfTU1Mcw () mail ! gmail ! com
[Download RAW message or body]

On 1 July 2014 20:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jeff Frost <jeff@pgexperts.com> writes:
> > > On Jun 30, 2014, at 4:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > > Did you check whether the locks were all on temp tables of the
> > > > ON COMMIT DROP persuasion?
> 
> > And indeed it did catch up overnight and the lag increased shortly after a \
> > correlating spike in AccessExclusiveLocks that were generated by temp table \
> > creation with on commit drop.
> 
> OK, so we have a pretty clear idea of where the problem is now.
> 
> It seems like there are three, not mutually exclusive, ways we might
> address this:
> 
> 1. Local revisions inside StandbyReleaseLocks to make it perform better in
> the presence of many locks.  This would only be likely to improve matters
> much if there's a fixable O(N^2) algorithmic issue; but there might well
> be one.
> 
> 2. Avoid WAL-logging AccessExclusiveLocks associated with temp tables, on
> the grounds that no standby should be touching them.  I'm not entirely
> sure that that argument is bulletproof though; in particular, even though
> a standby couldn't access the table's data, it's possible that it would be
> interested in seeing consistent catalog entries.
> 
> 3. Avoid WAL-logging AccessExclusiveLocks associated with
> new-in-transaction tables, temp or not, on the grounds that no standby
> could even see such tables until they're committed.  We could go a bit
> further and not take out any locks on a new-in-transaction table in the
> first place, on the grounds that other transactions on the master can't
> see 'em either.
> 
> It sounded like Andres had taken a preliminary look at #1 and found a
> possible avenue for improvement, which I'd encourage him to pursue.
> 
> For both #2 and the conservative version of #3, the main implementation
> problem would be whether the lock WAL-logging code has cheap access to
> the necessary information.  I suspect it doesn't.
> 
> The radical version of #3 might be pretty easy to do, at least to the
> extent of removing locks taken out during CREATE TABLE.  I suspect there
> are some assertions or other consistency checks that would get unhappy if
> we manipulate relations without locks, though, so those would have to be
> taught about the exception.  Also, we sometimes forget new-in-transaction
> status during relcache flush events; it's not clear if that would be a
> problem for this.
> 
> I don't plan to work on this myself, but perhaps someone with more
> motivation will want to run with these ideas.

Patch implements option 2 in the above.

Skipping the locks entirely seems like it opens a can of worms.

Skipping the lock for temp tables is valid since locks don't need to
exist on the standby. Any catalog entries for them will exist, but the
rows will show them as temp and nobody would expect them to be valid
outside of the original session.

Patch implements a special case that takes the lock normally, but
skips WAL logging the lock info.

-- 
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


["temp_tables_skip_logging_locks.v1.patch" (application/octet-stream)]

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index ae15c0b..cda51cc 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1014,6 +1014,8 @@ fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc,
  *
  *		NB: a "relation" is anything with a pg_class entry.  The caller is
  *		expected to check whether the relkind is something it can handle.
+ *
+ *		relation_open_temp() is a special case that avoids logging locks.
  * ----------------
  */
 Relation
@@ -1042,6 +1044,31 @@ relation_open(Oid relationId, LOCKMODE lockmode)
 	return r;
 }
 
+Relation
+relation_open_temp(Oid relationId, LOCKMODE lockmode)
+{
+	Relation	r;
+
+	Assert(lockmode >= NoLock && lockmode < MAX_LOCKMODES);
+
+	/* Get the lock before trying to open the relcache entry */
+	if (lockmode != NoLock)
+		LockRelationOidTemp(relationId, lockmode);
+
+	/* The relcache does all the real work... */
+	r = RelationIdGetRelation(relationId);
+
+	if (!RelationIsValid(r))
+		elog(ERROR, "could not open relation with OID %u", relationId);
+
+	/* Make note that we've accessed a temporary relation */
+	MyXactAccessedTempRel = true;
+
+	pgstat_initstats(r);
+
+	return r;
+}
+
 /* ----------------
  *		try_relation_open - open any relation by relation OID
  *
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7bc579b..f525a47 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -669,8 +669,13 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId)
 	 * really necessary for locking out other backends (since they can't see
 	 * the new rel anyway until we commit), but it keeps the lock manager from
 	 * complaining about deadlock risks.
+	 *
+	 * Temp tables follow a special case here to avoid logging locks.
 	 */
-	rel = relation_open(relationId, AccessExclusiveLock);
+	if (stmt->relation->relpersistence == RELPERSISTENCE_TEMP)
+		rel = relation_open_temp(relationId, AccessExclusiveLock);
+	else
+		rel = relation_open(relationId, AccessExclusiveLock);
 
 	/*
 	 * Now add any newly specified column default values and CHECK constraints
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 1c327fd..2ed545a 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -365,7 +365,7 @@ ResolveRecoveryConflictWithLock(Oid dbOid, Oid relOid)
 		ResolveRecoveryConflictWithVirtualXIDs(backends,
 											 PROCSIG_RECOVERY_CONFLICT_LOCK);
 
-		if (LockAcquireExtended(&locktag, AccessExclusiveLock, true, true, false)
+		if (LockAcquireExtended(&locktag, AccessExclusiveLock, true, true, false, false)
 			!= LOCKACQUIRE_NOT_AVAIL)
 			lock_acquired = true;
 	}
@@ -591,7 +591,7 @@ StandbyAcquireAccessExclusiveLock(TransactionId xid, Oid dbOid, Oid relOid)
 	 */
 	SET_LOCKTAG_RELATION(locktag, newlock->dbOid, newlock->relOid);
 
-	if (LockAcquireExtended(&locktag, AccessExclusiveLock, true, true, false)
+	if (LockAcquireExtended(&locktag, AccessExclusiveLock, true, true, false, false)
 		== LOCKACQUIRE_NOT_AVAIL)
 		ResolveRecoveryConflictWithLock(newlock->dbOid, newlock->relOid);
 }
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index 6cc4d26..747f0ac 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -82,6 +82,7 @@ SetLocktagRelationOid(LOCKTAG *tag, Oid relid)
  *
  * Lock a relation given only its OID.  This should generally be used
  * before attempting to open the relation's relcache entry.
+ * Any changes here should also change LockRelationOidTemp below.
  */
 void
 LockRelationOid(Oid relid, LOCKMODE lockmode)
@@ -108,6 +109,38 @@ LockRelationOid(Oid relid, LOCKMODE lockmode)
 }
 
 /*
+ *		LockRelationOidTemp
+ *
+ * Lock a relation given only its OID.  This should generally be used
+ * before attempting to open the relation's relcache entry.
+ *
+ * Special case version of LockRelationOid that avoids logging locks.
+ */
+void
+LockRelationOidTemp(Oid relid, LOCKMODE lockmode)
+{
+	LOCKTAG		tag;
+	LockAcquireResult res;
+
+	SetLocktagRelationOid(&tag, relid);
+
+	res = LockAcquireExtended(&tag, lockmode, false, false, true, false);
+
+	/*
+	 * Now that we have the lock, check for invalidation messages, so that we
+	 * will update or flush any stale relcache entry before we try to use it.
+	 * RangeVarGetRelid() specifically relies on us for this.  We can skip
+	 * this in the not-uncommon case that we already had the same type of lock
+	 * being requested, since then no one else could have modified the
+	 * relcache entry in an undesirable way.  (In the case where our own xact
+	 * modifies the rel, the relcache update happens via
+	 * CommandCounterIncrement, not here.)
+	 */
+	if (res != LOCKACQUIRE_ALREADY_HELD)
+		AcceptInvalidationMessages();
+}
+
+/*
  *		ConditionalLockRelationOid
  *
  * As above, but only lock if we can get the lock without blocking.
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 723051e..e281870 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -674,7 +674,7 @@ LockAcquire(const LOCKTAG *locktag,
 			bool sessionLock,
 			bool dontWait)
 {
-	return LockAcquireExtended(locktag, lockmode, sessionLock, dontWait, true);
+	return LockAcquireExtended(locktag, lockmode, sessionLock, dontWait, true, true);
 }
 
 /*
@@ -691,7 +691,8 @@ LockAcquireExtended(const LOCKTAG *locktag,
 					LOCKMODE lockmode,
 					bool sessionLock,
 					bool dontWait,
-					bool reportMemoryError)
+					bool reportMemoryError,
+					bool log_lock)
 {
 	LOCKMETHODID lockmethodid = locktag->locktag_lockmethodid;
 	LockMethod	lockMethodTable;
@@ -704,7 +705,7 @@ LockAcquireExtended(const LOCKTAG *locktag,
 	uint32		hashcode;
 	LWLock	   *partitionLock;
 	int			status;
-	bool		log_lock = false;
+	bool		lock_logged = false;
 
 	if (lockmethodid <= 0 || lockmethodid >= lengthof(LockMethods))
 		elog(ERROR, "unrecognized lock method: %d", lockmethodid);
@@ -801,11 +802,11 @@ LockAcquireExtended(const LOCKTAG *locktag,
 	 */
 	if (lockmode >= AccessExclusiveLock &&
 		locktag->locktag_type == LOCKTAG_RELATION &&
-		!RecoveryInProgress() &&
+		log_lock &&
 		XLogStandbyInfoActive())
 	{
 		LogAccessExclusiveLockPrepare();
-		log_lock = true;
+		lock_logged = true;
 	}
 
 	/*
@@ -1028,7 +1029,7 @@ LockAcquireExtended(const LOCKTAG *locktag,
 	 * Emit a WAL record if acquisition of this lock needs to be replayed in a
 	 * standby server.
 	 */
-	if (log_lock)
+	if (lock_logged)
 	{
 		/*
 		 * Decode the locktag back to the original values, to avoid sending
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 493839f..97b3502 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -81,6 +81,7 @@ typedef struct HeapUpdateFailureData
 
 /* in heap/heapam.c */
 extern Relation relation_open(Oid relationId, LOCKMODE lockmode);
+extern Relation relation_open_temp(Oid relationId, LOCKMODE lockmode);
 extern Relation try_relation_open(Oid relationId, LOCKMODE lockmode);
 extern Relation relation_openrv(const RangeVar *relation, LOCKMODE lockmode);
 extern Relation relation_openrv_extended(const RangeVar *relation,
diff --git a/src/include/storage/lmgr.h b/src/include/storage/lmgr.h
index e9447b7..52bcbe1 100644
--- a/src/include/storage/lmgr.h
+++ b/src/include/storage/lmgr.h
@@ -38,6 +38,7 @@ extern void RelationInitLockInfo(Relation relation);
 
 /* Lock a relation */
 extern void LockRelationOid(Oid relid, LOCKMODE lockmode);
+extern void LockRelationOidTemp(Oid relid, LOCKMODE lockmode);
 extern bool ConditionalLockRelationOid(Oid relid, LOCKMODE lockmode);
 extern void UnlockRelationId(LockRelId *relid, LOCKMODE lockmode);
 extern void UnlockRelationOid(Oid relid, LOCKMODE lockmode);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index 4c49e3c..75e7797 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -503,7 +503,8 @@ extern LockAcquireResult LockAcquireExtended(const LOCKTAG *locktag,
 					LOCKMODE lockmode,
 					bool sessionLock,
 					bool dontWait,
-					bool report_memory_error);
+					bool report_memory_error,
+					bool log_lock);
 extern void AbortStrongLockAcquire(void);
 extern bool LockRelease(const LOCKTAG *locktag,
 			LOCKMODE lockmode, bool sessionLock);


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic