[prev in list] [next in list] [prev in thread] [next in thread]
List: postgresql-hackers
Subject: Re: [HACKERS] CLUSTER command progress monitor
From: Masahiko Sawada <sawada.mshk () gmail ! com>
Date: 2019-08-30 10:45:57
Message-ID: CAD21AoCHnNRR671CbOBBXbLjZMSZk7thcO2riHpcHd7i-kCArQ () mail ! gmail ! com
[Download RAW message or body]
On Thu, Aug 15, 2019 at 12:48 PM Tatsuro Yamada
<tatsuro.yamada.tf@nttcom.co.jp> wrote:
>
> Hi Michael, Alvaro and Robert!
>
> On 2019/08/14 11:52, Michael Paquier wrote:
> > On Wed, Aug 14, 2019 at 11:38:01AM +0900, Tatsuro Yamada wrote:
> >> On 2019/08/13 14:40, Tatsuro Yamada wrote:
> >>> On 2019/08/02 3:43, Alvaro Herrera wrote:
> >>>> Hmm, I'm trying this out now and I don't see the index_rebuild_count
> >>>> ever go up. I think it's because the indexes are built using parallel
> >>>> index build ... or maybe it was the table AM changes that moved things
> >>>> around, not sure. There's a period at the end when the CLUSTER command
> >>>> keeps working, but it's gone from pg_stat_progress_cluster.
> >>>
> >>> Thanks for your report.
> >>> I'll investigate it. :)
> >>
> >> I did "git bisect" and found the commit:
> >>
> >> 03f9e5cba0ee1633af4abe734504df50af46fbd8
> >> Report progress of REINDEX operations
> >
> > I am adding an open item for this one.
> > --
> > Michael
>
>
> Okay, I checked it on the wiki.
>
> https://wiki.postgresql.org/wiki/PostgreSQL_12_Open_Items
> - index_rebuild_count in CLUSTER reporting never increments
>
> To be clear, 03f9e5cb broke CLUSTER progress reporting, but
> I investigated little more and share my ideas to fix the problem.
>
> * Call stack
> ========================================
> cluster_rel
> pgstat_progress_start_command(CLUSTER) *A1
> rebuild_relation
> finish_heap_swap
> reindex_relation
> reindex_index
> pgstat_progress_start_command(CREATE_INDEX) *B1
> pgstat_progress_end_command() *B2
> pgstat_progress_update_param(INDEX_REBUILD_COUNT, i) <- failed :(
> pgstat_progress_end_command() *A2
>
> Note
> These are sets:
> A1 and A2,
> B1 and B2
> ========================================
>
>
> * Ideas to fix
> There are Three options, I guess.
> ========================================
> 1. Call pgstat_progress_start_command(CLUSTER) again
> before pgstat_progress_update_param(INDEX_REBUILD_COUNT, i).
>
> 2. Add "save and restore" functions for the following two
> variables of MyBeentry in pgstat.c.
> - st_progress_command
> - st_progress_command_target
>
> 3. Use Hash or List to store multiple values for the two
> variables in pgstat.c.
> ========================================
>
>
> I tried 1. and it shown index_rebuild_count, but it also shown
> "initializing" phase again and other columns were empty. So, it is
> not suitable to fix the problem. :(
> I'm going to try 2. and 3., but, unfortunately, I can't get enough
> time to do that after PGConf.Asia 2019.
> If we selected 3., it affects following these progress reporting
> features: VACUUM, CLUSTER, CREATE_INDEX and ANALYZE. But it's okay,
> I suppose. Any comments welcome! :)
I looked at this open item. I prefer #3 but I think it would be enough
to have a small stack using a fixed length array to track nested
progress information and the current executing command (maybe 4 or 8
would be enough for maximum nested level for now?). That way, we don't
need any change the interfaces. AFAICS there is no case where we call
only either pgstat_progress_start_command or
pgstat_progress_end_command without each other (although
pgstat_progress_update_param is called without start). So I think that
having a small stack for tracking multiple reports would work.
Attached the draft patch that fixes this issue. Please review it.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
["track_nested_command_progress.patch" (text/x-patch)]
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index d362e7f7d7..99e4844e6c 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3016,8 +3016,10 @@ pgstat_bestart(void)
#endif
lbeentry.st_state = STATE_UNDEFINED;
- lbeentry.st_progress_command = PROGRESS_COMMAND_INVALID;
- lbeentry.st_progress_command_target = InvalidOid;
+ MemSet(&(lbeentry.st_progress_cmds), 0,
+ sizeof(PgBackendProgressInfo) * PGSTAT_MAX_PROGRESS_INFO);
+ /* Set invalid command index */
+ lbeentry.st_current_cmd = -1;
/*
* we don't zero st_progress_param here to save cycles; nobody should
@@ -3203,10 +3205,22 @@ pgstat_progress_start_command(ProgressCommandType cmdtype, Oid relid)
if (!beentry || !pgstat_track_activities)
return;
+ Assert(beentry->st_current_cmd >= -1);
+
+ /* The given command is already started */
+ if (beentry->st_progress_cmds[beentry->st_current_cmd].command == cmdtype)
+ return;
+
+ /* The progress information queue is full */
+ if (beentry->st_current_cmd >= PGSTAT_MAX_PROGRESS_INFO - 1)
+ elog(ERROR, "progress information per backends is full");
+
PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
- beentry->st_progress_command = cmdtype;
- beentry->st_progress_command_target = relid;
- MemSet(&beentry->st_progress_param, 0, sizeof(beentry->st_progress_param));
+ beentry->st_current_cmd++;
+ MemSet(&(beentry->st_progress_cmds[beentry->st_current_cmd]),
+ 0, sizeof(PgBackendProgressInfo));
+ beentry->st_progress_cmds[beentry->st_current_cmd].command = cmdtype;
+ beentry->st_progress_cmds[beentry->st_current_cmd].target = relid;
PGSTAT_END_WRITE_ACTIVITY(beentry);
}
@@ -3223,11 +3237,11 @@ pgstat_progress_update_param(int index, int64 val)
Assert(index >= 0 && index < PGSTAT_NUM_PROGRESS_PARAM);
- if (!beentry || !pgstat_track_activities)
+ if (!beentry || !pgstat_track_activities || beentry->st_current_cmd < 0)
return;
PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
- beentry->st_progress_param[index] = val;
+ beentry->st_progress_cmds[beentry->st_current_cmd].params[index] = val;
PGSTAT_END_WRITE_ACTIVITY(beentry);
}
@@ -3245,7 +3259,8 @@ pgstat_progress_update_multi_param(int nparam, const int *index,
volatile PgBackendStatus *beentry = MyBEEntry;
int i;
- if (!beentry || !pgstat_track_activities || nparam == 0)
+ if (!beentry || !pgstat_track_activities || nparam == 0 ||
+ beentry->st_current_cmd < 0)
return;
PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
@@ -3254,7 +3269,8 @@ pgstat_progress_update_multi_param(int nparam, const int *index,
{
Assert(index[i] >= 0 && index[i] < PGSTAT_NUM_PROGRESS_PARAM);
- beentry->st_progress_param[index[i]] = val[i];
+ beentry->st_progress_cmds[beentry->st_current_cmd].params[index[i]] =
+ val[i];
}
PGSTAT_END_WRITE_ACTIVITY(beentry);
@@ -3274,13 +3290,18 @@ pgstat_progress_end_command(void)
if (!beentry)
return;
- if (!pgstat_track_activities
- && beentry->st_progress_command == PROGRESS_COMMAND_INVALID)
+
+ if (!pgstat_track_activities || beentry->st_current_cmd < 0 ||
+ beentry->st_progress_cmds[beentry->st_current_cmd].command ==
+ PROGRESS_COMMAND_INVALID)
return;
PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
- beentry->st_progress_command = PROGRESS_COMMAND_INVALID;
- beentry->st_progress_command_target = InvalidOid;
+ beentry->st_progress_cmds[beentry->st_current_cmd].command =
+ PROGRESS_COMMAND_INVALID;
+ beentry->st_progress_cmds[beentry->st_current_cmd].target =
+ InvalidOid;
+ beentry->st_current_cmd--;
PGSTAT_END_WRITE_ACTIVITY(beentry);
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 05240bfd14..76ae2e0115 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -495,6 +495,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
Datum values[PG_STAT_GET_PROGRESS_COLS];
bool nulls[PG_STAT_GET_PROGRESS_COLS];
int i;
+ int cmdidx;
MemSet(values, 0, sizeof(values));
MemSet(nulls, 0, sizeof(nulls));
@@ -510,7 +511,18 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
* Report values for only those backends which are running the given
* command.
*/
- if (!beentry || beentry->st_progress_command != cmdtype)
+ if (!beentry)
+ continue;
+
+ /* Look up the progress information of the given command */
+ for (cmdidx = 0; cmdidx <= beentry->st_current_cmd; cmdidx++)
+ {
+ if (beentry->st_progress_cmds[cmdidx].command == cmdtype)
+ break;
+ }
+
+ /* Skip if the command is not running */
+ if (cmdidx > beentry->st_current_cmd)
continue;
/* Value available to all callers */
@@ -520,9 +532,9 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
/* show rest of the values including relid only to role members */
if (has_privs_of_role(GetUserId(), beentry->st_userid))
{
- values[2] = ObjectIdGetDatum(beentry->st_progress_command_target);
+ values[2] = ObjectIdGetDatum(beentry->st_progress_cmds[cmdidx].target);
for (i = 0; i < PGSTAT_NUM_PROGRESS_PARAM; i++)
- values[i + 3] = Int64GetDatum(beentry->st_progress_param[i]);
+ values[i + 3] = Int64GetDatum(beentry->st_progress_cmds[cmdidx].params[i]);
}
else
{
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index fe076d823d..b87f253777 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -953,13 +953,14 @@ typedef enum
*/
typedef enum ProgressCommandType
{
- PROGRESS_COMMAND_INVALID,
+ PROGRESS_COMMAND_INVALID = 0,
PROGRESS_COMMAND_VACUUM,
PROGRESS_COMMAND_CLUSTER,
PROGRESS_COMMAND_CREATE_INDEX
} ProgressCommandType;
#define PGSTAT_NUM_PROGRESS_PARAM 20
+#define PGSTAT_MAX_PROGRESS_INFO 4
/* ----------
* Shared-memory data structures
@@ -1010,6 +1011,18 @@ typedef struct PgBackendGSSStatus
} PgBackendGSSStatus;
+/*
+ * Struct for command progress information of one command. The 'target'
+ * should be the OID of the relation which the command targets (we assume
+ * there's just one, as this is meant for utility commands), but the
+ * meaning of each element in the params array is command-specific.
+ */
+typedef struct PgBackendProgressInfo
+{
+ ProgressCommandType command;
+ Oid target;
+ int64 params[PGSTAT_NUM_PROGRESS_PARAM];
+} PgBackendProgressInfo;
/* ----------
* PgBackendStatus
@@ -1085,17 +1098,16 @@ typedef struct PgBackendStatus
char *st_activity_raw;
/*
- * Command progress reporting. Any command which wishes can advertise
- * that it is running by setting st_progress_command,
- * st_progress_command_target, and st_progress_param[].
- * st_progress_command_target should be the OID of the relation which the
- * command targets (we assume there's just one, as this is meant for
- * utility commands), but the meaning of each element in the
- * st_progress_param array is command-specific.
+ * Command progress reporting. Since it's possible to nested call
+ * multiple commands that support progress report, e.g. CLUSTER
+ * executes REINDEX inside, we track progress of multiple commands.
+ * st_current_cmd is the index of st_progress_cmds of current executing
+ * command, -1 for invalid. Any command which wishes can advertise
+ * that is is running by increasing st_current_cmd and setting
+ * the corresponding elements of st_progress_cmd.
*/
- ProgressCommandType st_progress_command;
- Oid st_progress_command_target;
- int64 st_progress_param[PGSTAT_NUM_PROGRESS_PARAM];
+ int st_current_cmd;
+ PgBackendProgressInfo st_progress_cmds[PGSTAT_MAX_PROGRESS_INFO];
} PgBackendStatus;
/*
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic