[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-admin
Subject:    Re: OOM killer and rocky linux
From:       kaido vaikla <kaido.vaikla () gmail ! com>
Date:       2023-11-09 14:59:41
Message-ID: CA+427g_0RV43B9ftVCrLZkpZr0++MrV+DHXaU58_Dqf+5y_MzQ () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Yep, i know it. But my question was more like
why (during OOM)
pg on o/s Red Hat Enterprise Linux Server release 7.9 (Maipo)
does "Instance is reinitializing"
but
pg on o/s Rocky Linux release 9.2 (Blue Onyx)
got "Instance received fast shutdown request. "

br
Kaido

On Thu, 9 Nov 2023 at 15:04, Matti Linnanvuori <
matti.linnanvuori@portalify.com> wrote:

> [image: elephant.png]
> 
> 19.4. Managing Kernel Resources
> <https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT>
>  postgresql.org
> <https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT>
>  
> <https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT>
>  
> Hello!
> 
> Disabling overcommit is recommended.
> 
> Best regards
> 
> kaido vaikla <kaido.vaikla@gmail.com> kirjoitti 9.11.2023 kello 14.57:
> 
> Hi,
> 
> The problem is probably somwhere between pg and linux.
> Seems like OOM killer on rocky linux behaves in a different way
> than on RHEL.
> 
> For OOM killer invoking I used:
> set work_mem=50GB;
> select * from generate_series(1,100000000) order by random();
> 
> pg runs as systemd service.
> 
> Example 1:
> ==========
> pg 15.2
> o/s Red Hat Enterprise Linux Server release 7.9 (Maipo)
> 
> pg log:
> 
> 2023-07-05 12:56:28.848 EEST [:::119279:postmaster] LOG:  server process
> (PID 120401) was terminated by signal 9: Killed
> 2023-07-05 12:56:28.848 EEST [:::119279:postmaster] LOG:  terminating any
> other active server processes
> 2023-07-05 12:56:28.850 EEST [:::119279:postmaster] LOG:  all server
> processes terminated; reinitializing
> 2023-07-05 12:56:28.885 EEST [:::120421:startup] LOG:  database system was
> interrupted; last known up at 2023-07-05 12:55:34 EEST
> 2023-07-05 12:56:28.899 EEST [:::120421:startup] LOG:  database system was
> not properly shut down; automatic recovery in progress
> 2023-07-05 12:56:28.904 EEST [:::120421:startup] LOG:  redo starts at
> 1/D0002E8
> 2023-07-05 12:56:28.904 EEST [:::120421:startup] LOG:  invalid record
> length at 1/D000320: wanted 24, got 0
> 2023-07-05 12:56:28.904 EEST [:::120421:startup] LOG:  redo done at
> 1/D0002E8 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
> 2023-07-05 12:56:28.912 EEST [:::120422:checkpointer] LOG:  checkpoint
> starting: end-of-recovery immediate wait
> 2023-07-05 12:56:28.918 EEST [:::120422:checkpointer] LOG:  checkpoint
> complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0
> recycled; write=0.002 s, sync=0.002 s, total=0.013 s; sync files=2,
> longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB
> 
> 
> Instance is reinitializing
> 
> Example 2:
> ==========
> pg 15.2
> o/s Rocky Linux release 9.2 (Blue Onyx)
> 
> pg log and systemd log in chronological order (juuni == 06 ):
> 
> juuni 30 14:19:11.491833 pgdb-forecast kernel:
> oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_o \
> om,task_memcg=/system.slice/postgresql-15.service,task=postmaster,pid=2575533,uid=26
>  juuni 30 14:19:11.491839 pgdb-forecast kernel: Out of memory: Killed
> process 2575533 (postmaster) total-vm:9716200kB, anon-rss:5113080kB,
> file-rss:0kB, shmem-rss:16828kB, UID:26 pgtables:10256kB oom_score_adj:0
> juuni 30 14:19:11.491846 pgdb-forecast kernel: oom_reaper: reaped process
> 2575533 (postmaster), now anon-rss:0kB, file-rss:0kB, shmem-rss:16828kB
> juuni 30 14:19:11.478410 pgdb-forecast systemd[1]: postgresql-15.service:
> A process of this unit has been killed by the OOM killer.
> 2023-06-30 14:19:11.505 EEST [:::2517202:postmaster] LOG:  received fast
> shutdown request
> 2023-06-30 14:19:11.518 EEST [:::2517202:postmaster] LOG:  aborting any
> active transactions
> 2023-06-30 14:19:11.564 EEST [:::2517202:postmaster] LOG:  server process
> (PID 2575533) was terminated by signal 9: Killed
> 2023-06-30 14:19:11.564 EEST [:::2517202:postmaster] LOG:  terminating any
> other active server processes
> 2023-06-30 14:19:11.633 EEST [:::2517202:postmaster] LOG:  abnormal
> database system shutdown
> 2023-06-30 14:19:11.939 EEST [:::2517202:postmaster] LOG:  database system
> is shut down
> juuni 30 14:19:11.940216 pgdb-forecast systemd[1]: postgresql-15.service:
> Main process exited, code=exited, status=1/FAILURE
> juuni 30 14:19:11.940255 pgdb-forecast systemd[1]: postgresql-15.service:
> Killing process 2517203 (postmaster) with signal SIGKILL.
> juuni 30 14:19:11.940700 pgdb-forecast systemd[1]: postgresql-15.service:
> Failed with result 'oom-kill'.
> juuni 30 14:19:11.940884 pgdb-forecast systemd[1]: postgresql-15.service:
> Consumed 46min 16.612s CPU time.
> 
> 
> Instance received fast shutdown request.
> And kernel sends kill twice: process 2575533, process 2517203.
> 
> Googeling this problem gives me nothing, and i'm not sure,
> is it caused of pg or linux kernel.
> 
> br
> Kaido
> 
> 
> 
> 
> 
> 
> 


[Attachment #5 (text/html)]

<div dir="ltr"><div class="gmail_default" \
style="font-family:monospace,monospace">Yep, i know it. But my question was more like \
<br>why (during OOM)  <br>pg on o/s  Red Hat Enterprise Linux Server release 7.9 \
(Maipo)<br>does &quot;Instance is reinitializing&quot;  <br>but<br>pg on o/s  Rocky \
Linux release 9.2 (Blue Onyx)<br>got &quot;Instance received fast shutdown request. \
&quot;<br><br>br<br>Kaido</div></div><br><div class="gmail_quote"><div dir="ltr" \
class="gmail_attr">On Thu, 9 Nov 2023 at 15:04, Matti Linnanvuori &lt;<a \
href="mailto:matti.linnanvuori@portalify.com">matti.linnanvuori@portalify.com</a>&gt; \
wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div \
style="display:block"><div style="display:inline-block" role="link"><a \
style="border-radius:10px;font-family:-apple-system,Helvetica,Arial,sans-serif;display:block;width:228px;overflow:hidden;text-decoration:none" \
rel="nofollow" href="https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT" \
dir="ltr" role="button" width="228" target="_blank"><table \
style="table-layout:fixed;border-collapse:collapse;width:228px;background-color:rgb(229,230,233);font-family:-apple-system,Helvetica,Arial,sans-serif" \
cellpadding="0" cellspacing="0" border="0" width="228"><tbody><tr><td \
align="center"><img style="width: 228px; height: 235px;" width="228" height="235" \
alt="elephant.png" src="cid:ii_18bb49889c51960ddcc1"></td></tr><tr><td><table \
bgcolor="#E5E6E9" cellpadding="0" cellspacing="0" width="228" \
style="font-family:-apple-system,Helvetica,Arial,sans-serif;table-layout:fixed;background-color:rgb(229,230,233)"><tbody><tr><td \
style="padding:8px 0px"><div style="max-width:100%;margin:0px \
16px;overflow:hidden"><div \
style="font-weight:500;font-size:12px;overflow:hidden;text-overflow:ellipsis;text-align:left"><a \
rel="nofollow" href="https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT" \
style="text-decoration:none" target="_blank"><font color="#272727" \
style="color:rgba(0,0,0,0.847)">19.4.  Managing Kernel Resources</font></a></div><div \
style="font-weight:400;font-size:11px;overflow:hidden;text-overflow:ellipsis;text-align:left"><a \
rel="nofollow" href="https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT" \
style="text-decoration:none" target="_blank"><font color="#808080" \
style="color:rgba(0,0,0,0.498)">postgresql.org</font></a></div></div></td></tr></tbody \
></table></td></tr></tbody></table></a></div></div><div><br></div><div>Hello!</div><div><br></div>Disabling \
> overcommit is recommended.<div><br></div><div>Best \
> regards</div><div><div><br><blockquote type="cite"><div>kaido vaikla &lt;<a \
> href="mailto:kaido.vaikla@gmail.com" target="_blank">kaido.vaikla@gmail.com</a>&gt; \
> kirjoitti 9.11.2023 kello 14.57:</div><br><div><div dir="ltr"><div \
> class="gmail_default" style="font-family:monospace,monospace">Hi,  <br><br>The \
> problem is probably somwhere between pg and linux.  <br>Seems like OOM killer on \
> rocky linux behaves in a different way <br>than on RHEL.  <br><br>For OOM killer \
> invoking I used:<br>set work_mem=50GB;<br>select * from \
> generate_series(1,100000000) order by random();<br><br>pg runs as systemd \
> service.<br><br>Example 1:</div><div class="gmail_default" \
> style="font-family:monospace,monospace">==========<br></div><div \
> class="gmail_default" style="font-family:monospace,monospace">pg 15.2<br>o/s  Red \
> Hat Enterprise Linux Server release 7.9 (Maipo)<br><br>pg log:</div><div \
> class="gmail_default" style="font-family:monospace,monospace"><br></div><div \
> class="gmail_default" style="font-family:monospace,monospace">2023-07-05 \
> 12:56:28.848 EEST [:::119279:postmaster] LOG:   server process (PID 120401) was \
> terminated by signal 9: Killed<br>2023-07-05 12:56:28.848 EEST \
> [:::119279:postmaster] LOG:   terminating any other active server \
> processes<br>2023-07-05 12:56:28.850 EEST [:::119279:postmaster] LOG:   all server \
> processes terminated; reinitializing<br>2023-07-05 12:56:28.885 EEST \
> [:::120421:startup] LOG:   database system was interrupted; last known up at \
> 2023-07-05 12:55:34 EEST<br>2023-07-05 12:56:28.899 EEST [:::120421:startup] LOG:   \
> database system was not properly shut down; automatic recovery in \
> progress<br>2023-07-05 12:56:28.904 EEST [:::120421:startup] LOG:   redo starts at \
> 1/D0002E8<br>2023-07-05 12:56:28.904 EEST [:::120421:startup] LOG:   invalid record \
> length at 1/D000320: wanted 24, got 0<br>2023-07-05 12:56:28.904 EEST \
> [:::120421:startup] LOG:   redo done at 1/D0002E8 system usage: CPU: user: 0.00 s, \
> system: 0.00 s, elapsed: 0.00 s<br>2023-07-05 12:56:28.912 EEST \
> [:::120422:checkpointer] LOG:   checkpoint starting: end-of-recovery immediate \
> wait<br>2023-07-05 12:56:28.918 EEST [:::120422:checkpointer] LOG:   checkpoint \
> complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; \
> write=0.002 s, sync=0.002 s, total=0.013 s; sync files=2, longest=0.001 s, \
> average=0.001 s; distance=0 kB, estimate=0 kB<br><br><br>Instance is \
> reinitializing<br><br>Example 2:</div><div class="gmail_default" \
> style="font-family:monospace,monospace">==========<br>pg 15.2<br>o/s  Rocky Linux \
> release 9.2 (Blue Onyx)</div><div class="gmail_default" \
> style="font-family:monospace,monospace"><br>pg log and systemd log in chronological \
> order (juuni == 06 ):<br><br>juuni 30 14:19:11.491833 pgdb-forecast kernel: \
> oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_o \
> om,task_memcg=/system.slice/postgresql-15.service,task=postmaster,pid=2575533,uid=26<br>juuni \
> 30 14:19:11.491839 pgdb-forecast kernel: Out of memory: Killed process 2575533 \
> (postmaster) total-vm:9716200kB, anon-rss:5113080kB, file-rss:0kB, \
> shmem-rss:16828kB, UID:26 pgtables:10256kB oom_score_adj:0<br>juuni 30 \
> 14:19:11.491846 pgdb-forecast kernel: oom_reaper: reaped process 2575533 \
> (postmaster), now anon-rss:0kB, file-rss:0kB, shmem-rss:16828kB<br>juuni 30 \
> 14:19:11.478410 pgdb-forecast systemd[1]: postgresql-15.service: A process of this \
> unit has been killed by the OOM killer.<br>2023-06-30 14:19:11.505 EEST \
> [:::2517202:postmaster] LOG:   received fast shutdown request<br>2023-06-30 \
> 14:19:11.518 EEST [:::2517202:postmaster] LOG:   aborting any active \
> transactions<br>2023-06-30 14:19:11.564 EEST [:::2517202:postmaster] LOG:   server \
> process (PID 2575533) was terminated by signal 9: Killed <br>2023-06-30 \
> 14:19:11.564 EEST [:::2517202:postmaster] LOG:   terminating any other active \
> server processes<br>2023-06-30 14:19:11.633 EEST [:::2517202:postmaster] LOG:   \
> abnormal database system shutdown<br>2023-06-30 14:19:11.939 EEST \
> [:::2517202:postmaster] LOG:   database system is shut down<br>juuni 30 \
> 14:19:11.940216 pgdb-forecast systemd[1]: postgresql-15.service: Main process \
> exited, code=exited, status=1/FAILURE<br>juuni 30 14:19:11.940255 pgdb-forecast \
> systemd[1]: postgresql-15.service: Killing process 2517203 (postmaster) with signal \
> SIGKILL.<br>juuni 30 14:19:11.940700 pgdb-forecast systemd[1]: \
> postgresql-15.service: Failed with result &#39;oom-kill&#39;.<br>juuni 30 \
> 14:19:11.940884 pgdb-forecast systemd[1]: postgresql-15.service: Consumed 46min \
> 16.612s CPU time.<br><br><br>Instance received fast shutdown request.  <br>And \
> kernel sends kill twice: process 2575533,  process 2517203.<br><br>Googeling this \
> problem gives me nothing, and i&#39;m not sure,  <br>is it caused of pg or linux \
> kernel.<br><br>br<br>Kaido<br><br><br><br><br><br></div></div>
</div></blockquote></div><br></div></div></blockquote></div>

--0000000000007422d90609b97f56--


["elephant.png" (image/png)]

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic