[prev in list] [next in list] [prev in thread] [next in thread]
List: ceph-users
Subject: [ceph-users] "outed" 10+ OSDs, recovery was fast (300+Mbps) until it wasn't (<1Mbps)
From: David Young <davidy () funkypenguin ! co ! nz>
Date: 2022-05-31 10:16:31
Message-ID: CAHsx7_MpV4=a-a1v8CTp0aGwOvhh+W1VKsvHAA2zx1PDsK-4zA () mail ! gmail ! com
[Download RAW message or body]
Hey guys!
I've got a cluster with 90 OSDs spread across 5 hosts, most of which are
hdd based. After some real-world testing, performance was not up to
expectations, and as I started researching, I realized that I _should_ have
used my locally attached NMVEs as bluestore db devices.
So, I decided to "out" all the OSDs on one node, wait for recovery, and
then delete and recreate these OSDs using a separate metadata device. The
recovery process was relatively straightforward (>300Mbps) until the end,
at which it dropped to <1Mbps. Interestingly, the amount of of misplaced
objects is gradually *growing*..
Here's what "ceph -s" shows me:
---
cluster:
id: 4f4d6b12-7036-42d2-9366-8c99e4897391
health: HEALTH_WARN
insufficient standby MDS daemons available
noout flag(s) set
131 pgs not deep-scrubbed in time
87 pgs not scrubbed in time
3 daemons have recently crashed
services:
mon: 3 daemons, quorum b,d,e (age 20h)
mgr: a(active, since 20h)
mds: 4/4 daemons up, 2 hot standby
osd: 77 osds: 77 up (since 8h), 56 in (since 5d); 33 remapped pgs
flags noout
rgw: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 2/2 healthy
pools: 15 pools, 401 pgs
objects: 43.43M objects, 91 TiB
usage: 122 TiB used, 536 TiB / 659 TiB avail
pgs: 942074/154910213 objects misplaced (0.608%)
359 active+clean
32 active+clean+remapped
9 active+clean+scrubbing+deep
1 active+remapped+backfilling
io:
client: 120 MiB/s rd, 17 MiB/s wr, 151 op/s rd, 319 op/s wr
recovery: 1.7 MiB/s, 0 objects/s
progress:
Global Recovery Event (0s)
[............................]
---
And here's "ceph osd tree" (I outed all the SSD OSDs on some of my
hyperconverged hosts, and all disks on stg05):
---
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 840.67651 root default
-3 0.93149 host node01
0 ssd 0.93149 osd.0 up 0 1.00000
-11 0.93149 host node03
4 ssd 0.93149 osd.4 up 0 1.00000
-5 0.93149 host node04
1 ssd 0.93149 osd.1 up 0 1.00000
-7 0.93149 host node05
2 ssd 0.93149 osd.2 up 0 1.00000
-9 0.93149 host node06
3 ssd 0.93149 osd.3 up 0 1.00000
-13 0.93149 host node07
5 ssd 0.93149 osd.5 up 0 1.00000
-15 0.93149 host node08
6 ssd 0.93149 osd.6 up 0 1.00000
-25 131.90070 host stg01
7 hdd 10.91409 osd.7 up 1.00000 1.00000
13 hdd 10.91409 osd.13 up 1.00000 1.00000
14 hdd 10.91409 osd.14 up 1.00000 1.00000
19 hdd 10.91409 osd.19 up 1.00000 1.00000
23 hdd 10.91409 osd.23 up 1.00000 1.00000
25 hdd 10.91409 osd.25 up 1.00000 1.00000
30 hdd 10.91409 osd.30 up 1.00000 1.00000
36 hdd 10.91409 osd.36 up 1.00000 1.00000
39 hdd 10.91409 osd.39 up 1.00000 1.00000
43 hdd 10.91409 osd.43 up 1.00000 1.00000
48 hdd 10.91409 osd.48 up 1.00000 1.00000
50 hdd 10.91409 osd.50 up 1.00000 1.00000
34 ssd 0.46579 osd.34 up 1.00000 1.00000
55 ssd 0.46579 osd.55 up 1.00000 1.00000
-31 175.56384 host stg02
12 hdd 14.55269 osd.12 up 1.00000 1.00000
18 hdd 14.55269 osd.18 up 1.00000 1.00000
24 hdd 14.55269 osd.24 up 1.00000 1.00000
29 hdd 14.55269 osd.29 up 1.00000 1.00000
35 hdd 14.55269 osd.35 up 1.00000 1.00000
41 hdd 14.55269 osd.41 up 1.00000 1.00000
46 hdd 14.55269 osd.46 up 1.00000 1.00000
52 hdd 14.55269 osd.52 up 1.00000 1.00000
60 hdd 14.55269 osd.60 up 1.00000 1.00000
64 hdd 14.55269 osd.64 up 1.00000 1.00000
68 hdd 14.55269 osd.68 up 1.00000 1.00000
72 hdd 14.55269 osd.72 up 1.00000 1.00000
8 ssd 0.46579 osd.8 up 1.00000 1.00000
58 ssd 0.46579 osd.58 up 1.00000 1.00000
-37 175.56384 host stg03
11 hdd 14.55269 osd.11 up 1.00000 1.00000
17 hdd 14.55269 osd.17 up 1.00000 1.00000
21 hdd 14.55269 osd.21 up 1.00000 1.00000
28 hdd 14.55269 osd.28 up 1.00000 1.00000
32 hdd 14.55269 osd.32 up 1.00000 1.00000
40 hdd 14.55269 osd.40 up 1.00000 1.00000
45 hdd 14.55269 osd.45 up 1.00000 1.00000
51 hdd 14.55269 osd.51 up 1.00000 1.00000
56 hdd 14.55269 osd.56 up 1.00000 1.00000
61 hdd 14.55269 osd.61 up 1.00000 1.00000
65 hdd 14.55269 osd.65 up 1.00000 1.00000
69 hdd 14.55269 osd.69 up 1.00000 1.00000
74 ssd 0.46579 osd.74 up 1.00000 1.00000
76 ssd 0.46579 osd.76 up 1.00000 1.00000
-34 175.56384 host stg04
10 hdd 14.55269 osd.10 up 1.00000 1.00000
16 hdd 14.55269 osd.16 up 1.00000 1.00000
22 hdd 14.55269 osd.22 up 1.00000 1.00000
27 hdd 14.55269 osd.27 up 1.00000 1.00000
37 hdd 14.55269 osd.37 up 1.00000 1.00000
42 hdd 14.55269 osd.42 up 1.00000 1.00000
47 hdd 14.55269 osd.47 up 1.00000 1.00000
54 hdd 14.55269 osd.54 up 1.00000 1.00000
59 hdd 14.55269 osd.59 up 1.00000 1.00000
63 hdd 14.55269 osd.63 up 1.00000 1.00000
67 hdd 14.55269 osd.67 up 1.00000 1.00000
71 hdd 14.55269 osd.71 up 1.00000 1.00000
33 ssd 0.46579 osd.33 up 1.00000 1.00000
75 ssd 0.46579 osd.75 up 1.00000 1.00000
-28 175.56384 host stg05
9 hdd 14.55269 osd.9 up 0 1.00000
15 hdd 14.55269 osd.15 up 0 1.00000
20 hdd 14.55269 osd.20 up 0 1.00000
26 hdd 14.55269 osd.26 up 0 1.00000
31 hdd 14.55269 osd.31 up 0 1.00000
38 hdd 14.55269 osd.38 up 0 1.00000
44 hdd 14.55269 osd.44 up 0 1.00000
53 hdd 14.55269 osd.53 up 0 1.00000
57 hdd 14.55269 osd.57 up 0 1.00000
62 hdd 14.55269 osd.62 up 0 1.00000
66 hdd 14.55269 osd.66 up 0 1.00000
73 hdd 14.55269 osd.73 up 0 1.00000
49 ssd 0.46579 osd.49 up 0 1.00000
70 ssd 0.46579 osd.70 up 0 1.00000
---
How can I speed up / fix the recovery of this final PG?
Thanks! :)
D
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-leave@ceph.io
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic