'[ceph-users] Re: PG repair leaving cluster unavailable'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ceph-users
Subject:    [ceph-users] Re: PG repair leaving cluster unavailable
From:       Gesiel_GalvÃ£o_Bernardes <gesiel.bernardes () gmail ! com>
Date:       2021-04-28 17:42:26
Message-ID: CADAE32ODstEgxOCP3YSBcYii83Y0_Ye8qvXWgWjOnhrGNgse+A () mail ! gmail ! com
[Download RAW message or body]

Complementing the information: I'm using mimic (13.2) on the cluster. I
noticed that during the PG repair process the entire cluster was extremely
slow, however, there was no overhead on the OSD nodes. The load of these
nodes, which in normal production is between 10.00 and 20.00 was less than
5. When repair is finished (after 4 hours), the cluster backed to normal.

Is this result expected?


Em ter., 27 de abr. de 2021 Ã s 14:16, Gesiel GalvÃ£o Bernardes <
gesiel.bernardes@gmail.com> escreveu:

> Hi,
>
> I have 3 pools, where I use it exclusively for RBD images. 2 They are
> mirrored and one is an erasure code. It turns out that today I received the
> warning that a PG was inconsistent in the pool erasure, and then I ran
> "ceph pg repair <pg>". It turns out that after that the entire cluster
> became extremely slow, to the point that no VM works.
>
>
> This is the output of "ceph -s":
> # ceph -s
>   cluster:
>     id: 4ea72929-6f9e-453a-8cd5-bb0712f6b874
>     health: HEALTH_ERR
>             1 scrub errors
>             Possible data damage: 1 pg inconsistent, 1 pg repair
>
>   services:
>     mon: 2 daemons, cmonitor quorum, cmonitor2
>     mgr: cmonitor (active), standbys: cmonitor2
>     osd: 87 osds: 87 up, 87 in
>     tcmu-runner: 10 active daemons
>
>   date:
>     pools: 7 pools, 3072 pgs
>     objects: 30.00 M objects, 113 TiB
>     usage: 304 TiB used, 218 TiB / 523 TiB avail
>     pgs: 3063 active + clean
>              8 active + clean + scrubbing + deep
>              1 active + clean + scrubbing + deep + inconsistent + repair
>
>   io:
>     client: 24 MiB / s rd, 23 MiB / s wr, 629 op / s rd, 519 op / s wr
>     cache: 5.9 MiB / s flush, 35 MiB / s evict, 9 op / s promote
>
> Does anyone have any idea how to make it available again?
>
> Regards,
> Gesiel
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-leave@ceph.io

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic