[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ceph-users
Subject:    [ceph-users] dealing with unfound pg in 4:2 ec pool
From:       "Szabo, Istvan (Agoda)" <Istvan.Szabo () agoda ! com>
Date:       2021-09-30 20:01:16
Message-ID: HK0PR01MB26113C82BCDA5EEABEF8786785AA9 () HK0PR01MB2611 ! apcprd01 ! prod ! exchangelabs ! com
[Download RAW message or body]

Hi,

If I set the min size of the pool to 4, will this pg be recovered? Or how I can take \
out the cluster from health error like this? Mark as lost seems risky based on some \
maillist experience, even if marked lost after you still have issue, so curious what \
is the way to take the cluster out from this and let it recover:

Example problematic pg:
dumped pgs_brief
PG_STAT  STATE                                                 UP                   \
UP_PRIMARY  ACTING                              ACTING_PRIMARY 28.5b    \
active+recovery_unfound+undersized+degraded+remapped    [18,33,10,0,48,1]          18 \
[2147483647,2147483647,29,21,4,47]              29

Cluster state:
  cluster:
    id:     5a07ec50-4eee-4336-aa11-46ca76edcc24
    health: HEALTH_ERR
            10 OSD(s) experiencing BlueFS spillover
            4/1055070542 objects unfound (0.000%)
            noout flag(s) set
            Possible data damage: 2 pgs recovery_unfound
            Degraded data redundancy: 64150765/6329079237 objects degraded (1.014%), \
10 pgs degraded, 26 pgs undersized  4 pgs not deep-scrubbed in time

  services:
    mon: 3 daemons, quorum mon-2s01,mon-2s02,mon-2s03 (age 2M)
    mgr: mon-2s01(active, since 2M), standbys: mon-2s03, mon-2s02
    osd: 49 osds: 49 up (since 36m), 49 in (since 4d); 28 remapped pgs
         flags noout
    rgw: 3 daemons active (mon-2s01.rgw0, mon-2s02.rgw0, mon-2s03.rgw0)

  task status:

  data:
    pools:   9 pools, 425 pgs
    objects: 1.06G objects, 66 TiB
    usage:   158 TiB used, 465 TiB / 623 TiB avail
    pgs:     64150765/6329079237 objects degraded (1.014%)
             38922319/6329079237 objects misplaced (0.615%)
             4/1055070542 objects unfound (0.000%)
             393 active+clean
             13  active+undersized+remapped+backfill_wait
             8   active+undersized+degraded+remapped+backfill_wait
             3   active+clean+scrubbing
             3   active+undersized+remapped+backfilling
             2   active+recovery_unfound+undersized+degraded+remapped
             2   active+remapped+backfill_wait
             1   active+clean+scrubbing+deep

  io:
    client:   181 MiB/s rd, 9.4 MiB/s wr, 5.38k op/s rd, 2.42k op/s wr
    recovery: 23 MiB/s, 389 objects/s


Thank you.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-leave@ceph.io


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic