[prev in list] [next in list] [prev in thread] [next in thread]
List: ceph-users
Subject: [ceph-users] Re: monitor sst files continue growing
From: Wido den Hollander <wido () 42on ! com>
Date: 2020-10-30 8:39:31
Message-ID: 3b08c6b1-e993-e127-260d-dfc2ba153317 () 42on ! com
[Download RAW message or body]
On 29/10/2020 19:29, Zhenshi Zhou wrote:
> Hi Alex,
>
> We found that there were a huge number of keys in the "logm" and "osdmap"
> table
> while using ceph-monstore-tool. I think that could be the root cause.
>
But that is exactly how Ceph works. It might need that very old OSDMap
to get all the PGs clean again. An OSD which has been gone for a very
long time and needs to catch up to make a PG clean.
If not all PGs are active+clean you will and can see the MON databases
grow rapidly.
Therefor I always deploy 1TB SSDs in all Monitors. Not expensive anymore
and they give breathing room.
I always deploy physical and dedicated machines for Monitors just to
prevent these cases.
Wido
> Well, some pages also say that disable 'insight' module can resolve this
> issue, but
> I checked our cluster and we didn't enable this module. check this page
> <https://tracker.ceph.com/issues/39955>.
>
> Anyway, our cluster is unhealthy though, it just need time keep recovering
> data :)
>
> Thanks
>
> Alex Gracie <alexandergracie17@gmail.com> 于2020年10月29日周四 下午10:57写道:
>
>> We hit this issue over the weekend on our HDD backed EC Nautilus cluster
>> while removing a single OSD. We also did not have any luck using
>> compaction. The mon-logs filled up our entire root disk on the mon servers
>> and we were running on a single monitor for hours while we tried to finish
>> recovery and reclaim space. The past couple weeks we also noticed "pg not
>> scubbed in time" errors but are unsure if they are related. I'm still the
>> exact cause of this(other than the general misplaced/degraded objects) and
>> what kind of growth is acceptable for these store.db files.
>>
>> In order to get our downed mons restarted, we ended up backing up and
>> coping the /var/lib/ceph/mon/* contents to a remote host, setting up an
>> sshfs mount to that new host with large NVME and SSDs, ensuring the mount
>> paths were owned by ceph, then clearing up enough space on the monitor host
>> to start the service. This allowed our store.db directory to grow freely
>> until the misplaced/degraded objects could recover and monitors all
>> rejoined eventually.
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-leave@ceph.io
>>
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-leave@ceph.io
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-leave@ceph.io
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic