[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ceph-users
Subject:    [ceph-users] =?koi8-r?b?7uE6ICDu4TogSG93IHRvIGdldCBSQkQgdm9sdW1l?= =?koi8-r?b?IHRvIFBHIG1hcHBpbmc/?=
From:       megov () yuterra ! ru (=?koi8-r?B?7cXWz9cg6cfP0tgg4czFy9PBzsTSz9fJ3g==?=)
Date:       2015-09-28 8:19:45
Message-ID: b4042cdb979a426d9d9f7aa0f71cf6de () ex-1 ! yuterra ! ru
[Download RAW message or body]

Hi!

Ilya Dryomov wrote:
>Internally there is a way to list objects within a specific PG
>(actually more than one way IIRC), but I don't think anything like that
>is exposed in a CLI (it might be exposed in librados though).  Grabbing
>an osdmap and iterating with osdmaptool --test-map-object over
>rbd_data.<prefix>.* is probably the fastest way for you to get what you
>want.

Yes, I dumped osdmap, did 'rados ls' for all objects into a file and started simple
shell script, that read object list and run osdmaptool. It is surprisingly slow -
still running from Friday afternoon and process only 5.000.000 objects
from over the 11.000.000. So maybe I'll try to dig deeper in librados
headers to write some homemade tool.

David Burley wrote:
>So figure out which OSDs are active for the PG, and run the find in the subdir 
>for the placement group on one of those. It should run really fast unless you
>have tons of tiny objects in the PG.

I think finding objects in directory structure is a good way, but only for healthy
cluster, where object placement are not changing. In my case, for a strange reason,
I can't figure all three OSD for this one PG. After a node crash I have this one PG 
in degraded state, it have only two replicas, while pool min_size=3. 

And more strange is that I cant force it to repair - neither 'ceph pg repair', nor OSD
restart didn't help me to recover PG. In health detail I can see only two OSDs for this PG. 




Megov Igor
CIO, Yuterra



________________________________________
??: Ilya Dryomov <idryomov at gmail.com>
??????????: 25 ???????? 2015 ?. 18:21
????: ????? ????? ?????????????
?????: David Burley; Jan Schermer; ceph-users
????: Re: [ceph-users] ??: How to get RBD volume to PG mapping?

On Fri, Sep 25, 2015 at 5:53 PM, ????? ????? ?????????????
<megov at yuterra.ru> wrote:
> Hi!
>
> Thanks!
>
> I have some suggestions for the 1st method:
>
>>You could get the name prefix for each RBD from rbd info,
> Yes, I did it already at the steps 1 and 2. I forgot to mention, that I grab
> rbd frefix from 'rbd info' command
>
>
>>then list all objects (run find on the osds?) and then you just need to
>> grep the OSDs for each prefix.
> So, you advise to run find over ssh for all OSD hosts to traverse OSDs
> filesystems and find files (objects),
> named with rbd prefix? Am I right? If so, I have two thoughts: (1) it may be
> not so fast also, because
> even limiting find with rbd prefix and pool index, it have to recursively go
> through whole OSD filesytem
> hierarchy. And (2) - find will put an additional load to OSD drives.
>
>
> The second method is more attractive and I will try it soon. As we have an
> object name,
> and can get a crushmap in some usable form to look by ourself, or indirectly
> through a
> library/api call - finding the chain of object-to-PG-to-OSDs is a local
> computational
> task, and it can be done without remote calls (accessing OSD hosts, finding,
> etc).
>
> Also, the slow looping through 'ceph osd map <pool> <object>' could be
> explained:
> for every object we have to spawn process, connecting cluster (with auth),
> receiving
> maps to client, calculating placement, and ... finally throw it all away
> when process
> exits. I think this overhead is a main reason of slowness.

Internally there is a way to list objects within a specific PG
(actually more than one way IIRC), but I don't think anything like that
is exposed in a CLI (it might be exposed in librados though).  Grabbing
an osdmap and iterating with osdmaptool --test-map-object over
rbd_data.<prefix>.* is probably the fastest way for you to get what you
want.

Thanks,

                Ilya

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic