'Re: [ceph-users] How to add 100 new OSDs...'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ceph-users
Subject:    Re: [ceph-users] How to add 100 new OSDs...
From:       Paul Mezzanini <pfmeec () rit ! edu>
Date:       2019-07-28 13:16:06
Message-ID: 1564319765504.56978 () rit ! edu
[Download RAW message or body]

I'll throw my $.02 in from when I was growing our cluster.

My method ended up being to script up the LVM creation so the lvm names ref=
lect OSD/Journal serial numbers for easy location later,  "ceph-volume prep=
are" the whole node to get it ready for insertion followed by "ceph-volume =
activate".  I typically see more of an impact on performance with peering i=
nstead of with rebalancing.  =


If I'm doing a whole node, I make sure the node's weight is set to 0 and sl=
owly walk it up in chunks.  If it's anything less I just let it fly as-is.  =


My workloads didn't seem to mind the increased latency during a huge rebala=
nce but another admin has some latency sensitive VMs hosted and by moving i=
t up slowly I could easily wait for things to settle if he saw the numbers =
get too high.  It's a simple knob twist to make another admin happy when do=
ing storage changes so I do it.


--
Paul Mezzanini
Sr Systems Administrator / Engineer, Research Computing
Information & Technology Services
Finance & Administration
Rochester Institute of Technology
o:(585) 475-3245 | pfmeec@rit.edu

CONFIDENTIALITY NOTE: The information transmitted, including attachments, is
intended only for the person(s) or entity to which it is addressed and may
contain confidential and/or privileged material. Any review, retransmission,
dissemination or other use of, or taking of any action in reliance upon this
information by persons or entities other than the intended recipient is
prohibited. If you received this in error, please contact the sender and
destroy any copies of this information.
------------------------

________________________________________
From: ceph-users <ceph-users-bounces@lists.ceph.com> on behalf of Anthony D=
'Atri <aad@dreamsnake.net>
Sent: Sunday, July 28, 2019 4:09 AM
To: ceph-users
Subject: Re: [ceph-users] How to add 100 new OSDs...

Paul Emmerich wrote:

> +1 on adding them all at the same time.
>
> All these methods that gradually increase the weight aren't really
> necessary in newer releases of Ceph.

Because the default backfill/recovery values are lower than they were in, s=
ay, Dumpling?

Doubling (or more) the size of a cluster in one swoop still means a lot of =
peering and a lot of recovery I/O, I=92ve seen a cluster=92s data rate go t=
o or near 0 for a brief but nonzero length of time.  If something goes wron=
g with the network (cough cough subtle jumbo frame lossage cough) , if one =
has fat-fingered something along the way, etc. going in increments means th=
at a ^C lets the cluster stablize before very long.  Then you get to troubl=
eshoot with HEALTH_OK instead of HEALTH_WARN or HEALTH_ERR.

Having experienced a cluster be DoS=92d for hours when its size was tripled=
 in one go, I=92m once bitten twice shy.  Yes, that was Dumpling, but even =
with SSDs on Jewel and Luminous I=92ve seen sigificant client performance i=
mpact from en-masse topology changes.

=97 aad

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic