[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ceph-users
Subject:    [ceph-users] =?gb2312?b?tPC4tDogIEJsdWVzdG9yZSB3aXRoIHNvIG1hbnkg?= =?gb2312?b?c21hbGwgZmlsZXM=?=
From:       <LJshoot () hotmail ! com>
Date:       2019-04-28 7:56:39
Message-ID: HK0PR01MB2946EB17304B5A33AEC4AE19A6380 () HK0PR01MB2946 ! apcprd01 ! prod ! exchangelabs ! com
[Download RAW message or body]

[Attachment #2 (text/plain)]

OK. Thanks.

Once I thought restarting OSD could make it work.

________________________________
·¢¼þÈË: Fr¨¦d¨¦ric Nass <frederic.nass@univ-lorraine.fr>
·¢ËÍʱ¼ä: 2019Äê4Ô 23ÈÕ 14:05
ÊÕ¼þÈË: Áõ ¿¡
³­ËÍ: ceph-users
Ö÷Ìâ: Re: [ceph-users] Bluestore with so many small files

Hi,

You probably forgot to recreate the OSD after changing bluestore_min_alloc_size.

Regards,
Fr¨¦d¨¦ric.

----- Le 22 Avr 19, ¨¤ 5:41, Áõ ¿¡ <LJshoot@hotmail.com> a ¨¦crit :

Hi All,

I still see this issue with latest ceph Luminous 12.2.11 and 12.2.12.

I have set bluestore_min_alloc_size = 4096 before the test.

when I write 100000 small objects less than 64KB through rgw, the RAW USED showed in \
"ceph df" looks incorrect.

For example, I test three times and clean up the rgw data pool each time, the object \
size for first time is 4KB, for second time is 32KB, for third time is 64KB.

The RAW USED showed in "ceph df" are the same(18GB),  looks like always equal to \
64KB*100000/1024*3. (replicator is 3 here)

Any thought?

Jamie

_______________________________________________

Hi Behnam,

On 2/12/2018 4:06 PM, Behnam Loghmani wrote:
> Hi there,
> 
> I am using ceph Luminous 12.2.2 with:
> 
> 3 osds (each osd is 100G) - no WAL/DB separation.
> 3 mons
> 1 rgw
> cluster size 3
> 
> I stored lots of thumbnails with very small size on ceph with radosgw.
> 
> Actual size of files is something about 32G but it filled 70G of each osd.
> 
> what's the reason of this high disk usage?
Most probably the major reason is BlueStore allocation granularity. E.g.
an object of 1K bytes length needs 64K of disk space if default
bluestore_min_alloc_size_hdd  (=64K) is applied.
Additional inconsistency in space reporting might also appear since
BlueStore adds up DB volume space when accounting total store space.
While free space is taken from Block device only. is As a result when
reporting "Used" space always contain that total DB space part ( i.e.
Used = Total(Block+DB) - Free(Block) ). That correlates to other
comments in this thread about RockDB space usage.
There is a pending PR to fix that:
https://github.com/ceph/ceph/pull/19454/commits/144fb9663778f833782bdcb16acd707c3ed62a86
 You may look for "Bluestore: inaccurate disk usage statistics problem"
in this mail list for previous discussion as well.

> should I change "bluestore_min_alloc_size_hdd"? and If I change it and
> set it to smaller size, does it impact on performance?
Unfortunately I haven't benchmark "small writes over hdd" cases much
hence don't have exacts answer here. Indeed these 'min_alloc_size'
family of parameters might impact the performance quite significantly.
> 
> what is the best practice for storing small files on bluestore?
> 
> Best regards,
> Behnam Loghmani


> 
> On Mon, Feb 12, 2018 at 5:06 PM, David Turner <drakonstein at \
> gmail.com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> \
> <mailto:drakonstein at \
> gmail.com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>>> wrote: 
> Some of your overhead is the Wal and rocksdb that are on the OSDs.
> The Wal is pretty static in size, but rocksdb grows with the amount
> of objects you have. You also have copies of the osdmap on each osd.
> There's just overhead that adds up. The biggest is going to be
> rocksdb with how many objects you have.
> 
> 
> On Mon, Feb 12, 2018, 8:06 AM Behnam Loghmani
> <behnam.loghmani at \
> gmail.com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> \
> <mailto:behnam.loghmani at \
> gmail.com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>>> wrote: 
> Hi there,
> 
> I am using ceph Luminous 12.2.2 with:
> 
> 3 osds (each osd is 100G) - no WAL/DB separation.
> 3 mons
> 1 rgw
> cluster size 3
> 
> I stored lots of thumbnails with very small size on ceph with
> radosgw.
> 
> Actual size of files is something about 32G but it filled 70G of
> each osd.
> 
> what's the reason of this high disk usage?
> should I change "bluestore_min_alloc_size_hdd"? and If I change
> it and set it to smaller size, does it impact on performance?
> 
> what is the best practice for storing small files on bluestore?
> 
> Best regards,
> Behnam Loghmani
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> \
> <mailto:ceph-users at \
> lists.ceph.com<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>> \
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com \
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[Attachment #3 (text/html)]

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312">
<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} \
--></style> </head>
<body dir="ltr">
<div id="divtagdefaultwrapper" \
style="font-size:12pt;color:#000000;font-family:Calibri,Helvetica,sans-serif;" \
dir="ltr"> <p style="margin-top:0;margin-bottom:0"><span style="font-size: 12pt;">OK. \
Thanks.</span><br> </p>
<p style="margin-top:0;margin-bottom:0">Once&nbsp;I thought restarting OSD could make \
it work.<span style="font-size: 12pt;">&nbsp;</span></p> <br>
<div style="color: rgb(0, 0, 0);">
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" \
style="font-size:11pt" color="#000000"><b>·¢¼þÈË:</b> Fr¨¦d¨¦ric Nass \
&lt;frederic.nass@univ-lorraine.fr&gt;<br> <b>·¢ËÍʱ¼ä:</b> 2019Äê4Ô 23ÈÕ 14:05<br>
<b>ÊÕ¼þÈË:</b> Áõ ¿¡<br>
<b>³­ËÍ:</b> ceph-users<br>
<b>Ö÷Ìâ:</b> Re: [ceph-users] Bluestore with so many small files</font>
<div>&nbsp;</div>
</div>
<div>
<div id="x_zimbraEditorContainer" class="x_42" \
style="font-family:arial,helvetica,sans-serif; font-size:12pt; color:#000000"> \
<div></div> <div>Hi,<br>
</div>
<div><br>
</div>
<div>You probably forgot to recreate the OSD after changing \
bluestore_min_alloc_size.<br> </div>
<div><br>
</div>
<div>Regards,<br>
</div>
<div>Fr¨¦d¨¦ric.<br>
</div>
<div><br>
<span id="x_zwchr">----- Le 22 Avr 19, ¨¤ 5:41, Áõ ¿¡ &lt;LJshoot@hotmail.com&gt; a \
¨¦crit :<br> </span></div>
<div>
<blockquote style="border-left:2px solid #1010FF; margin-left:5px; padding-left:5px; \
color:#000; font-weight:normal; font-style:normal; text-decoration:none; \
font-family:Helvetica,Arial,sans-serif; font-size:12pt"> <div \
style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"> \
</div> <pre><pre style=""><pre style=""><span \
style="font-family:Arial,Helvetica,sans-serif; font-size:12pt">Hi </span><span \
style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt"><font \
color="#000000"><span \
style="font-family:Arial,Helvetica,sans-serif">All</span></font></span><span \
style="color:rgb(51,51,51); font-family:Arial,Helvetica,sans-serif; \
font-size:12pt">,</span></pre><pre style="color:rgb(51,51,51)"><span \
style="font-family:Arial,Helvetica,sans-serif">I still see this issue with latest \
</span><span style="color:rgb(0,0,0); font-family:Arial,Helvetica,sans-serif; \
font-size:12pt; font-variant-ligatures:inherit; font-variant-caps:inherit; \
font-weight:inherit">ceph Luminous 12.2.11 and 12.2.12.</span></pre><pre \
style="color:rgb(51,51,51)"><span style="color:rgb(0,0,0); \
font-family:Arial,Helvetica,sans-serif; font-size:12pt; \
font-variant-ligatures:inherit; font-variant-caps:inherit; font-weight:inherit">I \
have </span><span style="font-family:Arial,Helvetica,sans-serif">set \
bluestore_min_alloc_size = 4096 before the test.</span></pre><pre \
style="color:rgb(51,51,51)"><span style="color:rgb(0,0,0); \
font-family:Arial,Helvetica,sans-serif; font-size:12pt; \
font-variant-ligatures:inherit; font-variant-caps:inherit; font-weight:inherit">when \
I write 100000 small objects less than 64KB</span><span \
style="font-family:Arial,Helvetica,sans-serif"> through rgw, the RAW USED showed in \
&quot;ceph df&quot; looks incorrect.</span></pre><pre \
style="color:rgb(51,51,51)"><span style="font-family:Arial,Helvetica,sans-serif">For \
example, I test three times and clean up the rgw data pool each time, the object size \
for first time is 4KB, for second time is 32KB, for third time is \
64KB.</span></pre><pre style="color:rgb(51,51,51)"><pre \
style="background-color:rgb(255,255,255)"><span \
style="font-family:Arial,Helvetica,sans-serif">The RAW USED showed in &quot;ceph \
df&quot; are the same(18GB),  looks like always equal to \
64KB*100000/1024*3</span><span style="font-family:Arial,Helvetica,sans-serif">. \
(replicator is 3 </span><span \
style="font-family:Arial,Helvetica,sans-serif">here</span><span \
style="font-family:Arial,Helvetica,sans-serif">)</span></pre></pre><pre \
style=""><font face="Arial, Helvetica, sans-serif">Any thought?</font></pre><pre \
style="color:rgb(51,51,51)"><i style="color:inherit; \
font-family:Arial,Helvetica,sans-serif; font-size:inherit; \
font-variant-ligatures:inherit; font-variant-caps:inherit; font-weight:inherit; \
background-color:">Jamie</i><br></pre><pre \
style="color:rgb(51,51,51)"><i>_______________________________________________</i></pre></pre><pre \
style="color:rgb(0,0,0)">Hi Behnam,

On 2/12/2018 4:06 PM, Behnam Loghmani wrote:
&gt;<i> Hi there,
</i>&gt;<i>
</i>&gt;<i> I am using ceph Luminous 12.2.2 with:
</i>&gt;<i>
</i>&gt;<i> 3 osds (each osd is 100G) - no WAL/DB separation.
</i>&gt;<i> 3 mons
</i>&gt;<i> 1 rgw
</i>&gt;<i> cluster size 3
</i>&gt;<i>
</i>&gt;<i> I stored lots of thumbnails with very small size on ceph with radosgw.
</i>&gt;<i>
</i>&gt;<i> Actual size of files is something about 32G but it filled 70G of each \
osd. </i>&gt;<i>
</i>&gt;<i> what's the reason of this high disk usage?
</i>Most probably the major reason is BlueStore allocation granularity. E.g. 
an object of 1K bytes length needs 64K of disk space if default 
bluestore_min_alloc_size_hdd&nbsp; (=64K) is applied.
Additional inconsistency in space reporting might also appear since 
BlueStore adds up DB volume space when accounting total store space. 
While free space is taken from Block device only. is As a result when 
reporting &quot;Used&quot; space always contain that total DB space part ( i.e. 
Used = Total(Block&#43;DB) - Free(Block) ). That correlates to other 
comments in this thread about RockDB space usage.
There is a pending PR to fix that:
<a href="https://github.com/ceph/ceph/pull/19454/commits/144fb9663778f833782bdcb16acd707c3ed62a86" \
target="_blank" id="LPlnk774339" class="OWAAutoLink" \
previewremoved="true">https://github.com/ceph/ceph/pull/19454/commits/144fb9663778f833782bdcb16acd707c3ed62a86</a>
 You may look for &quot;Bluestore: inaccurate disk usage statistics problem&quot; 
in this mail list for previous discussion as well.

&gt;<i> should I change &quot;bluestore_min_alloc_size_hdd&quot;? and If I change it \
and  </i>&gt;<i> set it to smaller size, does it impact on performance?
</i>Unfortunately I haven't benchmark &quot;small writes over hdd&quot; cases much 
hence don't have exacts answer here. Indeed these 'min_alloc_size' 
family of parameters might impact the performance quite significantly.
&gt;<i>
</i>&gt;<i> what is the best practice for storing small files on bluestore?
</i>&gt;<i>
</i>&gt;<i> Best regards,
</i>&gt;<i> Behnam Loghmani</i><i>
</i></pre></pre>
<pre>&gt;<i> 
</i>&gt;<i> On Mon, Feb 12, 2018 at 5:06 PM, David Turner &lt;<a \
href="http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com" target="_blank" \
id="LPlnk893616" class="OWAAutoLink" previewremoved="true">drakonstein at \
gmail.com</a>  </i>&gt;<i> &lt;mailto:<a \
href="http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com" target="_blank" \
id="LPlnk601172" class="OWAAutoLink" previewremoved="true">drakonstein at \
gmail.com</a>&gt;&gt; wrote: </i>&gt;<i> 
</i>&gt;<i>     Some of your overhead is the Wal and rocksdb that are on the OSDs.
</i>&gt;<i>     The Wal is pretty static in size, but rocksdb grows with the amount
</i>&gt;<i>     of objects you have. You also have copies of the osdmap on each osd.
</i>&gt;<i>     There's just overhead that adds up. The biggest is going to be
</i>&gt;<i>     rocksdb with how many objects you have.
</i>&gt;<i> 
</i>&gt;<i> 
</i>&gt;<i>     On Mon, Feb 12, 2018, 8:06 AM Behnam Loghmani
</i>&gt;<i>     &lt;<a href="http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com" \
target="_blank" id="LPlnk883002" class="OWAAutoLink" \
previewremoved="true">behnam.loghmani at gmail.com</a> &lt;mailto:<a \
href="http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com" target="_blank" \
id="LPlnk996457" class="OWAAutoLink" previewremoved="true">behnam.loghmani at \
gmail.com</a>&gt;&gt; wrote: </i>&gt;<i> 
</i>&gt;<i>         Hi there,
</i>&gt;<i> 
</i>&gt;<i>         I am using ceph Luminous 12.2.2 with:
</i>&gt;<i> 
</i>&gt;<i>         3 osds (each osd is 100G) - no WAL/DB separation.
</i>&gt;<i>         3 mons
</i>&gt;<i>         1 rgw
</i>&gt;<i>         cluster size 3
</i>&gt;<i> 
</i>&gt;<i>         I stored lots of thumbnails with very small size on ceph with
</i>&gt;<i>         radosgw.
</i>&gt;<i> 
</i>&gt;<i>         Actual size of files is something about 32G but it filled 70G of
</i>&gt;<i>         each osd.
</i>&gt;<i> 
</i>&gt;<i>         what's the reason of this high disk usage?
</i>&gt;<i>         should I change &quot;bluestore_min_alloc_size_hdd&quot;? and If \
I change </i>&gt;<i>         it and set it to smaller size, does it impact on \
performance? </i>&gt;<i> 
</i>&gt;<i>         what is the best practice for storing small files on bluestore?
</i>&gt;<i> 
</i>&gt;<i>         Best regards,
</i>&gt;<i>         Behnam Loghmani
</i>&gt;<i>         _______________________________________________
</i>&gt;<i>         ceph-users mailing list
</i>&gt;<i>         <a href="http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com" \
target="_blank" id="LPlnk370162" class="OWAAutoLink" previewremoved="true">ceph-users \
at lists.ceph.com</a> &lt;mailto:<a \
href="http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com" target="_blank" \
id="LPlnk873620" class="OWAAutoLink" previewremoved="true">ceph-users at \
lists.ceph.com</a>&gt; </i>&gt;<i>         <a \
href="http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com" target="_blank" \
id="LPlnk197812" class="OWAAutoLink" \
previewremoved="true">http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com</a> \
</i>&gt;<i>         &lt;<a \
href="http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com" target="_blank" \
id="LPlnk132226" class="OWAAutoLink" \
previewremoved="true">http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com</a>&gt; \
</i><br></pre> <br>
_______________________________________________<br>
ceph-users mailing list<br>
ceph-users@lists.ceph.com<br>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com<br>
</blockquote>
</div>
</div>
</div>
</div>
</div>
</body>
</html>



_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--===============1664739631164490329==--

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic