'[Gluster-users] striping - only for big files and how to tune'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gluster-users
Subject:    [Gluster-users] striping - only for big files and how to tune
From:       p.gotwalt () uci ! ru ! nl (P ! Gotwalt)
Date:       2011-01-21 11:42:53
Message-ID: 002701cbb960$57474a80$05d5df80$ () gotwalt () uci ! ru ! nl
[Download RAW message or body]

Hi,

Using glusterfs 3.1.1 with a 4 node striped volume:
# gluster volume info
 
Volume Name: testvol
Type: Stripe
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: node20.storage.xx.nl:/data1
Brick2: node30.storage.xx.nl:/data1
Brick3: node40.storage.xx.nl:/data1
Brick4: node50.storage.xx.nl:/data1

To do some performance test I copied /usr to the gluster volume:
[root at drbd10.storage ~]# time rsync -avzx --quiet  /usr /gluster
real    5m54.453s
user    2m1.026s
sys     0m9.979s
[root at drbd10.storage ~]#

To see whether this operation was successful I check on the storage bricks the number \
of files, and used blocks. I expected these to be the same on all the bricks, because \
I use a striped configuration. The results are:

Number of files seen on the client:
[root at drbd10.storage ~]# find /gluster/usr -ls| wc -l
57517

Number of files seen on the storage bricks:
# mpssh -f s2345.txt 'find /data1/usr -ls | wc -l'                                 
  [*] read (4) hosts from the list
  [*] executing "find /data1/usr -ls | wc -l" as user "root" on each
  [*] spawning 4 parallel ssh sessions
 
node20 -> 57517
node30 -> 55875
node40 -> 55875
node50 -> 55875

Why has node20 all the files, but the others seem to miss quiet a lot.

The same but now for the real used storage blocks:
On the client:
[root at drbd10.storage ~]# du -sk /gluster/usr                                       \
 1229448 /gluster/usr

On the storage bricks:
# mpssh -f s2345.txt 'du -sk /data1/usr'                                           
  [*] read (4) hosts from the list
  [*] executing "du -sk /data1/usr" as user "root" on each
  [*] spawning 4 parallel ssh sessions
 
node20 -> 1067784       /data1/usr
node30 -> 535124        /data1/usr
node40 -> 437896        /data1/usr
node50 -> 405920        /data1/usr

In total: 2446724

My conclusions:
- all data is written to the first brick. If files are smaller than the chunk size \
then there is nothing more to stripe. So the first storage brick fills up with all \
the small files. Question: Does the filesystem stop working if the volume of the \
first brick is full?

- when using striping, the overhead seems to be almost 50%. This can get worse when \
the first node fills up. Question: what is the size of the stripe chunk and can this \
be tuned for the average size of the files?

All in all, glusterfs seems to be better for "big" files. Is there an "average" file \
size for which glusterfs is a better choice?

Greetings

Peter Gotwalt


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic