'[Gluster-users] Gluster's read performance'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       gluster-users
Subject:    [Gluster-users] Gluster's read performance
From:       joe () julianfamily ! org (Joe Julian)
Date:       2012-09-21 6:52:16
Message-ID: 9kvckjydbscs9t8h462j6fdc.1348208653801 () email ! android ! com
[Download RAW message or body]

"Small files" is sort of a misconception. Initial file ops include a small amount of \
overhead, with a lookup, the filename is hashed, the dht subvolume  is selected and \
the request is sent to that subvolume. If it's a replica, the request is sent to each \
replica in that subvolume set (usually 2). If it is a replica, all the replicas have \
to respond. If  one or more have pending flags or there's an attribute mismatch, \
either some self heal action has to take place, or a split-brain is determined. If \
the file doesn't exist on that subvolume, the same must be done to all the \
subvolumes. If the file is found, a link file is made on the expected dht subvolume \
pointing to the place we found the file. This will make finding it faster the next \
time. Once the file is found and is determined to be clean, the file system can move \
on to the next file operation. 

PHP applications, specifically, normally have a lot of small files that are opened \
for every page query so per-page, that overhead adds up. PHP also queries a lot of \
files that just don't exist. Your single page might query 200 files that just aren't \
there. They're in a different portion of the search path, or they're a plugin that's \
not used, etc.

NFS mitigates that affect by using FScache in the kernel. It stores directories and \
stats, preventing the call to the actual filesystem. This also means, of course, that \
the image that was just uploaded through a different server isn't going to exist on \
this one until the cache times out. Stale data in a multi-client system is going to \
have to be expected in a cached client.

Jeff Darcy created a test translator that caches negative lookups which he said also \
mitigated the PHP problem pretty nicely.

If you have control over your app, things like absolute pathing for PHP or leaving \
file descriptors open can also avoid overhead. Also, optimizing the number of times \
you open a file or the number of files to open can help.

So "small files" refers to the percent of total file op time that's spent on overhead \
vs actual data retrieval.

Chandan Kumar <chandank.kumar at gmail.com> wrote:

> Hello All,
> 
> I am new to gluster and evaluating it for my production environment. After
> reading some blogs and googling I learned that NFS mount at clients give
> better read performance for small files and the glusterfs/FUSE mount gives
> better for large write operations.
> 
> Now my questions are
> 
> 1) What do we mean by small files? 1KB/1MB/1GB?
> 2) If I am using NFS mount at the client I am most likely loosing the high
> availability feature of gluster. unlike fuse mount where if primary goes
> down I don't need to worry about availability.
> 
> Basically my production environment will mostly have read operations of
> files ranging from 400KB to 5MB and they will be concurrently read by
> different threads.
> 
> Thanks,
> Chandan
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120920/861a8993/attachment.html>

[prev in list] [next in list] [prev in thread] [next in thread]