[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lustre-discuss
Subject:    [Lustre-discuss] Web serving with Lustre
From:       adilger () clusterfs ! com (Andreas Dilger)
Date:       2006-12-29 1:00:43
Message-ID: 20061229080040.GI5937 () schatzie ! adilger ! int
[Download RAW message or body]

On Dec 28, 2006  16:18 -0500, Ivanov, Zlatin wrote:
> - When we crawl the Lustre file system, say for indexing or
> regular-expression substitution purposes, processing about 1MM files
> takes ~50 min over NFS, and ~2 hrs on Lustre, for an identical set of
> files.

What is your average file size?  Lustre isn't really tuned for the
millions of small files case yet.

> ls -1 /proc/fs/lustre/ldlm/namespaces/*/lru_size
> /proc/fs/lustre/ldlm/namespaces/OSC_lustre1_ost1p_mds-prod/lru_size
> /proc/fs/lustre/ldlm/namespaces/OSC_lustre1_ost2p_mds-prod/lru_size
> cat /proc/fs/lustre/ldlm/namespaces/*/lru_size
> 1000
> 1000

lru_size has no meaning on the MDS node.

> On the clients:
> 
> ls -1 /proc/fs/lustre/ldlm/namespaces/*/lru_size
> /proc/fs/lustre/ldlm/namespaces/MDC_lustre1_mds-prod_MNT_client-prod-f7f
> 23600/lru_size
> /proc/fs/lustre/ldlm/namespaces/OSC_lustre1_ost1p_MNT_client-prod-f7f236
> 00/lru_size
> /proc/fs/lustre/ldlm/namespaces/OSC_lustre1_ost2p_MNT_client-prod-f7f236
> 00/lru_size
> cat /proc/fs/lustre/ldlm/namespaces/*/lru_size
> 400
> 400
> 400

If you are re-using many files, I would suggest increasing these to at
least 5000, 2500, 2500, if not 2-5x that.  Since you have a very small
cluster the number of locks held by the clients isn't going to hurt the
servers, and it will allow you to cache much more content on the clients.

> /proc/fs/lustre/llite/fs0/max_read_ahead_mb
> 4
> 
> /proc/fs/lustre/llite/fs0/max_read_ahead_whole_mb
> 1

> 1) I am not sure about max_read_ahead_whole_mb - it's unclear to me
> whether 0 or 1 should be preferred here.

This means "files smaller than 1MB will be read in their entirety on the
first read".  There isn't really any point in making this smaller.

> 2) Should I prefer 2500 for
> /proc/fs/lustre/ldlm/namespaces/MDC*/lru_size and 1000 or 1500 for
> /proc/fs/lustre/ldlm/namespaces/OSC*/lru_size?

At least, yes.

> 3) If yes, should I be setting this across the board, on the clients
> only, or on the MDS/OSSs only?

On the clients only.

> 4) In general, should I still stick to avoiding mounting with flock
> unless explicitly required?

Well, there are occasional problems with the flock code, but if you
are not enabling it consistently across your cluster it means that some
nodes may not cooperate in the locking correctly.  I'd enable it across
the board, and if flock is unused on some nodes then no harm done.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic