'Re: [Lustre-discuss] proc filesystem variables on MDS'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lustre-announce
Subject:    Re: [Lustre-discuss] proc filesystem variables on MDS
From:       Andreas Dilger <adilger () clusterfs ! com>
Date:       2006-01-10 5:31:00
Message-ID: 20060110053100.GJ3682 () schatzie ! adilger ! int
[Download RAW message or body]

On Jan 09, 2006  18:45 -0700, Kumaran Rajaram wrote:
> On a MDS node, whats the definition of the following /proc fs variables: 
> 
> i) filestotal
> ii) filesfree 
> iii) kbytesavail 
> iv) kbytesfree 
> v) kbytestotal 

They are basically what you would assume, with some caveats below.

> 1). I assumed filestotal is the grand total number of inodes that can be created \
> and filesfree is the remaining/available inodes (such that filesfree < filestotal)  \
>  On an Empty Lustre FS: 
> n1:~ # cat /proc/fs/lustre/mds/mds-scratch/filestotal 
> 214586 
> n1:~ # cat /proc/fs/lustre/mds/mds-scratch/filesfree 
> 214552 
> 
> When mounting the MDS via ext3, I see 47 files while according to procfs there are \
> total of 32 files (214586-214552)  
> n1:/mnt/scratch # ls -1aR /mnt/scratch/ | wc -l 
> 47 

The difference between filestotal and filesfree should in fact reflect the
number of files currently in use in the filesystem.  One complicating factor
is that ext3 reserves some inodes for internal use, and Lustre also uses
some inodes internally.

> I created 100 files, and surprisingly the filetotal seems to increase and filesfree \
> remains constant  n1:~ # cat /proc/fs/lustre/mds/mds-scratch/filestotal 
> 214686 
> n1:~ # cat /proc/fs/lustre/mds/mds-scratch/filesfree 
> 214552 

The reason that filetotal is increasing and filesfree is remaining constant
is because filesfree is the "pessimistic" (minimum) number of files that
can be created in the MDS filesystem.  The pessimistic estimate is that
each file needs to store an EA in an external block, and therefore is limited
by the amount of free space in the filesystem (kbytesfree / 4) (blocksize is
4096 bytes), so in your case 858208 / 4 = 214552.

Because the number of inodes is actually slightly more than this (it will
be kbytestotal / 4096 bytes, so 874872 / 4 = 218718) there is a period when
the filesystem is first being used that there are actually more free inodes
than free blocks.  Once you are past this threshold the filestotal and
filesfree will be the "real" numbers.

> 2). With respect to kbytesfree and kbytesavail, I saw previous discussions.
> Guess its related to Lustre on a whole 
> 
> https://lists.clusterfs.com/pipermail/lustre-discuss/2005-July/000818.html
> How about the same parameter with respect to MDS ? The variables seem to
> have same values before and after the creation of 100 files. 
> ---------- 
> n1:~ # cat /proc/fs/lustre/mds/mds-scratch/kbytestotal 
> 874872 
> n1:~ # cat /proc/fs/lustre/mds/mds-scratch/kbytesfree 
> 858208 
> n1:~ # cat /proc/fs/lustre/mds/mds-scratch/kbytesavail 
> 808208 

Because ext3 is formatted so that "default" striped files (as defined by
--stripe_cnt at Lustre format time) should fit into the "fast" EA space
of a large inode they do not normally take up any filesystem space.  What
DOES take up space is wide-striped files and directories, other EAs (in
upcoming 1.4.6 it is possible to store arbitrary EAs on inodes), and Lustre
internal files, logs, etc.

> n1:~ # dumpe2fs /datadir/metadata_scratch | grep ^Inode | grep size 
> dumpe2fs 1.34 (25-Jul-2003) 
> Inode size:               512 

If you had done "dumpe2fs /datadir/metadata_scratch | grep -i inode" you
would have gotten the "real" numbers for filestotal and filesfree
("Inode count:" and "Free inodes:").

> I assumed since the default inode size is 512bytes, creating 100
> files would decrease the kbytesavail and kbytesfree on the MDS (since
> 512 bytes is consumed per inode with default striping).

Ext3 preallocates all of the inodes at mkfs time.  This is one reason
that ext3 has very good fsck support, because the inodes (and most other
fs metadata, except directories) are in a fixed location and the kernel
knows it should never overwrite the metadata, and e2fsck knows where to
find it regardless of other filesystem corruption.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.clusterfs.com
https://lists.clusterfs.com/mailman/listinfo/lustre-discuss

[prev in list] [next in list] [prev in thread] [next in thread]