'Re: [Lustre-discuss] Performance Measurement'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lustre-announce
Subject:    Re: [Lustre-discuss] Performance Measurement
From:       "Iozone" <capps () iozone ! org>
Date:       2006-01-16 19:11:02
Message-ID: 147701c61ad0$9807de10$1500000a () americas ! hpqcorp ! net
[Download RAW message or body]

Felix,

    Corrections and comments below:

----- Original Message ----- 
>From: "Felix, Evan J" <Evan.Felix@pnl.gov>
>To: "Don Capps" <Don.Capps@hp.com>; "Ressa" <ressa4299@yahoo.com>; 
><lustre-discuss@clusterfs.com>
>Sent: Monday, January 16, 2006 10:40 AM
>Subject: RE: [Lustre-discuss] Performance Measurement
>

>Ok, I can accept that on small IO jobs 'dd' may not be the best
>benchmark, but If you use a large enough dataset, it should still give
>you reliable numbers.   You can pretty much ignore the 'overhead' of
>starting and stopping the command.  From your point below:
>
>        a. Reading from /dev/zero
>Dev zero is really fast.  REALLY fast.  On my little laptop here I can
>copy 35.1 GB/s  when we get to the point that reading from /dev/zero is
>slower than network speed, I may buy this argument.

 Perhaps Ok, if using large files, not so OK if lots of open/closes.

>        b. Page faults caused  by executing "dd" and the page faults
>            of the internal buffers inside of dd.
>If your memory load on the benchmark client is so bad that we start page
>faulting the DD, is any benchmark going to work well?

    The "DD" application must page fault in from disk. This again
     is not part of the I/O you were attempting to measure.
    Iozone's page-in's are handled outside of the measurement.

>        c. The CPU time consumed by the command dd for all of its
>            internal operations.
>And Benchmarks like IOZone don't do anything internally?
>They all probably do.

    Iozone is designed to do I/O and use as small amount of
    CPU as possible in doing so.  DD is not.

>        d. The loading of dd, and process tear down of the command dd.
>Same with any benchmark

    No... Iozone measures the I/O while the benchmark is running,
    it does not include any process startup, or teardown, in the
    measurement. /bin/time dd will include startup and teardown times
    in the measurment.
    Note: Some systems have very long startup/teardown times. I've
        seen some that are in seconds.

>        e. The fork, and exit time of a process.
>Overhead. Inconsequential if you dataset is sufficiently
>large

    Agreed. If the dataset is sufficiently large as to amortize the
    overhead away, and the system doesn't have abnormally
    high fork/exit times. (as do some systems)

>        f. The open/create is also in the measurement.
>Overhead.

    Yes, and the open() system call is one of the most
    expensive system calls that exist.

>        g. The test only produces results for an initial write, which
> includes metadata overhead.
>Overhead.

    Initial writes may not be what your application does. So it
    makes sense to measure both initial writes, (with meta-data overhead)
    and re-writes (without the meta-data overhead)

>Again Overhead.
>        h. The open time for opening /dev/zero.

    Again, if one is trying to measure file I/O, one would
    not desire to have un-intended events inside the measurement.

>Overhead..
>        i.  The time it took for dd to write its output to the screen is

    Again, if one is trying to measure file I/O, one would
    not desire to have tty I/O in the measurment.

>
>also in the measurement.
>Sure this takes a little time, but its very small, and
>is not necessarily going to a screen...

    True, it could be going to 300 baud line printer, or
    a socket on a dirty network :-)

>
>
>Anyhow after all that.  I get better benchmarks using synchronized
>parallel DD than I do with Iozone, or IOR.  And its easy to run, most
>systems have it etc...  Now I need large data sets and time to run, but
>I get better results.

    It would depend on the implementation of "sychronized" as to
    if this measurment was achieving what one thinks it is achieving.
    Iozone uses a barrier sync mechanism so that it guarantees that
    the measurment of the throughput in the parallel region, was only
    taken during the time that the test was actually running in parallel.
    (Removing straggler effects, from the measurement)
    The definition of the word "better" is what I question :-)

>
>I also think that dd represents better what a 'user' will use, I've
>copied files with DD, but never with Iozone.

    I would think that one would use "cp" instead of dd.  :-) But I
    would agree with the statement that the end user's real
    application is the best tool for measuring performance, be that
    cp, dd, Oracle, Sybase, or perl. :-)

Enjoy,
Don Capps

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.clusterfs.com
https://lists.clusterfs.com/mailman/listinfo/lustre-discuss
[prev in list] [next in list] [prev in thread] [next in thread]