[prev in list] [next in list] [prev in thread] [next in thread]
List: lustre-discuss
Subject: [Lustre-discuss] [HPDD-discuss] Same performance Infiniband and Ethernet
From: alfonso.pardo () ciemat ! es (Pardo Diaz, Alfonso)
Date: 2014-05-21 6:32:54
Message-ID: 1CC41228-9D18-4B45-85BB-FE6C6C95A9ED () ciemat ! es
[Download RAW message or body]
Thanks Richard, I appreciate your advice.
I was able to sature the channel using: XDD, 10 threads writing in 10 OST, each OST \
in difference OSS and this is the result:
ETHERNET
T Q Bytes Ops Time \
Rate IOPS Latency %CPU TARGET Average 0 1 2147483648 65536 \
140.156 15.322 467.59 0.0021 39.16 TARGET Average 1 1 \
2147483648 65536 140.785 15.254 465.50 0.0021 39.11 TARGET \
Average 2 1 2147483648 65536 140.559 15.278 466.25 0.0021 \
39.14 TARGET Average 3 1 2147483648 65536 176.141 12.192 372.07 \
0.0027 38.02 TARGET Average 4 1 2147483648 65536 168.234 12.765 \
389.55 0.0026 38.54 TARGET Average 5 1 2147483648 65536 140.823 \
15.250 465.38 0.0021 39.11 TARGET Average 6 1 2147483648 65536 \
140.183 15.319 467.50 0.0021 39.16 TARGET Average 8 1 \
2147483648 65536 176.432 12.172 371.45 0.0027 38.02 TARGET \
Average 9 1 2147483648 65536 167.944 12.787 390.23 0.0026 \
38.57
Combined 10 10 21474836480 655360 180.000 119.305 3640.89 \
0.0003 387.99
INFINIBAND
T Q Bytes Ops Time \
Rate IOPS Latency %CPU TARGET Average 0 1 2147483648 65536 \
9.369 229.217 6995.16 0.0001 480.40 TARGET Average 1 1 \
2147483648 65536 9.540 225.110 6869.80 0.0001 474.25 TARGET \
Average 2 1 2147483648 65536 8.963 239.582 7311.45 0.0001 \
479.85 TARGET Average 3 1 2147483648 65536 9.480 226.521 \
6912.86 0.0001 478.21 TARGET Average 4 1 2147483648 65536 \
9.109 235.748 7194.47 0.0001 480.83 TARGET Average 5 1 \
2147483648 65536 9.284 231.299 7058.69 0.0001 479.04 TARGET \
Average 6 1 2147483648 65536 8.839 242.947 7414.15 0.0001 \
480.55 TARGET Average 7 1 2147483648 65536 9.210 233.166 \
7115.65 0.0001 480.17 TARGET Average 8 1 2147483648 65536 \
9.373 229.125 6992.33 0.0001 475.13 TARGET Average 9 1 \
2147483648 65536 9.184 233.828 7135.86 0.0001 \
480.25
Combined 10 10 21474836480 655360 9.540 2251.097 68698.03 \
0.0000 4788.69
A estimate is 0,6Gbits (max 1Gbit) by ethernet and 16Gbits by infiniband (max \
40Gbits).
REGARDS!
El 19/05/2014, a las 17:37, Mohr Jr, Richard Frank (Rick Mohr) <rmohr at utk.edu> \
escribi?:
> Alfonso,
>
> Based on my attempts to benchmark single client Lustre performance, here is some \
> advice/comments that I have. (YMMV)
> 1) On the IB client, I recommend disabling checksums (lctl set_param \
> osc.*.checksums=0). Having checksums enabled sometimes results in a significant \
> performance hit.
> 2) Single-threaded tests (like dd) will usually bottleneck before you can max out \
> the total client performance. You need to use a multi-threaded tool (like xdd) and \
> have several threads perform IO at the same time in order to measure aggregate \
> single client performance.
> 3) When using a tool like xdd, set up the test to run for a fixed amount of time \
> rather than having each thread write a fixed amount of data. If all threads write \
> a fixed amount of data (say 1 GB), and if any of the threads run slower than \
> others, you might get skewed results for the aggregate throughput because of the \
> stragglers.
> 4) In order to avoid contention at the ost level among the multiple threads on a \
> single client, precreate the output files with stripe_count=1 and statically assign \
> them evenly to the different osts. Have each thread write to a different file so \
> that no two processes write to the same ost. If you don't have enough osts to \
> saturate the client, you can always have two files per ost. Going beyond that will \
> likely hurt more than help, at least for an ldiskfs backend.
> 5) In my testing, I seem to get worse results using direct I/O for write tests, so \
> I usually just use buffered I/O. Based on my understanding, the max_dirty_mb \
> parameter on the client (which defaults to 32 MB) limits the amount of dirty \
> written data than can be cached on each ost. Unless you have increased this to a \
> very large number, that parameter will likely mitigate any effects of client \
> caching on the test results. (NOTE: This reasoning only applies to write tests. \
> Any written data can still be cached by the client, and a subsequent read test \
> might very well pull data from cache unless you have taken steps to flush the \
> cached data.)
> If you have 10 oss nodes and 20 osts in your file system, I would start by running \
> a test with 10 threads and have each thread write to a single ost on different \
> servers. You can increase/decrease the number of threads as needed to see if the \
> aggregate performance gets better/worse. On my clients with QDR IB, I typically \
> see aggregate write speeds in the range of 2.5-3.0 GB/s.
> You are probably already aware of this, but just in case, make sure that the IB \
> clients you use for testing don't also have ethernet connections to your OSS \
> servers. If the client has an ethernet and an IB path to the same server, it will \
> choose one of the paths to use. It could end up choosing ethernet instead of IB \
> and mess up your results.
> --
> Rick Mohr
> Senior HPC System Administrator
> National Institute for Computational Sciences
> http://www.nics.tennessee.edu
>
>
> On May 19, 2014, at 6:33 AM, "Pardo Diaz, Alfonso" <alfonso.pardo at ciemat.es>
> wrote:
>
> > Hi,
> >
> > I have migrated my Lustre 2.2 to 2.5.1 and I have equipped my OSS/MDS and clients \
> > with Infiniband QDR interfaces. I have compile lustre with OFED 3.2 and I have \
> > configured lnet module with:
> > options lent networks=?o2ib(ib0),tcp(eth0)?
> >
> >
> > But when I try to compare the lustre performance across Infiniband (o2ib), I get \
> > the same performance than across ethernet (tcp):
> > INFINIBAND TEST:
> > dd if=/dev/zero of=test.dat bs=1M count=1000
> > 1000+0 records in
> > 1000+0 records out
> > 1048576000 bytes (1,0 GB) copied, 5,88433 s, 178 MB/s
> >
> > ETHERNET TEST:
> > dd if=/dev/zero of=test.dat bs=1M count=1000
> > 1000+0 records in
> > 1000+0 records out
> > 1048576000 bytes (1,0 GB) copied, 5,97423 s, 154 MB/s
> >
> >
> > And this is my scenario:
> >
> > - 1 MDs with SSD RAID10 MDT
> > - 10 OSS with 2 OST per OSS
> > - Infiniband interface in connected mode
> > - Centos 6.5
> > - Lustre 2.5.1
> > - Striped filesystem ?lfs setstripe -s 1M -c 10"
> >
> >
> > I know my infiniband running correctly, because if I use IPERF3 between client \
> > and servers I got 40Gb/s by infiniband and 1Gb/s by ethernet connections.
> >
> >
> > Could you help me?
> >
> >
> > Regards,
> >
> >
> >
> >
> >
> > Alfonso Pardo Diaz
> > System Administrator / Researcher
> > c/ Sola n? 1; 10200 Trujillo, ESPA?A
> > Tel: +34 927 65 93 17 Fax: +34 927 32 32 37
> >
> >
> >
> >
> > ----------------------------
> > Confidencialidad:
> > Este mensaje y sus ficheros adjuntos se dirige exclusivamente a su destinatario y \
> > puede contener informaci?n privilegiada o confidencial. Si no es vd. el \
> > destinatario indicado, queda notificado de que la utilizaci?n, divulgaci?n y/o \
> > copia sin autorizaci?n est? prohibida en virtud de la legislaci?n vigente. Si ha \
> > recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente \
> > respondiendo al mensaje y proceda a su destrucci?n.
> > Disclaimer:
> > This message and its attached files is intended exclusively for its recipients \
> > and may contain confidential information. If you received this e-mail in error \
> > you are hereby notified that any dissemination, copy or disclosure of this \
> > communication is strictly prohibited and may be unlawful. In this case, please \
> > notify us by a reply and delete this email and its contents \
> > immediately.
> > ----------------------------
> >
> > _______________________________________________
> > HPDD-discuss mailing list
> > HPDD-discuss at lists.01.org
> > https://lists.01.org/mailman/listinfo/hpdd-discuss
>
>
>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic