[prev in list] [next in list] [prev in thread] [next in thread] 

List:       opensuse-factory
Subject:    [opensuse-factory] Re: [opensuse-science] openMPI mixup in Tumbleweed/Leap 15.x
From:       Nicolas Morey-Chaisemartin <nmoreychaisemartin () suse ! de>
Date:       2018-12-14 10:16:41
Message-ID: 27973e15-4403-54ef-842e-72eccaefdeed () suse ! de
[Download RAW message or body]

[Attachment #2 (multipart/mixed)]


On 12/9/18 4:18 PM, Todd Rme wrote:
> On Sat, Dec 8, 2018 at 2:53 PM Stefan Brüns
> <stefan.bruens@rwth-aachen.de> wrote:
> > Hi,
> > 
> > I went through a few packages which have an openMPI dependency or support, and
> > found it quite mixed up:
> > 
> > Currently, we have openmpi(1), openmpi2 and openmpi3 in Leap and TW. While
> > openmpi3 is currently unused, openmpi1 and openmpi2 are both used, with
> > similar frequency:
> > 
> > https://build.opensuse.org/package/binary/openSUSE:Factory/openmpi2:standard/
> > standard/x86_64/openmpi2-libs-2.1.5-2.1.x86_64.rpm
> > https://build.opensuse.org/package/binary/openSUSE:Factory/openmpi:standard/
> > standard/x86_64/openmpi-libs-1.10.7-21.1.x86_64.rpm
> > 
> > Several programs will end up with implicitly linking to both versions, as
> > libnetcdf and hdf5 use openmpi1 and boost_mpi uses openmpi2. One example is
> > vtk.
> > 
> > As both libraries (libmpi.so.12 and libmpi.so.20) export the same symbols for
> > large parts, this is mayhem waiting to happen.
> > 
> > For SLE, different MPI versions/implementations are supported using the HPC
> > modules, but for Leap/TW, we should obviously stick with *one* single
> > canonical version.
> > 
> > Question now, which version to choose?
> > 
> > Apparently, openmpi2 does not work on all architectures (PPC, PPC64BE) [1],
> > and is not supported by some software packages [2].
> > 
> > Are there any drawbacks for using openmpi1 everywhere in TW/Leap 15.x?
> > 
> > I have opened a bug report: https://bugzilla.opensuse.org/show_bug.cgi?
> > id=1118861
> > 
> > Kind regards,
> > 
> > Stefan
> > 
> > 
> > [1] "Stay with openmpi(1) also on PPC", boost, 2018-10-01, https://
> > build.opensuse.org/request/show/639401
> > [2] "Cntk packages do not support OpenMPI 2+", https://github.com/Microsoft/
> > CNTK/issues/3197
> > 
> > --
> > Stefan Brüns  /  Bergstraße 21  /  52062 Aachen
> > home: +49 241 53809034     mobile: +49 151 50412019
> No matter what we pick, I think it would be a good idea to do what we
> do with, say, gcc and llvm/clang, where we have separate "openmpi1",
> "openmpi2", and "openmpi3" packages, and have the "openmpi" package
> refer to the default version.  This would make it easy to change
> default versions in the future, or set default versions on a
> per-architecture basis.
> 
> As for openmpi 1 vs openmpi 2, the problem with openmpi 1 is that it
> is unmaintained [1].  The current version of openmpi is actually
> version 4.  So using it openmpi 2 as the default comes with all the
> problems associated with unmaintained software, especially
> network-oriented software.  openmpi 2 also adds support for MPI 3.x
> features.
> 
> openmpi 2 is supposed to support PPC.  If it doesn't that is probably
> a bug that should be reported upstream.  Unfortunately the linked
> request doesn't explain what the problem is.

It was disabled for ppc64be in v2.1.2 but reenabled in v2.1.4.
See: https://github.com/open-mpi/ompi/issues/4349#issuecomment-374970982

My two cents on the MPI version pick:
- openmpi1 has been unmaintained for over a year now. It is also deprecated in \
SLES/LEap15 although still available. We know there are some issue, specially \
reagrding the latest RDMA hardware. IMHO this should be dropped completely from \
                Factory.
- openmpi2 is the new "default" for SLES15. It seems to work well and is old enough \
                to be stable.
- openmpi3 was not picked for SLES15 as it was very recently released at the time and \
still pretty unstable, even running the testsuite it came with. We decided not to \
ship it. It might be mature enough to be a good candidate.
- openmpi4 is just barely out. I haven't got around to test it yet but my best guess \
is that it will be similar to openmpi3 when it came out. Working but with lots of \
instabilty and issue (on some non x86_64 arch usually). I think it is too early to \
use it, although it should be packaged and available in Factory.

TL;DR: I think openmpi2 and 3 are good candidates. openmpi2 has my preference because \
it means we can keep more in sync with SLES and Leap 15 ( which do not have \
openmpi3).

Regarding the rest of the discussion, I've replied to the BZ#111861

Nicolas


["signature.asc" (application/pgp-signature)]
-- 
To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse-factory+owner@opensuse.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic