4.1.5-5 mpirun fails when installed in remote host
Task Info (Flyspray) | |
Opened By | Francisco J. Vazquez (Fran) |
Task ID | 79670 |
Type | Bug Report |
Project | Arch Linux |
Category | Packages: Extra |
Version | None |
OS | x86_64 |
Opened | 2023-09-12 15:52:36 UTC |
Status | Assigned |
Assignee | David Runge (dvzrv) |
Assignee | Levente Polyak (anthraxx) |
Assignee | Christian Heusel (gromit) |
I have two fully updated arch systems host1 and host2, both with openmpi 4.1.5-5 installed. Running:
$ mpirun -v -n 2 --hostfile hosts.txt bash -c 'echo $HOSTNAME'
in host1, where hosts.txt is:
host1 slots=1 host2 slots=1
fails with:
ORTE was unable to reliably start one or more daemons. This usually is caused by:
not finding the required libraries and/or binaries on one or more nodes. Please check your PATH and LD_LIBRARY_PATH settings, or configure OMPI with --enable-orterun-prefix-by-default
lack of authority to execute on one or more specified nodes. Please verify your allocation and authorities.
the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base). Please check with your sys admin to determine the correct location to use.
compilation of the orted with dynamic libraries when static are required (e.g., on Cray). Please check your configure cmd line and consider using one of the contrib/platform definitions for your system type.
an inability to create a connection back to mpirun due to a lack of common network interfaces and/or no route found between them. Please check network connectivity (including firewalls and network routing requirements).
Downgrading the remote host to openmpi 4.1.5-4 solves the problem:
$ mpirun -v -n 2 --hostfile hosts.txt bash -c 'echo $HOSTNAME' host2 host1
The local version of openmpi does not seem to influence the result.
The same thing happens with -n 1, even though the program is launched locally.