高性能计算

openmpi连接ps3 ppc64和x86-32中的 No route to host

2010年4月23日 阅读(302)

如果不加入MPI的通信函数,比如只有MPI_Init MPI_Finalize,或者一些本地操作,二者可以正确互联,但是如果加入mpi操作,则出现如下操作。
[node2][[44798,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] connect() to 10.10.4.70 failed: No route to host (113)

这边这个解答很靠谱:
http://stackoverflow.com/questions/2495198/unable-to-run-openmpi-across-more-than-two-machines

The answer turned out to be simple: open mpi authenticated via ssh and then opened up tcp/ip sockets between the nodes. The firewalls on the compute nodes were set up to only accept ssh connections from each other, not arbitrary connections. So, after updating iptables, hello world runs like a champ across all of the nodes.

Edit: It should be pointed out that the fileserver’s firewall allowed arbitrary connections, so that was why an mpi program run on it would behave differently than just running on the compute nodes.

实际上,这种状况说明ssh连接成功,但是mpi的连接失败,原因在于新安装的yellowdog linux的防火墙默认把tcp连接禁止了,这样mpi就无法成功。解决方法就是:

查看防火墙状态:service iptables status

暂时关闭防火墙:service iptables stop.

或者启动时禁止其启动:chkconfig –level 2345 iptables off

You Might Also Like