在我的ubuntu上安装了openmpi,编译一个简单程序如下:
#include "mpi.h"
#include <stdio.h>
#include <math.h>
int main(int argc,char ** argv)
{
int myid, numprocs;
int namelen;
int type_size;
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Get_processor_name(processor_name,&namelen);
fprintf(stderr,"Hello World! Process %d of %d on %s\n",myid, numprocs, processor_name);
MPI_Type_size(MPI_INT,&type_size);
fprintf(stderr,"%s MPI_INT size:%d %d\n", processor_name,type_size,sizeof(int));
MPI_Type_size(MPI_LONG,&type_size);
fprintf(stderr,"%s MPI_LONG size:%d %d\n", processor_name,type_size,sizeof(long));
MPI_Type_size(MPI_FLOAT,&type_size);
fprintf(stderr,"%s MPI_FLOAT size:%d %d\n", processor_name,type_size,sizeof(float));
MPI_Type_size(MPI_DOUBLE,&type_size);
fprintf(stderr,"%s MPI_DOUBLE size:%d %d\n", processor_name,type_size,sizeof(double));
long send = -1;
if(myid == 0) send = 0;
MPI_Bcast ( &send, 1, MPI_LONG, 0, MPI_COMM_WORLD);
printf("rank:%d %ld",myid,send);
MPI_Finalize();
}
如果采用mpicc mpi.c 直接运行./a.out,或者mpirun -n 2 ./a.out
出现如下错误提示:
[node1:09705] *** Process received signal ***
[node1:09705] Signal: Segmentation fault (11)
[node1:09705] Signal code: Address not mapped (1)
[node1:09705] Failing at address: 0x44000074
[node1:09705] [ 0] [0xb7f4e440]
[node1:09705] [ 1] ./a.out(main+0x51) [0x8048875]
[node1:09705] [ 2] /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe0) [0xb7b04450]
[node1:09705] [ 3] ./a.out [0x80487c1]
[node1:09705] *** End of error message ***
段错误
如果采用mpicc mpi.c -static 再运行./a.out,或者mpirun -n 2 ./a.out
则出现如下错误提示:
duanple@node1:~/project/test$ ./a.out
[node1:09772] *** Process received signal ***
[node1:09772] Signal: Segmentation fault (11)
[node1:09772] Signal code: Address not mapped (1)
[node1:09772] Failing at address: 0x44000074
[node1:09772] [ 0] [0xb7f0b440]
[node1:09772] [ 1] [0x8048241]
[node1:09772] [ 2] [0x81a718b]
[node1:09772] [ 3] [0x8048151]
[node1:09772] *** End of error message ***
段错误
差别在动态链接,指出了出错的地方是 /lib/tls/i686/cmov/libc.so.6
我们可以用先用ldd a.out查看一下:
duanple@node1:~/project/test$ ldd a.out
linux-gate.so.1 => (0xb7fab000)
libmpi.so.0 => /home/duanple/mpi/openmpi/lib/libmpi.so.0 (0xb7ded000)
libopen-rte.so.0 => /home/duanple/mpi/openmpi/lib/libopen-rte.so.0 (0xb7d74000)
libopen-pal.so.0 => /home/duanple/mpi/openmpi/lib/libopen-pal.so.0 (0xb7d06000)
libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7cf4000)
libnsl.so.1 => /lib/tls/i686/cmov/libnsl.so.1 (0xb7cdc000)
libutil.so.1 => /lib/tls/i686/cmov/libutil.so.1 (0xb7cd8000)
libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb7cb2000)
libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7c9a000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7b4b000)
/lib/ld-linux.so.2 (0xb7fac000)
我们可以看到其链接的动态库版本。
而我们在另一台机子上,使用centos编译的a.out确可以在unbuntu上运行,我们查看cnetos生成的a.out.
duanple@node1:~/project/test$ ldd a.out
linux-gate.so.1 => (0xb7fb1000)
libmpi.so.0 => /home/duanple/mpi/openmpi/lib/libmpi.so.0 (0xb7df3000)
libopen-rte.so.0 => /home/duanple/mpi/openmpi/lib/libopen-rte.so.0 (0xb7d7a000)
libopen-pal.so.0 => /home/duanple/mpi/openmpi/lib/libopen-pal.so.0 (0xb7d0c000)
libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7cfa000)
libnsl.so.1 => /lib/tls/i686/cmov/libnsl.so.1 (0xb7ce2000)
libutil.so.1 => /lib/tls/i686/cmov/libutil.so.1 (0xb7cde000)
libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb7cb8000)
libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7ca0000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7b51000)
/lib/ld-linux.so.2 (0xb7fb2000)
我们可以看到二者在地址分布上有些差别。