高性能计算

ubuntu安装openmpi的Address not mapped 错误

2010年4月8日 阅读(1,073)

在我的ubuntu上安装了openmpi,编译一个简单程序如下:
#include "mpi.h"
#include <stdio.h>
#include <math.h>
int main(int argc,char ** argv)
{
    int myid, numprocs;
    int  namelen;
    int type_size;
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    MPI_Init(&argc,&argv);
    MPI_Comm_rank(MPI_COMM_WORLD,&myid);
    MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
    MPI_Get_processor_name(processor_name,&namelen);
    fprintf(stderr,"Hello World! Process %d of %d on %s\n",myid, numprocs, processor_name);

    MPI_Type_size(MPI_INT,&type_size);
    fprintf(stderr,"%s MPI_INT size:%d %d\n", processor_name,type_size,sizeof(int));
    MPI_Type_size(MPI_LONG,&type_size);
    fprintf(stderr,"%s MPI_LONG size:%d %d\n", processor_name,type_size,sizeof(long));
    MPI_Type_size(MPI_FLOAT,&type_size);
    fprintf(stderr,"%s MPI_FLOAT size:%d %d\n", processor_name,type_size,sizeof(float));
    MPI_Type_size(MPI_DOUBLE,&type_size);
    fprintf(stderr,"%s MPI_DOUBLE size:%d %d\n", processor_name,type_size,sizeof(double));
   
    long send = -1;
    if(myid == 0) send = 0;
    MPI_Bcast ( &send, 1, MPI_LONG, 0, MPI_COMM_WORLD);

    printf("rank:%d %ld",myid,send);
    MPI_Finalize();
}

如果采用mpicc mpi.c 直接运行./a.out,或者mpirun -n 2 ./a.out
出现如下错误提示:
[node1:09705] *** Process received signal ***
[node1:09705] Signal: Segmentation fault (11)
[node1:09705] Signal code: Address not mapped (1)
[node1:09705] Failing at address: 0x44000074
[node1:09705] [ 0] [0xb7f4e440]
[node1:09705] [ 1] ./a.out(main+0x51) [0x8048875]
[node1:09705] [ 2] /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe0) [0xb7b04450]
[node1:09705] [ 3] ./a.out [0x80487c1]
[node1:09705] *** End of error message ***
段错误

如果采用mpicc mpi.c -static 再运行./a.out,或者mpirun -n 2 ./a.out
则出现如下错误提示:
duanple@node1:~/project/test$ ./a.out
[node1:09772] *** Process received signal ***
[node1:09772] Signal: Segmentation fault (11)
[node1:09772] Signal code: Address not mapped (1)
[node1:09772] Failing at address: 0x44000074
[node1:09772] [ 0] [0xb7f0b440]
[node1:09772] [ 1] [0x8048241]
[node1:09772] [ 2] [0x81a718b]
[node1:09772] [ 3] [0x8048151]
[node1:09772] *** End of error message ***
段错误

差别在动态链接,指出了出错的地方是 /lib/tls/i686/cmov/libc.so.6
我们可以用先用ldd a.out查看一下:
duanple@node1:~/project/test$ ldd a.out
    linux-gate.so.1 =>  (0xb7fab000)
    libmpi.so.0 => /home/duanple/mpi/openmpi/lib/libmpi.so.0 (0xb7ded000)
    libopen-rte.so.0 => /home/duanple/mpi/openmpi/lib/libopen-rte.so.0 (0xb7d74000)
    libopen-pal.so.0 => /home/duanple/mpi/openmpi/lib/libopen-pal.so.0 (0xb7d06000)
    libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7cf4000)
    libnsl.so.1 => /lib/tls/i686/cmov/libnsl.so.1 (0xb7cdc000)
    libutil.so.1 => /lib/tls/i686/cmov/libutil.so.1 (0xb7cd8000)
    libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb7cb2000)
    libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7c9a000)
    libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7b4b000)
    /lib/ld-linux.so.2 (0xb7fac000)
我们可以看到其链接的动态库版本。

而我们在另一台机子上,使用centos编译的a.out确可以在unbuntu上运行,我们查看cnetos生成的a.out.
duanple@node1:~/project/test$ ldd a.out
    linux-gate.so.1 =>  (0xb7fb1000)
    libmpi.so.0 => /home/duanple/mpi/openmpi/lib/libmpi.so.0 (0xb7df3000)
    libopen-rte.so.0 => /home/duanple/mpi/openmpi/lib/libopen-rte.so.0 (0xb7d7a000)
    libopen-pal.so.0 => /home/duanple/mpi/openmpi/lib/libopen-pal.so.0 (0xb7d0c000)
    libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7cfa000)
    libnsl.so.1 => /lib/tls/i686/cmov/libnsl.so.1 (0xb7ce2000)
    libutil.so.1 => /lib/tls/i686/cmov/libutil.so.1 (0xb7cde000)
    libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb7cb8000)
    libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7ca0000)
    libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7b51000)
    /lib/ld-linux.so.2 (0xb7fb2000)

我们可以看到二者在地址分布上有些差别。

You Might Also Like