http://www.lam-mpi.org/faq/category11.php3
Table of contents:
- What exactly is considered a "heterogeneous" cluster?
- Does LAM/MPI work on heterogeneous clusters?
- Do different versions of LAM/MPI constitute heterogeneous clusters?
- How do I install LAM on a heterogeneous cluster?
- How do I de<lambootde< across a heterogeneous cluster?
- How do I execute the right binary on each node for each architecture in a heterogeneous system?
- Can I mix 32 and 64 bit executables in a single parallel MPI job?
[ Return to FAQ ]
1. What exactly is considered a "heterogeneous" cluster?
A homogeneous cluster is one where all the nodes have the same:
- Architecture
- Operating system (to include the same OS version)
- Key component libraries, such as de<libcde< or de<glibcde< on Linux and freeware BSD operating systems (as above, including the same library versions)
For example, the following cluster is considered homogeneous:
The following are some example clusters that are not homogeneous — they are heterogeneous:
- 16 Pentium III nodes running Red Hat 7.1, 16 Pentium III nodes running Red Hat 7.0. Yes, even a minor difference in operating system constitutes being "different enough" to be heterogeneous.
- 16 Pentium III nodes running Red Hat 7.1, 16 Pentium III nodes running Mandrake 8.0. This one is questionable, since Mandrake professes to be compatible with Red Hat. So to be safe, call it heterogeneous.
- 16 Pentium III nodes running Red Hat 7.1, 16 Pentium III nodes running SuSE 7.2. This is most likely heterogeneous since the linux distributions are different; it is possible that the Linux kernel versions are different, different versions of the GNU compilers are installed, and/or different versions of de<glibcde< are used, etc.
- 16 Pentium III nodes running Red Hat 7.1, 16 Pentium III nodes running OpenBSD 2.9. These are clearly two different operating systems.
- 16 Pentium II nodes and 16 Pentium III nodes all running Red Hat 7.1. You could play some tricks and treat this as a homogeneous cluster, but it is probably safer (and more efficient) to treat this as a heterogeneous cluster.
- 16 SunBlade 1000 nodes running Solaris 8, 16 SunBlade nodes running Solaris 9. The operating system difference makes this heterogeneous.
- 16 SubBlade 1000 nodes running Solaris 8, 16 Pentium III nodes running Red Hat 7.1. The architecture difference makes this heterogeneous.
[ Top of page | Return to FAQ ]
2. Does LAM/MPI work on heterogeneous clusters?
Yes — that’s one of the reasons that LAM/MPI exists.
LAM will transparently do any data conversion necessary.
[ Top of page | Return to FAQ ]
3. Do different versions of LAM/MPI constitute heterogeneous clusters?
So a better answer is really: Yes, but don’t ever, ever do this.
[ Top of page | Return to FAQ ]
4. How do I install LAM on a heterogeneous cluster?
- Install LAM on one node, and make the directory tree that LAM was installed to available to all nodes via a networked filesystem (such as NFS)
- Physically install LAM on each node in the cluster
[ Top of page | Return to FAQ ]
5. How do I de<lambootde< across a heterogeneous cluster?
- All nodes being de<lambootde<ing must be using the same version of LAM (this is actually always a requirement — this is just a clarification that "heterogeneous" does not mean "different versions of LAM/MPI").
- Each user’s de<$PATHde< must be setup properly to find the Right version of LAM/MPI on each node. That is, if multiple installations of LAM are available on each node, the user’s de<$PATHde< must be set to find the appropriate installation for that node. For example, if LAM is installed on a networked filesystem for two different architectures in:
de<de<
/home/lam/sparc-sun-solaris2.8
/home/lam/linux-redhat7.1
/home/lam/linux-suse7.2
[ Top of page | Return to FAQ ]
6. How do I execute the right binary on each node for each architecture in a heterogeneous system?
- If the right binaries are in the current working directory, and the current working directory is available on all nodes, de<mpiexecde< can execute them directly. For example:
de<de<
shell$ mpiexec -arch linux my_mpi_program.linux : \
-arch solaris my_mpi_program.solaris
- If the de<$PATHde< variable is set correctly for each node that LAM uses (i.e., separate directories exist containing MPI binaries for each architecture, and the correct directory for each architecture is inserted into the de<$PATHde< on each node), de<mpirun C foode< will automatically find the de<foode< for the right architecture.
- However, most users do not set their de<$PATHde< variable in this fashion. If de<mpiexecde< is not suitable, you will more than likely need to use an application schema ("app schema") file for this case. In the app schema, it is usually easiest to specify the absolute pathname of the program for each node. For example, using the following boot schema file:
[ Top of page | Return to FAQ ]
7. Can I mix 32 and 64 bit executables in a single parallel MPI job?
By definition, a mixture of 32 and 64 bit machines is a heterogenous cluster.
LAM/MPI allows two possibilities for mixing 32 and 64 bit machines in a single parallel job:
- Most 64 bit operating systems have the capacity to generate 32 bit executables. By doing so, one can make the cluster "homogenous" (at least in terms of bit size). Once all the executables (including relevant libraries) are 32 bits, one can run MPI jobs as if it were a homogenous cluster. Note that LAM/MPI libraries and executables should also be built as 32 bit libraries/executables.
- The differences in datatype sizes between 32 and 64 bit machcines are likely to create problems. Consider the scenario where a 64 process sends a message containing de<MPI_LONGde< data to a 32 bit process. What is the size of the datatype? On the 64 bit machine, each de<MPI_LONGde< is likely to be 64 bits, but on the 32 bit machine, they are likely to be 32 bits. So what should the 32 bit process do when it receives the data?