Publications
A fault-tolerant strategy for virtualized HPC clusters
Abstract
Virtualization is a common strategy for improving the utilization of existing computing resources, particularly within data centers. However, its use for high performance computing (HPC) applications is currently limited despite its potential for both improving resource utilization as well as providing resource guarantees to its users. In this article, we systematically evaluate three major virtual machine implementations for computationally intensive HPC applications using various standard benchmarks. Using VMWare Server, Xen, and OpenVZ, we examine the suitability of full virtualization (VMWare), paravirtualization (Xen), and operating system-level virtualization (OpenVZ) in terms of network utilization, SMP performance, file system performance, and MPI scalability. We show that the operating system-level virtualization provided by OpenVZ provides the best overall performance, particularly for MPI scalability …
- Date
- January 1, 1970
- Authors
- John Paul Walters, Vipin Chaudhary
- Journal
- The Journal of Supercomputing
- Volume
- 50
- Pages
- 209-239
- Publisher
- Springer US