Tuesday, December 01, 2009

Dynamic Provisioning of Virtual Clusters

Here I will present the details of the demonstration that we (SALSA team) presented at the Super Computing 09 conference in Portland.

Deploying virtual/bare-system clusters on demand is an emerging requirement in many HPC centers. The tools such as xCAT and MOAB can be used to provide these capabilities on top of a physical hardware infrastructure.

In this demonstration we coupled the idea of provisioning clusters with parallel runtimes. In other words, we showed that people can switch a given set of hardware nodes into different operating systems (either virtual or non-virtual) and run different applications written using various cloud technologies. Specifically we showed that it is possible to provision clusters with Hadoop on both Linux bare-system and virtual machines and Microsoft DryadLINQ on Windows Server 2008.

The following diagram shows the operating system and the software stacks we used in our demonstrations. We were able to get all configurations demonstrated except for Windows HPC on XEN VMs which does not work well due to the non-existence of para-virtualized drivers.

We setup our demonstration in 4 clusters each with 64 CPU cores (8 nodes with 8 CPU cores each). The first 3 clusters were configured with RedHat Linux bare-system, RedHat Linux on XEN, and Windows HPC 2008 operating systems respectively. The last cluster is configured to dynamically switch between any of the above configurations. In Linux (both bare-system and XEN) clusters we ran a Smith Waterman dissimilarity calculation using Hadoop as our demo application while on Windows we ran a DryadLINQ implementation of the same application.

We developed a performance monitoring infrastructure based on pub-sub messaging to collect and summarize CUP and memory utilization of individual clusters and a performance visualization GUI (Thanks Saliya for the nice GUI) as our front end demonstration component. Following two diagrams show the monitoring architecture and the GUI of our demo.
With the 3 static clusters we were able to demonstrate and compare the performance of Hadoop (both on Linux bare-system and XEN) and DryadLINQ to the people. It also provides a way to show the overhead of virtualization as well. The dynamic cluster demonstrated the applicability of the dynamically provisionable virtual/physical clusters with parallel runtimes for scientific research.

It was a wonderful team work involving all the members of the SALSA team and the members of IU UITS. What is more amazing is that we go this very successful demonstration built from the scratch in less than one month.
We will upload a video of the demonstration soon.

Here is a photo of most of the members of our group.
Front row (left to right): Jaliya Ekanakaye, Dr. Judy Qiu, Thilina Gunarathne, Scott Beason, Jong Choi, Saliya Ekanayake, Li Hui.
Second row (left to right) Prof. Geoffrey Fox, Joe Rinkovsky and Jenett Tillotson.