Monday, September 07, 2009

DryadLINQ for Scientific Analyses

I spent the last 3 months at Microsoft Research as an intern doing research on DryadLINQ. Our goal (myself and a another intern - Atilla Soner Balkir) was to evalute the usability of DryadLINQ for scientific applications.

We selected a series of scientific applications and developed DryadLINQ programs for those applications, and evaluated their performances. We compared the performance of the DryadLINQ applicaitons against Hadoop and in some cases MPI versions of the same applications.

We identified several improvments to DryadLINQ and its software stack, and found workarounds to these inefficienies and was able to run most applicaitons with 100% CPU utilizations.

We compiled a paper including our findings regarding DryadLINQ and submitted it for the eScience09 conference. You can find a draft of this technical paper here.

Hope this will be usefull to some of you who are developing applications using DryadLINQ.

No comments: