Thursday, March 18, 2010

What is the concept of "static data" in Twister?

Many iterative applications we analyzed show a common characteristic of operating on two types of data products called static and variable data. Static data (most of the time the largest of the two) is used in each iteration and remain fixed throughout the computation whereas the variable data is the computed results in each iteration and typically consumed in the next iteration in many expectation maximization (EM) type algorithms.

If a set of data is read by the application but do not get changed, then this set of data can be considered "static" in Twister. Matrix blocks, points in clustering, and web graph in pagerank are all examples of static data.