Distributed Computing

Distributed computing is the field in computer science that studies the design and behavior of systems that involve many loosely-coupled components. The components of such distributed systems may be multiple threads in a single program, multiple processes on a single machine, or multiple processors connected through a shared memory or a network. Distributed systems are unusually vulnerable to nondeterminism, where the behavior of the system as a whole or of individual components is hard to predict. Such unpredictability requires a wide range of new techniques beyond those used in traditional computing.

Like other areas in computer science, distributed computing spans a wide range of subjects from the applied to the very theoretical. On the theory side, distributed computing is a rich source of mathematically interesting problems in which an algorithm is pitted against an adversary representing the unpredictable elements of the system. Analysis of distributed algorithms often has a strong game-theoretic flavor, because executions involve a complex interaction between the algorithm’s behavior and the system’s responses.

Michael Fischer is one of the pioneering researchers in the theory of distributed computing. His work on using adversary arguments to prove lower bounds and impossibility results has shaped much of the research on the area. He is currently actively involved in the study of security issues in distributed systems, including cryptographic tools and trust management.

James Aspnes’ research emphasizes the use of randomization for solving fundamental problems in distributed computing. Many problems that turn out to be difficult or impossible to solve using a deterministic algorithm can be solved if processes can flip coins. Analyzing the resulting algorithms often requires using non-trivial techniques from probability theory.

Distributed systems research at Yale includes work in both programming language support for distributed computing, and in the use of distributed systems techniques to support parallel programming. Such work is designed to lift some of the burden of understanding complex distributed systems from the shoulders of distributed system designers by letting the compiler or run-time libraries handle issues of scheduling and communication.

Zhong Shao’s FLINT project focuses on developing a new mobile-code architecture to support efficient data and program migration on distributed and heterogeneous computing platforms. FLINT uses a common typed intermediate language to support safe execution of code written in multiple programming languages such as Java, C, and ML.

David Gelernter’s work on developing the Linda coordination language and related tools is an example of using distributed system techniques to support parallel programming. Linda provides a virtual “tuple space” through which processes can communicate without regard to location. His current Lifestreams project similarly simplifies information-management tasks, by freeing the user from many of the clerical duties imposed by traditional filesystems. Avi Silberschatz specializes in transaction management techniques as they relate to both distributed database systems and multidatabase systems.