Modeling and Verification of Randomized Distributed Real-Time Systems


Author: Roberto Segala

Appears: Ph.D. thesis, Department of Electrical Engineering and Computer Schence, Laboratory for Computer Science, Massachusetts Institute of Technology. Available as Technical Report MIT/LCS/TR-676, 1995.

Work carried out in the Theory of Distributed Systems group under the supervision of Nancy Lynch.

Abstract: Randomization has proved to be an exceptional tool for the design of distributed algorithms, sometimes yielding solutions to problems that are inherently complex, or even unsolvable, in the setting of deterministic algorithms. However, this tool has a price: even simple randomized algorithms can be extremely hard to verify and analyze.

We address the problem of verification of randomized distributed algorithms both from the theoretical and the practical point of view. From the theoretical point of view we build a new mathematical model of randomized distributed computation; from the practical point of view we develop techniques to be used for the actual verification of randomized systems. Our analysis involves both untimed and timed systems, so that real-time properties can be investigated.

The model is a conservative extension of labeled transition systems. A probabilistic automaton is a state machine with transitions, where a transition from a state leads to a discrete probability distribution over pairs consisting of a label and a state. A probabilistic automaton contains pure nondeterministic behavior in that from each state there can be several transitions, and probabilistic behavior in that once a transition is chosen the label that occurs and the state that is reached are determined by a probability distribution. The resolution of pure nondeterminism leads to probabilistic executions, which are Markov chain like structures. Once the pure nondeterminism is resolved, the probabilistic behavior of a probabilistic automaton can be studied.

The properties of a randomized algorithm are stated in terms of satisfying some other property with a minimal or maximal probability no matter how the nondeterminism is resolved. In stating the properties of an algorithm we account also for the possibility to impose restrictions on the ways to resolve the nondeterminism (e.g., fair scheduling, oblivious scheduling,...). We develop techniques to prove the validity of some property by reducing the problem to the verification of properties of non-randomized systems. One technique is based on coin lemmas, which state lower bounds to the probability of some chosen random draws to give some chosen outcomes no matter how the nondeterminism is resolved. We identify a collection of progress statements which can be used to prove upper bounds to the expected running time of an algorithm. The methods are applied to prove that the randomized dining philosophers algorithm of Lehmann and Rabin guarantees progress in expected constant time and that the randomized algorithm for agreement of Ben-Or guarantees agreement in expected exponential time.

We extend some of the common semantics for labeled transition systems to the probabilistic framework. We define a compositional trace semantics where a trace is replaced by a probability distribution over traces, called a trace distribution, and we extend the classical bisimulation and simulation relations, both in their strong and weak version. Furthermore, we define probabilistic forward simulations, where a state is related to a probability distribution over states. All the simulation relation are shown to be sound for the trace distribution precongruence.

In summary, we obtain a framework that accounts for the classical theoretical results for concurrent systems and that at the same time proves to be suitable for the actual verification of randomized distributed real-time systems. This double feature should lead eventually to the easy extension of several verification techniques that are currently available for distributed systems, thus rendering the analysis of randomized systems easier and more reliable.

Download:

Download the table of contents.

Download the thesis.


homepage