Inter-procedural dataflow analysis of parameterized concurrent software

A system and method for computing dataflow in concurrent programs of a computer system, includes, given a family of threads (U1, . . . , Um) and a Linear Temporal Logic (LTL) property, f, for a concurrent program, computing a cutoff for the LTL property, f, where c is called the cutoff if for all n greater than or equal to c, Un satisfies f if Uc satisfies f. The cutoff is computed using weighted multi-automata for internal transitions of the threads. Model checking a cutoff number of processes is performed to verify race freedom in the concurrent program.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
RELATED APPLICATION INFORMATION

This application claims priority to provisional application Ser. No. 60/884,048 filed on Jan. 9, 2007, incorporated herein by reference. This application also claims priority to provisional application Ser. No. 60/828,246 filed on Oct. 5, 2006, incorporated herein by reference.

The present application is related to U.S. application Ser. No. (TBD) filed currently herewith, entitled “MODEL CHECKING PARAMETERIZED THREADS FOR SAFETY” Ser. No. 11/867,160 and incorporated herein by reference.

BACKGROUND

1. Technical Field

The present invention relates to computer system verification and more particularly to verification of concurrent programs, which exploit parameterized qualities of computer systems comprised of many copies of the same hardware or software component.

2. Description of the Related Art

Computer verification is needed to ensure that a computer system operates properly and that the results obtained are trustworthy. One form of verification is testing. In testing, the actual behavior of a system is examined on a set on inputs and matched against an expected result. Due to a large or infinite number of possible inputs, it becomes impossible to confirm that a system behaves correctly in all circumstances.

Verification tries to address these issues. Verification provides a mathematical or model basis for simulating the system behavior. A model and its intended behavior are defined. A machine is usually modeled as a system whose state evolves over time, the model includes a specification of the state space and how the system can traverse it. Temporal logic has been shown to be useful in expressing behavior of reactive systems. The model-checking approach to verification includes taking the mathematical model of the system under consideration and checking the validity of a temporal logic formula within the model.

A primary problem faced by all methods is known as state explosion. State explosion means that the state space of the system under consideration grows rapidly (e.g., exponentially) with the amount of memory used (e.g., registers, or program variable, pointers, etc.). This limits the verification methods.

Multi-threading is a standard way of exploiting parallelism among different components of a computer system. As a result, the use of concurrent multi-threaded programs is becoming pervasive. Examples include operating systems, databases, embedded systems (cell phones, multimedia consumer products) etc. Since verification typically does not scale for large scale concurrent programs, there is a deep interest in leveraging static analysis techniques like inter-procedural dataflow analysis for debugging multi-threaded programs. While inter-procedural dataflow analysis has shown to be a very effective technique for finding bugs for sequential programs, there has been very little work on extending such dataflow analyses to the concurrent domain.

SUMMARY

Concurrent programs with many replicated threads, e.g., running the same piece of code, occur in many important applications. These include implementations of protocols for networking, cache coherence, synchronization and multi-core architectures running multi-threaded software, among others. As a concrete example, we consider Linux™ device drivers. Driver code is supposed to work correctly irrespective of the number of threads executing it. For such applications, the goal is to establish correctness of programs of the form U1n1∥ . . . ∥Umnm irrespective of the program size as measured by the number ni of threads executing the code for driver Ui. In the art, this is often referred to as the Parameterized Model Checking Problem (PMCP). Clearly, this is important as correctness of a system with a fixed number of threads does not, in general, establish correctness for an arbitrary number.

In practice, however, deciding the PMCP is considered a hard problem. Therefore, the approach that is typically followed is to first tackle the seemingly simpler problem of trying to establish correctness for programs with a fixed number (typically 2) of replicated threads. However, we obtain the somewhat surprising result that establishing correctness for a fixed number (even two) of replicated components is, in many important cases, provably less tractable than establishing parameterized correctness.

This has at least two implications. First, when reasoning about parameterized recursive programs, it is important to try to reason directly about parameterized correctness rather than attempt to establish correctness for a special case comprising a small fixed number of replicated threads and successively increasing the number of copies. To illustrate the second, and more important, conclusion of practical interest, we consider the scenario where our end goal is not parameterized reasoning but establishing correctness of a program with a fixed number of, possibly distinct, threads.

Suppose that we want to establish the absence of data races in a program U1∥U2 comprised of threads U1 and U2 running two possibly distinct device drivers. Then, if we establish the absence of a data race in the parameterized system U1n∥U2m, comprised of arbitrarily many copies of U1 and U2, it automatically establishes data race freedom for U1∥U2. One key point is that we show that in many cases of interest that it turns out that reasoning about U1∥U2 is undecidable whereas the PMCP is efficiently decidable.

We consider the PMCP for concurrent programs of the form U1n1∥ . . . ∥Umnm comprised of an arbitrary number ni of copies of a template thread Ui interacting with each other using standard synchronization primitives like pairwise and asynchronous rendezvous, locks, broadcasts, and disjunctive guards. We model threads as Pushdown Systems (PDS) which have emerged as a natural and powerful framework for analyzing recursive programs. Correctness properties are expressed using multi-indexed LTL\X. Note that absence of the “next-time” operator X makes the logic stuttering insensitive which is usual when reasoning about parameterized systems. For ease of exposition, we formulate our results for systems with a single template PDS and for double-indexed LTL\X properties. Extension to systems with multiple templates and k-index properties, where k>2, will be understood to those skilled in the art.

Our new results show that decidability of the PMCP hinges on the set of temporal operators allowed in the correctness property, thereby providing a natural way to characterize fragments of double-indexed LTL for which the PMCP is decidable. We use L(Op1, . . . , Opk), where Opiε{F,U,G}, to denote the fragment comprised of formulae of the form Ef, where f is double-indexed LTL\X formula in positive normal form (PNF), viz., only atomic propositions are negated, built using the temporal operators Op1, . . . , Opk and the Boolean connectives and . Here F “sometimes”, U, “until” and G “always”, denote the standard temporal operators and E is the “existential path quantifier”. L(U,G) is the full-blown double-indexed LTL\X.

In this disclosure, we delineate precisely the decidability/undecidability boundary of the PMCP for double-indexed LTL\X for each of the standard synchronization primitives. Specifically, we show the following.

(a) The PMCP for L(F,G) and L(U) is, in general, undecidable even for systems wherein the PDSs do not interact at all with each other. The above results imply that to get decidability of the PMCP for PDSs, interacting or not, we have to restrict ourselves to either the sub-logic L(F) or the sub-logic L(G). For these sub-logics, decidability of the PMCP depends on the synchronization primitive used by the PDSs,

(b) For the sub-logic L(F), we show that the PMCP is efficiently decidable for PDSs interacting via pairwise or asynchronous rendezvous, disjunctive guards and nested locks but remains undecidable for broadcasts and non-nested locks. The decidability for pairwise rendezvous (and indeed for asynchronous rendezvous and disjunctive guards) is surprising given the undecidability of model checking systems comprised of two PDSs (even when they are isomorphic to each other) interacting via pairwise rendezvous for reachability—a cornerstone undecidability result for model checking interacting PDSs. Our new results show that the PMCP for PDSs interacting via pairwise rendezvous is not only decidable but efficiently so. This is especially interesting as it illustrates that for pairwise (and asynchronous rendezvous and disjunctive guards) switching to the parameterized version of the problem makes it more tractable.

(c) For the fragment L(G), we show that the PMCP is decidable for pairwise and asynchronous rendezvous, disjunctive guards and locks (even non-nested ones). This settles the PMCP for all the standard synchronization primitives.

Let {Un} be the parameterized family of systems defined by the template PDS U interacting via pairwise rendezvous. To get decidability for L(F), we start by formulating a new efficient procedure to compute the set of control states of U which are parameterized reachable, e.g., reachable in Un for some n. This is accomplished via a fixpoint computation which starts with the set R0 containing the initial state of U, and in the ith iteration constructs the set Ri+1 of control states that become parameterized reachable assuming that all states in Ri are parameterized reachable. The crucial point is that in adding a new control state c to Ri, we have to not only ensure that synchronization constraints arising out of rendezvous are met but also that the newly added states are context-free reachable from existing parameterized reachable states. The checking of the two constraints are dovetailed, i.e., carried out in an interleaved fashion until a fixpoint is reached in that no new states are discovered. We next show, via a flooding argument, that the PMCP for a formula f of L(F) reduces to standard model checking for a system with two non-interacting copies of the PDS UR, where UR is the template that we get from U by retaining only the parameterized reachable control states of U and converting all pairwise rendezvous between such states to internal transitions. The last problem is known to be efficiently decidable giving us the decidability result. Decidability for PDSs with asynchronous rendezvous and disjunctive guards follows via similar procedures.

To get decidability for L(G), we first show cutoff results. We say that c is a cutoff for formula f if for m≧c, Um|=f if Uc|=f. By leveraging the use of Weighted Multi-Automata, we give new procedures to compute cutoffs for L(F) and L(G) formulae for PDSs interacting via pairwise and asynchronous rendezvous. For PDSs interacting via locks, this cutoff is known to be k for k-index properties. The existence of cutoffs reduces the PMCP to model checking systems with finitely many PDSs which we show to be decidable for disjunctive and (non-nested) locks and which is already known to be decidable for PDSs interacting via pairwise and asynchronous rendezvous. For PDSs interacting via disjunctive guards, we show, via a flooding argument, that the PMCP for a formula f of L(G) reduces to standard model checking for a system with two (non-interacting) copies of a simplified PDS UR. The last problem is known to be efficiently decidable giving us the decidability result.

A system and method for computing dataflow in concurrent programs of a computer system, includes, given a family of threads (U1, . . . , Um) and a Linear Temporal Logic (LTL) property, f, for a concurrent program, computing a cutoff for the LTL property, f, where c is called the cutoff if for all n greater than or equal to c, Un satisfies f if Uc satisfies f. The cutoff is computed using weighted multi-automata for internal transitions of the threads. Model checking a cutoff number of processes is performed to verify race freedom in the concurrent program.

These and other features and advantages will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.

BRIEF DESCRIPTION OF DRAWINGS

The disclosure will provide details in the following description of preferred embodiments with reference to the following figures wherein:

FIG. 1 is a block/flow diagram showing a system/method for solving a dataflow computation in concurrent programs in accordance with an illustrative embodiment;

FIG. 2 is a diagram showing a template process U with control states c and transition designations with ! and ? for demonstrating operation in accordance with the present principles; and

FIG. 3 is a diagram showing a fixpoint computation in accordance with the present principles showing progression through several iterations.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

The present embodiments relate to computer system verification and more particularly to verification of concurrent programs, such as, e.g., device drivers used for controlling hardware components like disk drives, audio speakers, etc. In particularly useful embodiments, parameterized qualities of computer systems are exploited in that the concurrent programs are comprised of many copies of the same hardware or software component. In addition, the present embodiments are applicable to other applications, such as, e.g., embedded software used in mobile devices like cell phones, personal digital assistants (PDAs), database software, SQL servers, application level software, like web browsers (Firefox™, Explorer™) or any application using concurrency.

Model checking of interacting pushdown systems is a core problem underlying dataflow analysis for concurrent programs. However, it is decidable only for very restricted temporal logic fragments. The fundamental obstacle is the undecidability of checking non-emptiness of an intersection of two context-free languages. It is not difficult to couple two different pushdown systems (PDSs) either by making the synchronization primitive expressive enough or the property being model-checked strong enough to accept precisely the intersection of the context free languages accepted by these PDSs. This results in the undecidability of the model checking problem. However, in accordance with the present principles, that many important classes of concurrent systems are parameterized is exploited, i.e., the classes are comprised of many replicated copies of a few basic type of components.

In accordance with the present principles, the general difficult problem need not be solved. We exploit the fact that many concurrent systems are parameterized, i.e., composed of many replicated copies of the same basic component. Indeed for most distributed protocols for networking, cache coherence, synchronization the same exact piece of code implementing the protocol is run on different machines thus making it parameterized. The Internet can be thought of as a network of computers, each running the TCP/IP protocol. Other examples include multi-core-architectures with multi-threading. Indeed, a device driver is supposed to run correctly irrespective of the number of threads executing it.

A new and efficient inter-procedural dataflow analysis system and method are provided for parameterized multi-threaded programs. The problem reduces to the problem of model checking interacting PDSs wherein all the PDSs are copies of each other. The prior work so far on analyzing parameterized programs has been restricted to models where there is no effective communication between the threads (PDSs) and is thus of little practical value. In the present disclosure, we have considered more powerful and realistic models wherein PDSs can interact via locks, rendezvous (e.g., WaitNotify( ) from Java™) or broadcasts (e.g., Wait NotifyAll( ) from Java™). Thus, inter-procedural analysis is extended to the parameterized concurrent domain for realistic models of communication.

We consider the model checking problem for concurrent programs comprised of a finite, but arbitrarily many, copies of a fixed set of threads—often referred to as the Parameterized Model Checking Problem (PMCP). Modeling each thread as a PDS, we delineate the decidability boundary of the PMCP for Indexed Linear Temporal Logic (LTL) for each of the standard synchronization primitives. Our results lead to the surprising conclusion that in many cases of interest, the PMCP, even though a seemingly harder problem, is more tractable than the problem of model checking a fixed number of PDSs. For example, for PDSs interacting via pairwise rendezvous, the PMCP for reachability (presence of a data race) is efficiently decidable whereas model checking a system with two such (even isomorphic) PDSs is undecidable. Deciding the PMCP efficiently is of great importance for parameterized applications like, for instance, Linux™ device drivers. However, the broader practical implication of our results is that even if we are not interested in parameterized reasoning but only in model checking a system U1∥ . . . ∥Um with a fixed number of possibly distinct threads U1, . . . , Um, then in many cases it is more useful to consider the PMCP for the corresponding parameterized system U1n1∥ . . . ∥Umnm with arbitrarily many copies of U1, . . . , Um.

Practical applications in accordance with the present principles include that for debugging concurrent multithreaded software, it is more tractable to consider the parameterized version of the problem. For example, it we want to detect data races in a concurrent program T1∥T2 with two Linux™ device drivers T1 and T2 then it is more efficient and tractable to consider the same problem for a system T1n∥T2m with an arbitrary number of copies of T1 and T2. This is surprising since it is seemingly a harder problem but in reality is much more tractable.

It should be understood that the elements shown in the FIGS. may be implemented in various forms of hardware, software or combinations thereof. Preferably, these elements may be stored on computer media and are implemented in software, on one or more appropriately programmed general-purpose digital computers having a processor and memory and input/output interfaces. Software may include but is not limited to firmware, resident software, microcode, etc.

Embodiments of the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment including both hardware and software elements. In a preferred embodiment, the present invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.

Embodiments can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. A computer-usable or computer readable medium may include any apparatus that stores, communicates, propagates, or transports the program for use by or in connection with the instruction execution system, apparatus, or device. The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. For example, the medium may include a computer-readable medium such as a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk, etc.

Referring now to the drawings in which like numerals represent the same or similar elements and initially to FIG. 1, a block/flow diagram showing system/method for verifying concurrent programs is illustratively depicted. Consider a family of systems of the form U1n1∥ . . . ∥Umnm comprised of an arbitrary number ni of copies of a template thread Ui. Each template Ui may be modeled as a Pushdown System (PDS). A PDS has a finite control part and a stack which models recursion. Formally, a PDS is a five-tuple P=(P,Act,Γ,c0,Δ), where P is a finite set of control states, Act is a finite set of actions containing the empty action e, Γ is a finite stack alphabet, and Δ (P×Γ)×Act×(P×Γ*) is a finite set of transition rules. If ((p, γ), a, (p′, w)) εΔ then we write (p, γ)(p′, w). A configuration of P is a pair (p, w), where p ε P denotes the control location and w εΓ* the stack content. We call c0 the initial configuration of P. The set of all configurations of P is denoted by C. For each action a, we define a relation C×C as follows: if (q, γ)(q′, w), then (p, γv)(p′, wv) for every vεΓ*.

We use {U1, . . . , Um} to denote the family of concurrent programs (or threads), U1n1∥ . . . ∥Umnm, formed by the interleaved parallel composition of ni copies of template Ui. The jth copy of Ui, denoted by Uj[i], communicates with the other threads via the standard synchronization primitives—locks, pairwise or asynchronous rendezvous, broadcasts and disjunctive boolean guards. Pairwise rendezvous are inspired by calculus of communicating systems (CCS) (e.g., a language for writing concurrent programs) whereas asynchronous rendezvous and broadcasts are used to model the Wait/Notify and Wait/NotifyAll primitives of Java™. The semantics of U1n1∥ . . . ∥Umnm is defined in the usual way and is therefore omitted. For the sake of simplicity, we formulate our results for parameterized systems with a single template and for double-index properties. Given a global computation x of Un, we use x[i,j] to denote the sequence resulting by projecting x onto the local computation sequence of threads U[i] and U[j].

Correctness Properties. Given template U, we consider double-index properties of the form Λi,jEg(i,j), where g(i,j) is an LTL\X property interpreted over the local control states of copies U[i] and U[j]. Note that due to symmetry Un|=Λi,jEg(i,j) if Un|=Eg(1,2). We thus restrict ourselves to properties of the form Eg(1,2). For this logic, we follow the classification L(Op1, . . . , Opk) based on the temporal operators allowed in the correctness property as previously formulated. We observe that double-index LTL\X is a very rich logic which can encode many properties of interest. For instance, the presence of a data race can be formulated as the double-index formula EF(c1d2).

Given the family of PDSs, for a concurrent program comprised of many similar components, e.g., device drivers which run hardware like disk drives, audio speakers, etc., and an logic temporal property f (as described above), compute a cutoff for f in block 12. c is called a cutoff if for all n≧c, Un satisfies f, if Uc satisfies f. This reduces the problem to debugging a system with up to a cutoff number of processes.

In block 14, to compute these cutoffs, employ weighted multi-automata. Computation of these cutoffs reduces to pre*-closure computations of weighted automata which can be carried out efficiently in polynomial time in the size of source code. Once cutoffs have been computed, model check the resulting systems of cutoff size in block 16. The techniques used depend on the communication primitives used by the threads and the techniques may be known.

The present embodiments extend inter-procedural dataflow analysis to the parameterized concurrent domain for realistic models of communication among threads. All the standard Java™ communication primitives may be modeled. The present embodiments are more scalable, i.e., can potentially verify larger programs than existing techniques, and accomplish this by avoiding construction of the global state space of the given program thus bypassing the state explosion problem. The analysis is reduced from a concurrent multithreaded program to its individual threads.

The methods are both sound and complete, thus avoiding bogus error traces that could be generated by less precise techniques. This is important from a commercial standpoint as most of the resources spent in real life verification go into detecting/avoiding bogus error traces.

Undecidability Barriers: We start by showing two undecidability results for the PMCP for systems comprised of PDSs that do not even interact with each other.

The decidability of the PMCP hinges on the set of temporal operators allowed in the correctness property, thereby providing a natural way to characterize fragments of double-indexed LTL for which the PMCP is decidable. In one example, we use L(Op1, . . . , Opk), where Op1ε{F,U,G}, to denote the fragment comprised of formulae of the form Ef, where f is double-indexed LTL\X formula in positive normal form (PNF), viz., only atomic propositions are negated, built using the temporal operators Op1, . . . , Opk and the Boolean connectives and . Here F “sometimes”, U, “until” and G “always”, denote the standard temporal operators and E is the “existential path quantifier”. L(U,G) is the full-blown double-indexed LTL\X.

The PMCPs for L(U) and L(G,F) are undecidable for systems comprised of non-interacting PDSs. An important consequence of this is that for more expressive systems wherein PDSs interact using some synchronization mechanism, we need to focus only on the remaining fragments, e.g., L(F) and L(G).

Pairwise and Asynchronous Rendezvous: Let {U} be the parameterized family defined by a template process U modeled as a PDS synchronizing via pairwise rendezvous. Here, Σ, the set of action symbols of U, is comprised of the set Σin of internal transition labels; and the sets Σpr×{!} and Σpr×{?} of send and receive pairwise rendezvous transitions, respectively. We assume that synchronizing transitions, i.e., those labeled by actions in Σpr×{!}∪Σpr×{?}, do not modify the stack of the PDS executing the transition. For action lεΣpr, a pair of transitions labeled with l! and l? are called matching. We recall that for rεΣpr×{!}∪Σpr×{?}, transition tr1:ab of a process U[i] of Un is enabled in global state s, if there exists a process U[j] of Un, where j≠i, in local state c such that there exists a matching transition of the form tr2:cd in Δ. To execute the rendezvous, both the pairwise send and receive transitions tr1 and tr2 must be fired synchronously in one atomic step.

We present decision procedures for the PMCP for L(F) and L(G) for PDSs interacting via pairwise and asynchronous rendezvous. We start by presenting a provably efficient procedure for computing the set of all parameterized reachable control locations of U. This is needed for not only formulating the procedures for L(F) and L(G) but is also of independent interest as detecting the presence of data races can be reduced to deciding parameterized reachability.

Parameterized Reachability: We say that a control state c (configuration c) of template process U is parameterized reachable if there exists a reachable global state s of Un, for some n, with a process in control state c (configuration c).

It can be shown that if a configuration c of U is parameterized reachable; then given l, for some k, there exists a reachable global state s of Uk with at least l copies of c. In other words, we can pump up the multiplicity of each parameterized reachable configuration of U to any arbitrarily large value. This relieves us of the burden of tracking the multiplicity of each configuration of U.

Unbounded Multiplicity: Let R be the set of all parameterized reachable configurations of U and let R′ be a finite subset of R. Then given l, for some m, there exists a finite computation of Um leading to a global state s with at least l copies of each configuration in R′.

The above result reduces the PMCP for EF(c1 . . . ck), i.e., the presence of a data race, to the PMCP for EFc, where c is a control state of U. We have: ∃n, Un|=EF(c1 . . . ck) if for each iε[1 . . . k], ci is parameterized reachable.

While computing parameterized reachable control states for the case where U is a finite state labeled transition system can be accomplished via a simple fixpoint computation, for PDSs it is complicated by the requirement to simultaneously satisfy constraints arising both out of synchronization primitives and context-free reachability introduced by the stack.

Referring to FIG. 2, an example template process is shown for determining reachability. Consider the template process U. Suppose that we want to decide whether for some n, Un|=EFc1. We start with the set R0={c0} containing only the initial state c0 of U. We then construct a series of sets R0, . . . , Rm, where Ri+1 is obtained from Ri by adding new control states that become parameterized reachable assuming that all states in Ri are parameterized reachable. In constructing Ri+1, from Ri we need to make sure that both the constraints, i.e., those imposed by (i) the synchronization primitives, and (ii) context-free reachability are satisfied. We accomplish this in a dovetailed fashion.

First, to satisfy the synchronization constraints, we convert all transitions of the form ab such that there exists a transition of the form cd, where p and p′ are matching send and receive rendezvous actions with cεRi, to an internal transition of the form ab, where τ is a newly introduced special internal action symbol in Σin. This is motivated by the fact that since c is parameterized reachable, we can ensure that if a becomes parameterized reachable (now or in some future iteration), then, for some m, there exists a reachable global state of Um with a process each in local states a and c. In other words, if a becomes reachable, the rendezvous transition ab can always be enabled and executed. Thus, it can be treated like an internal transition. In this way, by flooding all the control states of Ri, we can remove all the synchronization constraints arising out of pairwise send or receive transitions emanating from control states in Ri. This will enable every rendezvous transition with a matching send/receive starting at a control state in Ri. Such transitions can therefore be replaced by internal transitions. Motivated by this, we define Ui+1 to be the template that we get from the original template U by replacing the appropriate pairwise rendezvous send/receive transitions as described above with internal transitions and removing the remaining rendezvous send and receive transitions.

To check that the second constraint, i.e., context-free reachability, is satisfied, we can now use any procedure for model checking a single PDS, to determine the set Rci of those control states of U that are reachable in the individual PDS Ui. This gives us the set Rci of all the context free reachable states in Ui. If new control states become reachable via removal of some synchronization constraints in the previous set, they are added to Ri+1; otherwise, we have reached a fixpoint and the procedure terminates.

Referring to FIG. 3, in the example, R0 is initialized to {c0}. This enables both the transitions c0c9 and c0c8 and hence both of them can be converted to internal transitions resulting in the template U1. In a second iteration (U2), we note that c5, c6, c8 and c9 are all reachable control states of template U1 and so R1={c0,c5,c6,c8,c9}. Now, since both c0 and c5 are in R1, the rendezvous transitions c5c2 and c0c7 become enabled and can be converted to internal transitions resulting in the template U2, In U2, control states c2, c4 and c7 now become reachable and are therefore added to R2 resulting in R3={c0,c2,c4,c5,c6,c7,c8,c9}. Finally, since both the control states c4 and c6ε R3, the rendezvous transitions c6c3 and c4c1 are converted to internal transitions resulting in the template U3. Since c1 and c3 are reachable control locations of U3, these control locations are now included in R4 thereby reaching a fixpoint and leading to termination of the procedure. Since c1εR4, we conclude that c1 is parameterized reachable. A formal description of a method A is given below. The method A returns the set of parameterized reachable control states of U.

METHOD A: Initialize i=0 and R0={c0}, where c0 is the initial state of U. Next, i=i+1. Construct PDS Ui by replacing each pairwise send (receive) transition of template U of the form ab, such that there exists a matching receive (send) transition of the form cd where cεRi−1, by the internal transition ab and removing the remaining pairwise send or receive rendezvous transitions. Compute the set Rci of context-free reachable control locations of Ui using a procedure for model checking a single PDS. Set Ri=Ri−1∪Rci. Except for the initialization step, perform these steps until Ri⊂/Ri−1. Return Ri.

Complexity Analysis: We start by noting that in each iteration of the method A, we add at least one new control state to Ri. Thus, the method terminates in at most |Q| times, where Q is the set of control states of U. During the ith iteration we need to decide for each control state in Q, Ri whether it is context-free reachable in Ui+1 which, by using a model checking procedure for PDSs, can be accomplished in O(|U|3) time, where |U| is the size of U. Each step therefore takes at most O(|U|4) time. Thus, the entire method runs in O(|U|5). The Parameterized Model Checking Problem for control state reachability, and hence EF(c1 . . . ck) (data race), for systems composed from a template PDS U interacting via pairwise rendezvous can decided in O(|U|5) time, where |U| is the size of U.

Asynchronous Rendezvous: The procedure for deciding the PMCP for PDSs interacting via asynchronous rendezvous, which are more expressive than pairwise rendezvous, is essentially the same as the method A. A minor modification is needed to account for the slightly different semantics of an asynchronous rendezvous. The only difference is that an asynchronous send transition ab can be executed irrespective of whether a matching receive cd is present or not. A receive transition, on the other hand, does require a matching send to be currently enabled with both the send and receive transitions then being fired atomically. Now, constructing PDS Ui, in method A is modified as follows: We replace each asynchronous send transition of template U of the form ab, with the internal transition ab. On the other hand, to replace a receive transition of the form ab with the internal transition ab, we need to test whether there exists a matching send transition of the form cd with cεRi−1. The remaining receive asynchronous rendezvous transitions are removed. The time complexity of the method remains the same.

Extension to Multiple Templates: To start with, R0 contains the initial control state of each of the templates U1, . . . , Um. The set Ri now tracks the union of parameterized reachable control states detected up to the ith iteration in any of the templates. Finally, in method A, for each 1≦j≦m we construct PDS Uji by replacing each rendezvous send/receive transition ab in template Uj having an enabled matching receive/send transitions of the form cd in any of the templates, where cεRi−1, with the internal transition ab.

Model Checking Procedure for L(F): From the given template U=(P,Act,Γ,c0,Δ), we define the new template R=(PR,Act,Γ,c0R), where PR is the set of parameterized reachable control states of U and ΔR is the set of transitions of U between states of PR with each pairwise rendezvous send or receive transition converted to an internal transition. Let f be a formula of the form Eg(1,2), where g(1,2) is a double-indexed LTL\X formula with atomic propositions over U[1] and U[2]. Then, if we restrict reasoning about f to finite computation paths then for some n, Un|=Efing if UR2|=Efing, where Efin quantifies only over finite paths.

The intuition behind the reduction of the PMCP to a 2-process instance is a flooding argument resulting from the unbounded multiplicity result. If f has a finite computation x of length l, say, as a model, then at most l pairwise send or receive transitions are fired along x. By the unbounded multiplicity lemma, for some m, there exists a computation y leading to a reachable state of Um, for some m, with at least l copies of each control state of UR. In a system with Um+2 processes, we first let processes U3, . . . , Um+2 execute y to flood all control states of UR with multiplicity at least l. Then, we are guaranteed that in any computation x of U[1,2] of length not more than l, the rendezvous transition can always be fired via synchronization with one of the processes U3, . . . , Um+1 and can therefore be treated as internal transitions.

Thus we have: (Binary Reduction Result). For any finite computation x of Un, where n≧2, there exists a finite computation y of UR2 such that y is stuttering equivalent to x[1,2]. As an immediate corollary, it follows that if f has a model which is a finite computation of Um, for some m, then for some k, Uk|=f if UR2|=f. In particular:

Corollary For any formula f of L(F), for some in m, Um|=f if UR2|=f.

Note that the above result reduces the PMCP for L(F) for PDSs interacting via pairwise or asynchronous rendezvous to (standard) model checking of systems comprised of only two non-interacting PDSs which is known to be efficiently decidable. As a corollary, we have that the PMCP for L(F) is decidable in polynomial time in the size of U.

Computing Cutoffs: We say that cut is a cutoff for a temporal logic formula f and a parameterized family defined by a template U if for m≧cut, Um|=f if Ucut|=f. The existence of a cutoff for a formula f is useful as it reduces the PMCP for f to a finite number of standard model checking problems for systems with up to the cutoff number of copies of U. Let B(F) be the set of branching time formulae built using the temporal operator AF, the boolean operators and , and atomic propositions. We show how to compute cutoffs for L(F) formulae and then extend this to handle B(F) formulae. One motivation for computing cutoffs is that it is a step in the decision procedure for the PMCP for L(G) formulae. One can, of course, use the cutoff approach to model check L(F) formulae also.

Cutoffs for L(F) formulae: We start by observing that the cutoff cut for a formula f of L(F) is related to the number of rendezvous transitions fired along finite computations satisfying f. Let x be a finite computation of Un, for some n, satisfying f. For each rendezvous transition tr of U, let ntr be the number of times tr is fired along x[1,2]. We assume, without loss of generality, that each rendezvous send/receive transition tr has a unique matching receive/send transition, denoted by tr, in U. For each control state c, let Trc be the set of pairwise rendezvous send or receive transitions tr of the form c→d such that tr is fired along x[1,2]. Also, for each control state c of U, let nctrεTrcn tr. Then, one can give a cutoff for f in terms of the values of nc.

As a first step towards that direction, we show that if cut is such that there exists a reachable global state of Ucut with at least nc copies of each control state c, then using a flooding argument we have: cut′+2 is a cutoff for f. Next, we estimate an upper bound for cut from nc. We denote by ic, the first iteration for the method A in which control state c of U was first added to Ri. Then, we have: Um|=EFc where m=2ic. For each control state c of U let Nc be a cutoff for EFc. Then cut≦ΣcεRncNc.

The problem of computing cut, thus reduces to computing bounds for ntr and Nc. We start with ntr, the number of pairwise rendezvous transitions fired along a computation of Um, for some m, satisfying the given L(F) formula. We first consider the case where an L(F) formula is single-index, i.e., atomic propositions are interpreted only over one process. For this, we assume, without loss of generality, that each control state of U is parameterized reachable else we simply remove unreachable states and the associated transitions. Furthermore, using the same flooding argument, we have that each control states of U can be flooded with arbitrary multiplicity. Thus, when reasoning about finite computations, we can treat each rendezvous transition as an internal transition. This eases analysis as instead of reasoning about the parameterized family {U}, it suffices to reason only about the single template U.

Computation of these bounds for PDSs is complicated by context-free reachability introduced by the stack. To handle that we leverage the notion of a Weighted Multi-Automaton (WMA) which is a Multi-Automaton (MA) with each of its transitions labeled with a non-negative integer. WMAs have been used before for dataflow analysis. However, they are employed here for a different purpose, e.g., for estimating a bound on the number of pairwise rendezvous transition fired in transiting between two control states. Intuitively, the weight labeling a transition s→t of a WMA indicates an upper bound on the number of rendezvous transitions that need be fired in order to get from s to t.

A Weighted Multi-Automaton (WMA) may be defined as follows. Given a PDS P=(P,Γ,c0,Δ), a weighted multi-automaton is a tuple M=(Γ,Q,δ,w,I,F), where M′=(Γ,Q,δ,I,F) is a multi-automaton and w:δ→Z is a function mapping each transition of M to a non-negative integer. The weight of a finite path x of M is defined to be the sum of the weights of all the transitions appearing along x. Given states s and t of M, we use


to denote the fact that there is a path in M from s to t labeled with it and having weight b. To estimate a bound for the number of rendezvous transitions fired along a computation satisfying f, we proceed by constructing a WMA Mf for f which captures the (regular) set of all configurations of U which satisfy f. Then, if b is the weight of an accepting path for (c0, ⊥) in Mf, we show that there exists a path of U along which at most b pairwise rendezvous transitions are fired.

Since an L(F) formula is built using the operators F, and , in order to construct Mf it suffices to show how to construct WMAs for Fg, gh and gh, given WMAs for g and h. Then, given an L(F) formula f, repeated applications of these constructions inside out starting with the WMAs for the atomic propositions of f gives us Mf.

DEFINITIONS

multi-Automata: Let P=(P,Act,Γ,c0,Δ) be a pushdown system where P={p1, . . . , pm}. A P-multi-automaton (P-MA for short) is a tuple A=(Γ,Q,δ,I,F) where Q is a finite set of states, δQ×Γ×Q is a set of transitions, I={s1, . . . , sm}Q is a set of initial states and FQ is a set of final states. Each initial state si corresponds to the control state pi of P.

We define the transition relation →Q×Γ*×Q as the smallest relation satisfying the following:

    • if (q,γ,q′)εδ then qq′,
    • qq for every qεQ, and
    • if qq″ and q″q′ then qq′.

A multi-automaton can be thought of as a data structure that is used to succinctly represent (potentially infinite) regular sets of configurations of a given PDS. Towards that end, we say that multi-automaton A accepts a configuration (pi,w) if s1q for some qεF. The set of configurations recognized by A is denoted by Conf(A). A set of configurations is regular if it is recognized by some MA.

Alternating Pushdown Systems: Let P=(P,Act,Γ,c0,Δ) be a pushdown system. An APDS is a five-tuple P=(P,Γ,Δ), where P is a finite set of control locations, Γ is a finite stack alphabet, and Δ(P×Γ)×2(P×Γ*) is a finite set of transition rules. For (p,γ,S)εΔ, each successor set is of the form {(p1,w1), . . . , (pn,wn)}εS denotes a transition of P and is denoted by (p,γ){(p1,w1), . . . , (pn,wn)}. Due to non-determinism there may be multiple successor sets for each pair of control state p and stack alphabet γ all of which are captured by the set S. A configuration of P is a pair (p,w), where pεP denotes the control location and wεΓ* the stack content. The set of all configurations of P is denoted by C. If (p,γ){(p1,w1), . . . , (pn,wn)}, then for every wεΓ* the configuration (p,γw) is an immediate predecessor of the set (p1,w1w, . . . , pn,wnw), this set being called the immediate successor of (p,γw). We use → to denote the immediate successor relation, Note that firing the transition (p,γ){(p1,w1), . . . , (pn,wn)}, from configuration (p,γw) causes the APDS to branch into the configurations (p1,w1w, . . . , pn,wnw).

A run of P for an initial configuration c is a tree of configurations with root c, such that the children of a node c′ are the configurations that belong to one of its immediate successors. We define the reachability relation (P×Γ*)×2P×Γ* between configurations and sets of configurations. Informally cC if and only if C is a finite frontier of a run of P starting from c. Formally, is the smallest subset of (P×Γ*)×2P×Γ* such that

    • c{c} for every c ε P×Γ*
    • if c is an immediate predecessor of C, then cC,
    • if c{c1, . . . , cn} and ciCi for each 1≦i≦n, then c(C1∪ . . . ∪Cn).

Alternating Multi-Automata: Let P=(P,Γ,Δ) be an APDS system where P={p1, . . . , pm}. An alternating P-multi-automaton (P-AMA for short) is a tuple A=(Γ,Q,δ,I,F) where Q is a finite set of states, δQ×Γ×2Q is a set of transitions, I={s1, . . . , sm}Q is a set of initial states and F Q is a set of final states. We define the transition relation →Q×Γ*×2Q as the smallest relation satisfying the following:

    • if (q,γ,Q′)εδ then qQ′,
    • q{q} for every qεQ, and
    • if {q1, . . . , q11} and for each 1≦i≦n q1Q1 then (Q1∪ . . . ∪Q11).

AMA A accepts a configuration (pi,w) if siQ for some QF. The set of configurations recognized by A is denoted by Conf(A). Given a finite sequence wεΓ* and a state qεQ, a run of A over w starting from q is a finite tree whose nodes are labeled by states in Q and whose edges are labeled by symbols in Γ such that the root is labeled by q and the labeling of the other nodes is consistent with δ. Observe that in such a tree each sequence of edges going from the root to the leaves is labeled with w. A set of configurations is regular if it is recognized by some AMA.

Weighted Automaton for . Let M1=(Γ,Q11,w1,I1,F1) and M2=(Γ,Q22,w2,I2,F2) be two WMAs. Then, we can construct a WMA M accepting the union of configurations accepted by M1 and M2 by first renaming each initial state s of M1 as s′ and each initial state s of M2 as s″. Then we define a Multi-Automaton M=M1M2 via the standard union construction M=(Γ,Q1∪Q21∪δ2∪δ12,w,I,F1∪F2), where for transition trεRi, δ(tr)=δ1(tr), δ(q0q1)=0 and δ(q0q2)=0; I is the set of newly introduced initial states s1, . . . , sm corresponding to control states p1, . . . , pm of the template U and δ12 is the set of zero weight transitions ∪1{S1s′l and s1sml}.

Weighted Automaton for . Let M1=(Γ,Q11,w1,I1,F1) and M2=(Γ,Q22,w2,I2,F2) be two WMAs. Then, we can construct a WMA M accepting the intersection of M1 and M2 via the standard product construction M=(Γ,Q1×Q2,δ,w,I1×I2,F1×F2), where (s1, s2)(s3, s4)εδ if (s1s3) and (s2s4) and w is the maximum of w1 and w2. The state (si,si) is renamed as si in order to ensure that for each control state pi of U there is an initial state of M.

Weighted Multi-Automaton for Fg: Let M0 be a given WMA accepting the set of regular configurations of U satisfying g. Starting at M0, we construct a series of WMAs M0, . . . , Mm resulting in the WMA Mm. We recall from the definition of an MA that for each control state pi of U there is an initial state si of M0. We denote by →k the transition relation of Mk. Then, for every k≧0, Mk+1 is obtained from Mk by conserving the set of states and adding new transitions as follows: (i) For each internal transition pi→pj, we add the transition sisj with weight 0. (ii) For each pairwise rendezvous send or receive transition pi→pj, we add the transition sisj with weight 1. (iii) For each stack transition pipj of U, if there exists a path x in Mk from state sj to t labeled with u, we add the transition sit, where wu is the sum of the weights of the transitions occurring along x. Note that if there exists more than one such path we may take wu to be the minimum weight over all such paths.

For configurations s and t of U, let s≦bt denote the fact that there is a path from s to t along which at most b pairwise rendezvous transitions are fired. Then, we have: If sj1q, then (pj,w)≦b (pk,v) for some pk and v such that sk0q, where b=b1+b2. Moreover if q is the initial state s, then pk=pl and v=ε. The constructions of WMAs for fg and fg are similar to the standard union and intersection construction for automata.

Given an L(F) formula f, we first construct a WMA for each atomic proposition of f by constructing an MA for the atomic proposition and setting the weights of all its transitions to 0. Next, we perform the above operations by traversing the formula f inside out starting from the atomic propositions. Let Mf be the resulting WMA. Using the above result, we let configuration (q,u) of U be accepted by Mf and let b be the weight of an accepting path of M starting from q and labeled with u. Then there exists a finite path of U starting from (q,u) and satisfying f such that at most b pairwise rendezvous transition are fired along it.

Doubly-indexed Formulae: We reduce the problem of computing cutoffs for double-indexed L(F) formulae f to single-index ones, Without loss of generality, each atomic proposition of f can be assumed to be of the form c or c, where c is control location of U. Rewriting c as the disjunction all the (finitely many) control states of U other than c, we can remove all negations from f. Let f=Eg. Then, by driving up the operator in g as far as possible we can write g=g1 . . . gk, where for each i, gi does not contain the operator. Then, the minimum of the cutoffs for Eg1, . . . , Egk is a cutoff for Eg. Thus, it suffices to compute cutoffs for Eg(1,2), where g(1,2) is a formula built using F, , but without , and atomic propositions that are control states of U[1] and U[2] in Un.

Note that with g(1,2), we can associate a set Seq of finite sequences of ordered pairs of the form (ci,dj), where ci(di) is either true or a control state of U[1](U[2], etc. respectively) occurring in g(1,2), which capture all possible orders in which global states satisfying cidi can appear along computation paths satisfying g(1,2). For example, with the formula c01F(c10c21)Fc32, where cij is true if U[j] is currently in local control state ci, we can associate the sequences. (c01,true),(true,c32),(c10,c21) and (c01,true),(c10,c21),(true,c32). Thus Un|=Eg(1,2) if there exists a sequence π:(c1,d1), . . . , (ck,dk) in Seq and a computation path x along which there exists global states satisfying c1d1, . . . , ckdk in the order listed, viz., x:gπ=c1d1F(c2d2F( . . . )). Then the minimum of the cutoff bounds for fπ, where πεSeq gives us the desired cutoff. Finally, the computation of cutoff bounds for fπ can be reduced to those of single-index L(G) formulae via the following result. Given a formula f=c1d1F(c2d2F( . . . )), the sum of the cutoffs for f1=c1F(c2F( . . . )) and f2=d1F(d2F( . . . )) is a cutoff bound for f.

Computing Nc. Given a control state c of U we now show how to compute Nc, viz., a cutoff for EFc. Let c be first included in Ri in the ith iteration in the method A. The computation of Nc is by induction on i. If i=0, viz., c is the initial state of U, Nc=1. Now, assume that Nc is known for each cεRi, where i>0. Let cεRi+1, Ri. Then, there is a path of Ui+1 comprised of states of Ri leading to c. Using WMA, we can, compute for each rendezvous transition tr, a bound on ntr, the number of times tr is fired along a path of Ui+1 satisfying EFc. Also, since by the induction hypothesis, we know the value of Nc for each c of R1, a cutoff for EFc can be determined thus completing the induction step.

Example for computing cutoffs: cut+2 is a cutoff for f. Once all the control states c are flooded using processes U3, . . . , Uk+2, processes U1 and U2 can execute x[1,2] wherein each rendezvous transition fired by U1 or U2 synchronizes with one of the processes U3, . . . , Uk+2. For control state c of U, let Nc be the cutoff for EFc, then cut′≦ΣcεRncNc. Since Nc is a cutoff of EFc, there exists a computation xc of UNc leading to a global state with a process in local state c. To get a global state with at least nc copies of c, we let processes U[1], . . . , U[Nc] of UncNc execute xc to reach a global state s1 with at least one copy of c. Next, starting at s1, we let processes U[Nc+1], . . . , U[Nc+nc] execute xc to reach a global state s2 with at least two copies of c. Repeating this process nc times results in a global state snc with at least nc copies of c. Repeating this process for each control state c, then gives us the desired result.

Cutoffs for B(P). For generating cutoffs for B(F) formulae, we start by recalling the standard procedure for model checking PDSs for mu-calculus formulae. We first take the product of the given PDS U with an alternating automaton/tableaux for the given formula f. Such products can be modeled as Alternating Pushdown Systems (APDSs). Then, model checking for f reduces to a pre*-closure computation for regular sets of configurations of the resulting APDS. These regular sets can be modeled as Alternating Multi-Automaton (AMA).

The procedure for computing cutoffs for B(F) formulae is similar to that for L(F), the only difference being that we use Weighted Alternating Multi-Automaton (WAMA) to capture the branching nature of the formulae where a state can now have a set of successors instead of just one. Thus, in a standard AMA each transition is a member of the set (Q×Γ)×2Q. Note that since f is a branching time property, the model for f is a computation tree of U. Thus while performing the pre*-closure computation, we need to keep track of the number of pairwise rendezvous fired along each branch of the computation trees encountered thus far. However, the number of pairwise rendezvous fired along different branches of a computation tree might, in general, be different and hence for each state outgoing transition needs to be assigned a different weight. Thus, each transition is now a member of the set (Q×Γ)×2Q×Z.

Weighted Alternating Multi-Automaton (WAMA): Given a PDS P=(P,Γ,c0,Δ), a WAMA is a tuple M=(Γ,Q,δ,I,F), where δ(Q×Γ)×2Q×Z and M′=(Γ,Q, δ′,I,F) is an AMA where
δ′={(s,{t1, . . . , tm}|(s,{(t1,w1), . . . , (tm,wm)})εδ}.

Updating the weights during the pre*-closure to compute the WAMA for AFg given the WAMA for g can be carried out in a similar fashion as for L(F) formulae the only difference being that the weights need to be updated for each successor. Let M0 be a given WAMA accepting a set of regular configurations. Starting at M0, we construct a series of WAMAs M0, . . . , Mp resulting in the WAMA Mp. We denote by →k, the transition relation of Mk. Then for every k≧0, Mk+1 is obtained from Mk by conserving the set of states and adding new transitions as follows: (i) For each internal transition pipj, we add the transition sisj with weight 0. (ii) For each pairwise rendezvous send or receive transition pipj, we add the transition sisj with weight 1. (iii) For each transition (pj,γ)→{pj,w1), . . . , (pkm,wm)} and every set sk1i{(p11, b11), . . . , (p1i1, b1i1)}, . . . , skmj{(pm1, bm1), . . . , (pmim, bmim)}, we add a new transition sji+1{(q1, b1), . . . , (q1, b1)}, where for each j, bj is the maximum of all brj where prj=qj.

The Model Checking Procedure for L(G): Reasoning about a double-indexed LTL formula with infinite models is in general harder than the ones with finite models. This is because one has to now ensure that there exists an infinite computation of Um, for some m, along which rendezvous transitions cannot only be recycled infinitely often but can be done so while maintaining context-free reachability. However, we exploit the fact that the dual of an L(G) formula is of the form Ag, where g is built using the temporal operator F and the boolean connectives and . Such formulae have finite tree-like models. However, note that ∃n:Un|=f if ∀n:Un|=f. Thus, if we resort to the dual of f, the resulting problem is no longer a PMCP. A method for the PMCP for L(G) is then the following: 1. Given an L(G) formula f, construct a B(F) formula g equivalent to f, viz., Un|=f if Um|=g. 2. Compute the cutoff cut for g. 3. For each m≦cut, check if Um|=g.

The procedure for computing cutoffs for B(F) formulae was given above. For step 3, it suffices to check whether for each m≦cut, Um|=g, where f=g is an L(G) formula. But the model checking problem for L(G) formula for systems with a finite number of PDSs interacting via pairwise or asynchronous rendezvous is already known to be decidable. Thus, it can be shown how to construct a B(F) formula g equivalent to f.

Disjunctive Guards: we consider PMCP for PDSs interacting via disjunctive guards. Here transitions of U are labeled with guards that are boolean expressions of the form (c1 . . . ck) with c1, . . . , ck being control states of U. In copy U[i] of template U in Un, a transition ab of U is rewritten as a transition of the form ab. In Un such a transition of U[i] is enabled in a global state s of Un if there exists a process U[j] other than U[i] in at least one of the local states c1, . . . , ck in s. Concurrent systems with processes communicating via boolean guards are motivated by Dijkstra's guarded command model. The PMCP for finite state processes communicating via disjunctive guards was shown to be efficiently decidable. As for pairwise rendezvous, the unbounded multiplicity result holds. Then, as before, the set of parameterized reachable control states can be computed efficiently. The procedure is similar to the one for PDSs interacting via pairwise rendezvous except that when constructing Ri+1 from Ri, in order to handle the synchronization constraints, we convert all transitions of the form ab, where for some jε[1 . . . k]:cjεRi, to an internal transition of the form ab. This is motivated by the fact that since c1 is parameterized reachable, transition ab can always be enabled by ensuring, via the unbounded multiplicity result and a flooding argument, that for some j, there exists a process in local state cj. We get: the Parameterized Model Checking Problem for control state reachability, for systems composed from a template PDS U interacting via disjunctive guards can be decided in O(|U|5) time, where |U| is the size of U.

PMCP for Linear Time Formulae: Let UR be the PDS that we get from U by retaining only the parameterized reachable control states and all transitions between them that are either internal or labeled with a disjunctive guard which has at least one of the parameterized reachable control states as a disjunct. However, we replace each such transition labeled with a disjunctive guard with an internal transition making UR non-interacting. We first show via a flooding argument that for any double-indexed LTL\X formula g, for some n, Un|=Eg if UR2|=Eg. By the unbounded multiplicity property, for some m, there exists a computation y leading to a global state of Um with at least one copy of each parameterized reachable control state of U. In a system with Um+2 processes, we first let processes U3, . . . , Um+2 execute y to flood all control states of U with multiplicity at least one. Then we are guaranteed that in any computation x (finite or infinite) of U[1,2], the transitions labeled with disjunctive guards can always be fired as there exists a process among U3, . . . , Um+1 in each of the reachable control states. All such transitions can therefore be treated as internal transitions.

Binary Reduction Result: For any computation x of Un, where n≧2, there exists a computation y of UR2 such that y is stuttering equivalent to x[1,2]. Note that the above result reduces the PMCP for any doubly-indexed LTL\X formula f to model checking for a system with two non-interacting PDSs for f. It follows that we need to consider only the fragments L(F) and L(G). For these fragments the problem of model checking a system with non-interacting PDSs is already known to be efficiently decidable. Thus, the PMCP for PDSs interacting via disjunctive guards is efficiently decidable for the fragments L(F) and L(G) and undecidable for the fragments L(U) and L(F,G).

Locks: We consider the PMCP for PDSs interacting via locks. Leveraging the cutoff result determined above, we have that for n≧2, Un|=f if U2|=f, where f is a doubly-indexed LTL\X formula. This reduces the problem of deciding the PMCP for f to a (standard) model checking problem for a system comprised of two PDSs interacting via locks.

We now distinguish between nested and non-nested locks. A PDS accesses locks in a nested fashion if it can only release the last lock it acquired and that has not been released. For a system with two PDSs interacting via nested locks, the model checking problem for systems with two PDSs is known to be efficiently decidable for both fragments of interest, viz., L(F) and L(G). So, the PMCP for L(F) and L(G) for PDSs interacting via nested locks is decidable in polynomial time in the number of control states of the given template U and exponential time in the number of locks of U.

For the case of non-nested locks, we show that the PMCP is decidable for L(G) but undecidable for L(F). For L(F) the undecidability result follows by reduction from the problem of model checking a system comprised of two PDSs P1 and P2 interacting via non-nested locks for the formula EF(c1c2) which is known to be undecidable.

The PMCP for EF(c1c2), and hence L(F), is undecidable for PDSs interacting via non-nested locks. For L(G), it can be shown that the problem of model checking a system with PDSs interacting via locks for an L(G) formula f can be reduced to model checking an alternate formula f′ for two non-interacting PDSs. Given a template U interacting via the locks l1, . . . , lk, we construct a new template V with control states of the form (c,m1, . . . , mk). The idea is to store whether a copy of U is currently in possession of lock li in bit mi which is set to 1 or 0 accordingly as Ui is in possession of li or not, respectively. Then, we can convert V into a non-interacting PDS by removing all locks from V and instead letting each transition of V acquiring/releasing li set mi to 1/0. However, removing locks makes control states which were mutually exclusive in U2 simultaneously reachable in V2. In order to restore the lock semantics, while model checking for an L(G) property of the form Eg, we instead check for the modified L(G) property E(gg′), where g′=G(i(mi1mi2)) with atomic proposition mij evaluating to true in global state s if in the local control state (c,mij, . . . , mkj process Vj in s, mij=1. Note that g′ ensures that in the control states of V1 and V2, for each i, the mi-entry corresponding to lock li cannot simultaneously be 1 for both V[1] and V[2], viz., U[1] and U[2] cannot both hold the same lock li. Then, the problem reduces to model checking two non-interacting PDS for L(G) formulae which is known to be decidable. This gives that the PMCP for L(G), is efficiently decidable for PDSs interacting via non-nested locks.

Broadcasts: We consider the PMCP for PDSs communicating via broadcasts. Here, Σ, the set of action symbols of U, is comprised of the set Σin of internal transition labels; and the sets Σpr×{!!} and Σpr×{??} of send and receive broadcasts transitions, respectively. Like asynchronous rendezvous, a broadcast send transition that is enabled can always be fired. A broadcast receive transition can only be fired if there exists an enabled matching broadcast send transition. Broadcasts differ from asynchronous rendezvous in that executing a broadcast send transition forces not merely one but all other processes with matching receives to fire. It can be shown that for PDSs interacting via broadcasts, the PMCP for pairwise reachability, viz., EF(c1c2) is undecidable. The undecidability result for L(F) then follows as an immediate corollary. The PMCP for L(F) is undecidable for PDSs interacting via broadcasts.

We consider the PMCP for PDSs interacting via each of the standard synchronization primitives for a broad class of temporal properties. Specifically we have delineated the decidability boundary of the PMCP for PDSs interacting via each of the standard synchronization primitives for doubly-indexed LTL. We have also demonstrated that in many important cases of interest the PMCP is more tractable than the standard model checking problem. The practical implications of the new results are that in many applications like Linux™ device drivers, it may be more useful to consider the PMCP than the standard model checking problem.

Having described preferred embodiments of a system and method for inter-procedural dataflow analysis of parameterized concurrent software (which are intended to be illustrative and not limiting), it is noted that modifications and variations can be made by persons skilled in the art in light of the above teachings. It is therefore to be understood that changes may be made in the particular embodiments disclosed which are within the scope and spirit of the invention as outlined by the appended claims. Having thus described aspects of the invention, with the details and particularity required by the patent laws, what is claimed and desired protected by Letters Patent is set forth in the appended claims.

Claims

1. A method implemented by a computer for computing dataflow in concurrent programs of a computer system, comprising:

given a family of threads (U1,..., Um) and a Linear Temporal Logic (LTL) property, f, for a concurrent program, computing a cutoff for the LTL property, f, where c is called the cutoff if for all n greater than or equal to c, Un satisfies f if Uc satisfies f, the cutoff being computed using weighted multi-automata for internal transitions of the threads; and
model checking a cutoff number of processes to verify data race freedom in the concurrent program.

2. The method as recited in claim 1, wherein the step of model checking includes establishing data race freedom in a concurrent program with at least two distinct drivers, each running respective threads, by establishing data race freedom in a parameterized system comprised of a plurality of copies of the respective threads.

3. The method as recited in claim 1, wherein the step of model checking includes establishing data race freedom in an undecidable concurrent program by establishing data race freedom in a parameterized system including the undecidable concurrent program.

4. The method as recited in claim 1, wherein the threads are modeled as push down systems (PDSs).

5. The method as recited in claim 1, wherein the threads interact with each other using synchronization primitives.

6. The method as recited in claim 1, wherein the synchronization primitives include at least one of pairwise rendezvous, asynchronous rendezvous, disjunctive guards, broadcasts, nested locks and non-nested locks.

7. The method as recited in claim 1, wherein f is a double-indexed LTL formula.

8. The method as recited in claim 1, wherein using weighted multi-automata for internal transitions of the threads includes estimating a bound on a number of transitions fired in transit between two control states.

9. A method implemented by a computer for computing dataflow in a computer program of a computer system, comprising:

given a family of threads modeled as pushdown systems (U1,..., Um) which interact by synchronization primitives and a Linear Temporal Logic (LTL) property, f, for a concurrent program, computing a cutoff for the LTL property, f, by computing bounds on a number of transitions fired along a computation of a thread between reachable control states of the concurrent program where the bounds are computed using weighted multi-automata on internal transitions of the threads; and
model checking a cutoff number of processes by parameterized model checking to verify data race freedom in the concurrent program.

10. The method as recited in claim 9, wherein the step of model checking includes establishing data race freedom in a concurrent program with at least two distinct drivers, each running respective threads, by establishing data race freedom in a parameterized system comprised of a plurality of copies of the respective threads.

11. The method as recited in claim 9, wherein the step of model checking includes establishing data race freedom in an undecidable concurrent program by establishing data race freedom in a parameterized system including the undecidable concurrent program.

12. The method as recited in claim 9, wherein the synchronization primitives include at least one of pairwise rendezvous, asynchronous rendezvous, disjunctive guards, broadcasts, nested locks and non-nested locks.

Patent History
Patent number: 8380483
Type: Grant
Filed: Oct 4, 2007
Date of Patent: Feb 19, 2013
Patent Publication Number: 20080086723
Assignee: NEC Laboratories America, Inc. (Princeton, NJ)
Inventor: Vineet Kahlon (Princeton, NJ)
Primary Examiner: Suzanne Lo
Application Number: 11/867,178
Classifications
Current U.S. Class: Software Program (i.e., Performance Prediction) (703/22); Including Analysis Of Program Execution (717/131)
International Classification: G06F 9/45 (20060101);