Inter-procedural dataflow analysis of parameterized concurrent software

Info

Patent number: 8380483
Type: Grant
Filed: Oct 4, 2007
Date of Patent: Feb 19, 2013
Patent Publication Number: 20080086723
Assignee: NEC Laboratories America, Inc. (Princeton, NJ)
Inventor: Vineet Kahlon (Princeton, NJ)
Primary Examiner: Suzanne Lo
Application Number: 11/867,178

Abstract

A system and method for computing dataflow in concurrent programs of a computer system, includes, given a family of threads (U1, . . . , Um) and a Linear Temporal Logic (LTL) property, f, for a concurrent program, computing a cutoff for the LTL property, f, where c is called the cutoff if for all n greater than or equal to c, Un satisfies f if Uc satisfies f. The cutoff is computed using weighted multi-automata for internal transitions of the threads. Model checking a cutoff number of processes is performed to verify race freedom in the concurrent program.

Description

Description

RELATED APPLICATION INFORMATION

This application claims priority to provisional application Ser. No. 60/884,048 filed on Jan. 9, 2007, incorporated herein by reference. This application also claims priority to provisional application Ser. No. 60/828,246 filed on Oct. 5, 2006, incorporated herein by reference.

The present application is related to U.S. application Ser. No. (TBD) filed currently herewith, entitled “MODEL CHECKING PARAMETERIZED THREADS FOR SAFETY” Ser. No. 11/867,160 and incorporated herein by reference.

BACKGROUND

1. Technical Field

The present invention relates to computer system verification and more particularly to verification of concurrent programs, which exploit parameterized qualities of computer systems comprised of many copies of the same hardware or software component.

2. Description of the Related Art

Computer verification is needed to ensure that a computer system operates properly and that the results obtained are trustworthy. One form of verification is testing. In testing, the actual behavior of a system is examined on a set on inputs and matched against an expected result. Due to a large or infinite number of possible inputs, it becomes impossible to confirm that a system behaves correctly in all circumstances.

Verification tries to address these issues. Verification provides a mathematical or model basis for simulating the system behavior. A model and its intended behavior are defined. A machine is usually modeled as a system whose state evolves over time, the model includes a specification of the state space and how the system can traverse it. Temporal logic has been shown to be useful in expressing behavior of reactive systems. The model-checking approach to verification includes taking the mathematical model of the system under consideration and checking the validity of a temporal logic formula within the model.

A primary problem faced by all methods is known as state explosion. State explosion means that the state space of the system under consideration grows rapidly (e.g., exponentially) with the amount of memory used (e.g., registers, or program variable, pointers, etc.). This limits the verification methods.

Multi-threading is a standard way of exploiting parallelism among different components of a computer system. As a result, the use of concurrent multi-threaded programs is becoming pervasive. Examples include operating systems, databases, embedded systems (cell phones, multimedia consumer products) etc. Since verification typically does not scale for large scale concurrent programs, there is a deep interest in leveraging static analysis techniques like inter-procedural dataflow analysis for debugging multi-threaded programs. While inter-procedural dataflow analysis has shown to be a very effective technique for finding bugs for sequential programs, there has been very little work on extending such dataflow analyses to the concurrent domain.

SUMMARY

Concurrent programs with many replicated threads, e.g., running the same piece of code, occur in many important applications. These include implementations of protocols for networking, cache coherence, synchronization and multi-core architectures running multi-threaded software, among others. As a concrete example, we consider Linux™ device drivers. Driver code is supposed to work correctly irrespective of the number of threads executing it. For such applications, the goal is to establish correctness of programs of the form U₁ⁿ¹∥ . . . ∥U_mⁿ^mirrespective of the program size as measured by the number n_iof threads executing the code for driver U_i. In the art, this is often referred to as the Parameterized Model Checking Problem (PMCP). Clearly, this is important as correctness of a system with a fixed number of threads does not, in general, establish correctness for an arbitrary number.

In practice, however, deciding the PMCP is considered a hard problem. Therefore, the approach that is typically followed is to first tackle the seemingly simpler problem of trying to establish correctness for programs with a fixed number (typically 2) of replicated threads. However, we obtain the somewhat surprising result that establishing correctness for a fixed number (even two) of replicated components is, in many important cases, provably less tractable than establishing parameterized correctness.

This has at least two implications. First, when reasoning about parameterized recursive programs, it is important to try to reason directly about parameterized correctness rather than attempt to establish correctness for a special case comprising a small fixed number of replicated threads and successively increasing the number of copies. To illustrate the second, and more important, conclusion of practical interest, we consider the scenario where our end goal is not parameterized reasoning but establishing correctness of a program with a fixed number of, possibly distinct, threads.

Suppose that we want to establish the absence of data races in a program U₁∥U₂comprised of threads U₁and U₂running two possibly distinct device drivers. Then, if we establish the absence of a data race in the parameterized system U₁ⁿ∥U₂^m, comprised of arbitrarily many copies of U₁and U₂, it automatically establishes data race freedom for U₁∥U₂. One key point is that we show that in many cases of interest that it turns out that reasoning about U₁∥U₂is undecidable whereas the PMCP is efficiently decidable.

We consider the PMCP for concurrent programs of the form U₁ⁿ¹∥ . . . ∥U_mⁿ^mcomprised of an arbitrary number n_iof copies of a template thread U_iinteracting with each other using standard synchronization primitives like pairwise and asynchronous rendezvous, locks, broadcasts, and disjunctive guards. We model threads as Pushdown Systems (PDS) which have emerged as a natural and powerful framework for analyzing recursive programs. Correctness properties are expressed using multi-indexed LTL\X. Note that absence of the “next-time” operator X makes the logic stuttering insensitive which is usual when reasoning about parameterized systems. For ease of exposition, we formulate our results for systems with a single template PDS and for double-indexed LTL\X properties. Extension to systems with multiple templates and k-index properties, where k>2, will be understood to those skilled in the art.

Our new results show that decidability of the PMCP hinges on the set of temporal operators allowed in the correctness property, thereby providing a natural way to characterize fragments of double-indexed LTL for which the PMCP is decidable. We use L(Op₁, . . . , Op_k), where Op_iε{F,U,G}, to denote the fragment comprised of formulae of the form Ef, where f is double-indexed LTL\X formula in positive normal form (PNF), viz., only atomic propositions are negated, built using the temporal operators Op₁, . . . , Op_kand the Boolean connectives and . Here F “sometimes”, U, “until” and G “always”, denote the standard temporal operators and E is the “existential path quantifier”. L(U,G) is the full-blown double-indexed LTL\X.

In this disclosure, we delineate precisely the decidability/undecidability boundary of the PMCP for double-indexed LTL\X for each of the standard synchronization primitives. Specifically, we show the following.

(a) The PMCP for L(F,G) and L(U) is, in general, undecidable even for systems wherein the PDSs do not interact at all with each other. The above results imply that to get decidability of the PMCP for PDSs, interacting or not, we have to restrict ourselves to either the sub-logic L(F) or the sub-logic L(G). For these sub-logics, decidability of the PMCP depends on the synchronization primitive used by the PDSs,

(b) For the sub-logic L(F), we show that the PMCP is efficiently decidable for PDSs interacting via pairwise or asynchronous rendezvous, disjunctive guards and nested locks but remains undecidable for broadcasts and non-nested locks. The decidability for pairwise rendezvous (and indeed for asynchronous rendezvous and disjunctive guards) is surprising given the undecidability of model checking systems comprised of two PDSs (even when they are isomorphic to each other) interacting via pairwise rendezvous for reachability—a cornerstone undecidability result for model checking interacting PDSs. Our new results show that the PMCP for PDSs interacting via pairwise rendezvous is not only decidable but efficiently so. This is especially interesting as it illustrates that for pairwise (and asynchronous rendezvous and disjunctive guards) switching to the parameterized version of the problem makes it more tractable.

(c) For the fragment L(G), we show that the PMCP is decidable for pairwise and asynchronous rendezvous, disjunctive guards and locks (even non-nested ones). This settles the PMCP for all the standard synchronization primitives.

Let {Uⁿ} be the parameterized family of systems defined by the template PDS U interacting via pairwise rendezvous. To get decidability for L(F), we start by formulating a new efficient procedure to compute the set of control states of U which are parameterized reachable, e.g., reachable in Uⁿfor some n. This is accomplished via a fixpoint computation which starts with the set R₀containing the initial state of U, and in the ith iteration constructs the set R_i+1of control states that become parameterized reachable assuming that all states in R_iare parameterized reachable. The crucial point is that in adding a new control state c to R_i, we have to not only ensure that synchronization constraints arising out of rendezvous are met but also that the newly added states are context-free reachable from existing parameterized reachable states. The checking of the two constraints are dovetailed, i.e., carried out in an interleaved fashion until a fixpoint is reached in that no new states are discovered. We next show, via a flooding argument, that the PMCP for a formula f of L(F) reduces to standard model checking for a system with two non-interacting copies of the PDS U_R, where U_Ris the template that we get from U by retaining only the parameterized reachable control states of U and converting all pairwise rendezvous between such states to internal transitions. The last problem is known to be efficiently decidable giving us the decidability result. Decidability for PDSs with asynchronous rendezvous and disjunctive guards follows via similar procedures.

To get decidability for L(G), we first show cutoff results. We say that c is a cutoff for formula f if for m≧c, U^m|=f if U^c|=f. By leveraging the use of Weighted Multi-Automata, we give new procedures to compute cutoffs for L(F) and L(G) formulae for PDSs interacting via pairwise and asynchronous rendezvous. For PDSs interacting via locks, this cutoff is known to be k for k-index properties. The existence of cutoffs reduces the PMCP to model checking systems with finitely many PDSs which we show to be decidable for disjunctive and (non-nested) locks and which is already known to be decidable for PDSs interacting via pairwise and asynchronous rendezvous. For PDSs interacting via disjunctive guards, we show, via a flooding argument, that the PMCP for a formula f of L(G) reduces to standard model checking for a system with two (non-interacting) copies of a simplified PDS U_R. The last problem is known to be efficiently decidable giving us the decidability result.

A system and method for computing dataflow in concurrent programs of a computer system, includes, given a family of threads (U¹, . . . , U^m) and a Linear Temporal Logic (LTL) property, f, for a concurrent program, computing a cutoff for the LTL property, f, where c is called the cutoff if for all n greater than or equal to c, Uⁿsatisfies f if U^csatisfies f. The cutoff is computed using weighted multi-automata for internal transitions of the threads. Model checking a cutoff number of processes is performed to verify race freedom in the concurrent program.

These and other features and advantages will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.

BRIEF DESCRIPTION OF DRAWINGS

The disclosure will provide details in the following description of preferred embodiments with reference to the following figures wherein:

FIG. 1 is a block/flow diagram showing a system/method for solving a dataflow computation in concurrent programs in accordance with an illustrative embodiment;

FIG. 2 is a diagram showing a template process U with control states c and transition designations with ! and ? for demonstrating operation in accordance with the present principles; and

FIG. 3 is a diagram showing a fixpoint computation in accordance with the present principles showing progression through several iterations.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

The present embodiments relate to computer system verification and more particularly to verification of concurrent programs, such as, e.g., device drivers used for controlling hardware components like disk drives, audio speakers, etc. In particularly useful embodiments, parameterized qualities of computer systems are exploited in that the concurrent programs are comprised of many copies of the same hardware or software component. In addition, the present embodiments are applicable to other applications, such as, e.g., embedded software used in mobile devices like cell phones, personal digital assistants (PDAs), database software, SQL servers, application level software, like web browsers (Firefox™, Explorer™) or any application using concurrency.

Model checking of interacting pushdown systems is a core problem underlying dataflow analysis for concurrent programs. However, it is decidable only for very restricted temporal logic fragments. The fundamental obstacle is the undecidability of checking non-emptiness of an intersection of two context-free languages. It is not difficult to couple two different pushdown systems (PDSs) either by making the synchronization primitive expressive enough or the property being model-checked strong enough to accept precisely the intersection of the context free languages accepted by these PDSs. This results in the undecidability of the model checking problem. However, in accordance with the present principles, that many important classes of concurrent systems are parameterized is exploited, i.e., the classes are comprised of many replicated copies of a few basic type of components.

In accordance with the present principles, the general difficult problem need not be solved. We exploit the fact that many concurrent systems are parameterized, i.e., composed of many replicated copies of the same basic component. Indeed for most distributed protocols for networking, cache coherence, synchronization the same exact piece of code implementing the protocol is run on different machines thus making it parameterized. The Internet can be thought of as a network of computers, each running the TCP/IP protocol. Other examples include multi-core-architectures with multi-threading. Indeed, a device driver is supposed to run correctly irrespective of the number of threads executing it.

A new and efficient inter-procedural dataflow analysis system and method are provided for parameterized multi-threaded programs. The problem reduces to the problem of model checking interacting PDSs wherein all the PDSs are copies of each other. The prior work so far on analyzing parameterized programs has been restricted to models where there is no effective communication between the threads (PDSs) and is thus of little practical value. In the present disclosure, we have considered more powerful and realistic models wherein PDSs can interact via locks, rendezvous (e.g., WaitNotify( ) from Java™) or broadcasts (e.g., Wait NotifyAll( ) from Java™). Thus, inter-procedural analysis is extended to the parameterized concurrent domain for realistic models of communication.

We consider the model checking problem for concurrent programs comprised of a finite, but arbitrarily many, copies of a fixed set of threads—often referred to as the Parameterized Model Checking Problem (PMCP). Modeling each thread as a PDS, we delineate the decidability boundary of the PMCP for Indexed Linear Temporal Logic (LTL) for each of the standard synchronization primitives. Our results lead to the surprising conclusion that in many cases of interest, the PMCP, even though a seemingly harder problem, is more tractable than the problem of model checking a fixed number of PDSs. For example, for PDSs interacting via pairwise rendezvous, the PMCP for reachability (presence of a data race) is efficiently decidable whereas model checking a system with two such (even isomorphic) PDSs is undecidable. Deciding the PMCP efficiently is of great importance for parameterized applications like, for instance, Linux™ device drivers. However, the broader practical implication of our results is that even if we are not interested in parameterized reasoning but only in model checking a system U₁∥ . . . ∥U_mwith a fixed number of possibly distinct threads U₁, . . . , U_m, then in many cases it is more useful to consider the PMCP for the corresponding parameterized system U₁ⁿ¹∥ . . . ∥U_mⁿ^mwith arbitrarily many copies of U₁, . . . , U_m.

Practical applications in accordance with the present principles include that for debugging concurrent multithreaded software, it is more tractable to consider the parameterized version of the problem. For example, it we want to detect data races in a concurrent program T1∥T2 with two Linux™ device drivers T1 and T2 then it is more efficient and tractable to consider the same problem for a system T₁ⁿ∥T₂^mwith an arbitrary number of copies of T1 and T2. This is surprising since it is seemingly a harder problem but in reality is much more tractable.

It should be understood that the elements shown in the FIGS. may be implemented in various forms of hardware, software or combinations thereof. Preferably, these elements may be stored on computer media and are implemented in software, on one or more appropriately programmed general-purpose digital computers having a processor and memory and input/output interfaces. Software may include but is not limited to firmware, resident software, microcode, etc.

Embodiments of the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment including both hardware and software elements. In a preferred embodiment, the present invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.

Embodiments can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. A computer-usable or computer readable medium may include any apparatus that stores, communicates, propagates, or transports the program for use by or in connection with the instruction execution system, apparatus, or device. The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. For example, the medium may include a computer-readable medium such as a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk, etc.

Referring now to the drawings in which like numerals represent the same or similar elements and initially to FIG. 1, a block/flow diagram showing system/method for verifying concurrent programs is illustratively depicted. Consider a family of systems of the form U₁ⁿ¹∥ . . . ∥U_mⁿ^mcomprised of an arbitrary number n_iof copies of a template thread U_i. Each template U_imay be modeled as a Pushdown System (PDS). A PDS has a finite control part and a stack which models recursion. Formally, a PDS is a five-tuple P=(P,Act,Γ,c₀,Δ), where P is a finite set of control states, Act is a finite set of actions containing the empty action e, Γ is a finite stack alphabet, and Δ ⊂ (P×Γ)×Act×(P×Γ*) is a finite set of transition rules. If ((p, γ), a, (p′, w)) εΔ then we write (p, γ)(p′, w). A configuration of P is a pair (p, w), where p ε P denotes the control location and w εΓ* the stack content. We call c₀the initial configuration of P. The set of all configurations of P is denoted by C. For each action a, we define a relation ⊂C×C as follows: if (q, γ)(q′, w), then (p, γv)(p′, wv) for every vεΓ*.

We use {U₁, . . . , U_m} to denote the family of concurrent programs (or threads), U₁ⁿ¹∥ . . . ∥U_mⁿ^m, formed by the interleaved parallel composition of n_icopies of template U_i. The jth copy of U_i, denoted by U_j[i], communicates with the other threads via the standard synchronization primitives—locks, pairwise or asynchronous rendezvous, broadcasts and disjunctive boolean guards. Pairwise rendezvous are inspired by calculus of communicating systems (CCS) (e.g., a language for writing concurrent programs) whereas asynchronous rendezvous and broadcasts are used to model the Wait/Notify and Wait/NotifyAll primitives of Java™. The semantics of U₁ⁿ¹∥ . . . ∥U_mⁿ^mis defined in the usual way and is therefore omitted. For the sake of simplicity, we formulate our results for parameterized systems with a single template and for double-index properties. Given a global computation x of Uⁿ, we use x[i,j] to denote the sequence resulting by projecting x onto the local computation sequence of threads U[i] and U[j].

Correctness Properties. Given template U, we consider double-index properties of the form Λ_i,jEg(i,j), where g(i,j) is an LTL\X property interpreted over the local control states of copies U[i] and U[j]. Note that due to symmetry Uⁿ|=Λ_i,jEg(i,j) if Uⁿ|=Eg(1,2). We thus restrict ourselves to properties of the form Eg(1,2). For this logic, we follow the classification L(Op₁, . . . , Op_k) based on the temporal operators allowed in the correctness property as previously formulated. We observe that double-index LTL\X is a very rich logic which can encode many properties of interest. For instance, the presence of a data race can be formulated as the double-index formula EF(c₁d₂).

Given the family of PDSs, for a concurrent program comprised of many similar components, e.g., device drivers which run hardware like disk drives, audio speakers, etc., and an logic temporal property f (as described above), compute a cutoff for f in block 12. c is called a cutoff if for all n≧c, Uⁿsatisfies f, if U^csatisfies f. This reduces the problem to debugging a system with up to a cutoff number of processes.

In block 14, to compute these cutoffs, employ weighted multi-automata. Computation of these cutoffs reduces to pre*-closure computations of weighted automata which can be carried out efficiently in polynomial time in the size of source code. Once cutoffs have been computed, model check the resulting systems of cutoff size in block 16. The techniques used depend on the communication primitives used by the threads and the techniques may be known.

The present embodiments extend inter-procedural dataflow analysis to the parameterized concurrent domain for realistic models of communication among threads. All the standard Java™ communication primitives may be modeled. The present embodiments are more scalable, i.e., can potentially verify larger programs than existing techniques, and accomplish this by avoiding construction of the global state space of the given program thus bypassing the state explosion problem. The analysis is reduced from a concurrent multithreaded program to its individual threads.

The methods are both sound and complete, thus avoiding bogus error traces that could be generated by less precise techniques. This is important from a commercial standpoint as most of the resources spent in real life verification go into detecting/avoiding bogus error traces.

Undecidability Barriers: We start by showing two undecidability results for the PMCP for systems comprised of PDSs that do not even interact with each other.

The decidability of the PMCP hinges on the set of temporal operators allowed in the correctness property, thereby providing a natural way to characterize fragments of double-indexed LTL for which the PMCP is decidable. In one example, we use L(Op₁, . . . , Op_k), where Op₁ε{F,U,G}, to denote the fragment comprised of formulae of the form Ef, where f is double-indexed LTL\X formula in positive normal form (PNF), viz., only atomic propositions are negated, built using the temporal operators Op₁, . . . , Op_kand the Boolean connectives and . Here F “sometimes”, U, “until” and G “always”, denote the standard temporal operators and E is the “existential path quantifier”. L(U,G) is the full-blown double-indexed LTL\X.

The PMCPs for L(U) and L(G,F) are undecidable for systems comprised of non-interacting PDSs. An important consequence of this is that for more expressive systems wherein PDSs interact using some synchronization mechanism, we need to focus only on the remaining fragments, e.g., L(F) and L(G).

Pairwise and Asynchronous Rendezvous: Let {U} be the parameterized family defined by a template process U modeled as a PDS synchronizing via pairwise rendezvous. Here, Σ, the set of action symbols of U, is comprised of the set Σ_inof internal transition labels; and the sets Σ_pr×{!} and Σ_pr×{?} of send and receive pairwise rendezvous transitions, respectively. We assume that synchronizing transitions, i.e., those labeled by actions in Σ_pr×{!}∪Σ_pr×{?}, do not modify the stack of the PDS executing the transition. For action lεΣ_pr, a pair of transitions labeled with l! and l? are called matching. We recall that for rεΣ_pr×{!}∪Σ_pr×{?}, transition tr₁:ab of a process U[i] of Uⁿis enabled in global state s, if there exists a process U[j] of Uⁿ, where j≠i, in local state c such that there exists a matching transition of the form tr₂:cd in Δ. To execute the rendezvous, both the pairwise send and receive transitions tr₁and tr₂must be fired synchronously in one atomic step.

We present decision procedures for the PMCP for L(F) and L(G) for PDSs interacting via pairwise and asynchronous rendezvous. We start by presenting a provably efficient procedure for computing the set of all parameterized reachable control locations of U. This is needed for not only formulating the procedures for L(F) and L(G) but is also of independent interest as detecting the presence of data races can be reduced to deciding parameterized reachability.

Parameterized Reachability: We say that a control state c (configuration c) of template process U is parameterized reachable if there exists a reachable global state s of Uⁿ, for some n, with a process in control state c (configuration c).

It can be shown that if a configuration c of U is parameterized reachable; then given l, for some k, there exists a reachable global state s of U^kwith at least l copies of c. In other words, we can pump up the multiplicity of each parameterized reachable configuration of U to any arbitrarily large value. This relieves us of the burden of tracking the multiplicity of each configuration of U.

Unbounded Multiplicity: Let R be the set of all parameterized reachable configurations of U and let R′ be a finite subset of R. Then given l, for some m, there exists a finite computation of U^mleading to a global state s with at least l copies of each configuration in R′.

The above result reduces the PMCP for EF(c₁ . . . c_k), i.e., the presence of a data race, to the PMCP for EFc, where c is a control state of U. We have: ∃n, Uⁿ|=EF(c₁ . . . c_k) if for each iε[1 . . . k], c_iis parameterized reachable.

While computing parameterized reachable control states for the case where U is a finite state labeled transition system can be accomplished via a simple fixpoint computation, for PDSs it is complicated by the requirement to simultaneously satisfy constraints arising both out of synchronization primitives and context-free reachability introduced by the stack.

Referring to FIG. 2, an example template process is shown for determining reachability. Consider the template process U. Suppose that we want to decide whether for some n, Uⁿ|=EF_c₁. We start with the set R₀={c₀} containing only the initial state c₀of U. We then construct a series of sets R₀, . . . , R_m, where R_i+1is obtained from R_iby adding new control states that become parameterized reachable assuming that all states in R_iare parameterized reachable. In constructing R_i+1, from R_iwe need to make sure that both the constraints, i.e., those imposed by (i) the synchronization primitives, and (ii) context-free reachability are satisfied. We accomplish this in a dovetailed fashion.

First, to satisfy the synchronization constraints, we convert all transitions of the form ab such that there exists a transition of the form cd, where p and p′ are matching send and receive rendezvous actions with cεR_i, to an internal transition of the form ab, where τ is a newly introduced special internal action symbol in Σ_in. This is motivated by the fact that since c is parameterized reachable, we can ensure that if a becomes parameterized reachable (now or in some future iteration), then, for some m, there exists a reachable global state of U^mwith a process each in local states a and c. In other words, if a becomes reachable, the rendezvous transition ab can always be enabled and executed. Thus, it can be treated like an internal transition. In this way, by flooding all the control states of R_i, we can remove all the synchronization constraints arising out of pairwise send or receive transitions emanating from control states in R_i. This will enable every rendezvous transition with a matching send/receive starting at a control state in R_i. Such transitions can therefore be replaced by internal transitions. Motivated by this, we define U_i+1to be the template that we get from the original template U by replacing the appropriate pairwise rendezvous send/receive transitions as described above with internal transitions and removing the remaining rendezvous send and receive transitions.

To check that the second constraint, i.e., context-free reachability, is satisfied, we can now use any procedure for model checking a single PDS, to determine the set R_cⁱof those control states of U that are reachable in the individual PDS U_i. This gives us the set R_cⁱof all the context free reachable states in U_i. If new control states become reachable via removal of some synchronization constraints in the previous set, they are added to R_i+1; otherwise, we have reached a fixpoint and the procedure terminates.

Referring to FIG. 3, in the example, R₀is initialized to {c₀}. This enables both the transitions c₀c₉and c₀c₈and hence both of them can be converted to internal transitions resulting in the template U₁. In a second iteration (U₂), we note that c₅, c₆, c₈and c₉are all reachable control states of template U₁and so R₁={c₀,c₅,c₆,c₈,c₉}. Now, since both c₀and c₅are in R₁, the rendezvous transitions c₅c₂and c₀c₇become enabled and can be converted to internal transitions resulting in the template U₂, In U₂, control states c₂, c₄and c₇now become reachable and are therefore added to R₂resulting in R₃={c₀,c₂,c₄,c₅,c₆,c₇,c₈,c₉}. Finally, since both the control states c₄and c₆ε R₃, the rendezvous transitions c₆c₃and c₄c₁are converted to internal transitions resulting in the template U₃. Since c₁and c₃are reachable control locations of U₃, these control locations are now included in R₄thereby reaching a fixpoint and leading to termination of the procedure. Since c₁εR₄, we conclude that c₁is parameterized reachable. A formal description of a method A is given below. The method A returns the set of parameterized reachable control states of U.

METHOD A: Initialize i=0 and R₀={c₀}, where c₀is the initial state of U. Next, i=i+1. Construct PDS U_iby replacing each pairwise send (receive) transition of template U of the form ab, such that there exists a matching receive (send) transition of the form cd where cεR_i−1, by the internal transition ab and removing the remaining pairwise send or receive rendezvous transitions. Compute the set R_cⁱof context-free reachable control locations of U_iusing a procedure for model checking a single PDS. Set R_i=R_i−1∪R_cⁱ. Except for the initialization step, perform these steps until R_i⊂/R_i−1. Return R_i.

Complexity Analysis: We start by noting that in each iteration of the method A, we add at least one new control state to R_i. Thus, the method terminates in at most |Q| times, where Q is the set of control states of U. During the ith iteration we need to decide for each control state in Q, R_iwhether it is context-free reachable in U_i+1which, by using a model checking procedure for PDSs, can be accomplished in O(|U|³) time, where |U| is the size of U. Each step therefore takes at most O(|U|⁴) time. Thus, the entire method runs in O(|U|⁵). The Parameterized Model Checking Problem for control state reachability, and hence EF(c₁ . . . c_k) (data race), for systems composed from a template PDS U interacting via pairwise rendezvous can decided in O(|U|⁵) time, where |U| is the size of U.

Asynchronous Rendezvous: The procedure for deciding the PMCP for PDSs interacting via asynchronous rendezvous, which are more expressive than pairwise rendezvous, is essentially the same as the method A. A minor modification is needed to account for the slightly different semantics of an asynchronous rendezvous. The only difference is that an asynchronous send transition ab can be executed irrespective of whether a matching receive cd is present or not. A receive transition, on the other hand, does require a matching send to be currently enabled with both the send and receive transitions then being fired atomically. Now, constructing PDS U_i, in method A is modified as follows: We replace each asynchronous send transition of template U of the form ab, with the internal transition ab. On the other hand, to replace a receive transition of the form ab with the internal transition ab, we need to test whether there exists a matching send transition of the form cd with cεR_i−1. The remaining receive asynchronous rendezvous transitions are removed. The time complexity of the method remains the same.

Extension to Multiple Templates: To start with, R₀contains the initial control state of each of the templates U₁, . . . , U_m. The set R_inow tracks the union of parameterized reachable control states detected up to the ith iteration in any of the templates. Finally, in method A, for each 1≦j≦m we construct PDS U_jiby replacing each rendezvous send/receive transition ab in template U_jhaving an enabled matching receive/send transitions of the form cd in any of the templates, where cεR_i−1, with the internal transition ab.

Model Checking Procedure for L(F): From the given template U=(P,Act,Γ,c₀,Δ), we define the new template R=(P_R,Act,Γ,c₀,Δ_R), where P_Ris the set of parameterized reachable control states of U and Δ_Ris the set of transitions of U between states of P_Rwith each pairwise rendezvous send or receive transition converted to an internal transition. Let f be a formula of the form Eg(1,2), where g(1,2) is a double-indexed LTL\X formula with atomic propositions over U[1] and U[2]. Then, if we restrict reasoning about f to finite computation paths then for some n, Uⁿ|=E_fing if U_R²|=E_fing, where E_finquantifies only over finite paths.

The intuition behind the reduction of the PMCP to a 2-process instance is a flooding argument resulting from the unbounded multiplicity result. If f has a finite computation x of length l, say, as a model, then at most l pairwise send or receive transitions are fired along x. By the unbounded multiplicity lemma, for some m, there exists a computation y leading to a reachable state of U^m, for some m, with at least l copies of each control state of U_R. In a system with U^m+2processes, we first let processes U₃, . . . , U_m+2execute y to flood all control states of U_Rwith multiplicity at least l. Then, we are guaranteed that in any computation x of U[1,2] of length not more than l, the rendezvous transition can always be fired via synchronization with one of the processes U₃, . . . , U_m+1and can therefore be treated as internal transitions.

Thus we have: (Binary Reduction Result). For any finite computation x of Uⁿ, where n≧2, there exists a finite computation y of U_R²such that y is stuttering equivalent to x[1,2]. As an immediate corollary, it follows that if f has a model which is a finite computation of U^m, for some m, then for some k, U^k|=f if U_R²|=f. In particular:

Corollary For any formula f of L(F), for some in m, U^m|=f if U_R²|=f.

Note that the above result reduces the PMCP for L(F) for PDSs interacting via pairwise or asynchronous rendezvous to (standard) model checking of systems comprised of only two non-interacting PDSs which is known to be efficiently decidable. As a corollary, we have that the PMCP for L(F) is decidable in polynomial time in the size of U.

Computing Cutoffs: We say that cut is a cutoff for a temporal logic formula f and a parameterized family defined by a template U if for m≧cut, U^m|=f if U^cut|=f. The existence of a cutoff for a formula f is useful as it reduces the PMCP for f to a finite number of standard model checking problems for systems with up to the cutoff number of copies of U. Let B(F) be the set of branching time formulae built using the temporal operator AF, the boolean operators and , and atomic propositions. We show how to compute cutoffs for L(F) formulae and then extend this to handle B(F) formulae. One motivation for computing cutoffs is that it is a step in the decision procedure for the PMCP for L(G) formulae. One can, of course, use the cutoff approach to model check L(F) formulae also.

Cutoffs for L(F) formulae: We start by observing that the cutoff cut for a formula f of L(F) is related to the number of rendezvous transitions fired along finite computations satisfying f. Let x be a finite computation of Uⁿ, for some n, satisfying f. For each rendezvous transition tr of U, let n_trbe the number of times tr is fired along x[1,2]. We assume, without loss of generality, that each rendezvous send/receive transition tr has a unique matching receive/send transition, denoted by tr, in U. For each control state c, let Tr_cbe the set of pairwise rendezvous send or receive transitions tr of the form c→d such that tr is fired along x[1,2]. Also, for each control state c of U, let n_c=Σ_trεTr_cn_tr. Then, one can give a cutoff for f in terms of the values of n_c.

As a first step towards that direction, we show that if cut is such that there exists a reachable global state of U^cutwith at least n_ccopies of each control state c, then using a flooding argument we have: cut′+2 is a cutoff for f. Next, we estimate an upper bound for cut from n_c. We denote by i_c, the first iteration for the method A in which control state c of U was first added to R_i. Then, we have: U^m|=EF_cwhere m=2ⁱ^c. For each control state c of U let N_cbe a cutoff for EFc. Then cut≦Σ_cεRn_cN_c.

The problem of computing cut, thus reduces to computing bounds for n_trand N_c. We start with n_tr, the number of pairwise rendezvous transitions fired along a computation of U^m, for some m, satisfying the given L(F) formula. We first consider the case where an L(F) formula is single-index, i.e., atomic propositions are interpreted only over one process. For this, we assume, without loss of generality, that each control state of U is parameterized reachable else we simply remove unreachable states and the associated transitions. Furthermore, using the same flooding argument, we have that each control states of U can be flooded with arbitrary multiplicity. Thus, when reasoning about finite computations, we can treat each rendezvous transition as an internal transition. This eases analysis as instead of reasoning about the parameterized family {U}, it suffices to reason only about the single template U.

Computation of these bounds for PDSs is complicated by context-free reachability introduced by the stack. To handle that we leverage the notion of a Weighted Multi-Automaton (WMA) which is a Multi-Automaton (MA) with each of its transitions labeled with a non-negative integer. WMAs have been used before for dataflow analysis. However, they are employed here for a different purpose, e.g., for estimating a bound on the number of pairwise rendezvous transition fired in transiting between two control states. Intuitively, the weight labeling a transition s→t of a WMA indicates an upper bound on the number of rendezvous transitions that need be fired in order to get from s to t.

A Weighted Multi-Automaton (WMA) may be defined as follows. Given a PDS P=(P,Γ,c₀,Δ), a weighted multi-automaton is a tuple M=(Γ,Q,δ,w,I,F), where M′=(Γ,Q,δ,I,F) is a multi-automaton and w:δ→Z is a function mapping each transition of M to a non-negative integer. The weight of a finite path x of M is defined to be the sum of the weights of all the transitions appearing along x. Given states s and t of M, we use

to denote the fact that there is a path in M from s to t labeled with it and having weight b. To estimate a bound for the number of rendezvous transitions fired along a computation satisfying f, we proceed by constructing a WMA M_ffor f which captures the (regular) set of all configurations of U which satisfy f. Then, if b is the weight of an accepting path for (c₀, ⊥) in M_f, we show that there exists a path of U along which at most b pairwise rendezvous transitions are fired.

Since an L(F) formula is built using the operators F, and , in order to construct M_fit suffices to show how to construct WMAs for Fg, gh and gh, given WMAs for g and h. Then, given an L(F) formula f, repeated applications of these constructions inside out starting with the WMAs for the atomic propositions of f gives us M_f.

DEFINITIONS

multi-Automata: Let P=(P,Act,Γ,c₀,Δ) be a pushdown system where P={p₁, . . . , p_m}. A P-multi-automaton (P-MA for short) is a tuple A=(Γ,Q,δ,I,F) where Q is a finite set of states, δ⊂Q×Γ×Q is a set of transitions, I={s₁, . . . , s_m}⊂Q is a set of initial states and F⊂Q is a set of final states. Each initial state s_icorresponds to the control state p_iof P.

We define the transition relation →⊂Q×Γ*×Q as the smallest relation satisfying the following:

- if (q,γ,q′)εδ then qq′,
- qq for every qεQ, and
- if qq″ and q″q′ then qq′.

A multi-automaton can be thought of as a data structure that is used to succinctly represent (potentially infinite) regular sets of configurations of a given PDS. Towards that end, we say that multi-automaton A accepts a configuration (p_i,w) if s₁q for some qεF. The set of configurations recognized by A is denoted by Conf(A). A set of configurations is regular if it is recognized by some MA.

Alternating Pushdown Systems: Let P=(P,Act,Γ,c₀,Δ) be a pushdown system. An APDS is a five-tuple P=(P,Γ,Δ), where P is a finite set of control locations, Γ is a finite stack alphabet, and Δ⊂(P×Γ)×2^(P×Γ*⁾is a finite set of transition rules. For (p,γ,S)εΔ, each successor set is of the form {(p₁,w₁), . . . , (p_n,w_n)}εS denotes a transition of P and is denoted by (p,γ){(p₁,w₁), . . . , (p_n,w_n)}. Due to non-determinism there may be multiple successor sets for each pair of control state p and stack alphabet γ all of which are captured by the set S. A configuration of P is a pair (p,w), where pεP denotes the control location and wεΓ* the stack content. The set of all configurations of P is denoted by C. If (p,γ){(p₁,w₁), . . . , (p_n,w_n)}, then for every wεΓ* the configuration (p,γw) is an immediate predecessor of the set (p₁,w₁w, . . . , p_n,w_nw), this set being called the immediate successor of (p,γw). We use → to denote the immediate successor relation, Note that firing the transition (p,γ){(p₁,w₁), . . . , (p_n,w_n)}, from configuration (p,γw) causes the APDS to branch into the configurations (p₁,w₁w, . . . , p_n,w_nw).

A run of P for an initial configuration c is a tree of configurations with root c, such that the children of a node c′ are the configurations that belong to one of its immediate successors. We define the reachability relation ⊂(P×Γ*)×2^P×Γ* between configurations and sets of configurations. Informally cC if and only if C is a finite frontier of a run of P starting from c. Formally, is the smallest subset of (P×Γ*)×2^P×Γ* such that

- c{c} for every c ε P×Γ*
- if c is an immediate predecessor of C, then cC,
- if c{c₁, . . . , c_n} and c_iC_ifor each 1≦i≦n, then c(C₁∪ . . . ∪C_n).

Alternating Multi-Automata: Let P=(P,Γ,Δ) be an APDS system where P={p₁, . . . , p_m}. An alternating P-multi-automaton (P-AMA for short) is a tuple A=(Γ,Q,δ,I,F) where Q is a finite set of states, δ⊂Q×Γ×2^Qis a set of transitions, I={s₁, . . . , s_m}⊂Q is a set of initial states and F⊂ Q is a set of final states. We define the transition relation →⊂Q×Γ*×2^Qas the smallest relation satisfying the following:

- if (q,γ,Q′)εδ then qQ′,
- q{q} for every qεQ, and
- if {q₁, . . . , q₁₁} and for each 1≦i≦n q₁Q₁then (Q₁∪ . . . ∪Q₁₁).

AMA A accepts a configuration (p_i,w) if s_iQ for some Q⊂F. The set of configurations recognized by A is denoted by Conf(A). Given a finite sequence wεΓ* and a state qεQ, a run of A over w starting from q is a finite tree whose nodes are labeled by states in Q and whose edges are labeled by symbols in Γ such that the root is labeled by q and the labeling of the other nodes is consistent with δ. Observe that in such a tree each sequence of edges going from the root to the leaves is labeled with w. A set of configurations is regular if it is recognized by some AMA.

Weighted Automaton for . Let M₁=(Γ,Q₁,δ₁,w₁,I₁,F₁) and M₂=(Γ,Q₂,δ₂,w₂,I₂,F₂) be two WMAs. Then, we can construct a WMA M accepting the union of configurations accepted by M₁and M₂by first renaming each initial state s of M₁as s′ and each initial state s of M₂as s″. Then we define a Multi-Automaton M=M₁M₂via the standard union construction M=(Γ,Q₁∪Q₂,δ₁∪δ₂∪δ₁₂,w,I,F₁∪F₂), where for transition trεR_i, δ(tr)=δ₁(tr), δ(q₀q₁)=0 and δ(q₀q₂)=0; I is the set of newly introduced initial states s₁, . . . , s_mcorresponding to control states p₁, . . . , p_mof the template U and δ₁₂is the set of zero weight transitions ∪₁{S₁s′_land s₁s^m_l}.

Weighted Automaton for . Let M₁=(Γ,Q₁,δ₁,w₁,I₁,F₁) and M₂=(Γ,Q₂,δ₂,w₂,I₂,F₂) be two WMAs. Then, we can construct a WMA M accepting the intersection of M₁and M₂via the standard product construction M=(Γ,Q₁×Q₂,δ,w,I₁×I₂,F₁×F₂), where (s₁, s₂)(s₃, s₄)εδ if (s₁s₃) and (s₂s₄) and w is the maximum of w₁and w₂. The state (s_i,s_i) is renamed as s_iin order to ensure that for each control state p_iof U there is an initial state of M.

Weighted Multi-Automaton for Fg: Let M₀be a given WMA accepting the set of regular configurations of U satisfying g. Starting at M₀, we construct a series of WMAs M₀, . . . , M_mresulting in the WMA M_m. We recall from the definition of an MA that for each control state p_iof U there is an initial state s_iof M₀. We denote by →_kthe transition relation of M_k. Then, for every k≧0, M_k+1is obtained from M_kby conserving the set of states and adding new transitions as follows: (i) For each internal transition p_i→p_j, we add the transition s_is_jwith weight 0. (ii) For each pairwise rendezvous send or receive transition p_i→p_j, we add the transition s_is_jwith weight 1. (iii) For each stack transition p_ip_jof U, if there exists a path x in M_kfrom state s_jto t labeled with u, we add the transition s_it, where w_uis the sum of the weights of the transitions occurring along x. Note that if there exists more than one such path we may take w_uto be the minimum weight over all such paths.

For configurations s and t of U, let s_≦bt denote the fact that there is a path from s to t along which at most b pairwise rendezvous transitions are fired. Then, we have: If s_j₁q, then (p_j,w)_≦b(p_k,v) for some p_kand v such that s_k₀q, where b=b₁+b₂. Moreover if q is the initial state s, then p_k=p_land v=ε. The constructions of WMAs for fg and fg are similar to the standard union and intersection construction for automata.

Given an L(F) formula f, we first construct a WMA for each atomic proposition of f by constructing an MA for the atomic proposition and setting the weights of all its transitions to 0. Next, we perform the above operations by traversing the formula f inside out starting from the atomic propositions. Let M_fbe the resulting WMA. Using the above result, we let configuration (q,u) of U be accepted by M_fand let b be the weight of an accepting path of M starting from q and labeled with u. Then there exists a finite path of U starting from (q,u) and satisfying f such that at most b pairwise rendezvous transition are fired along it.

Doubly-indexed Formulae: We reduce the problem of computing cutoffs for double-indexed L(F) formulae f to single-index ones, Without loss of generality, each atomic proposition of f can be assumed to be of the form c or c, where c is control location of U. Rewriting c as the disjunction all the (finitely many) control states of U other than c, we can remove all negations from f. Let f=Eg. Then, by driving up the operator in g as far as possible we can write g=g₁ . . . g_k, where for each i, g_idoes not contain the operator. Then, the minimum of the cutoffs for Eg₁, . . . , Eg_kis a cutoff for Eg. Thus, it suffices to compute cutoffs for Eg(1,2), where g(1,2) is a formula built using F, , but without , and atomic propositions that are control states of U[1] and U[2] in Uⁿ.

Note that with g(1,2), we can associate a set Seq of finite sequences of ordered pairs of the form (c_i,d_j), where c_i(d_i) is either true or a control state of U[1](U[2], etc. respectively) occurring in g(1,2), which capture all possible orders in which global states satisfying c_id_ican appear along computation paths satisfying g(1,2). For example, with the formula c₀¹F(c₁⁰c₂¹)Fc₃², where c_i^jis true if U[j] is currently in local control state c_i, we can associate the sequences. (c₀¹,true),(true,c₃²),(c₁⁰,c₂¹) and (c₀¹,true),(c₁⁰,c₂¹),(true,c₃²). Thus Uⁿ|=Eg(1,2) if there exists a sequence π:(c₁,d₁), . . . , (c_k,d_k) in Seq and a computation path x along which there exists global states satisfying c₁d₁, . . . , c_kd_kin the order listed, viz., x:g_π=c₁d₁F(c₂d₂F( . . . )). Then the minimum of the cutoff bounds for f_π, where πεSeq gives us the desired cutoff. Finally, the computation of cutoff bounds for f_π can be reduced to those of single-index L(G) formulae via the following result. Given a formula f=c₁d₁F(c₂d₂F( . . . )), the sum of the cutoffs for f₁=c₁F(c₂F( . . . )) and f₂=d₁F(d₂F( . . . )) is a cutoff bound for f.

Computing N_c. Given a control state c of U we now show how to compute N_c, viz., a cutoff for EFc. Let c be first included in R_iin the ith iteration in the method A. The computation of N_cis by induction on i. If i=0, viz., c is the initial state of U, N_c=1. Now, assume that N_cis known for each cεR_i, where i>0. Let cεR_i+1, R_i. Then, there is a path of U_i+1comprised of states of R_ileading to c. Using WMA, we can, compute for each rendezvous transition tr, a bound on n_tr, the number of times tr is fired along a path of U_i+1satisfying EFc. Also, since by the induction hypothesis, we know the value of N_cfor each c of R₁, a cutoff for EFc can be determined thus completing the induction step.

Example for computing cutoffs: cut+2 is a cutoff for f. Once all the control states c are flooded using processes U₃, . . . , U_k+2, processes U₁and U₂can execute x[1,2] wherein each rendezvous transition fired by U₁or U₂synchronizes with one of the processes U₃, . . . , U_k+2. For control state c of U, let N_cbe the cutoff for EFc, then cut′≦Σ_cεRn_cN_c. Since N_cis a cutoff of EFc, there exists a computation x_cof U^N^cleading to a global state with a process in local state c. To get a global state with at least n_ccopies of c, we let processes U[1], . . . , U[N_c] of Uⁿ^c^N^cexecute x_cto reach a global state s₁with at least one copy of c. Next, starting at s₁, we let processes U[N_c+1], . . . , U[N_c+n_c] execute x_cto reach a global state s₂with at least two copies of c. Repeating this process n_ctimes results in a global state s_n_cwith at least n_ccopies of c. Repeating this process for each control state c, then gives us the desired result.

Cutoffs for B(P). For generating cutoffs for B(F) formulae, we start by recalling the standard procedure for model checking PDSs for mu-calculus formulae. We first take the product of the given PDS U with an alternating automaton/tableaux for the given formula f. Such products can be modeled as Alternating Pushdown Systems (APDSs). Then, model checking for f reduces to a pre*-closure computation for regular sets of configurations of the resulting APDS. These regular sets can be modeled as Alternating Multi-Automaton (AMA).

The procedure for computing cutoffs for B(F) formulae is similar to that for L(F), the only difference being that we use Weighted Alternating Multi-Automaton (WAMA) to capture the branching nature of the formulae where a state can now have a set of successors instead of just one. Thus, in a standard AMA each transition is a member of the set (Q×Γ)×2^Q. Note that since f is a branching time property, the model for f is a computation tree of U. Thus while performing the pre*-closure computation, we need to keep track of the number of pairwise rendezvous fired along each branch of the computation trees encountered thus far. However, the number of pairwise rendezvous fired along different branches of a computation tree might, in general, be different and hence for each state outgoing transition needs to be assigned a different weight. Thus, each transition is now a member of the set (Q×Γ)×2^Q×Z.

Weighted Alternating Multi-Automaton (WAMA): Given a PDS P=(P,Γ,c₀,Δ), a WAMA is a tuple M=(Γ,Q,δ,I,F), where δ⊂(Q×Γ)×2^Q×Zand M′=(Γ,Q, δ′,I,F) is an AMA where
δ′={(s,{t₁, . . . , t_m}|(s,{(t₁,w₁), . . . , (t_m,w_m)})εδ}.

Updating the weights during the pre*-closure to compute the WAMA for AFg given the WAMA for g can be carried out in a similar fashion as for L(F) formulae the only difference being that the weights need to be updated for each successor. Let M₀be a given WAMA accepting a set of regular configurations. Starting at M₀, we construct a series of WAMAs M₀, . . . , M_presulting in the WAMA M_p. We denote by →_k, the transition relation of M_k. Then for every k≧0, M_k+1is obtained from M_kby conserving the set of states and adding new transitions as follows: (i) For each internal transition p_ip_j, we add the transition s_is_jwith weight 0. (ii) For each pairwise rendezvous send or receive transition p_ip_j, we add the transition s_is_jwith weight 1. (iii) For each transition (p_j,γ)→{p_j,w₁), . . . , (p_k_m,w_m)} and every set s_k₁_i{(p₁₁, b₁₁), . . . , (p_1i₁, b_1i₁)}, . . . , s_k_m_j{(p_m1, b_m1), . . . , (p_mi_m, b_mi_m)}, we add a new transition s_j_i+1{(q₁, b₁), . . . , (q₁, b₁)}, where for each j, b_jis the maximum of all b_rjwhere p_rj=q_j.

The Model Checking Procedure for L(G): Reasoning about a double-indexed LTL formula with infinite models is in general harder than the ones with finite models. This is because one has to now ensure that there exists an infinite computation of U^m, for some m, along which rendezvous transitions cannot only be recycled infinitely often but can be done so while maintaining context-free reachability. However, we exploit the fact that the dual of an L(G) formula is of the form Ag, where g is built using the temporal operator F and the boolean connectives and . Such formulae have finite tree-like models. However, note that ∃n:Uⁿ|=f if ∀n:Uⁿ|=f. Thus, if we resort to the dual of f, the resulting problem is no longer a PMCP. A method for the PMCP for L(G) is then the following: 1. Given an L(G) formula f, construct a B(F) formula g equivalent to f, viz., Uⁿ|=f if U^m|=g. 2. Compute the cutoff cut for g. 3. For each m≦cut, check if U^m|=g.

The procedure for computing cutoffs for B(F) formulae was given above. For step 3, it suffices to check whether for each m≦cut, U^m|=g, where f=g is an L(G) formula. But the model checking problem for L(G) formula for systems with a finite number of PDSs interacting via pairwise or asynchronous rendezvous is already known to be decidable. Thus, it can be shown how to construct a B(F) formula g equivalent to f.

Disjunctive Guards: we consider PMCP for PDSs interacting via disjunctive guards. Here transitions of U are labeled with guards that are boolean expressions of the form (c₁ . . . c_k) with c₁, . . . , c_kbeing control states of U. In copy U[i] of template U in Uⁿ, a transition ab of U is rewritten as a transition of the form ab. In Uⁿsuch a transition of U[i] is enabled in a global state s of Uⁿif there exists a process U[j] other than U[i] in at least one of the local states c₁, . . . , c_kin s. Concurrent systems with processes communicating via boolean guards are motivated by Dijkstra's guarded command model. The PMCP for finite state processes communicating via disjunctive guards was shown to be efficiently decidable. As for pairwise rendezvous, the unbounded multiplicity result holds. Then, as before, the set of parameterized reachable control states can be computed efficiently. The procedure is similar to the one for PDSs interacting via pairwise rendezvous except that when constructing R_i+1from R_i, in order to handle the synchronization constraints, we convert all transitions of the form ab, where for some jε[1 . . . k]:c_jεR_i, to an internal transition of the form ab. This is motivated by the fact that since c₁is parameterized reachable, transition ab can always be enabled by ensuring, via the unbounded multiplicity result and a flooding argument, that for some j, there exists a process in local state c_j. We get: the Parameterized Model Checking Problem for control state reachability, for systems composed from a template PDS U interacting via disjunctive guards can be decided in O(|U|⁵) time, where |U| is the size of U.

PMCP for Linear Time Formulae: Let U_Rbe the PDS that we get from U by retaining only the parameterized reachable control states and all transitions between them that are either internal or labeled with a disjunctive guard which has at least one of the parameterized reachable control states as a disjunct. However, we replace each such transition labeled with a disjunctive guard with an internal transition making U_Rnon-interacting. We first show via a flooding argument that for any double-indexed LTL\X formula g, for some n, Uⁿ|=Eg if U_R²|=Eg. By the unbounded multiplicity property, for some m, there exists a computation y leading to a global state of U^mwith at least one copy of each parameterized reachable control state of U. In a system with U^m+2processes, we first let processes U₃, . . . , U_m+2execute y to flood all control states of U with multiplicity at least one. Then we are guaranteed that in any computation x (finite or infinite) of U[1,2], the transitions labeled with disjunctive guards can always be fired as there exists a process among U₃, . . . , U_m+1in each of the reachable control states. All such transitions can therefore be treated as internal transitions.

Binary Reduction Result: For any computation x of Uⁿ, where n≧2, there exists a computation y of U_R²such that y is stuttering equivalent to x[1,2]. Note that the above result reduces the PMCP for any doubly-indexed LTL\X formula f to model checking for a system with two non-interacting PDSs for f. It follows that we need to consider only the fragments L(F) and L(G). For these fragments the problem of model checking a system with non-interacting PDSs is already known to be efficiently decidable. Thus, the PMCP for PDSs interacting via disjunctive guards is efficiently decidable for the fragments L(F) and L(G) and undecidable for the fragments L(U) and L(F,G).

Locks: We consider the PMCP for PDSs interacting via locks. Leveraging the cutoff result determined above, we have that for n≧2, Uⁿ|=f if U²|=f, where f is a doubly-indexed LTL\X formula. This reduces the problem of deciding the PMCP for f to a (standard) model checking problem for a system comprised of two PDSs interacting via locks.

We now distinguish between nested and non-nested locks. A PDS accesses locks in a nested fashion if it can only release the last lock it acquired and that has not been released. For a system with two PDSs interacting via nested locks, the model checking problem for systems with two PDSs is known to be efficiently decidable for both fragments of interest, viz., L(F) and L(G). So, the PMCP for L(F) and L(G) for PDSs interacting via nested locks is decidable in polynomial time in the number of control states of the given template U and exponential time in the number of locks of U.

For the case of non-nested locks, we show that the PMCP is decidable for L(G) but undecidable for L(F). For L(F) the undecidability result follows by reduction from the problem of model checking a system comprised of two PDSs P₁and P₂interacting via non-nested locks for the formula EF(c₁c₂) which is known to be undecidable.

The PMCP for EF(c₁c₂), and hence L(F), is undecidable for PDSs interacting via non-nested locks. For L(G), it can be shown that the problem of model checking a system with PDSs interacting via locks for an L(G) formula f can be reduced to model checking an alternate formula f′ for two non-interacting PDSs. Given a template U interacting via the locks l₁, . . . , l_k, we construct a new template V with control states of the form (c,m₁, . . . , m_k). The idea is to store whether a copy of U is currently in possession of lock l_iin bit m_iwhich is set to 1 or 0 accordingly as U_iis in possession of l_ior not, respectively. Then, we can convert V into a non-interacting PDS by removing all locks from V and instead letting each transition of V acquiring/releasing l_iset m_ito 1/0. However, removing locks makes control states which were mutually exclusive in U²simultaneously reachable in V². In order to restore the lock semantics, while model checking for an L(G) property of the form Eg, we instead check for the modified L(G) property E(gg′), where g′=G(_i(m_i¹m_i²)) with atomic proposition m_i^jevaluating to true in global state s if in the local control state (c,m_i^j, . . . , m_k^jprocess V_jin s, m_i^j=1. Note that g′ ensures that in the control states of V₁and V₂, for each i, the m_i-entry corresponding to lock l_icannot simultaneously be 1 for both V[1] and V[2], viz., U[1] and U[2] cannot both hold the same lock l_i. Then, the problem reduces to model checking two non-interacting PDS for L(G) formulae which is known to be decidable. This gives that the PMCP for L(G), is efficiently decidable for PDSs interacting via non-nested locks.

Broadcasts: We consider the PMCP for PDSs communicating via broadcasts. Here, Σ, the set of action symbols of U, is comprised of the set Σ_inof internal transition labels; and the sets Σ_pr×{!!} and Σ_pr×{??} of send and receive broadcasts transitions, respectively. Like asynchronous rendezvous, a broadcast send transition that is enabled can always be fired. A broadcast receive transition can only be fired if there exists an enabled matching broadcast send transition. Broadcasts differ from asynchronous rendezvous in that executing a broadcast send transition forces not merely one but all other processes with matching receives to fire. It can be shown that for PDSs interacting via broadcasts, the PMCP for pairwise reachability, viz., EF(c₁c₂) is undecidable. The undecidability result for L(F) then follows as an immediate corollary. The PMCP for L(F) is undecidable for PDSs interacting via broadcasts.

We consider the PMCP for PDSs interacting via each of the standard synchronization primitives for a broad class of temporal properties. Specifically we have delineated the decidability boundary of the PMCP for PDSs interacting via each of the standard synchronization primitives for doubly-indexed LTL. We have also demonstrated that in many important cases of interest the PMCP is more tractable than the standard model checking problem. The practical implications of the new results are that in many applications like Linux™ device drivers, it may be more useful to consider the PMCP than the standard model checking problem.

Having described preferred embodiments of a system and method for inter-procedural dataflow analysis of parameterized concurrent software (which are intended to be illustrative and not limiting), it is noted that modifications and variations can be made by persons skilled in the art in light of the above teachings. It is therefore to be understood that changes may be made in the particular embodiments disclosed which are within the scope and spirit of the invention as outlined by the appended claims. Having thus described aspects of the invention, with the details and particularity required by the patent laws, what is claimed and desired protected by Letters Patent is set forth in the appended claims.

Claims

1. A method implemented by a computer for computing dataflow in concurrent programs of a computer system, comprising:

given a family of threads (U1,..., Um) and a Linear Temporal Logic (LTL) property, f, for a concurrent program, computing a cutoff for the LTL property, f, where c is called the cutoff if for all n greater than or equal to c, Un satisfies f if Uc satisfies f, the cutoff being computed using weighted multi-automata for internal transitions of the threads; and

model checking a cutoff number of processes to verify data race freedom in the concurrent program.

2. The method as recited in claim 1, wherein the step of model checking includes establishing data race freedom in a concurrent program with at least two distinct drivers, each running respective threads, by establishing data race freedom in a parameterized system comprised of a plurality of copies of the respective threads.

3. The method as recited in claim 1, wherein the step of model checking includes establishing data race freedom in an undecidable concurrent program by establishing data race freedom in a parameterized system including the undecidable concurrent program.

4. The method as recited in claim 1, wherein the threads are modeled as push down systems (PDSs).

5. The method as recited in claim 1, wherein the threads interact with each other using synchronization primitives.

6. The method as recited in claim 1, wherein the synchronization primitives include at least one of pairwise rendezvous, asynchronous rendezvous, disjunctive guards, broadcasts, nested locks and non-nested locks.

7. The method as recited in claim 1, wherein f is a double-indexed LTL formula.

8. The method as recited in claim 1, wherein using weighted multi-automata for internal transitions of the threads includes estimating a bound on a number of transitions fired in transit between two control states.

9. A method implemented by a computer for computing dataflow in a computer program of a computer system, comprising:

given a family of threads modeled as pushdown systems (U1,..., Um) which interact by synchronization primitives and a Linear Temporal Logic (LTL) property, f, for a concurrent program, computing a cutoff for the LTL property, f, by computing bounds on a number of transitions fired along a computation of a thread between reachable control states of the concurrent program where the bounds are computed using weighted multi-automata on internal transitions of the threads; and

model checking a cutoff number of processes by parameterized model checking to verify data race freedom in the concurrent program.

10. The method as recited in claim 9, wherein the step of model checking includes establishing data race freedom in a concurrent program with at least two distinct drivers, each running respective threads, by establishing data race freedom in a parameterized system comprised of a plurality of copies of the respective threads.

11. The method as recited in claim 9, wherein the step of model checking includes establishing data race freedom in an undecidable concurrent program by establishing data race freedom in a parameterized system including the undecidable concurrent program.

12. The method as recited in claim 9, wherein the synchronization primitives include at least one of pairwise rendezvous, asynchronous rendezvous, disjunctive guards, broadcasts, nested locks and non-nested locks.