Workflow auto generation from user constraints and hierarchical dependence graphs for workflows
A system and method of modeling and evaluating workflows that provides workflow auto generation and Hierarchical Dependence Graphs for workflows. Modeling and evaluation of workflows is accomplished by accessing a knowledge database 2 containing service descriptions, generating valid workflows models 4, simulating workflow 6 and obtaining customer requirements through a Graphical User Interface 8. This system and method generate and display workflows that satisfy a users requirements. In addition, Hierarchical Dependence Graphs provide abstract views that provide additional analysis and control of workflow.
Latest Patents:
This disclosure relates to Workflow Auto Generation and Workflow Analysis. It finds particular application in conjunction with workflow as related to printing jobs, and will be described with particular reference thereto. However, it is to be appreciated that the embodiments illustrated herein are also amenable to other like applications.
Workflow-based businesses rely heavily on their ability to effectively compete in and control existing and emerging workflows. Given the heterogeneity of the space, integration of these heterogeneous distributed systems is a considerable challenge and is fast becoming a critical factor of success in the business. In addition to the multiplicity of systems, customers are demanding customization and flexibility for their workflows. As a result, automation of the integration and deployment of workflows today means considerable competitive advantage. Effective modeling is key part of an overall workflow automation strategy.
Current workflow modeling technologies and tools enable clever visualization and some analysis capability. However, their effectiveness relies heavily upon the idiosyncratic knowledge and expertise of the person doing the modeling. That is, it is a highly manual and cumbersome effort and yields results only as good as the intuition and skill of the particular modeler.
Another aspect of this disclosure relates to Hierarchical Dependence Graphs for Dynamic JDF workflows. JDF is a Job Definition Format proposed by an industry consortium body CIP4, which affects every aspect involved in the creation and production of printing from pre-press, press to post-press. JDF provides a common language for describing a print job across enterprises, department, software and systems. It also provides a basis for workflow automation that incorporates human, machine and computer. But JDF itself is not an explicit workflow specification language. Instead, the JDF workflow is implicitly described as a job description that contains a collection of process nodes. The execution sequence of process nodes of a job description is implicitly defined as resource dependences across process nodes. JDF leaves the issues of how to drive the sequence of process flow unspecified and completely up to the implementation of MIS or Controller components in JDF-based system. However, in most existing implementations, either a JDF workflow is hard-coded within the implementation, or a limited set of static JDF workflows are supported. In order to facilitate a fully dynamic JDF workflow, the dependences among process nodes and resources should be expressed and tracked explicitly, and also should be decoupled completely from the implementations.
The Hierarchical Dependence Graph (HDG) of this disclosure extends the theory of directed acyclic graph (DAG) by allowing hierarchical representation of workflows. It can be used to explicitly express the dependences across JDF (process) nodes and resources derived from any JDF job description. It defines a flexible and semantic-rich model to represent JDF workflow as a set of DAGs at different abstractions: intent level, process group levels and process execution level. By explicitly representing JDF workflows in the HDG, it not only enables the separation of the workflow itself from MIS/or Controller implementations to support fully dynamic JDF workflows, but also it provides a theoretic basis for formal analysis of JDF workflows.
Furthermore, this disclosure introduces the concept of Connectivity Matrixs and their transformations to allow two views derived from a single model: process-centric view and resource-centric view. By exploiting the fact that each of these views is a DAG with a hierarchical structure, it is possible to show various analytical properties defined for DAG and recursively analyze JDF workflows, particularly in the following perspectives:
-
- Validating the JDF workflow is a valid workflow without any cyclic dependence, missing resources, dangling resources or nodes.
- Identifying the impacted JDF nodes or resources due to the availability and workflow status changes
- Intelligently handling failures or exceptions by considering the root causes of failures or exceptions rather than the static dependence pre-defined in a given workflow model.
The key innovations are primarily two-fold: (1) extending DAG (directed acyclic graph) with a hierarchical structure which results in a novel graph structure HDG (hierarchical dependence graph); and (2) using multiple orthogonal HDGs to explicitly describe the dependencies between workflow components, which eventually enables dynamic workflows, such as JDF.
BRIEF DESCRIPTIONIn accordance with one embodiment of the disclosure, a workflow auto generation system is disclosed. The workflow auto generation system comprising a knowledge database containing service descriptions; a workflow modeling inference engine that generates valid workflow models by matching connectivity between various services in the knowledge base; a simulator performing a simulation of each workflow; and a Graphical User Interface to obtain customer requirements and display views of the workflows.
In accordance with another embodiment of the disclosure, a method of auto generating workflow is disclosed. The method of auto-generating workflow comprising accessing a knowledge database containing service descriptions; generating a workflow model using a workflow modeling simulation engine to match connectivity between various services in the knowledge base; simulating each workflow; obtaining customer requirements through a Graphical User Interface; and displaying views of the workflow through said Graphical User Interface.
In accordance with another embodiment of the disclosure, a workflow auto generation system is disclosed. The workflow auto generating system comprising means for accessing a knowledge database containing service descriptions; means for generating a workflow model using a workflow modeling simulation engine to match connectivity between various services in the knowledge base; means for simulating each workflow; means for obtaining customer requirements through a Graphical User Interface; and means for displaying views of the workflow through the graphical user interface.
In accordance with another embodiment of the disclosure, a workflow analysis and control system is disclosed. The workflow analysis and control system comprising a workflow client service, providing a description of various print jobs to be executed; a workflow analysis service, performing a Hierarchical Dependence Graph representation and analysis of a workflow, including process and resource dependences; and a workflow orchestrator, controlling the execution of said print jobs, wherein the workflow client service provides input to the workflow analysis service and the workflow analysis service provides input to the workflow orchestrator.
BRIEF DESCRIPTION OF THE DRAWINGS
The present disclosure provides a formal way of modeling and evaluating workflows, which obviates the current intuitive, trial-and-error approach. It is a technique for dynamically auto-generating all valid workflow models from a given set of functional requirements and determining optimal workflows based upon varying sets of user-specified parameters. It replaces a cumbersome manual effort of trial and error. Workflow service descriptions containing functional attributes, which describe control and data interfaces, and non-functional attributes which describe service features and performance metrics, are stored in a logical database. Logically valid workflows are then generated by using a formal mechanism called Petri Nets. The valid workflows are evaluated against user-defined metrics to determine optimal workflows. The user-defined metrics are obtained by a questions-generation mechanism. Possible-workflows are visualized in various views using auto-graph layout techniques to bridge the gap between user functional requirements and vendor product offerings.
In order to dynamically generate workflows, the detailed service or process descriptions specifying their capability, recognition and interfaces are stored in the logical database. This information is used to create valid connectivity between various services. These descriptions are obtained from XML-based service interface descriptions such as the Job Definition Format (JDF). They contain information such as service control and data interactions and device associations. The control interaction describes the communication mechanism used by a service for control. The data interaction describes the communication mechanism essential for data exchange, such as the type of data structures, data types and data sizes. The control, data and other parameters that specify functionality can be categorized as functional attributes. Parameters specifying the device metrics, such as cost, QOS, availability and throughput can be categorized as non-functional attributes. The valid workflows generated contain all the services, which meet both the user's functional and non-functional requirements.
Petri Nets are used in this disclosure to represent distributed asynchronous systems operating concurrently. When a workflow is mapped to a Petri Net, certain properties dealing with the correctness of workflows, such as deadlocks, liveliness, and bounded-ness, can be verified using graph analysis. The performance of workflows can also be simulated by colored and timed Petri Nets. Colored Petri Nets enable consideration of various job types and resource availability. Timed Petri Nets can be used to model workflows in which various services are dependent on time. In addition, hierarchical Petri Nets can also be used for modular and top-down representations of systems.
The primary requirement of a dynamically generated workflow model is correctness. Though functional programming languages are good at structuring programs and making them run efficiently, they are not so good at checking for a logical or syntactical correctness. Logic programming, due to its built-in support for non-determinism and unification, can be used to explicitly and thoroughly check the correctness of generated workflow models. Since the results are generated dynamically, inclusion of new components does not impact the original program.
Generated workflows are in the form of a formal edge vertex notation, and they require some graph auto-layout techniques to visualize the workflows. The generated workflows must have no edge crossings and must be symmetrical and evenly spread across a given area.
With reference to
Service Descriptions in the Knowledge Base.
The knowledge base 2 contains descriptions of the available services, with detailed descriptions of the service parameters. The service parameters are obtained from XML-based JDF and other capability description formats. The service structure is shown below.
The refID is a unique ID representing a service. The list of input and output constraints consists of the control inputs and outputs that a service accepts or could connect to. For example, (control_port,tcp_ip,2) for InboundConstraints implies that a service accepts two TCP\IP connections at a time. The number 2 is the cardinality specifying the number of services that the service can handle. Similarly, (data_format,pdf,1) for DataInputConstraints implies that a service can accept one pdf document at a time. The attributes contain a list of service-centric parameters, such as service delay, or may include additional service-specific constraints. Service_Details contains additional service-specific information such as name, version etc. The prodID refers to the product the service maps to. The product contains a unique prodID and device-centric parameters such as the manufacturer, version, cost, etc. The service has an n-to-n relationship with products.
With reference to
The inference engine initially generates a permutation of valid paths by matching valid service and user requirements, and then generates all the combinations of all the valid paths which would generate possible workflow structures. With reference to
- WF=A ([ ], [B, C]), B ([A], [D]), C ([A], [D]), D ([B, C], [E]), E([D], [ ])
The above structure specifies that there are five services, namely A, B, C, D, E. The initial list in each tuple specifies the input services and the second one specifies the list of output services. This can be visually depicted as illustrated in
Here A has an output cardinality of 2, indicating that A can connect to two services-B and C.
The above representation could be extended to have tuples for each service represent branching conditions and iteration. For example, for service D, D(j,b,i) could represent the joining condition, branching condition and the number of iterations. Using the previous representation D(OR,AND,0) would mean that D has an OR join, an AND branch and no loops allowed.
The above workflow structure illustrated in
As represented in
Token/Job Representation in Prolog:
Marking gives the state of the current Petri Net; it is a vector of all the places in the Petri Net.
Example of an initial marking indicating the state of the Petri Net with four places and two jobs:
- marking([p1[(job1,0,(0,0)),(job2,0,(0,0))]), (p2,[ ]),(p3,[ ]),(p4,[ ])])
Here transitions are associated with a certain delay, and there is a global clock through which the delays of active various transitions get updated and know when to fire. The transition fires when the tokens are available, and the tokens wait for a certain time in their previous place (which is equal to the delay of the transition). When there is a branch with more than one transition and one token becomes available, the transition which gets ready to fire, after elapsing a certain time, fires first. When the delay of two transitions is equal, then both transitions fire simultaneously.
The sum of delay of the two transitions is equal to the service delay. Each job contains the global time, time spent in the net and time spent at each transition to fire. Based on the number of jobs in the initial place and the total and average times taken by the Petri Net-based workflow, the total throughput is obtained. The cost function is a summation of the cost of each product. The cost could be extended to be a function of the resource utilization.
In order to perform real-time workflow simulation, there might be various types of resources needed in order to process a single job, resources that have to be shared, and also various types of jobs that have to be processed. The timed Petri Nets could be extended to implement resource sharing. The colored Petri Nets allow modeling of systems when there are different types of jobs and different types of tokens. The process could also contain a queue of jobs.
Many available Petri Net tools could also do Petri Net simulation by generating a Petri Net Markup Language (PNML), which is a Work Flow Management Coalition Standard adopted by many analysis and simulation engines.
In order to gather the workflow functionality requirements from the user, required attributes of services are selected directly on the GUI, or the user can respond to questions generated by an automated question-generation module. The questions eventually narrow down the set of workflows.
The automated question-generation module, represented in
The user can also directly select the service constraints in the user interface. Service constraints are grouped based on their constraint type. All valid workflows containing the required specifications are obtained.
As illustrated in
The workflow structure obtained from the workflow modeler in Prolog could be converted to a nested list structure to indicate branching and joining.
Any two-dimensional acyclic and planar workflow can be represented as a nesting of lists. A list is an ordered sequence of elements of any length separated by commas altogether enclosed in square brackets. The elements can be a single service or another list of services. A service can connect to a number of services if their functional attributes and cardinality (number of services it can connect to) match.
For example, the workflow illustrated in
With reference to
JDF workflow is specified through a hierarchical job tree structure, which describes all the production processes and material types likely to encounter in XML. It contains two basic elements: JDF nodes and resources, which are strung together through the resource input/output linking to meet the requirements of each workflow job. Depending on various needs, JDF node can be any of the following types: Product node (intent-level), Process Group node, and Process node (execution-level). Typically, MIS/or Controller needs to map any Product node into Process node(s) which then could be ultimately routed to a destination for execution. A process is an operation performed on digital data, such as a PDF file or an imposition design, or on a physical item, such as a lift of paper. A resource is the digital data or physical item itself. The output of one process becomes the input of the following process(es), and a process doesn't begin until its input resources are available. JDF defines details on how to use these building blocks to describe concurrent processes, spawned processes, merged processes and dynamic processes.
Directed acyclic graph is a directed graph where no path starts and ends at the same vertex [1]. It is very useful graphical structure in representing syntactic structure of arithmetic expressions, representing task graphs and precedence relations in many scheduling applications. The Hierarchical Dependence Graph (HDG) extends the directed acyclic graph (i.e. DAG) with hierarchical structures. One aspect of this disclosure can use two types of HDG, one is job-centric (or process centric) HDG or J-HDG in short, another one is resource-centric or R-HDG in short. The formal definitions of these graphical structures are as follows:
Definition 1: HDG is a graph G=V, E with no cycles, where V={v|viεV,|=1, . . . , |V]} is a set of vertices and E={e|ekεE, k=1, . . . |E|} is a set of directed edges, within which ek is an ordered pair of vertices with a label. Namely, ek=(vi, vj, λk) where vi, vjεV are in-vertex and out-vertex of edge ek respectively; and λk is a symbolic label of ek. For certain vertices in HDG V′⊂V, they may contain DAGs within themselves.
In J-HDG, JDF nodes are vertices, their incoming edges are labeled with input resources and outgoing edges are output resources. Depending on which JDF node type it belongs to, each vertex in J-HDG can be either an atomic element (i.e. JDF Process node) or be further decomposed on to a DAG itself (i.e. JDF Product node or Process Group node). J-HDG not only retains the flexible JDF hierarchical structure, but also explicitly represents the control sequence among JDF nodes. By incorporating J-HDG structure in MIS/or Controller design, it can avoid any hard-coded workflow control sequence in their implementation so that fully dynamic workflows can be supported. With an explicit job-centric dependence representation, J-HDG is also an intermediate step between JDF job structure and emerging explicit workflow description standards (i.e. BPEL, BPML). By properly mapping JDF to/from BPEL/BPML, it ultimately enables the workflow engine to seamlessly orchestrate JDF workflow through a standard workflow description.
Definition 2: J-HDG is a HDG G=V, E, where V={v|viεV, 1=1, . . . , |V]} is a set of vertex and E={e|ekεE, k=1, . . . |E|} is a set of directed edges. N represents a set of JDF nodes and R represents a set of JDF resources (which can be directly linked with JDF nodes, including their partitioned resources). Respectively, the source and target vertices that are external to any given JDF job description are generally denoted as a and. Therefore, V=N Y {α, } for any ekεE, ek=(vi vj, λk) where vi, vjεV are in-vertex and out-vertex of edge ek respectively, and λk.εR.
In R-HDG, however, JDF resources are vertices, their incoming edges are JDF nodes that produced them and outgoing edges are JDF nodes that consumed them. Since all JDF resources are partitionable, for each JDF resource with partitioned resource parts, the precedence relations among partitioned parts can be described in a DAG. Hence, each resource vertex in R-HDG potentially contains a DAG itself.
Definition 3: R-HDG is a HDG G=V, E, where V={v|viεV, 1=1, . . . , |V]} is a set of vertex and E={e|ekεE, k=1, . . . |E|} is a set of directed edges. N represents a set of JDF nodes and R represents a set of JDF resources (which can be directly linked with JDF nodes) and δ represents a set of resource precedence relations between partitioned resources. Therefore, V=R for any ekεE, ek=(vi vj, λk) where vi, vjεV are in-vertex and out-vertex of edge ek respectively, and λk.εN Y δ.
As a linear graph, the structure of HDG can be represented by an incidence matrix and its operations (e.g. addition, multiplication, transposing, etc.). This section provides a definition of a HDG Incidence Matrix to further define a HDG Connectivity Matrix. From the Connectivity Matrix, transformations of J-HDG and R-HDG are produced.
Definition 4: The Incidence Matrix of HDG G of V vertices and E edges is a matrix M=[mij] of order [V] and |E| where: mij=1 if edge j is incident at vertex i and is oriented away from vertex i; mij=−1 if edge j is incident at vertex i and is oriented toward vertex i; mij=0 otherwise.
For example, as illustrated in
Definition 5: The Connectivity Matrix of a HDG represents the connectivity between JDF nodes and resources of a given J-HDG or R-HDG, where N of JDF nodes and R of JDF resources. The Connectivity Matrix of J-HDG is a matrix CJ-HDG=[cij] of order |N| and |R|, where each column cj is the addition of any columns of the incidence matrix of J-HDG with a same resource label (excluding the rows of α and β).
For example, the connectivity matrix of a J-HDG derived from above
Definition 6: A Matrix Roll-up Procedure is a process to construct a next level up connectivity matrix from a given connectivity matrix. There are two steps involved: (1) removes the columns that represent hidden edges in the next level up HDG (2) merge the rows that collide into one single node in the next level up HDG, by adding all relevant rows together. The resulting connectivity matrix keeps the same semantics as the original one.
J-HDG and R-HDG are intuitively dual forms of HDG, where J-HDG provides a job-centric view and R-HDG provides a resource-centeric view. Construction of a dual HDG (e.g. R-HDG) can be accomplished by transposing the connectivity matrix of the original HDG (e.g. J-HDG), and vice versa. In other words, CJ-HDG=transpose (CR-HDG) or CR-HDG-transpose (CJ-HDG).
Definition 7: The J-HDG−>R-HDG transformation procedure has the following steps: (1) constructs the connectivity matrix of the original J-HDG, and (2) transpose the original connectivity matrix by switching the rows and columns; 3) each row is a node in R-HDG and each column is a label on a directed edge in R-HDG, where a negative number represents an incoming edge and a positive number represents an outgoing edge. The number itself represents the weight of edge and the weight of edge represents the number of resource instances involved.
Definition 8: Similar to Definition 7, the R-HDG−>J-HDG transformation procedure has the following steps: 1) Construct the connectivity matrix of the original R-HDG and (2) transpose the original connectivity matrix by simply switching the rows and columns; 3) each row is a node in J-HDG, and each column is a label on a directed edge in J-HDG, where a negative number represents an incoming edge and a positive number represents an outgoing edge and the number itself represents the weight of an edge; and 4) add external source node a and target node to complete the graph.
Representing JDF workflow structure in a formal graphical structure and its corresponding matrix allows a formal workflow analysis by means of rigorous analytical procedures rather than visual inspection and intuition. The theory of DAG and its applications (decision tree, Bayesian networks, machine learning, etc.) in many artificial intelligence fields provide a foundation for such a workflow analysis framework. The value of different abstractions in J-HDG and R-HDG is their visualization benefits and resulting HDG's which can be analyzed in the same way as the original HDG. This is a crucial feature because the operations and transformations used on a HDG, result in another HDG capable of analysis using the same core set of analytical procedures. This enables a variety of related representations of a given workflow.
The two HDGs described, J-HDG and R-HDG, provide orthogonal views for a given JDF workflow, and allow for an explicit representation of workflow components (i.e. process nodes and resources) and interactions among them. The HDG's are used to validate a given workflow, for example JDF. The validation process determines the following: cycling among components (i.e. deadlock); missing or tangling resource(s), etc. In addition, the HDG's provide a set of semantic-rich information with different abstractions for the MIS/Controller to facilitate JDF workflow execution and management. For example, if a process node is disabled, the HDG's enable an efficient determination of other processes unavailable to be executed. For another example, if a resource is not available, HDG's enable an efficient determination of other resources effected. These examples are not an exhaustive list.
As illustrated in
In the following discussions, we concentrate on applying this set of semantic-rich information to intelligently handling failures/exceptions at run-time. This technique is applicable to general workflows and not limited to JDF workflows. A JDF process node is interchangeable with “task” as a general term.
The simple abortion of a crucial workflow in the presence of failures/exceptions can lead to significant disadvantages. Therefore, any workflow management system needs flexible mechanisms that deal with such failures/exceptions and guarantee the consistent and reliable execution of workflows. Failure/Exception handling in commercial workflow engines is mostly limited to transaction-based mechanisms that only ensure the recovery of persistent data after system failures (e.g. a system crash). This only permits a very rigid handling of expected failure. This disclosure provides more information about the inter-task (or inter-resources) dependencies (such as the connectivity information in J-HDG/R-HDG). As a result, a flexible failure handling strategy is achieved.
For example, referencing workflow illustrated in
The exemplary embodiment has been described with reference to the preferred embodiments. Obviously, modifications and alterations will occur to others upon reading and understanding the preceding detailed description. It is intended that the exemplary embodiment be construed as including all such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.
Claims
1. A workflow auto generation system comprising:
- a knowledge database 2 containing service descriptions;
- a workflow modeling inference engine 4 that generates valid workflow models by matching connectivity between various services in the knowledge base;
- a simulator 6 performing a simulation of each workflow; and
- a Graphical User Interface 8 to obtain customer requirements and display views of the workflows.
2. The system according to claim 1, wherein the simulator 6 is a Petri Net simulator, and the Petri Net simulator performs a simulation of each workflow by mapping it to a Petri Net.
3. The system according to claim 1, wherein the Graphical User Interface 8 obtains customer requirements through a series of questions to a User, and the series of questions narrows down the workflow options meeting the customer requirements.
4. The system according to claim 1, wherein the Graphical User Interface 8 displays a service view of workflow, a product view of the workflow, and a Petri Net view of the workflow.
5. A workflow analysis and control system, comprising:
- a workflow client service 120, providing a description of various print jobs to be executed;
- a workflow analysis service 122, performing a Hierarchical Dependence Graph representation and analysis of a workflow, including process and resource dependences; and
- a workflow orchestrator 124, controlling the execution of said print jobs;
- wherein the workflow client service provides input to the workflow analysis service and the workflow analysis service provides input to the workflow orchestrator, and the workflow client service provides a description of various print jobs in JDF.
6. The system according to claim 5, the workflow analysis 122 further comprising:
- a J-HDG, Definition 2, representation of the workflow client service input; and
- a Connectivity Matrix of J-HDG; wherein the Connectivity Matrix transforms the J-HDG, Definition 2, representation to a J-HDG, Definition 6, representation.
7. The system according to claim 5, the workflow analysis 122 further comprising:
- a R-HDG, Definition 3, representation of the workflow client service input; and
- a Connectivity Matrix of R-HDG; wherein the Connectivity Matrix transforms the R-HDG, Definition 2, representation to a R-HDG, Definition 6, representation.
8. The system according to claim 5, the workflow analysis 122 further comprising: a J-HDG, Definition 2, representation of the workflow client service input;
- a Connectivity Matrix of J-HDG;
- a R-HDG, Definition 3, representation of the workflow client service input; and
- a Connectivity Matrix of R-HDG;
- wherein the Connectivity Matrix transforms the J-HDG, Definition 2, representation to a J-HDG, Definition 6, representation; and the Connectivity Matrix transforms the R-HDG, Definition 2, representation to a R-HDG, Definition 6, representation.
Type: Application
Filed: Apr 30, 2004
Publication Date: Nov 17, 2005
Applicant:
Inventors: Tong Sun (Penfield, NY), John Walker (Rochester, NY), Shriram Revankar (Webster, NY), Narasimha Gottumukkala (Ruston, LA)
Application Number: 10/836,298