System and method for managing dynamic content assembly
A dynamic content assembly system, including an application and underlying database, with methods to support the creation, transformation and management of relationship information between resources and to enable dynamic assembly of content based on these relationships.
This invention relates generally to a system for creating, linking, and assembling electronic content. More specifically, it relates to a system and method for dynamically assembling content from different sources.
BACKGROUND OF THE INVENTIONIn today's content publishing environment, content may be generated and edited using a variety of editors, such as Microsoft Word, Web editors (e.g., Arbortext's Contributor), and Extensible Markup Language (XML) editors (e.g., Arbortext's Epic Editor). Similarly this content may be published to a variety of output media formats, including print, Portable Document Format (PDF), various forms of Hypertext Markup Language (HTML), Wireless Markup Language (WML), and PostScript. Content may also be published to compiled formats such as HTML-Help, MS Reader, formats for personal digital assistants (PDAs), and formats for mobile phones.
Assembling documents from various formats can be challenging. The documents must be assembled from many different pieces with many different cross-document links. While the task of storing documents and their components in a repository is currently being handled by multiple vendors, there is a need to automate the dynamic assembly of document components and their related links to other document components. This assembly is currently being done in a laborious way requiring extensive special case programming.
The complexity of dynamic content assembly across multiple media formats, audiences, compound documents, and versions can be costly when done via manual processes such as creating multiple documents, cutting and pasting content, or completely recreating information with additional review required for all newly created information.
Accordingly, there is a need in the art for a system or method to manage the dynamic creation, linking, and assembly of content.
BRIEF SUMMARY OF THE INVENTIONThe present invention is designed to support the defining, creating, modifying, storing, reusing, validating, resolving, and exchanging multiple link types. Each link or link collection may have multiple audiences defined. Each link or link collection may have multiple media-appropriate, output-resolution filters. Each link or link collection may be validated for the given contexts of media format, audience, compound document usage, and document or sub-document component version.
While multiple embodiments are disclosed, still other embodiments of the present invention will become apparent to those skilled in the art from the following detailed description. As will be apparent, the invention is capable of modifications in various obvious aspects, all without departing from the spirit and scope of the present invention. Accordingly, the drawings and detailed description are to be regarded as illustrative in nature and not restrictive.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention is capable of dynamically managing content linking and assembly. Dynamic Content Assembly Management, or “DCAM,” is generally the process of supporting and enabling the creation, transformation and management of relationship information between content resources and dynamic assembly of content based on these relationships. Thus, generally, a DCAM system provides a means for defining, creating, modifying, linking, reusing, validating, resolving, and assembling dynamic content using an efficient and automated process.
Dynamic Content Assembly (DCA) can be divided into many functional areas such as: embedding traversal points such as hyperlinks; embedding content from one resource into another resource; transforming content to a representation in the desired data format; filtering content based on meta-information such as the target audience for a specific portion of the content; and transforming the content to a representation with the desired style.
The DCAM System (or Link Management System) of the present invention is an application and underlying database, which supports the creation, transformation and management of relationship information between resources. Links can serve multiple purposes such as:
-
- indicating a point of traversal to a related piece of content, for example, Hyperlink in a web document.
- embedding textual content of one resource at a particular location in another resource, for example, a company biography paragraph may be stored separately and used by reference in multiple pages.
- embedding graphical content of one resource at a particular location in another resource, for example, a logo or graphic in a Hypertext Markup Language (HTML) page which is stored separately from the HTML page and could be used by reference in multiple pages.
- embedding content from an external application, for example, including product specifications from a product data management system in a marketing data sheet about that product.
- indicating a related collection of information by associating multiple pieces of related content, for example, a Retailer providing a related collection to a purchaser of a certain product.
- indicating the audience for the content, which allows creation of a single source of content that supports multiple views, for example, the views could be driven by the person reading the document deciding between the expert view and the beginner's view, by the parent document needing the repair description rather than the build description, or by the publishing process needing the Spanish version rather than the English version.
The present invention enables dynamic assembly of content based on these relationships or links.
The term link, as used herein, is meant generally and includes not only hyperlinks but also “include”, “fileref”; or any other means for embedding text or content within a document. Thus, the concept of link management in the DCAM system supports all types of dynamic content assembly management.
Generally, Extensible Markup Language (XML) provides a useful, human and machine readable, standardized syntax for representing content, metadata, links, queries, transforms, and configuration information, which permits simplified processing of dynamic content. Document content (prose) can be created as XML natively, or transformed from proprietary applications to XML. Data from a variety of software applications including directly from many databases is accessible as XML. W3C Standards such as XML Linking Language (Xlink), XML Path Language (XPath), and Extensible Stylesheet Language (XSL) can be leveraged to enable application programmers and users with a standard, application interchangeable syntax.
In today's content publishing environment, content may need to be published to print, Portable Document Format (PDF), multiple forms of HTML including different web sites, locations, and languages, compiled formats such as HTML-Help and MS Reader, formats for palm devices, formats for cell phones, and/or exchanged with partners and customers in a source format such as XML. Each media type has its own functional capabilities for representing dynamic content. For example a print document represents a navigation link such as ‘See also “The Definitive Guide,” chapter 13, paragraph 12’ while an HTML file might have a hyperlink embedded in the phrase “Get More Information” to a specific paragraph in another HTML file.
For each media format the “link” needs to be represented natively for that format. For example, in HTML a link could be defined by a simple anchor tag such as: <a href=“http:www.arbortext.com”>. However, more complex links may require dynamic HTML using a scripting language such as JavaScript. For printed documents, the same cross reference may appear as “Contact Arbortext, Inc.” or alternatively “See Arbortext, Incorporated's web site at www.arbortext.com”
As links become more complex—one to many, many to many, rings, etc.—the ability to create, verify, and resolve links for multiple publish media becomes exceedingly difficult to do manually. A DCAM system enables resolving links appropriately for each media format.
Several types of links may be managed using the DCAM system of the present invention: simple links (also called outbound or navigation links), inbound links, graphic links, extended links, taxonomy links, object to object links, links for use with other applications, links wherein the application decides the traversal means, third party links, related information links, and media object links.
A simple link associates exactly two resources, one local and one remote, with an arc going from the former to the latter. Thus, a simple link is always an outbound link (going away from the linking element). An outbound link is used when an explicit location within an object references outside the object. With this type of link, the profiling of the link determines the resolution. Conversely, if the arc of the link were to start at a remote resource and end at a local resource, the link is inbound. An inbound link relationship is maintained entirely within link manager. A graphic link is used when an explicit location within an object references to different graphics based on profiling. With this type of link, the profiling of the link determines the resolution. An extended link associates an arbitrary number of resources. The participating resources may be any combination of remote and local. A taxonomy link associates information with a subject taxonomy. Taxonomy links are used to associate topics with subjects and categories of subjects. This allows improved searching for appropriate topics both for specific subjects such as cardiac infarction and general subjects such as heart disease. A third party link is a link wherein neither the starting resource nor the ending resource is local. Related information links are used when an object does not explicitly reference another object but a relationship exists. This type of link is a third party arc entirely maintained in the link repository. The links provided are in relation to a given resource. Media object links are frequently context sensitive links. Resolution may be document dependent, output dependent or language dependent.
XML further simplifies managing a document as a hierarchical collection of sub-document level components. These collections are often referred to as “compound documents.” For example, a cookbook is made of 20 recipe chapters, but each recipe chapter is stored individually. A given recipe chapter may be reused in numerous books. Creating and managing a link in a sub-document component which may be reused in multiple contexts is difficult. If these documents and components are managed in a content management system, they may also be versioned. The present invention provides a DCAM system that may be used to enable the managing and resolving of links for different document contexts and different document versions.
Additionally, the DCAM system of the present invention may be used to publish alternate views of a document targeting different audiences. For example, a catalog may contain different prices for customers, partners, and employees. The catalog may be different for North America, Europe, and Asia., and may also differ when in print versus web format. Making links work in all of these contexts is often handled by manual processes and redundant copies of the document content. The present invention enables the assembly of content dynamically to suit the needs of a given audience.
Documents may be a combination of textual content created by authors and database content derived from queries performed against a database application. The invention, in providing a DCAM system, provides mechanisms for managing the database derived content the same way that the textual content is managed. This allows the database derived content to be profiled for different audiences and to be published appropriately for different output media formats
The present invention supports reuse, repurposing and dynamic assembly. This includes multichannel publishing such as output sensitive linking, information reuse such as context sensitive linking, dynamic content such as inclusion and data merge, and content supply chain such as web services. Performing link management using the present invention has several benefits. A single link may be managed with different output resolutions—thus providing simplified support for multichannel publishing. Further, a single link may be managed with different resolutions based on different contexts. This enables template-based authoring and reusing rather than duplicating business data with context sensitive links. Linked content may be dynamically included based on context and can include or reference the content. A web services interface supports information exchange across the enterprise and a Simple Object Access Protocol (SOAP) Application Program Interface (API) (discussed more fully below) is provided for manipulating a link repository.
The link manager manages component or linking relationships internal to an object, across objects, across versions, across publishing cycles, and across supply chains.
As explained above, managing the dynamic assembly of content involves determining an access point or end point of a relationship arc or link. The arc leads from the access point to contents for dynamic assembly. Thus, for example, a hyperlink leads to a url. The content to which the arc leads may be verified based upon rules. Once the appropriate relationship is verified, the content is made available from the access point.
A link management application generally performs certain functions. These functions typically require the application to include means to create, modify, remove, store, configure the behavior of, validate, resolve and exchange link information with other applications. The architecture of a DCAM System in accordance with a first embodiment of the invention is shown in
The Epic Editor Authoring User Interface 40 is a user interface used within the Epic Editor environment. It is an interface for creating, modifying, removing and managing linking information. Generally, operations performed by an author within the editing environment are exposed through this interface.
The DCA Repository Authoring User Interface 42 is a user interface for creating, modifying, removing and managing linking information directly in the link repository. Generally, operations performed by an author within this environment are exposed through this interface.
The DCA Configuration 44 is the configuration information for the DCA system based on a specific data model. The focus of this application is configuration rather than customization. To this end, anything which can be abstracted to this configuration generally should be.
The Configuration Interface 46 is a simple user interface for inputting and modifying configuration information for the DCA system based on a data model.
The DCA Repository 48 is a database application for storage, search and retrieval of the link information. The DCA Repository Administrator User Interface 50 is the user interface for performing IT administration functions on the DCA Repository 48. The DCA Repository SDK 52 is documented Application Program Interface (API) for programmatically communicating with the DCA Repository 48.
The Linkbase 54 is an Xlink compliant linkbase XML document. These documents may be used to export portions of the DCA Repository 48 or to import modifications into the DCA Repository 48.
The DCA Resolver 56 is a filter in the content pipeline for resolving linking information. The DCA Resolver SDK 58 is documented API of the Java methods (or other) used in the DCA Resolver 56.
The DCA Validator 60 is a validation program to verify the validity of link information.
The DCA server is a system designed to manage linking information externally from documents, while still maintaining actual links as if they solely exist within documents. The DCA server provides late-binding linking, in a variety of formats. Using this, a link may be formatted into a variety of link syntaxes at publish time. The DCA system comprises three major components: a server tier, a transporttransport tier, and a client tier.
The design of the DCA system is flexible, enabling bindings to be made to any database. In a particular embodiment the overall language used in the design of the DCA system is Java. The system may be built for an Oracle Platform. Alternately, any other relational database platform may be used.
Explained differently, the DCAM System includes a Server Tier, a Transport Tier, and a Client Tier.
The server tier of the DCAM System comprises a database bound to a server application. The server application contains logic necessary to interact with the database, and exposes a full API. All database access is performed through the server API. The server component is central to the DCA System. The server component is responsible for storing and managing all linking information within the system. The server component comprises a database binding, classes representing database objects, and an API that connects the components, in addition to providing a simple way to interface with the server.
A server diagram is provided at
In one embodiment, the Server component is built on top of a SQL compliant database.
The links table 70, contains all links in the system. Each link includes one or more titles (as it has multilingual support) and may have metadata properties associated with it. The link_metadata table 72 contains a list of metadata properties related to the links. The link_title table 74 contains one or more names for each link. Again, links may have multiple titles defined, each in a different language. The link_folders table 76 contains a hierarchical listing of folders used to categorize and classify links. The link_folder_metadata table 78 contains metadata properties that are related to link folders. The link_folder_title table 80 contains titles that are directly related to each link folder. Multiple titles may be specified, each in a different language.
The resources table 82 contains a listing of resources within a repository. Each addressable object (documents, element, etc.) has a resource definition specified in this table. Resources may be addressed by URI and when URIs are stored, they are broken down into their component atoms. All resources that are considered top-level documents have an is Document flag set. The resource_metadata table 84 contains metadata properties that are related to the resources. The resource_title table 86 contains titles that are directly related to each resource. Multiple titles may be specified, each in a different language. The resource_folders table 88 contains metadata properties that are related to resource folders. The resource_folder_title table 90 contains titles that are directly related to each resource folder. Multiple titles may be specified, each in a different language. The resource_pairs table 92 contains a listing of all resource pairs in the system. Resource pairs are components of a link that comprise a starting and ending resource (both of which are relationships to the resource table), as well as information about the role and traversal constraints. Resource pairs may be related to profiles to scope their usage. Resource pairs are bound to links, and upon deletion of a link, that deletion will be cascaded to it's resource pairs. The properties table 94 maintains a listing of XML attributes pertinent to each resource pair (such as graphic size, etc.) that are placed in markup at resolution time.
The profiles table 96 contains profiles defined within the system. Profiles are defined on a per doctype basis and may span multiple doctypes. Profiles are applicability attribute values that, when references, set the usage and scope of a resource pair. The resource_pair_profile_xref table 98 contains relationships of resource pairs to profiles. A resource pair may be related with as many profiles as desired. The named_profiles table 100 contains a mapping of names to profiles. This may be used to create a grouping or categorization of profiles. Further shown are a link_folder_metadata table 106, a link_folders table 108, a link_title table 110, and a link_folder_title table 112.
A configuration table 104 may be provided containing system configuration information relevant to the DCAM server.
The transport tier of the DCAM System provides the communication link between the client tier and the server tier. The transport tier facilitates client/server communications.
The format for transport requests is SOAP (Simple Object Access Protocol). Thus, in one embodiment, the transport layer may be referred to as a SOAP transport layer. Each SOAP request is considered a transaction boundary. Alternately, transport-oriented languages (such as EJB, RMI, and CORBA) may be used to perform object marshaling.
The transport tier manages user sessions and transactions, by interpreting requests, executing transactions atomically, and returning those results. Each operation is performed by creating one or more data transfer objects along with a command, sending the objects and command wrapped in a SOAP request, processing the command which returns as results zero or more data transfer objects, and returning the results of the operation wrapped in a SOAP response. The client tier contains everything needed by users to create, delete and manage links. One embodiment of the client side implementation includes Epic customizations (menu items, hooks, etc.), user interfaces, and a client API that provides direct access to the server. The client API facilitates communications to the transport tier. Calls to the server are done within a transaction. The client component includes all classes necessary to create/submit a transaction and receive those results in the form of classes that represent server side objects. As shown in
The user may be provided with the ability to profile groups. Thus, the user may save and name a choice of profiles, which may later be applied to an object by selecting the named profile group, rather than requiring each choice to be selected each time. The user may also save and name a choice of profiles, which may later be used to designate the profiles to use for resolution purposes (such as at block 14 of
Hierarchical profiles may be provided to allow the user to apply a group of profiles simultaneously to an object based on selecting a containment node. Typically, the profiles are applied at the leaf level. Radio profiles may be provided. Radio profiles are mutually excusive profiles where only one choice is allowed. Profiling may be done via containment. Containment is the concept where a single profile value represents the inclusion of other lower level profiles (for example, “top secret” including “secret, classified, and unclassified”). Further, named profiles may be used (profile groups and profile filters), profiles may be restricted to or from particular elements, and logical expressions (AND, OR, NOT, EQUAL, XOR) may be included in profile filters.
A screen shot of an Apply Profiles dialog 250 is illustrated in
A screen shot of a Search DCAM Links dialog 280 is illustrated at
A screen shot of an Export Linkbase dialog 300 is illustrated at
A screen shot of a Profile Filter dialog 308 is shown at
The Profile Filter Group dialog 312 is shown at
As shown in
The functioning of the link resolution filter 340 is illustrated in
To use the client component, an application calls the establishClientSession( ) method in the Application API. This call initiates a connection to the transport tier, which passes on the create session request to the server. If the user authenticates, a session is created. Once a Client Session object is created, user actions may generate calls to the Service interface. That interface generates a SOAP request and sends it to the transport tier which processes the transaction through the server. Upon completion, the server returns a SOAP document containing all of the results.
Data merge, in accordance with the present invention, permits including content from a separate data store. Data merge is the incorporation of references to external data sources in a document, and the periodic resolution of those references. A reference to some external data is called a query. Three authoring stages may be identified: query declaration; reference to a query declaration; and update of one or more query results. Named declarations are reusable, that is, they may be reference many times. They are given a name when created and their location defines their scope. The name must be unique for the scope. Query declarations with no name are not reusable. The point of declaration is the only reference. Query results appear at the location of a query reference. Results are inserted at the time a reference is inserted or whenever a reference to a query is updated.
A query declaration must refer to an external data source, called the query definition. A query definition comprises of a UI component and a formal definition. The UI component includes: the name of the query definition, the parameters that must be passed to the query, whether the query returns document content of name/value pairs, and if the document content is returned, a representative top level tag for quick context verification. The name of the query definition links a document's query declaration to a query definition. The formal definition includes: a source stage, one or more transformation stages, a description of the order in which the stages are to be applied, and a mapping of UI parameters to actual parameters for each stage. The source stage may be any program that generates a Document Object Module (DOM) node. The actual source may be a database, a file, a URL, or some external process. The program is responsible for presenting the result as a node, perhaps using some simple markup to represent value pairs. The transformation stages take a DOM node as input and generate a new DOM node as output.
The data merge framework is designed for flexibility and extensibility. It adapts the Model-View-Controller (MVC) design paradigm and uses a pipeline structure for handling data acquisition and processing. The MVC paradigm separates the business logic from the user interface, allowing both sides to be modified independently. A pipeline allows easy reuse of components.
DOM Server 350 refers to the program which is capable of manipulating a Dom instance and displaying a DOM instance when in the interactive mode. Data Merge Controller 352 is a program that consists of helper functions that are assembled from DOM API. It receives requests from the user, distributes the requests to queries and then inserts the result into a document. It is also responsible for updating changes and retrieving data fields. A Query 354 comprises one Source and zero or more Transformers. A source is a component that can generate a DOM node. A Transformer takes an existing node as input and returns a modified node as the output. A Query process generates a DOM node (which may be an element, document fragment, or document) and returns it to the Data Merge Controller. The components in a Query communicate with each other via a DOM node. A Query generally comprises three components: SQL Query, Transformation and XPath. The Query process retrieves data from a database and transforms the result using some XLT Transformation. Before returning the result, an XPath filter is applied to retrieve only interested components. The query process involves a sequence of transformations.
Although the present invention has been described with reference to preferred embodiments, persons skilled in the art will recognize that changes may be made in form and detail without departing from the spirit and scope of the invention.
Claims
1. A method for managing dynamic assembly of content in a document, comprising:
- parsing a document to find at least one link;
- resolving the at least one link against a DCAM Repository resulting in a resolved link to content;
- assembling the resolved link to content into the document.
2. The method of claim 1, further including profiling the link when the link is resolved.
3. The method of claim 1, further including processing data merge queries.
4. The method of claim 1, further including determining link type of the at least one link.
5. The method of claim 4, wherein if the link is a navigation link, updated markup is inserted into the document.
6. The method of claim 4, wherein if the link is an embedding of content, the method further includes embedding the content into the document.
7. A method for managing dynamic assembly of content, comprising:
- determining an access point, the access point invoking a dynamic content assembly request;
- selecting one of a plurality of contents for assembly from the access point; and
- assembling said contents dynamically based on rules; and
- making said dynamically assembled content available from said access point.
8. The method of claim 7, wherein the access point is the start point of a relationship arc.
9. A method for managing creation of links between and within content modules, comprising:
- determining first and second end points, the first end point being a start of a relationship arc and the second end point being an end of a relationship arc;
- creating a relationship arc;
- indicating an appropriate audience for the relationship arc; and
- dynamically resolving the relationship arc based on the audience.
10. The method of claim 9, wherein determining end points and creating a relationship arc further comprise:
- determining possible end points based on rules;
- capturing metadata about each possible end point based on rules; and
- validating the end points used within a document or system.
11. The method of claim 10, wherein possible end points are found using a navigation mechanism for finding end points.
12. A method for dynamically assembling documents, comprising:
- entering a target with assembly rules;
- entering a link that points to the target; and
- resolving the link into content.
13. A system for managing dynamic content assembly, comprising:
- a server tier comprising a database bound to a server application, the database including linking information;
- a client tier enabling a user to create, delete and manage links; and
- a transport tier for communicating between the server tier and the client tier.
14. The system of claim 13, wherein the server application includes logic for interacting with the database.
15. The system of claim 13, wherein the transport tier is a SOAP transport layer.
Type: Application
Filed: May 5, 2004
Publication Date: Jul 13, 2006
Inventors: John Dreystadt (Pickney, MI), Timothy Allen (Saline, MI), John Koenig (Ann Arbor, MI), Curt Malouin (Northville, MI), Ying-Che Fang (Ann Arbor, MI), Andrew Dobrowolski (Saline, MI)
Application Number: 10/839,109
International Classification: G06F 17/24 (20060101); G06F 17/21 (20060101);