System and method of translating a relational database into an XML document and vice versa
A method of translating a relational database into an XML document comprises generating an Extended Entity Relationship model from relational schema associated with the relational database and applying a schema translation process to the Extended Entity Relationship model to map the relational schema into a Document Type Definition (DTD) of an XML schema. An XML Tree Model is then generated from the Document Type Definition, the XML Tree Model being representative of one or more data semantics of the relational schema which are preserved as one or more data semantics in the XML document. Relational data from the relational database is then converted into the XML document using the relational schema and the XML schema from the Document Type Definition and/or the XML Tree Model. There is also described a method of translating an XML database into a relational database which comprises the reversal of the steps of the aforementioned method, and a system for implementing both methods.
The present invention relates to a method of translating a relational database into an XML document, a method of translating an XML database into a relational database, a method of data transmission of relational data through an XML document between a sender and a receiver over a network, a computer program, computer program product, a system of translating a relational database into an XML document and a system of translating an XML database into a relational database.
BACKGROUND OF INVENTIONInternet applications today are faced with the problem of replicating, transforming, exporting, or saving data from one format to another. This process may be laborious, tedious and error prone. The Internet holds within it the potential for integrating all information into a global network, promising access to information any time and anywhere. However, this potential has yet to be realised. At present, the Internet is merely an access medium. To realize the Internet's potential, intelligent search, data exchange, adaptive presentation, and data recovery are needed. The Internet must go beyond setting an information access standard, which means a standard way of representing data, so that software can search, move, display, recover and otherwise manipulate information currently hidden in contextual obscurity.
XML (extensible Markup Language) has emerged as the standard for data interchange over the Internet. Interoperation of relational databases and XML databases requires schema translation and data conversion between the relational and XML databases. The translated XML schema may assist in the sharing of business data with other systems, interoperability with incompatible systems, exposing legacy data to applications that use XML such as e-commerce, object persistence using XML, and content syndication. In recent years, with the growing importance of XML documents as a means to represent data on the World Wide Web, much research has been carried out on devising new technologies to store and retrieve XML documents using relational databases.
XML databases are available from the key Relational Database Vendors in the marketplace as an extender or cartridge to a relational database management system. Most XML-enabled database management systems such as Oracle, SQL Server and Sybase can only translate a few relations into an XML document. However, they cannot transform the whole relational database into an XML document nor synchronize a relational database into a replicate XML database.
Moreover, in such conventional systems and methods, the translation is without data semantics constraints considerations, and thus these methods may not be sufficient for an information highway on the web. The demand on the database is increased in e-commerce. Aoying Zhou, Hongjun Lu, Shihui Zheng, Yuqi Liang, Long Zhang, Wenyun Ji, and Zengping Tian describe a visual based XML document management system (a VXMLR system) in the paper entitled ‘A Visual XML-Relational Database System’, published as Proceedings of the 27th VLDB Conference, Roma, Italy, 2001 pp. 646-648. In this system, firstly an XML document is parsed into a Document Object Model (DOM) tree and the Document Type Definition (DTD) of the document is extracted. The Document Object Model tree is then mapped into a relational table and stored in a database. For processing XML queries, the path expressions queries are transformed into SQL statements and submitted to the underlying Relational Database Management Systems (RDBMS). VXMLR maintains some statistics of data and a path directory, which are used in the query rewriting process to reduce the number of SQL statements and simplify join conditions.
Mary Fernandez, Wang-Chiew Tan and Dan Suciu in the document entitled ‘SilkRoute: trading between relations and XML, Computer Networks’, Volume 33, Issues 1-6, June 2000, pp. 723-745 describe a general framework for mapping relational databases to XML virtual views using a declarative query language, RXL (Relational-to-XML Transformation Language). The resultant view is formulated by application using XML-Query Language (QL) to extract XML data.
In a document by Masatoshi Yoshikawa and Toshiyuki Amagasa entitled ‘XRel: A path-based approach to storage and retrieval of XML documents using relational databases’, published as ACM Transactions on Internet Technology, Vol. 1 No. 1, August 2001, pp. 110-141, an XML document is decomposed into a set of nodes that are stored in several tables along with encoded path information from the root to each node. XML documents are stored using a fixed relational schema without any information about DTDs and also utilize indices such as the B1-tree supported by DBMS. To process XML queries, an algorithm is presented for translating a core subset of XPath expressions into SQL queries.
Jayavel Shanmugasundaram, Eugene Shekita, Rinion Barr, Michael Carey, Bruce Lindsay, Hamid Pirahesh, and Berthold Reinwald, in a document entitled ‘Efficiently Publishing Relational Data as XML Documents’, published as Proceedings of the 26th VLDB Conference, Cairo, Egypt, 2000, pp. 65-76, describe an SQL language extension, namely an XML constructor, for constructing complex XML documents directly in the relation engine. Different execution plans for generating the content of an XML document were explored. The result shows that constructing XML documents inside the relation engine could have significant performance benefits.
Joseph Fong, Francis Pang, and Chris Bloor in a document entitled ‘Converting Relational Database into XML Document’, published as Proceedings of First International Workshop on Electronic Business Hubs, September, 2001, pp. 61-65 describe a method to translate XQL into SQL in an XML gateway. The described translation process adopts a symbolic transformation of node navigation in an XQL query graph to a relation join table navigation in an SQL query graph.
Joseph Fong and Tharam Dillon in a document entitled ‘Towards Query Translation from XQL to SQL’, published as Proc. of 9th IFIP 2.6 Working Conference on Database Semantics (D59) by World Scientific Publisher in 2001, pp. 113-129, describe a comparison of the performance analysis between an XML-Enabled Database and a Native XML database and Native XML databases are recommended therein for very complex structure systems. In a document by Joseph Fong, H K Wong, and Anthony Fong entitled ‘Performance Analysis between XML-Enabled Database and Native XML Database’, a book chapter of XML Data Management, edited by Akmal Chaudhri, Addison-Wesley, USA, March, 2003, steps are described for converting a relational database into an XML document. The described steps show how to translate relational schema into XML schema, followed by manually mapping data to an XML document.
Multi-database systems are systems that provide interoperation and a varying degree of integration among multiple databases. There are different approaches to multidatabase interoperability. Global schema integration is an approach that is based on complete integration of multiple databases in order to provide a global schema. However, there are several disadvantages of this approach, one of them is that it is difficult to identify relationships among attributes of two schemas and to identity relationships among entity types and relationship types. However, there is another approach, known as the Multidatabase Language Approach. The aim of this approach is to perform queries involving several databases at the same time. However, this approach requires users to learn another language and users may find it difficult to understand each individual database schema.
Some database management systems (e.g. Oracle, DB2) allow input of XQL queries to allow users to retrieve XML documents. However, the data retrieved are actually stored in tables in the relational database and are not stored in an XML database.
Conventional methods for storing XML documents in relational databases can roughly be classified into three categories: structure-mapping, model-mapping and semantic-preserving approaches.
The Model-Mapping Approach:
There have been several studies that use fixed relational schemas to store XML documents. Such approaches are known as model-mapping approaches. Each such approach has different mapping rules and database schema.
The “Edge” approach is described in Kanne, C., and Moerkotte, G., Efficient Storage of XML Data, Proceedings of the 16 International Conference on Data Engineering, 2000, Page(s): 198-198 and stores the XML data as a direct graph/tree in a single relational table. This approach maintains edges individually. Therefore it needs to concatenate the edges to form a path for processing user queries. As a sample table, it only keeps edge-labels, rather than the labeled paths. Therefore a large number of joins is needed to check edge connections.
Similar to the “Edge” approach, Thomas Kudrass, in a document entitled ‘Management of XML documents without schema in relational database systems’, published as Information and Software Technology, Volume 44, Issue 4, March 2002, Page(s): 269-275 describes an edge table enriched by an amount of information in order to distinguish between different target nodes. In this approach, the content of a document is stored in a leaf value (Leaf table) or in an attribute value (Attr table). Both are referenced from the Edge table via a foreign key. The edges of the document tree are identified by a source node and a target node. Each document has a unique ID so that an edge can be assigned to one document. A drawback of this approach is that the decomposition of a document produces a lot of tuples to be inserted into the database. Therefore, the load time may increase for a large document. Masatoshi Yoshikawa, and Toshiyuki Amagasa, in a document entitled ‘Xrel: A Path-Based Approach to Storage and Retrieval of XML Documents Using Relational Databases’, published as ACM Transactions on Internet Technology, Vol. 1, No. 1, August 2001, Page(s): 110-141 describe a system (XreI) in which an XML document is decomposed into nodes on the basis of its tree structure and stored in relational tables according to the node type, with path information from the root to each node. The Xrel system stores the directed graph of an XML document in four tables. The advantage of the XReI system is that it does not require recursive queries, and can perform the same function within the SQL-92 standard. Haifeng Jiang, Hongjun Lu, Wei Wang, and Jeffrey Xu Yu, in a document entitled ‘Xparent: an efficient RDBMS-Based XML database system’, published as Proceedings of the 18th International Conference on Data Engineering. 2002, Page(s): 335-336 describe a system (Xparent) in which the data model of an Xpath system is adopted to represent XML documents. The Xparent system models a document as an ordered tree. It uses similar schema to those used in the XreI system. In this system the data-path id replaces the start and end pairs used in the XreI system. The advantage of the Xparent system is that it can be efficiently supported using conventional index mechanisms such as B-tree. One drawback of the Xparent system is that it requires a large number of joins to check edge-connections for processing complex queries.
In XML-Relational conversion which is described in a document by Latifur Khan, Yan Rao entitled ‘A performance evaluation of storing XML data in relational database management systems’, published as Proceeding of the 3rd international workshop on Web information and data management, November 2001 Page(s): 31-38, each document is stored in two relational tables. This approach preserves the nested structure of an XML document. A shortcoming of this approach is that Pathld depends on an element's tag, and it might be the case that some elements occur multiple times which violates the definition of primary key (Pathld). Extra work is required to solve such conflicts.
The Structure-Mapping Approach:
In structure-mapping, schemas are extracted from XML documents and a database schema is defined for each XML document.
Mary Fernàndez, Wang-Chiew Tan and Dan Suciu, in a document entitled ‘SilkRoute: trading between relations and XML’, published as Computer Networks, Volume 33, Issues 1-6, June 2000, Page(s): 723-745 describes a general framework for mapping relational databases to XML virtual views using a declarative query language, RXL (Relational to XML Transformation Language). The operation starts by writing an RXL query that defines the XML virtual view of the database. The main shortcoming with this approach is that queries over the views often produce composed queries with many unions. Iraklis Varlamis and Michalis Vazirgiannis, in a document entitled ‘Bridging XML-schema and relational databases, a system for generating and manipulating relational databases using valid XML documents’, published as Proceeding of the ACM Symposium on Document Engineering, November 2001, Page(s): 105-1 14 describe an X-Database system that acts as an interface between the application and database. The basis of the system is an XML-Schema that describes the logical model of interchanged information. A drawback of the X-Database system is that in this system the XML-Schema may be defined once in the beginning of the process and cannot be changed, but, in reality, the schema is changed over time in the majority of applications.
The XPERANTO system described by Michael Carey, Jerry Kiernan, Jayavel Shanmugasumidaram, Eugene Shekita, and Subbu Subramanianm, in a document entitled ‘XPERANTO: Middleware for Publishing Object-Relational Data as XML Documents’, published as Proceedings of the 26th VLDB Conference, 2000, Page(s): 646-648 operates as a middleware on top of a (an object) relational database system. This system starts by providing a default virtual view of a given (object) relational database. The user may then create more complex or specialised views based on the default view by using an XML query language. One attractive aspect of the XPERANTO approach is that it works in any existing relational database system because the XPERANTO system generates a regular SQL and tags the results outside the database engine.
Aoying Zhou, Hongjun Lu, Shihui Zheng, Yudi Liang, Long Zhang, Wenyun Ji, and Zengping Tian in a paper entitled “VXMLR: A Visual XML-Relational Database System” published as Proceedings of the 27th VLDB Conference, 2001, pages 646-648 present a visual based XML document management system, VXMLR. In this system, the XML document is parsed into a Document Object Mode tree and the DTD of the document is extracted. The document tree is then mapped and stored into a relational table. VXMLR maintains some statistic of data and a path directory, which are used in the query rewriting process to reduce the number of SQL statements and simplify join conditions.
The Semantic-Preserving Approach:
The semantic-preserve approach generates an XML structure that is able to describe the semantics and structure in the underlying relational database.
Wenyue Du, Mong Li Lee and Tok Wang Ling, in a document entitled ‘XML structures for relational data’, published as Proceedings of the Second International Conference on Web Information Systems Engineering, Volume 1, December 2001, Page(s): 151-160 describe a methodology which employs a semantically rich Object-Relational-Attribute model for semi-structured data (ORA-SS) in the translation process. ORA-SS models a rich variety of semantic constraints (strong/weak entities, binary/n-ary/recursive/ISA relationship type, single-valued/multi-valued attributes of entity types or relationship types and cardinality constraints) in the underlying relational database, and represents the implicit structures of relational data using hierarchy and referencing. ORA-SS preserves the inherent semantics and implicit structure in relational schema.
J. Fong, H. K. Wong and Z. Cheng, in a document entitled ‘Converting relational database into XML documents with DOM’, published as Information and Software Technology, Volume 45, Issue 6, April 2003, Pages 335-355 describe a system in which the relational schema are denormalized into joined tables which are transformed into a Document Object Model (DOM) according to their data dependency constraints. These DOMs are integrated into a DOM which is translated into an XML document. The data dependencies constraints in the de-normalized relational schema are mapped into XML document trees in elements and sub-elements. In the process, the partial functional dependencies are mapped into elements and attributes. The transitive data dependencies are mapped into element, sub-element, and sub-sub-elements in the XML documents. The multi-valued dependencies are mapped into multiple sub-elements under one element. The join dependencies are mapped into a group element. As a result, the data semantics in the relational schema are translated and preserved in the XML document.
Angela Cristina Duta, Ken Barker, Reda Alhajj, in a document entitled ‘ConvRel: relationship conversion to XML nested structures’, published as Proceedings of the 2004 ACM symposium on applied computing, March 2004, Page(s): 698-702 describe a system in which relational schemas are transformed into nested-based XML schema for each relational data source.
In summary, there is a need for a system having a relational database for traditional data processing and also its equivalent XML database for various applications (such as Bank-to-Bank (B2B) applications) with improved performance in the online conversion from relational data to an XML document. Furthermore, as users may prefer to keep two production database systems for computing, there is a need for a system in which a relational database may be used for internal data processing and its counterpart XML database may be used for external Internet data transmission. There is also a need for a method for converting between a relational database and an XML database which improves database performance, enables automatic XML database recovery in the case of system failures, and is easy to use enabling users to use their own familiar query language.
SUMMARY OF THE INVENTIONAccording to a first aspect of the present invention there is provided a method of translating a relational database into an XML document comprising the steps of:
generating an Extended Entity Relationship (EER) model from relational schema associated with said relational database;
applying a schema translation process to the Extended Entity Relationship model to map the relational schema into a Document Type Definition (DTD) of an XML schema;
generating a XML Tree Model from said Document Type Definition representative of one or more data semantics of the relational schema which are preserved as one or more data semantics in said XML document; and
converting relational data from said relational database into said XML document using said relational schema and said XML schema from said Document Type Definition and/or said XML Tree Model.
Preferably, the step of applying a schema translation process comprises mapping the relational schema with associated relational schema constraints into said Document Type Definition.
Preferably, the step of generating a XML Tree Model comprises generating a plurality of XML Tree Models representative of one or more data semantics of the relational schema. In a preferred embodiment, the method further comprises updating said relational database and said XML database by translating an update transaction from said relational database in Structural Query Language into an update transaction of said XML database as a Document Object Model.
According to a second aspect of the present invention there is provided a method of translating an XML database into a relational database comprising the steps of:
generating a XML Tree Model from said XML database;
generating a Document Type Definition representative of one or more data semantics of an XML schema associated with said XML database;
generating an Extended Entity Relationship (EER) model from said XML schema;
applying a schema translation process to the Extended Entity Relationship model to map the XML schema into a relational schema representative of said relational database, said data semantics of said XML schema being preserved as one or more data semantics in said relational database; and
converting XML data from said XML database into said relational database using said relational schema and said XML schema from said Document Type Definition and/or said XML Tree Model.
Preferably, said XML schema comprise one or more elements each having an associated data occurrence, and wherein the step of applying a schema translation process further comprises for each element in said XML schema, locating a corresponding target relation, and loading into a tuple of said target relation the data occurrence of said element according to one or more data semantics of said XML database.
According to a third aspect of the present invention there is provided a method of data transmission of relational data through an XML document between a sender and a receiver over a network comprising the method defined above for translating relational data into an XML document, transmitting from said sender said XML document over said network, receiving at said receiver said XML document, and converting said XML document into a relational language used in said receiver.
Preferably, the step of converting said XML document into a local relational schema used in said receiver comprises:
generating a XML Tree Model from said XML document;
generating a Document Type Definition representative of one or more data semantics of an XML schema associated with said XML document;
generating an Extended Entity Relationship (EER) model from said XML schema;
applying a schema translation process to the Extended Entity Relationship model to map the XML schema into said local relational schema representative of a relational database of said receiver, said data semantics of said XML schema being preserved as one or more data semantics in said relational database of said receiver; and
converting XML data from said XML database into said local relational database using said local relational schema and said XML schema from said Document Type Definition and/or said XML Tree Model.
Preferably, said network is the Internet.
According to a fourth aspect of the present invention there is provided a computer program comprising program instructions for causing a computer to perform one or more of the methods defined above.
According to a fifth aspect of the present invention there is provided a computer program product comprising the computer program defined above.
According to a sixth aspect of the present invention there is provided a system arranged to perform any one or more of the methods defined above.
According to a seventh aspect of the present invention there is provided a system of translating a relational database into an XML document comprising:
an Extended Entity Relationship (EER) model generator for generating an Extended Entity Relationship (EER) model from relational schema associated with said relational database;
means for applying a schema translation process to the Extended Entity Relationship model to map the relational schema into a Document Type Definition (DTD) of an XML schema;
a generator for generating a XML Tree Model from said Document Type Definition representative of one or more data semantics of the relational schema which are preserved as one or more data semantics in said XML document; and
a converter for converting relational data from said relational database into said XML document using said relational schema and said XML schema from said Document Type Definition and/or said XML Tree Model.
According to an eighth aspect of the present invention there is provided a system of translating an XML database into a relational database comprising:
a XML Tree Model generator for generating a XML Tree Model from said XML database;
a Document Type Definition generator for generating a Document Type Definition representative of one or more data semantics of an XML schema associated with said XML database;
an Extended Entity Relationship (EER) model generator for generating an Extended Entity Relationship (EER) model from said XML schema;
means for applying a schema translation process to the Extended Entity Relationship model to map the XML schema into a relational schema representative of said relational database, said data semantics of said XML schema being preserved as one or more data semantics in said relational database; and
a converter for converting XML data from said XML database into said relational database using said relational schema and said XML schema from said Document Type Definition and/or said XML Tree Model.
One or more preferred embodiments of the invention are advantageous for assisting improvements in database performance, automating XML database recovery in the case of system failures, and permitting users to use their own familiar query language which renders the systems and methods easy to use.
One or more preferred embodiments of the invention provide an incrementally maintainable XML database for efficient and effective Internet computing on the web which is particularly useful in the field of e-commerce.
Internet computing performance may be improved as a replicate XML database and its counterpart relational database may be processed in parallel for both internal data processing computing and external data transmission on the Internet. Furthermore, one or more preferred embodiments of the invention enable an XML database to be recovered by its counterpart relational database in the event that the XML database is down.
To make relational tables compatible with the XML document, one or more preferred embodiments of the invention propose a scheme for translating a relational database into an XML document according to its topology mapping. The scheme may preserve the original relational database constraints which has the benefit that XML documents may be made compatible with a relational database and vice versa.
Thus, one or more preferred embodiments of the invention provide a pair of information capacity equivalent relational and XML databases for rapid and user friendly computing on the Internet.
In one or more embodiments of the present invention, the DTD is used as the logical schema and the XML Tree Model is suggested as the conceptual schema. Thus, users may rely on the XML Tree Model to improve the conceptual structure for understanding the data requirements constraints of XML database.
XML schema provides a means of using XML instances to define augmented DTDs.
DESCRIPTION OF DRAWINGSPreferred features of the invention will now be described, for the sake of illustration only, with reference to the following figures in which:
Document Type Definition (DTD) is a logical schema of the XML model. There is currently no standard format for the conceptual level of the XML model, and preferred embodiments of the invention present a XML Tree Model as a diagrammatic representation of a DTD to form an XML conceptual model. The XML Tree Model may represent diagrammatically the data semantics of an XML database. The XML Tree Model may transform the constraints of a DTD in a topological structure of hierarchy nodes representing all elements within the DTD. Furthermore, the XML Tree Model may confirm the constraints according to user requirements.
According to a preferred embodiment of the invention, an XML document may be considered to have a hierarchical tree structure as follows. Every XML document must have one root element 1. The root element 1 is in the highest hierarchical level. The root element 1 contains all the other elements 2, 3 and its attributes 4. The other elements 2, 3 are in hierarchical order, such that they are in relative parent or child node. The relative higher level is the parent node and the relative lower level is the child node.
An element 2 may be considered to be the basic building block of an XML document.
An element name should start with a letter or underscore character. An element may have sub-elements 3 under it. However, an empty element does not have a sub-element. Between an element 2 and a sub-element 3, there may be declarations 5 which control the occurrences of sub-elements 3. For example, element instances in a Document Type Definition (DTD) may be defined with an occurrence indicator. The “*” operator may be used, for example, to identify “set” sub-elements that can occur from zero to many times under a parent element. The “+” occurrence indicator may be used to specify one to many times occurrence under a parent element. The “?” occurrence indicator may be used to specify zero to one time occurrence under a parent element.
Attributes 4 give more information about an element 2 and reside inside the element 2. An attribute 4 may further define the behaviour of an element 2 and allow it to have extended links by giving it an identifier.
The components of the XML Tree Model preferably consist of the Element 2, Attributes 4, Occurrence indicator 5, Id, Idref, Group element, Sub-element 3 and Component element.
The architecture 10 comprises a Sender/Receiver station and a Receiver/Sender station. Each station comprises a user interface 16 coupled to a data extraction stage 18, each data extraction stage 18 also being coupled to a respective database system 12, 14 and the Internet 20. Each database system 12, 14 contains a relational database 22 for local use, an XML database 24 for transmitting and receiving data and a data conversion stage 26 for converting the format and data between the two databases.
At the sending station, which comprises the database 12, data stored in the relational database 22 may be converted by the data conversion stage 26 to XML format and stored in the XML database 24 prior to transmission over the Internet 20 as an XML document 28. The data to be transmitted is extracted by the data extraction stage 18 under instruction from the user interface 16 associated with the sending station.
At the receiving station which comprises the database 14, the received XML document 28 is stored in the XML database 24 by a data storage stage 30 coupled to the Internet 20 and the database system 14. The stored document is then converted by the data conversion stage 26 to relational format and stored in the relational database 22, from which it may be accessed via the associated data extraction stage 18 by the user interface 16 for local use.
XML data received by the sender station from the Internet 20 may be processed in a similar manner to that described above. An XML document received from the Internet 20 is stored in the database system 12 by a further data storage stage 30.
A benefit of one or more preferred embodiments of the invention is that XML documents may be made compatible with the relational database and vice versa. A pair of information capacity equivalent relational and XML databases may be created for rapid and user friendly computing on the Internet.
According to a preferred embodiment of the invention, in replicating a relational database into an XML database, relational data may be copied into the XML document by transforming the topology data structure of relational tables into the hierarchical data structure of XML documents. As an example, a view of the relational database may be selected with a root relation and transformed into four topological XML documents according to their data semantics for transmission. One benefit of using an XML document as an intermediate data exchange on the Internet is to enable different receivers to expect a standard document on the Internet which can be mapped into their own relational database for processing.
After schema translation 38, relational data from the relational database 42 may be converted in a data conversion stage 44 into XML documents 46 by loading the relational data into XML documents under the control of the relational schema 32 and the XML schema created in the stage 40. Tuples of relational tables are loaded into the object instances of elements in the XML documents according to their constraints. By following a stepwise procedure as shown in
The procedure for conversion between a relational database and the corresponding XML database and vice versa as shown in
Step 1: Reverse engineer relational and XML logical schema into an EER model and a XML Tree Model.
Step 2: Perform schema translation from relational to XML and vice versa by carrying out the following sub-steps:
(1)Defining a root element
(2)Mapping weak entity between relational and XML databases
(3)Mapping participation between relational and XML document
(4)Mapping cardinality between relational and XML databases
(5)Mapping aggregation between relational and XML databases
(6)Mapping is a relationship between relational and XML databases
(7)Mapping generalisation between relational and XML database
(8)Mapping categorisation between relational and XML databases
(9)Mapping n-ary relationship between relational and XML databases
Step 3: Perform data conversion from relational database into XML documents by carrying out the following sub-steps:
(1)Converting relations into a DOM or JDOM (Java Document Object Model)
(2)Integrating XML documents using JDOM
(3)Manipulating XML documents using JDOM
Step 4: Perform conversion from XML database into relational database.
The above steps will now be described in more detail with reference to the figures.
Step 1: Reverse Engineer Relational and XML Logical Schema into an EER Model
To reverse-engineer relational and XML logical schema into an EER model, a relational classification table (see for example Table 1 below) may be used to define the relationship between keys and attributes in all relations, and data semantics may be recovered in the form of an EER model.
An XML classification table (see for example Table 2 below) may be used to define the association among elements.
An example of an algorithm which may be used to map relations into topological XML documents is set out below.
Algorithm:
For the purposes of this specification the notations of data constraints referred to may be defined as follows:
Functional dependency: A functional dependency is a statement of the form X→Y, where X and Y are sets of attributes. The FD: X→Y holds for relation R if whenever s and t are tuples of R where s[X]=t[X], then s[Y]=t[Y].
Multi-valued dependency: Let R be a relation, and let X, Y, and Z be attributes of R. Then Y is multi-dependent on X in MVD: X→→YIZ if and only if the set of Y-values matching a given (X-value, Z-value) pair in R depends only on the X-value and is independent of the Z-value.
Join dependency: Let R be a relation, and let A, B, . . . Z be arbitrary subsets of the set of attributes of R. Then JD {A, . . . Z} is said to hold for a relation R if R={R[A] . . . R[Z]}. That is, JD * (A, . . . Z) if and only if R is equal to the join of its projections on A, Z.
Transitive dependency: A functional dependency X→Y in a relation schema R is a transitive dependency if there is a set of attribute Z that is neither a candidate key nor a subset of any key of R, and both X→Z and Z→Y hold.
Partial dependency: A functional dependency X→Y is a partial dependency if some attribute A∈X can be removed from X and the dependency still hold.
In an EER model, it is possible to navigate from one entity to another entity in one-to-many cardinality in correspondence with an XML hierarchical containment tree model from parent element to child elements. Navigability specifies the feasibility of the traversal from an entity to its related entities. Relationship can be directional with navigability. Navigation is from parent element to child element, then from the children table of the previous relationship and so on.
In an EER model, a superclass entity data occurrence should include its subclass entity data occurrences. On the other hand, a subclass entity may have its own attributes. Thus, a superclass entity requested by the user should include its relevant subclass entity.
Step 2: Schema Translation from Relational to XML and Vice Versa
The conceptual and logical schema (data semantics) in the EER model may be mapped from relational to XML and vice versa according to their data dependencies constraints. These constraints can then be transformed into DTD as XML schema in the following manner:
Step 2.1 Define a Root Element
To select a root element, its relevant information must be put into an XML schema. Relevance is concerned with the entities that are related to an entity selected by the user for processing. The relevant classes include the selected entity and all its related entities that are navigable. Navigability specifies whether traversal from an entity to its related entity is possible.
To make relational schema compatible with the XML schema, based on each constraint in the relational schema, the relational schema with its semantic constraints are mapped in stage 40 of
Given the DTD information of the XML to be stored, a structure may be created called the XML Tree Model that mirrors the structure of the DTD. Each node in the XML Tree model represents an XML element in a rectangle, an XML attribute in an oval, and an operator in a circle. These may be put together in a hierarchical containment under a root element node, with element nodes under a parent element node.
Furthermore, it is possible to link elements together with an Identifier (ID) and an Identifier Reference (IDREF). An element with an IDREF refers to an element with an ID. Each ID must have a unique address. Nodes can refer to each other by using the ID and IDREF.
Elements may cross-reference each other by ID and IDREF such that an element having an IDREF can refer to another element with the appropriate ID.
Thus, to draw an XML Tree Model, an element is selected as a root and then its relevant information is put into a document. The selection is usually driven by the nature of the data being handled and its perceived business application.
XML is in the form of a spool of text in a particular sequence and the sequence will affect the output statement and finally the whole database schema. An XML schema may be considered to consist of a root element and then each element is laid down one by one as branches and leaves in the schema. There is a top-down relationship of the element in an XML schema. Even the element's attributes are also ordered in the schema.
On the other hand, XML Tree Model node diagram uses a graphical interface. Each node in an XML Tree Model does not carry any ordering information. There is no explicit root-branch relationship between nodes in the XML Tree Model nodes diagram.
In order to solve the problem due to this structural difference, an arbitrary root element, a database object, has to be created in order to start the branching from root. Branching from this root element are the basic classes and various constraints included in the XML Tree Model specification. To prepare for the transformation, the non-ordered XML Tree Model node diagram must be replaced with a listing of all related components in the entity diagram. This process may be termed “Decomposition”. With the component list, a process sequence may be drawn to transform each kind of XML Tree Model component into its XML correspondence of DTD.
Entity E 54 is selected for mapping and, in the XML Tree Model 67, becomes the Root Element E 68. Entities F 60 and H 62 become the sub-elements F 70 and H 72 respectively, entity G 64 becoming sub-element G 74. The operators 76 indicate that each sub-element occurs at least once. The navigable entities in the EER Model are mapped as sub elements under root elements in a hierarchy structure. All elements are declared as EMPTY in this situation. Each attribute of the relevant entity is mapped into the attribute of the corresponding element.
The mapping procedure may operate both ways and may be used to map from the XML Tree Model into the EER Model.
Step 2.2: Mapping Weak Entity Between Relational and XML Databases
A weak entity depends on its strong entity in that the primary key of the weak entity is also a foreign key referring to the primary key of its strong entity. In DTD, a strong entity may be transformed into an element with ID and a weak entity into another element which refers to the “ID” element using IDREF as shown in
The existence dependency constraints may be preserved in the translated XML schema as shown below:
An example of algorithms for schema translation between the relational and XML in
Step 2.3: Mapping Participation Between Relational and XML Document
A child table is in total participation with a parent table in that all data occurrences of the child table must participate in a relationship with the parent table. A foreign key of a child table in total participation must refer to the primary key of its parent table. A child table is in partial participation with a parent table in that a foreign key of a child table in partial participation can be a null value.
In preferred embodiments of the present invention, the functional dependency of relational schema may be preserved in the XML schema where a foreign key determines a referred primary key and an instance of a child element determines a data occurrence of a parent element as shown below in Table 3:
In DTD, the total participation may be translated into a mandatory occurrence and a partial participation into an optional occurrence as shown in
In the XML Tree Model 158, the entity R1 144 becomes the element E1 160 and has an attribute A1 162 from the subclass entity A1. The entity R2 146 becomes the sub-element E2* 164 having an attribute A2 166 from the subclass entity A2. The entity R3 148 becomes the sub-element E3* 168 having an attribute A3 170 from the subclass entity A3. The “*” operator identifies sub-elements that can occur from zero to n times under a parent element. The procedure is reversible and the relational schema may be obtained from the XML Tree Model.
Step 2.4: Mapping Cardinality Between Relational and XML Databases
One-to-one cardinality indicates that a foreign key of a child table refers to a primary key of a parent table in one to one occurrence. One-to-many cardinality indicates that a primary key of a parent table is referred by many foreign keys of a child table in one to many occurrences. Many-to-many cardinality indicates that a primary key of a parent table is referred by many foreign keys of a child table and vice versa.
Table 5 below illustrates that the functional dependency and multi-valued dependency of relational schema are preserved in the translated XML schema used in the three above-described cases of cardinality as shown in
Examples of preferred algorithms of schema translation between relational and XML for use in the methods illustrated in
Step 2.5: Mapping Aggregation Between Relational and XML Databases
An aggregation specifies a whole-part relationship within an aggregate such that an entity represents the whole of the aggregate and a constituent entity represents part of the aggregate. The aggregate may be taken as an entity which is mapped into an element. A DTD may be used to construct the part relationships in the element content.
In the methods of
Examples of preferred algorithms of schema translation between relational and XML for use in the methods of
Step 2.6: Mapping is a Relationship Between Relational and XML Databases
The is a defines a relationship between a subclass entity and a superclass entity such that the data in the subclass must be included in the superclass. Also, the superclass and subclass must have the same domain value which is why they can be related in is a relationship.
In DTD, each subclass entity may be transformed as a child element which refers to its parent element such that each parent element can have zero to one child elements.
The procedure is reversible and the relational schema may be obtained from the XML Tree Model.
In the methods of
Examples of preferred algorithms of schema translation between relational and XML for use in the method of
Step 2.7: Mapping Generalisation Between Relational and XML Database
The generalisation defines a relationship between entities to build a taxonomy of classes: One entity is a more general description of a set of other entities. In DTD, the general superclass entity may be transformed into an element, the element type originating from the superclass.
In the methods of
Examples of preferred algorithms of schema translation between relational and XML for use in the methods of
Step 2.8: Mapping Categorisation Between Relational and XML Databases
A subclass table is a subset of a categorisation of its superclass tables in which the data occurrence of the subclass table appears in one and only one superclass table. In DTD, the superclass may be transformed into an element, and the common subclass into a sub-element. Each element receives an additional “artificial” ID attribute declared as #REQUIRED referred by the common sub-element's IDREF.
In the methods of
Examples of preferred algorithms of schema translation between relational and XML for use in the methods of
Step 2.9: Mapping N-Ary Relationship Between Relational and XML Databases
Multiple tables relate to each other in an n-ary relationship. An n-ary relationship is a relationship relation for multiple tables such that components of the former's compound primary key referring to the primary keys of the latter which are related to each other. In DTD, the entities in the n-ary relationship may be transformed as shown in
In the methods of
Examples of preferred algorithms of schema translation between relational and XML for use in the methods of
Thus, in step 2 described above, the data dependencies constraints in the relational schema may be mapped into XML Tree Models and the declarations of elements and attributes are mapped into DTD. In the process, the various data semantics of cardinality, participation, aggregation, generalisation, and categorisation are preserved in the hierarchical containment elements and attributes of the XML documents.
Step 3 Data Conversion from Relational Database into XML Documents
According to a preferred embodiment of the present invention, after schema translation, data conversion may then be carried out by loading relational data into XML documents. Tuples of the relational tables may be loaded into the object instances of elements in the XML documents according to their constraints.
According to preferred embodiments of the invention, the method preferably preserves the structural constraints (cardinality and participation) of the relationships from the underlying relational database source and represents the flat relation structures in a compact nested XML structure.
As the result of the schema translation in step 2 described above, an EER model may be translated into different embodiments of XML schemas based on the selected root elements. For each translated XML schema, the corresponding source relation may be read sequentially by embedded SQL, that is, one tuple at a time, starting from a parent relation. The tuple can then be loaded into an XML document according to the mapped XML DTD. The corresponding child relation tuple(s) may then be read, and loaded into the XML document. According to preferred embodiments, corresponding parent and child relations in the source relational database are processed according to the translated parent and child elements in the mapped DTD.
Step 3.1 Convert Relations into a DOM (Document Object Model) or a JDOM (Java Document Object Model)
According to each data semantic, relations may be converted into DOMs as follows:
Data Conversion Algorithm:
Step 3.2: Integrating XML Documents Using DOMs
An XML structure may be represented as a linked list where one element follows another. A DOM technique may be employed for implementation. In a preferred embodiment, each set of relations in a data dependencies relationship may be translated into a DOM. These DOM nodes may then be integrated into a single DOM node, and translated into an XML document using, for example, the following algorithm:
Step 3.3 Manipulating XML Documents Using DOMs
In a preferred embodiment, when a DOM parse attempts to read an XML document, it creates firstly a document object, and then the whole XML document may be traversed from this point. During the merging of two or more DOM's, every element/node in one DOM may be evaluated with the others, not only the structure, such as parent/children relationship, but also their value. A search algorithm such as the procedure getNode set out below may be defined for matching elements/nodes within n numbers of DOMs. The algorithm may match the same elements in a document. Firstly the XML database is inspected and the node list that contains desired elements is derived.
Algorithm of Procedure getNode
To integrate DOMs, one main DOM tree is focussed upon and duplicate elements in the other two are deleted after their child elements have been appended to the main element in the program. The reason for deleting duplicate elements is not only to avoid double checking every time the program called getNode( ) is run, but also to avoid duplicate appending. The integration algorithm may be divided into search, deletion and insertion as follows:
Algorithm of the Procedure of Integration
The above integration algorithm checks the property of Node first. According to different Node types, TEXT Node will be checked within the other two nodes. When the function finishes its job, an integrated DOM is created.
In a preferred embodiment of the present invention, the above steps 3.1 to 3.3 may be carried out by converting relations into a JDOM (Java Document Object Model).
Step 4: Conversion from an XML Database into a Relational Database
As the result of the schema translation in step 2, an XML Tree Model may be translated into an EER model. For each element in the source XML schema, its corresponding target relation may be located. The data occurrence of an element may then be loaded into the tuple of the relation according to the data semantic. Element and sub-element data occurrences in the source XML database may be processed according to the translated parent and child relations in the mapped relational schema with a template as shown below:
An example of a Data Conversion algorithm which may be used in this step is as follows:
Updating the Databases
To update replicate relational databases and XML databases, a synchronization update may be performed by translating a source relational database program into a target XML database program, and then processing them concurrently. These replicate relational databases and XML databases continue to support the relational database while developing an information-capacity-equivalent XML database for the same application. An incremental mapping from the relational database to the XML database may be maintained. In a preferred embodiment, applications on the relational database may be rewritten and moved to the XML database.
The basic idea of the synchronization update of a pair of relational and XML databases according to a preferred embodiment of the present invention is shown in
At the relational site, DML statements that update the relational database may be monitored. Every time an update operation changes the relational database, the changes may also be recorded in the corresponding XML database by the translated XML database program. This may be implemented by transforming the update transactions of relational database program into the update transactions of the replicate XML database program which perform the same operations on the database as did the original relational database programs. The pseudo code for the overall algorithm of synchronization may be as follows:
Algorithm
Update Transaction Translation from SQL to DOM
Three update transactions: insert, update and delete may be translated from SQL to DOM as follows:
For the INSERT Transaction:
Attribute values may be specified for a sub-element instance to be inserted in an element Ek. The values for attributes corresponding to fields in Rk may be denoted by v1, v2, . . . vn and the values of the foreign keys in Rk may be denoted by V1, V2, . . . Vn.
An example of a suitable algorithm is as follows:
The syntax of insert algorithm may be as set out in Table 12 below:
For the UPDATE Transaction:
If it is desired, for example, to replace the value of an attribute A in the element Ek with the value V, basically, two cases are considered. In the first case, A is not a foreign key. It corresponds to a data item in the corresponding relation R and thus a DOM command is required to perform the replacement in the XML database. In the second case, A is a foreign key. Replacing a value in this case involves changing the element sub-element relationship rather than the attribute value. Value (A) is the content of attribute A in the relation type R before update.
An example of a suitable algorithm for use in this procedure is as follows:
The syntax of update algorithm is:
For the DELETE Transaction:
A simple delete-only statement in the relational database corresponds to the XML database delete statement for a given XML schema. The delete-sub-element-Ek-only statement has the following properties:
1. Remove sub-element Ek from all elements in which it participates as a sub-element
2. Do not remove sub-element Ek for each element where Ek participates as an element.
An example of a suitable algorithm for use in this procedure is as follows:
The syntax of delete algorithm is:
After converting the relational database into the XML database, in order to synchronise the update of these two databases, the update transaction of the relational database program may be translated into the update transaction of the XML database program. Once translated, these two programs may update both the relational database and the XML database concurrently for synchronised updating.
As shown in
Asynchronous Update Transactions, Translation and Processing of SQL and JDOM
In a further preferred embodiment of the present invention, after converting the relational database into an XML database, the two databases may be updated asynchronously by translating and processing relational database transactions into XML database transactions. Once translated, the update transactions may be processed asynchronously, firstly the SQL and then, for example, the Java Document Object Model (JDOM). In the pre-process, the data to be transmitted on the web is extracted from the relational database. The data is then converted/replicated into an XML document which is stored in a replicate XML database. Each translated JDOM update transaction is to be processed after each successful SQL update transaction.
Three update transactions, namely, insert, update and delete may be translated from SQL to DOM as follows:
For the INSERT Transaction:
To insert a sub-element into an element Ek its attribute values should be specified. The values for attributes corresponding to fields in Rk may be denoted by v1, v2, . . . vn and the values of the foreign keys in Rk may be denoted by V1, V2, . . . , Vn and the non-key values may be denoted by N1, . . . , Nn.
An example of a suitable algorithm is as follows:
The syntax of insert algorithm is:
For the UPDATE Transaction:
If the value (A) of an attribute A in the relation R is to be replaced by an element Ek with the value V in the translated XML document (database) X, basically, two cases may be considered. In the first case, attribute A is not a foreign key but instead corresponds to a data item in the corresponding relation R. A JDOM command is required to perform the replacement in the XML database. In the second case, attribute A is a foreign key. Replacing a value in this case involves changing a time element sub-element relationship rather than the attribute value in the translated XML database.
An example of a suitable algorithm is as follows:
The syntax of update algorithm is:
For the DELETE Transaction:
A simple delete-only statement in the RDB corresponds to the XML database delete statement for a given XML schema. The delete-sub-element-Ek-only statement has the following properties:
1. Remove sub-element Ek from all elements in which it participates as a sub-element
2. Do not remove sub-element Ek for each element where Ek participates as an element.
An example of a suitable algorithm is as follows:
The syntax of delete algorithm is:
Two case studies are described below to illustrate the implementation of the above described preferred embodiments of the present invention.
Case Study 1:
This case study is of a Hospital Database System. An EER of the system is shown in
In the following, underlined and italic means primary key and * means foreign key.
By following the stepwise procedures according to preferred embodiments of the present invention as described in Steps 1 to 4 above in connection with FIGS. 1 to 13b, the relational schema of this case study may be converted into the XML Schema as follows:
Step 1: Reverse Engineer Relational Schema into an EER Model
By using a classification table, the EER model shown in
Step 2.1: Define a Root Element
As this case study is about patients' records, it is advisable to use a meaningful name for the root element. The entity name, ‘Patient’, should preferably not be used as the root element name because it is desired to hold all the patient records in an XML file. Another reason is that it may be desirable to add some other attributes to the root element to describe the system itself. Thus, it is preferable to use Patient Records as the root element for the DTD:
XML Schemas
<!ELEMENT Patient_Records (Patient)+>
Starting from the entity Patient 582 in the EER model of
An XML Tree Model that starts from the entity Patient_Records may then be formed and this is shown in
The root element Patient_Records 620 has the entity Patient 622 as a direct child. The entity Patient 622 has the entities Borrower 624, Borrow 626 and Record Folder 628 related to it. The entities Borrower 624 and Borrow 626 are each in a zero-to-many relationship 630, 632 with the entity Patient 622 and the entity Record Folder 628 is in a one-to-many relationship 634 with the entity Patient 622. The entity Record Folder 628 has the entity Medical Record 636 as a direct child. In the XML Tree Model, the element Medical Record 636 is in a relationship 644 with either Outpatient 638, Ward 640 or AE 642.
As the entities Record Folder 628 and Medical Record 636 are navigable from the Patient entity 622, all those entities may then be mapped into the elements of the XML schema. The attributes of those elements may be defined by using the definition of the relational schema as shown below in Table 26:
Step 2.2: Map Weak Entity into the Content Model
This is not applicable in this case study.
Step 2.3: Map Participation into the Content Model
The relationship between the entities Patient 622 and the Record Folder 628 is total participation. The relationship between the entities Record Folder 628 and Medical Record 636 is also total participation. Therefore, the content model of the XML schema is translated as shown below in Table 27. Not all foreign keys in the relational schema will be mapped into XML schema as they will be represented in containment or ID and IDREF.
Step 2.4: Map Cardinality into the Content Model
The relationship between the entities Borrower 624 and Borrow 26, and the entity Record_Folder 628 is many-to-many cardinality as a borrower may borrow many record folders and a record folder may be borrowed by many borrowers. In this many-to-many cardinality, the relationship between the entities borrow and borrower will not be included for the purposes of this case study as they are in a many-to-one relationship. The translated XML schema together with the many-to-many relationship is shown below in Table 28:
As the entity Loan_History shown in
Step 2.5: Map Aggregation into the Content Model
This step is not applicable in this case study.
Step 2.6 Map is a into the Content Model
This step is not applicable in this case study.
Step 2.7: Map Generalisation into the Content Model
As the medical record may be an AE, a ward or an outpatient record, so it is a disjoint generalisation. The translated XML schema for the entity Medical Record may be as shown below in Table 29:
Step 2.8: Map Categorisation into the Content Model
Although there is a categorisation in this case study, it is not navigable from the entity Patient. Thus this step is not applicable for this case study.
Step 2.9: Map N-Ary Relationship into the Content Model
This step is not applicable in this case study.
As a result, the final XML DTD and example of XML document are as follows:
The Translated XML DTD
Step 3 Data Conversion from Relational Database into XML Document
As a result of schema translation in step 2, relational data may be loaded into an XML document as follows:
An Example of XML Document is:
Case Study 2
This case study is for a bank loan application. In this study, a loan with an identity number belongs to a customer who has a customer identity number. Customers have mortgage loans secured by loan securities. Each loan interest type may be accrued by multiple interest types. Each interest type may be assigned to different loans. Customers open accounts at different branches with a maturity date. Each loan is charged with interest of a rate of an interest type. All of these may be described in an extended entity relationship model such as that shown in
Starting from the entity Loan 672 in the EER model of
The relational schemas for this case study are shown in Tables 30 to 37 below.
Transforming Relational Database into XML Documents:
(a) Schema Translation from Relational to Topological XML Tree Model
After classifying each attribute in a classification table, their constraints may be derived as set out in Table 38:
(i) Map Relational Schema into Group Topological XML Tree Model
The relational schema comprising the relations R1, R2 and R3 where R1 is defined by R1(*Customer, *Security), R2 is defined by R2(*Security, *Loan, Maturity_Date), and R3 is defined by R3(*Loan, *Customer) are joined into the relation R(*Customer, *Security *Loan, Maturity_Date). Then, the relation R is transformed into a group of elements in an XML Tree Model.
(ii) Map Relational Schema into Multiple Topological XML Tree Model
The relational schema comprising the relations R1 and R2, where these relations R1 and R2 are defined as R1(*Customer, Credit_Card) and R2(*Customer, Debit_Acct), are joined into the relation R(*Customer Credit_Card, Debit_Acct). Then the relation R is transformed into a group of sub-elements of multiple occurrences in an XML Tree Model.
(iii) Map Relational Schema into a Single Sub-Element Topological XML Tree Model
The relational schema comprising the relations R1 and R2, where R1 and R2 in this case are defined as R1(Type, Enter_Date, Description) and R2(*Type, Effective Date, Rate), are mapped into a relation R(Type, Effective Date, Enter_Date, Rate, Description). Then the relation R is transformed into a single sub-element topological XML Tree Model.
(iv) Map Relational Schema into a referral Topological XML Tree Model
The relational schema comprising the relation R(Loan_ID, Type) is mapped into a referral topological XML Tree Model as shown in
Finally, the above translated XML Tree Models of FIGS. 18 to 21 are integrated into an XML Tree Model as shown in
The XML Tree Model of
The element mortgage 732 also has the sub-element Interest Type 758. The sub-element Interest Type 758 has the sub-element Interest_Rate 760 and the attributes type 762, enter_date 764, and description 766. The sub-element Interest_Rate 760 has the attributes effective_date 768 and rate 770. The element Loan 712 having the idref 750 refers to the element Interest Type 720 having the ID 756.
(b) Map XML Tree Model into XML Schema
In this case study, Bank is selected as the root of the XML document for the application. Then the integrated XML Tree Model may be mapped into an XML Schema (DTD) as follows:
(c) Data Conversion from Relational to XML Document
Case (i): Relations→Group Topological XML Document:
To convert the data from the relational database into the XML document, firstly a reorganized relation R1(*Customer, *Security, *Loan, Maturity_Date) is loaded into a group of element data instances in an XML document (1) as follows:
Relation R1=Relation Customer-SecurityRelation Security-Loan Relation Loan-Customer
Relation R1 is shown in Table 39.
Case (ii): Relations→Multiple Sub-Element Topological XML Document:
To convert the data from the relational database into the XML document, secondly, a reorganized relation R2(*Customer, CreditCard, DebitAcct) is loaded into a multiple sub-element topological XML document (2) as follows:
Relation R2=Relation Customer-Credit_Card Relation Customer-Debit_Acct
Relation R2 is shown in Table 40.
XML Document (2)
Case (iii): Relations→Single Sub-Element Topological XML Document:
To convert the data from the relational database into the XML document, thirdly a reorganized relation R3(Type, Effective_Date, Enter_Date, Rate, Description) is loaded into a single sub-element topological XML document (3) as follows:
Relation R3=Relation Interest_Type Relation Interest_Rate
Relation R3 is shown in Table 41.
XML Document (3)
Case (iv): Relations→Referral Topological XML Document:
To convert the data from the relational database into the XML document, fourthly a reorganized relation R4(Loan, Type) is loaded into a topological XML document (4) as follows:
Relation R4=Relation Loan-Interest_Type
Relation R4 is shown in Table 42.
XML Document (4)
Then all of the above relations are integrated into an XML document by use of a DOM tree as follows:
Update Transactions from SQL to XML Document
At the relational database (RDB) site, DML statements that update the relational database are monitored. Every time an update operation (insert/delete/update) changes the relational database, the changes may also be applied to the corresponding XML database. This may be implemented by transforming the update transactions of the relational database program into the update transactions of the replicate XML database program which perform the same operations on the database as did the original relational database programs.
Update Transaction Translation from SQL to DOM
Three update transactions, namely insert, update and delete may be translated from SQL to DOM as follows. The example given shows the actual SQL and its translated DOM statements.
Firstly, if it is desired to insert a new record into the RDB, the corresponding change is applied to the XMLDB simultaneously, as shown below:
FIGS. 23 to 31 show sample display screens which may be used for Case Study 2 described above during the implementation of the above-described method embodying the present invention.
Finally, if it is desired to delete the customer “Tomi”, a ‘Find’ operation may be executed to find the customer to be deleted, then the above-described Delete operation may be applied directly. A message box may appear on the display to show that the record has been deleted (as shown in the display screen 784 of
In summary, one or more preferred embodiments of the present invention provide a method for converting a relational database into one or more XML documents according to its topology mapping. The schema translation and data conversion procedures are provided with steps and mapping rules to recover the data constraints semantics of relational database into an Extended Entity Relationship model which may then be mapped into XML Tree Model and XML schema. The target XML schema may be presented in DTD. The constraints of the relational schema in functional dependencies, inclusion dependencies and multi-valued dependencies may be represented in the translated XML schema. The translation may be constructed through an extracted XML view of relational database, which may be based on a selection of its root element (an entity) and its relevant and navigable elements, (the selected entity plus its navigable entities) to fulfil the data requirement of an XML document. The translation process involves mapping each constraint of relational schema into a hierarchical containment of XML Tree Model. The conversion is preferably capable of preserving the original relational database constraints. The resulting XML structure is thereby able to reflect the semantics and structure in the underlying relational database.
One or more preferred embodiments of the present invention may assist in improving the performance of Internet computing by allowing parallel processing for data exchange on the Internet as well as data processing of relational data. Also, the reliability of an XML database may be improved by recovery from its counterpart relational database.
Various modifications to the embodiments of the present invention described above may be made. For example, other components and method steps can be added or substituted for those above. Thus, although the invention has been described above using particular embodiments, many variations are possible within the scope of the claims, as will be clear to the skilled reader, without departing from the spirit and scope of the invention.
Claims
1. A method of translating a relational database into an XML document comprising the steps of:
- generating an Extended Entity Relationship (EER) model from relational schema associated with said relational database;
- applying a schema translation process to the Extended Entity Relationship model to map the relational schema into a Document Type Definition (DTD) of an XML schema;
- generating a XML Tree Model from said Document Type Definition representative of one or more data semantics of the relational schema which are preserved as one or more data semantics in said XML document; and
- converting relational data from said relational database into said XML document using said relational schema and said XML schema from said Document Type Definition and/or said XML Tree Model.
2. A method according to claim 1, wherein the step of generating said Extended Entity Relationship model comprises reverse-engineering logical relational schema associated with said relational database.
3. A method according to claim 1, wherein the step of applying a schema translation process comprises mapping the relational schema with associated relational schema constraints into said Document Type Definition.
4. A method according to claim 1, wherein the step of applying a schema translation process comprises mapping the relational schema into a topological XML Document Type Definition.
5. A method according to claim 1, where the step of applying a schema translation process comprises defining a root element prior to mapping the relational schema into said Document Type Definition (DTD), said root element being representative of an element in said relational database.
6. A method according to claim 5, wherein the step of defining a root element comprises selecting said root element.
7. A method according to claim 5, wherein the step of defining a root element comprises selecting said root element from a relational entity table in said relational database.
8. A method according to claim 5, wherein said relational database comprises one or more entities, and said XML document comprises said root element, and one or more sub-elements, and wherein the step of applying a schema translation process further comprises one or more of the following steps after defining said root element:
- (a) mapping a weak entity from said relational database into said XML document;
- (b) mapping participation between entities in said relational database into said XML document;
- (c) mapping cardinality between entities in said relational database into said XML document;
- (d) mapping aggregation between entities in said relational database into said XML document;
- (e) mapping one or more is a relationships between entities in said relational database into said XML document;
- (f) mapping one or more generalisations between entities in said relational database into said XML document;
- (g) mapping one or more categorisations between entities in said relational database into said XML document; and
- (h) mapping one or more single and/or multiple (n-ary) relationships between entities in said relational database into said XML document.
9. A method according to claim 1, wherein said relational database comprises one or more entities, and said XML document comprises said root element, and one or more sub-elements, and wherein the step of applying a schema translation process comprises mapping related entities in said relational database into relevant elements in said XML document based on navigability of the entities.
10. A method according to claim 1, wherein the step of converting relational data from said relational database into said XML document comprises:
- (a) converting one or more relations associated with relational data in said relational database into a Document Object Model (DOM); and
- (b) manipulating said XML document using said Document Object Model.
11. A method according to claim 10, wherein said Document Object Model (DOM) is a Java Document Object Model (JDOM).
12. A method according to claim 5, wherein each mapping step generates a new XML document, and wherein the step of converting relational data from said relational database into said XML document comprises:
- (a) converting one or more relations associated with relational data in said relational database into a Document Object Model (DOM); and
- (b) integrating XML documents using said Document Object Model to form an XML database corresponding to said relational database.
13. A method according to claim 1 wherein the step of generating an XML Tree Model comprises generating a plurality of XML Tree Models representative of one or more data semantics of the relational schema.
14. A method according to claim 12, further comprising updating said relational database and said XML database by translating an update transaction from said relational database in Structural Query Language into an update transaction of said XML database as a Document Object Model.
15. A method according to claim 14, wherein said transactions update said relational database and said XML database concurrently to produce a synchronized update.
16. A method according to claim 14, wherein said updating of said relational database is effected prior to or after said update of said XML database to produce an asynchronized update.
17. A method according to claim 1, wherein the step of generating an Extended Entity Relationship (EER) model comprises recovering one or more data semantics associated with said relational schema from a classification table.
18. A method of translating an XML database into a relational database comprising the steps of:
- generating an XML Tree Model from said XML database;
- generating a Document Type Definition representative of one or more data semantics of an XML schema associated with said XML database;
- generating an Extended Entity Relationship (EER) model from said XML schema;
- applying a schema translation process to the Extended Entity Relationship model to map the XML schema into a relational schema representative of said relational database, said data semantics of said XML schema being preserved as one or more data semantics in said relational database; and
- converting XML data from said XML database into said relational database using said relational schema and said XML schema from said Document Type Definition and/or said XML Tree Model.
19. A method according to claim 18, wherein the step of generating said Extended Entity Relationship model comprises reverse-engineering logical relational schema associated with said relational database.
20. A method according to claim 18, wherein the step of applying a schema translation process comprises mapping the XML schema with associated XML schema constraints into said Document Type Definition.
21. A method according to claim 18, wherein the step of applying a schema translation process comprises mapping a topological XML Document Type Definition into said Extended Entity Relationship.
22. A method according to claim 18, wherein said XML schema comprise one or more elements each having an associated data occurrence, and wherein the step of applying a schema translation process further comprises for each element in said XML schema, locating a corresponding target relation, and loading into a tuple of said target relation the data occurrence of said element according to one or more data semantics of said XML database.
23. A method according to claim 18, wherein the step of generating an XML Tree Model comprises generating a plurality of XML Tree Models representative of one or more data semantics of the XML schema.
24. A method according to claim 18, further comprising updating said relational database and said XML database by translating an update transaction from said XML database as a Document Object Model into an update transaction of said XML database in Structural Query Language.
25. A method according to claim 24, wherein said transactions update said relational database and said XML database concurrently to produce a synchronized update.
26. A method according to claim 24, wherein said updating of said relational database is effected prior to or after said update of said XML database to produce an asynchronized update.
27. A method of data transmission of relational data through an XML document between a sender and a receiver over a network comprising the method of claim 1 for translating relational data into an XML document, transmitting from said sender said XML document over said network, receiving at said receiver said XML document, and converting said XML document into a relational language used in said receiver.
28. A method according to claim 27, wherein the step of converting said XML document into a local relational schema used in said receiver comprises:
- generating an XML Tree Model from said XML document;
- generating a Document Type Definition representative of one or more data semantics of an XML schema associated with said XML document;
- generating an Extended Entity Relationship (EER) model from said XML schema;
- applying a schema translation process to the Extended Entity Relationship model to map the XML schema into said local relational schema representative of a relational database of said receiver, said data semantics of said XML schema being preserved as one or more data semantics in said relational database of said receiver; and
- converting XML data from said XML database into said local relational database using said local relational schema and said XML schema from said Document Type Definition and/or said XML Tree Model.
29. A method according to claim 28, wherein said network is the Internet.
30. A method according to claim 28, wherein said network is the Internet.
31. A computer program comprising program instructions for causing a computer to perform the method of claim 1.
32. A computer program comprising program instructions for causing a computer to perform the method of claim 18.
33. A computer program product comprising the computer program of claim 31.
34. A computer program product comprising the computer program of claim 32.
35. A system arranged to perform the method of claim 1.
36. A system arranged to perform the method of claim 18.
37. A system of translating a relational database into an XML document comprising:
- an Extended Entity Relationship (EER) model generator for generating an Extended Entity Relationship (EER) model from relational schema associated with said relational database;
- means for applying a schema translation process to the Extended Entity Relationship model to map the relational schema into a Document Type Definition (DTD) of an XML schema;
- a generator for generating an XML Tree Model from said Document Type Definition representative of one or more data semantics of the relational schema which are preserved as one or more data semantics in said XML document; and
- a converter for converting relational data from said relational database into said XML document using said relational schema and said XML schema from said Document Type Definition and/or said XML Tree Model.
38. A system of translating an XML database into a relational database comprising:
- an XML Tree Model generator for generating an XML Tree Model from said XML database;
- a Document Type Definition generator for generating a Document Type Definition representative of one or more data semantics of an XML schema associated with said XML database;
- an Extended Entity Relationship (EER) model generator for generating an Extended Entity Relationship (EER) model from said XML schema;
- means for applying a schema translation process to the Extended Entity Relationship model to map the XML schema into a relational schema representative of said relational database, said data semantics of said XML schema being preserved as one or more data semantics in said relational database; and
- a converter for converting XML data from said XML database into said relational database using said relational schema and said XML schema from said Document Type Definition and/or said XML Tree Model.
Type: Application
Filed: Feb 3, 2005
Publication Date: Aug 3, 2006
Inventor: Joseph Fong (Discovery Bay)
Application Number: 11/049,831
International Classification: G06F 7/00 (20060101);