Database management systems and methods using data normalization and defragmentation techniques
Improved systems and methods for database management using data normalization and defragmentation techniques are provided. At least one exchange processor in communication with an exchange computer system receives market data from the exchange computer system, processes the market information, and transmits the market data to a master processor. The master processor receives the market data, processes the data using at least one normalization process to generate normalized data including an intra-day file and an archival file, and stores the intra-day file and the archival file in the master database. The master processor transmits the intra-day file and the archival file to the at least one regional processor. The regional processor receives a request for information from a customer computer system in communication with the regional processor, queries the intra-day file and the archival file to identify matching market data in response to the request, and transmits the matching market data to the customer computer system.
Latest MAYSTREET INC. Patents:
The present disclosure relates generally to the field of computer database management systems and methods. More specifically, the present disclosure relates to improved database management systems and methods using data normalization and defragmentation.
Related ArtDatabase management systems are a critical part of today's computing technology. Database management systems of various designs exist, such as relational databases, columnar databases, object databases, and other types of databases. Additionally, databases can be distributed across multiple computing systems/platforms, and are scalable to accommodate various data requirements.
In the field of financial information processing and technology, database management systems are of critical importance in storing and managing financial data, often in real time. For example, the various stock exchanges of the United States and other countries (e.g., the New York Stock Exchange (NYSE)) each maintain sophisticated computer systems and associated database management systems which store live (and historical) stock market and exchange data. Each of these systems utilizes proprietary data formats and database management functions.
Due to the highly proprietary nature of each exchange's computer systems and database management systems/technology, it is difficult to rapidly and efficiently search for desired financial data (e.g., stock data, market data, etc.) across multiple exchanges. Accordingly, what would be desirable, but has not yet been provided, are improved database management systems and methods using data normalization and defragmentation techniques, which solve the foregoing and other needs.
SUMMARYThe present disclosure relates to improved systems and methods for database management using data normalization and defragmentation techniques. The system includes at least one exchange processor in communication with an exchange computer system, a master processor in communication with the at least one exchange processor, and at least one regional processor in communication with the master processor. The at least one exchange processor receives market data from the exchange computer system, processes the market information, and transmits the market data to the master processor. The master processor receives the market data, processes the data using at least one normalization process to generate normalized data including an intra-day file and an archival file, and stores the intra-day file and the archival file in the master database. The master processor transmits the intra-day file and the archival file to the at least one regional processor, which stores the intra-day file and the archival file in a regional database. The regional processor receives a request for information from a customer computer system in communication with the regional processor, queries the intra-day file and the archival file to identify matching market data in response to the request, and transmits the matching market data to the customer computer system.
The foregoing features of the invention will be apparent from the following Detailed Description of the Invention, taken in connection with the accompanying drawings, in which:
The present disclosure relates to improved database management systems and methods, as described in detail below in connection with
The master processor 12 and the regional processors 16a-16b could be implemented using a wide variety of computer hardware, such as individual servers, groups of servers (server farms), could computing platforms, or other suitable computing devices, running suitable operating systems (e.g., LINUX, UNIX, etc.). The customer computers 22a-22b could be any suitable computing system capable of accessing the regional processors 16a-16b, such as personal computers, tablet computers, smart phones, etc. The exchange processors 24a-24c are customized processors that are in communication with one or more exchange data systems (and, optionally, located at the same physical location as such data systems) and which include customized software and hardware components for rapidly (e.g., in real time) aggregating market data from the one or more exchanges and transmitting such information to the master processor 12 for normalization by the master processor. As will be discussed in greater detail below, the master database 14 and the regional databases 20a-20b are customized databases with built-in features that allow for very rapid searching of market data by users of the system. Although specific processors are shown in
In the event that the user desires a depth-of-book presentation of information, step 162 occurs, wherein the regional processor loads incremental and snapshot files (such as the files discussed above in connection with
Having thus described the system and method in detail, it is to be understood that the foregoing description is not intended to limit the spirit or scope thereof. It will be understood that the embodiments of the present disclosure described herein are merely exemplary and that a person skilled in the art can make any variations and modification without departing from the spirit and scope of the disclosure. All such variations and modifications, including those discussed above, are intended to be included within the scope of the disclosure.
Claims
1. An improved database management system, comprising: at least one master processor programmed to:
- receive, over a period of time, a plurality of market data updates, each market data update from among the plurality of market data updates comprising a product identifier, a data value of a product identified by the product identifier, and a time value for which the data value relates;
- generate an intra-day file comprising a plurality of time chunks, each time chunk indexed by time and corresponding to an interval of time;
- assign each product identifier, data value, and time value from the market data updates to a respective time chunk based on the time value and the interval of time, wherein the intra-day file is fragmented with respect to a given product because data values for the given product is spread out over multiple time chunks, but the intra-day file is queryable based on a time index of each time chunk; and
- defragment the intra-day file to generate an archival file comprising a plurality of product fields that are indexed by a source of a market data update, wherein each product field from among the plurality of product fields includes a time-series of data values for a respective product, and wherein to defragment the intra-day file, the processor is programmed to: group product identifiers and their respective values from the intra-day file into a time series of data values; and assign each of the groups into a respective data record.
2. The system of claim 1, further comprising at least one exchange processor, associated with a source of one or more market data updates, in communication with the master processor, wherein the at least one exchange processor includes a ring buffer and a database, the at least one exchange processor storing the one or more market data updates in at least one of the ring buffer and the database prior to transmitting the market data to the master processor that generates the intra-day file and the archival file.
3. The system of claim 2, wherein the at least one exchange processor and the master processor perform an in-memory synchronization processes to synchronize information shared by the at least one exchange processor and the master processor.
4. The system of claim 1, wherein the master processor stores the intra-day file and the archival file in a master database.
5. The system of claim 1, wherein the master processor transmits historical data and incremental data to at least one regional processor, the at least one regional processor storing the historical data and the incremental data in at least one regional database.
6. The system of claim 1, wherein market data updates correspond to a plurality of financial products at different times.
7. The system of claim 6, wherein each of the product fields include market data corresponding to a single financial product.
8. The system of claim 7, wherein to defragment the intra-day file to create the archival file, the master processor is further programmed to sort the market data in time sequence.
9. The system of claim 1, wherein the master processor is further programmed to: generate, during a normalization process, an intra-day streaming data file from the market data updates, the intra-day streaming data file including an exchange identifier field and a plurality of objects associated with the exchange identifier field.
10. The system of claim 9, wherein the normalization process generates an incremental data file and a snapshot data file, the snapshot data file indexed by the incremental data file.
11. The system of claim 10, further comprising at least one regional processor programmed to replay the incremental data file until a time requested by a customer is located.
12. The system of claim 11, wherein the at least one regional processor returns snapshot data from the snapshot data file when the time requested by the customer is located in the incremental data file.
13. The system of claim 11, wherein the at least one regional processor returns matching information to a customer computer system in one or more of a top-of-book or a depth-of-book format.
14. A method, comprising:
- receiving, by a master processor, over a period of time, a plurality of market data updates, each market data update from among the plurality of market data updates comprising a product identifier, a data value of a product identified by the product identifier, and a time value for which the data value relates;
- generating, by the master processor, an intra-day file comprising a plurality of time chunks, each time chunk indexed by time and corresponding to an interval of time;
- assigning, by the master processor, each product identifier, data value, and time value from the market data updates to a respective time chunk based on the time value and the interval of time, wherein the intra-day file is fragmented with respect to a given product because data values for the given product is spread out over multiple time chunks, but the intra-day file is queryable based on a time index of each time chunk; and
- defragmenting, by the master processor, the intra-day file to generate an archival file comprising a plurality of product fields that are indexed by a source of a market data update, wherein each product field from among the plurality of product fields includes a time-series of data values for a respective product, and wherein defragmenting the intra-day file comprises: grouping, by the master processor, product identifiers and their respective values from the intra-day file into a time series of data values; and assigning, by the master processor, each of the groups into a respective data record.
15. The method of claim 14, further comprising:
- storing, by at least one exchange processor, the one or more market data updates in at least one of a ring buffer and a database prior to transmitting the market data to the master processor that generates the intra-day file and the archival file, wherein the at least one exchange processor is associated with a source of one or more market data updates and is in communication with the master processor.
16. The method of claim 15, further comprising:
- performing, by the at least one exchange processor and the master processor, an in- memory synchronization processes to synchronize information shared by the at least one exchange processor and the master processor.
17. The method of claim 14, further comprising:
- storing, by the master processor, the intra-day file and the archival file in a master database.
18. The method of claim 14, further comprising:
- transmitting, by the master processor, historical data and incremental data to at least one regional processor, the at least one regional processor storing the historical data and the incremental data in at least one regional database.
19. The method of claim 14, wherein market data updates correspond to a plurality of financial products at different times.
20. The method of claim 19, wherein each of the product fields include market data corresponding to a single financial product.
21. The method of claim 20, wherein defragmenting the intra-day file to create the archival file comprises:
- sorting, by the master processor, the market data in time sequence.
22. The method of claim 14, further comprising:
- generating, the master processor during a normalization process, an intra-day streaming data file from the market data updates, the intra-day streaming data file including an exchange identifier field and a plurality of objects associated with the exchange identifier field.
23. The method of claim 22, wherein the normalization process generates an incremental data file and a snapshot data file, the snapshot data file indexed by the incremental data file.
24. The method of claim 23, further comprising:
- replaying, by at least one regional processor, the incremental data file until a time requested by a customer is located.
25. The method of claim 24, further comprising:
- returning, by the at least one regional processor, snapshot data from the snapshot data file when the time requested by the customer is located in the incremental data file.
26. The method of claim 24, further comprising:
- returning, by the at least one regional processor, matching information to a customer computer system in one or more of a top-of-book or a depth-of-book format.
7921046 | April 5, 2011 | Parsons |
9990393 | June 5, 2018 | Parsons |
10121196 | November 6, 2018 | Parsons |
20080243675 | October 2, 2008 | Parsons |
20120150765 | June 14, 2012 | Leven |
20130346274 | December 26, 2013 | Ferdinand et al. |
20150127513 | May 7, 2015 | Studnitzer et al. |
20190068483 | February 28, 2019 | Loveless |
20200380353 | December 3, 2020 | Ding |
20220237673 | July 28, 2022 | Muthraja |
- Nayak et al, Impact of Data Normalization on Stock Index Forecasting, International Journal of Computer Information Systems and Industrial Management Applications. ISSN 2150-7988 vol. 6 (2014) pp. 257-269 (Year: 2014).
- Keogh et al, On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration, KDD '02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, Jul. 2002, pp. 102-111 (Year: 2002).
- International Search Report of the International Searching Authority dated Nov. 15, 2021, issued in connection with International Application No. PCT/US2021/44606 (3 pages).
- Written Opinion of the International Searching Authority dated Nov. 15, 2021, issued in connection with International Application No. PCT/US2021/44606 (9 pages).
Type: Grant
Filed: Aug 6, 2020
Date of Patent: Oct 24, 2023
Patent Publication Number: 20220044259
Assignee: MAYSTREET INC. (New York, NY)
Inventors: Niall Douglas (Kerry Pike), Robert Leahy (New York, NY), Michael Lehr (New York, NY)
Primary Examiner: Andre D Boyce
Application Number: 16/986,646
International Classification: G06Q 30/0201 (20230101); G06F 16/17 (20190101); G06F 16/11 (20190101); G06F 16/178 (20190101);