Patents by Inventor Paul Geoffrey Brown
Paul Geoffrey Brown has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20210303574Abstract: Improved techniques for performing Matrix-Related operations (e.g., Matrix Multiplication, Matrix Transpose) in Relational Database systems are disclosed. Techniques provide Matrix Data Sets for performing Matrix-Related operations in Relational Databases more efficiently than conventional techniques. By way of example, Matrix Data can be partitioned such that data each partition can be processed directly in a cache memory of a processor thereby reducing the need for copying data as it is conventionally done in Relational Databases. In addition, database queries involving Matrix-Related operations can be optimized for a Relational Database by providing Matrix Operations that can be directly used as declarative statements in a Database Query language (e.g., SQL).Type: ApplicationFiled: December 31, 2020Publication date: September 30, 2021Inventor: Paul Geoffrey Brown
-
Patent number: 7912805Abstract: A pattern-based data matching system matches pattern-based data. The data matching system generates a regular expression pattern for input datasets and describes similarity measures between the generated patterns. The data matching system analyzes an input dataset in terms of symbol classes, generalizing input values into a general pattern to allow identification or extrapolation of overlap between input datasets, aiding in matching fields in databases that are being merged and in learning a pattern for an input dataset. For each sequence of data values, the present system computes a compact pattern describing the sequence. Embodiments of the data matching system comprise noise reduction and repetitive pattern discovery in the input dataset and calculation of recall and precision of the generated pattern.Type: GrantFiled: December 15, 2008Date of Patent: March 22, 2011Assignee: International Business Machines CorporationInventors: Paul Geoffrey Brown, Jussi Petri Myllymaki
-
Patent number: 7685086Abstract: A scheme is used to automatically discover algebraic constraints between pairs of columns in relational data. The constraints may be “fuzzy” in that they hold for most, but not all, of the records, and the columns may be in the same table or different tables. The scheme first identifies candidate sets of column value pairs that are likely to satisfy an algebraic constraint. For each candidate, the scheme constructs algebraic constraints by applying statistical histogramming, segmentation, or clustering techniques to samples of column values. In query-optimization mode, the scheme automatically partitions the data into normal and exception records. During subsequent query processing, queries can be modified to incorporate the constraints; the optimizer uses the constraints to identify new, more efficient access paths. The results are then combined with the results of executing the original query against the (small) set of exception records.Type: GrantFiled: August 21, 2007Date of Patent: March 23, 2010Assignee: International Business Machines CorporationInventors: Paul Geoffrey Brown, Peter Jay Haas
-
Patent number: 7647293Abstract: A system and method of discovering dependencies between relational database column pairs and application of discoveries to query optimization is provided. For each candidate column pair remaining after simultaneously generating column pairs, pruning pairs not satisfying specified heuristic constraints, and eliminating pairs with trivial instances of correlation, a random sample of data values is collected. A candidate column pair is tested for the existence of a soft functional dependency (FD), and if a dependency is not found, statistically tested for correlation using a robust chi-squared statistic. Column pairs for which either a soft FD or a statistical correlation exists are prioritized for recommendation to a query optimizer, based on any of: strength of dependency, degree of correlation, or adjustment factor; statistics for recommended columns pairs are tracked to improve selectivity estimates.Type: GrantFiled: June 10, 2004Date of Patent: January 12, 2010Assignee: International Business Machines CorporationInventors: Paul Geoffrey Brown, Peter Jay Haas, Ihab F. Ilyas, Volker G. Markl
-
Patent number: 7543006Abstract: A sampling infrastructure/scheme that supports flexible, efficient, scalable and uniform sampling is disclosed. A sample is maintained in a compact histogram form while the sample footprint stays below a specified upper bound. If, at any point, the sample footprint exceeds the upper bound, then the compact representation is abandoned, the sample purged to obtain a subsample. The histogram of the purged subsample is expanded to a bag of values while sampling remaining data values of the partitioned subset. The expanded purged subsample is converted to a histogram and uniform random samples are yielded. The sampling scheme retains the bounded footprint property and to a partial degree the compact representation of the Concise Sampling scheme, while ensuring statistical uniformity. Samples from at least two partitioned subsets are merged on demand to yield uniform merged samples of combined partitions wherein the merged samples also maintain the histogram representation and bounded footprint property.Type: GrantFiled: August 31, 2006Date of Patent: June 2, 2009Assignee: International Business Machines CorporationInventors: Paul Geoffrey Brown, Peter Jay Haas
-
Publication number: 20090132454Abstract: A pattern-based data matching system matches pattern-based data. The data matching system generates a regular expression pattern for input datasets and describes similarity measures between the generated patterns. The data matching system analyzes an input dataset in terms of symbol classes, generalizing input values into a general pattern to allow identification or extrapolation of overlap between input datasets, aiding in matching fields in databases that are being merged and in learning a pattern for an input dataset. For each sequence of data values, the present system computes a compact pattern describing the sequence. Embodiments of the data matching system comprise noise reduction and repetitive pattern discovery in the input dataset and calculation of recall and precision of the generated pattern.Type: ApplicationFiled: December 15, 2008Publication date: May 21, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Paul Geoffrey Brown, Jussi Petri Myllymaki
-
Patent number: 7487150Abstract: A pattern-based data matching method matches pattern-based data. The data matching method generates a regular expression pattern for input datasets and describes similarity measures between the generated patterns. The data matching method analyzes an input dataset in terms of symbol classes, generalizing input values into a general pattern to allow identification or extrapolation of overlap between input datasets, aiding in matching fields in databases that are being merged and in learning a pattern for an input dataset. For each sequence of data values, the present method computes a compact pattern describing the sequence. Embodiments of the data matching method comprise noise reduction and repetitive pattern discovery in the input dataset and calculation of recall and precision of the generated pattern.Type: GrantFiled: July 2, 2005Date of Patent: February 3, 2009Assignee: International Business Machines CorporationInventors: Paul Geoffrey Brown, Jussi Petri Myllymaki
-
Publication number: 20080059540Abstract: A sampling infrastructure/scheme that supports flexible, efficient, scalable and uniform sampling is disclosed. A sample is maintained in a compact histogram form while the sample footprint stays below a specified upper bound. If, at any point, the sample footprint exceeds the upper bound, then the compact representation is abandoned, the sample purged to obtain a subsample. The histogram of the purged subsample is expanded to a bag of values while sampling remaining data values of the partitioned subset. The expanded purged subsample is converted to a histogram and uniform random samples are yielded. The sampling scheme retains the bounded footprint property and to a partial degree the compact representation of the Concise Sampling scheme, while ensuring statistical uniformity. Samples from at least two partitioned subsets are merged on demand to yield uniform merged samples of combined partitions wherein the merged samples also maintain the histogram representation and bounded footprint property.Type: ApplicationFiled: August 31, 2006Publication date: March 6, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: PAUL GEOFFREY BROWN, PETER JAY HAAS
-
Patent number: 7277873Abstract: A scheme is used to automatically discover algebraic constraints between pairs of columns in relational data. The constraints may be “fuzzy” in that they hold for most, but not all, of the records, and the columns may be in the same table or different tables. The scheme first identifies candidate sets of column value pairs that are likely to satisfy an algebraic constraint. For each candidate, the scheme constructs algebraic constraints by applying statistical histogramming, segmentation, or clustering techniques to samples of column values. In query-optimization mode, the scheme automatically partitions the data into normal and exception records. During subsequent query processing, queries can be modified to incorporate the constraints; the optimizer uses the constraints to identify new, more efficient access paths. The results are then combined with the results of executing the original query against the (small) set of exception records.Type: GrantFiled: October 31, 2003Date of Patent: October 2, 2007Assignee: International Business Machines CorporatonInventors: Paul Geoffrey Brown, Peter Jay Haas