Patents by Inventor Paul Geoffrey Brown

Paul Geoffrey Brown has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

MATRIX-RELATED OPERATIONS IN RELATIONAL DATABASES SYSTEMS INCLUDING MASSIVELY PARALLEL PROCESSING SYSTEMS

Publication number: 20210303574

Abstract: Improved techniques for performing Matrix-Related operations (e.g., Matrix Multiplication, Matrix Transpose) in Relational Database systems are disclosed. Techniques provide Matrix Data Sets for performing Matrix-Related operations in Relational Databases more efficiently than conventional techniques. By way of example, Matrix Data can be partitioned such that data each partition can be processed directly in a cache memory of a processor thereby reducing the need for copying data as it is conventionally done in Relational Databases. In addition, database queries involving Matrix-Related operations can be optimized for a Relational Database by providing Matrix Operations that can be directly used as declarative statements in a Database Query language (e.g., SQL).

Type: Application

Filed: December 31, 2020

Publication date: September 30, 2021

Inventor: Paul Geoffrey Brown
System for matching pattern-based data

Patent number: 7912805

Abstract: A pattern-based data matching system matches pattern-based data. The data matching system generates a regular expression pattern for input datasets and describes similarity measures between the generated patterns. The data matching system analyzes an input dataset in terms of symbol classes, generalizing input values into a general pattern to allow identification or extrapolation of overlap between input datasets, aiding in matching fields in databases that are being merged and in learning a pattern for an input dataset. For each sequence of data values, the present system computes a compact pattern describing the sequence. Embodiments of the data matching system comprise noise reduction and repetitive pattern discovery in the input dataset and calculation of recall and precision of the generated pattern.

Type: Grant

Filed: December 15, 2008

Date of Patent: March 22, 2011

Assignee: International Business Machines Corporation

Inventors: Paul Geoffrey Brown, Jussi Petri Myllymaki
Method for discovering undeclared and fuzzy rules in databases

Patent number: 7685086

Abstract: A scheme is used to automatically discover algebraic constraints between pairs of columns in relational data. The constraints may be “fuzzy” in that they hold for most, but not all, of the records, and the columns may be in the same table or different tables. The scheme first identifies candidate sets of column value pairs that are likely to satisfy an algebraic constraint. For each candidate, the scheme constructs algebraic constraints by applying statistical histogramming, segmentation, or clustering techniques to samples of column values. In query-optimization mode, the scheme automatically partitions the data into normal and exception records. During subsequent query processing, queries can be modified to incorporate the constraints; the optimizer uses the constraints to identify new, more efficient access paths. The results are then combined with the results of executing the original query against the (small) set of exception records.

Type: Grant

Filed: August 21, 2007

Date of Patent: March 23, 2010

Assignee: International Business Machines Corporation

Inventors: Paul Geoffrey Brown, Peter Jay Haas
Detecting correlation from data

Patent number: 7647293

Abstract: A system and method of discovering dependencies between relational database column pairs and application of discoveries to query optimization is provided. For each candidate column pair remaining after simultaneously generating column pairs, pruning pairs not satisfying specified heuristic constraints, and eliminating pairs with trivial instances of correlation, a random sample of data values is collected. A candidate column pair is tested for the existence of a soft functional dependency (FD), and if a dependency is not found, statistically tested for correlation using a robust chi-squared statistic. Column pairs for which either a soft FD or a statistical correlation exists are prioritized for recommendation to a query optimizer, based on any of: strength of dependency, degree of correlation, or adjustment factor; statistics for recommended columns pairs are tracked to improve selectivity estimates.

Type: Grant

Filed: June 10, 2004

Date of Patent: January 12, 2010

Assignee: International Business Machines Corporation

Inventors: Paul Geoffrey Brown, Peter Jay Haas, Ihab F. Ilyas, Volker G. Markl
Flexible, efficient and scalable sampling

Patent number: 7543006

Abstract: A sampling infrastructure/scheme that supports flexible, efficient, scalable and uniform sampling is disclosed. A sample is maintained in a compact histogram form while the sample footprint stays below a specified upper bound. If, at any point, the sample footprint exceeds the upper bound, then the compact representation is abandoned, the sample purged to obtain a subsample. The histogram of the purged subsample is expanded to a bag of values while sampling remaining data values of the partitioned subset. The expanded purged subsample is converted to a histogram and uniform random samples are yielded. The sampling scheme retains the bounded footprint property and to a partial degree the compact representation of the Concise Sampling scheme, while ensuring statistical uniformity. Samples from at least two partitioned subsets are merged on demand to yield uniform merged samples of combined partitions wherein the merged samples also maintain the histogram representation and bounded footprint property.

Type: Grant

Filed: August 31, 2006

Date of Patent: June 2, 2009

Assignee: International Business Machines Corporation

Inventors: Paul Geoffrey Brown, Peter Jay Haas
SYSTEM FOR MATCHING PATTERN-BASED DATA

Publication number: 20090132454

Abstract: A pattern-based data matching system matches pattern-based data. The data matching system generates a regular expression pattern for input datasets and describes similarity measures between the generated patterns. The data matching system analyzes an input dataset in terms of symbol classes, generalizing input values into a general pattern to allow identification or extrapolation of overlap between input datasets, aiding in matching fields in databases that are being merged and in learning a pattern for an input dataset. For each sequence of data values, the present system computes a compact pattern describing the sequence. Embodiments of the data matching system comprise noise reduction and repetitive pattern discovery in the input dataset and calculation of recall and precision of the generated pattern.

Type: Application

Filed: December 15, 2008

Publication date: May 21, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Paul Geoffrey Brown, Jussi Petri Myllymaki
Method for matching pattern-based data

Patent number: 7487150

Abstract: A pattern-based data matching method matches pattern-based data. The data matching method generates a regular expression pattern for input datasets and describes similarity measures between the generated patterns. The data matching method analyzes an input dataset in terms of symbol classes, generalizing input values into a general pattern to allow identification or extrapolation of overlap between input datasets, aiding in matching fields in databases that are being merged and in learning a pattern for an input dataset. For each sequence of data values, the present method computes a compact pattern describing the sequence. Embodiments of the data matching method comprise noise reduction and repetitive pattern discovery in the input dataset and calculation of recall and precision of the generated pattern.

Type: Grant

Filed: July 2, 2005

Date of Patent: February 3, 2009

Assignee: International Business Machines Corporation

Inventors: Paul Geoffrey Brown, Jussi Petri Myllymaki
FLEXIBLE, EFFICIENT AND SCALABLE SAMPLING

Publication number: 20080059540

Abstract: A sampling infrastructure/scheme that supports flexible, efficient, scalable and uniform sampling is disclosed. A sample is maintained in a compact histogram form while the sample footprint stays below a specified upper bound. If, at any point, the sample footprint exceeds the upper bound, then the compact representation is abandoned, the sample purged to obtain a subsample. The histogram of the purged subsample is expanded to a bag of values while sampling remaining data values of the partitioned subset. The expanded purged subsample is converted to a histogram and uniform random samples are yielded. The sampling scheme retains the bounded footprint property and to a partial degree the compact representation of the Concise Sampling scheme, while ensuring statistical uniformity. Samples from at least two partitioned subsets are merged on demand to yield uniform merged samples of combined partitions wherein the merged samples also maintain the histogram representation and bounded footprint property.

Type: Application

Filed: August 31, 2006

Publication date: March 6, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: PAUL GEOFFREY BROWN, PETER JAY HAAS
Method for discovering undeclared and fuzzy rules in databases

Patent number: 7277873

Abstract: A scheme is used to automatically discover algebraic constraints between pairs of columns in relational data. The constraints may be “fuzzy” in that they hold for most, but not all, of the records, and the columns may be in the same table or different tables. The scheme first identifies candidate sets of column value pairs that are likely to satisfy an algebraic constraint. For each candidate, the scheme constructs algebraic constraints by applying statistical histogramming, segmentation, or clustering techniques to samples of column values. In query-optimization mode, the scheme automatically partitions the data into normal and exception records. During subsequent query processing, queries can be modified to incorporate the constraints; the optimizer uses the constraints to identify new, more efficient access paths. The results are then combined with the results of executing the original query against the (small) set of exception records.

Type: Grant

Filed: October 31, 2003

Date of Patent: October 2, 2007

Assignee: International Business Machines Corporaton

Inventors: Paul Geoffrey Brown, Peter Jay Haas