Patents by Inventor Austin Clifford

Austin Clifford has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10387422
    Abstract: Provided are a system, method and computer program product for redistribution of data in an online shared nothing database, said shared nothing database comprising a plurality of original partitions and at least one new partition.
    Type: Grant
    Filed: December 9, 2014
    Date of Patent: August 20, 2019
    Assignee: International Business Machines Corporation
    Inventors: Enzo Cialini, Austin Clifford, Garrett Fitzsimons
  • Publication number: 20190251292
    Abstract: A computer-implemented method, computer program product and system for identifying pseudonymized data within data sources. One or more data repositories within one or more of the data sources are selected. One or more privacy data models are provided, where each of the privacy data models includes pattern(s) and/or parameter(s). One or more of the one or more privacy data models are selected. Data identification information is generated, where the data identification information indicates a presence or absence of pseudonymized data and of non-pseudonymized data within the one or more of the data sources. The data identification information is generated utilizing the pattern(s) and/or the parameter(s) to determine pseudonymized data.
    Type: Application
    Filed: April 25, 2019
    Publication date: August 15, 2019
    Inventors: Pedro Barbas, Austin Clifford, Konrad Emanowicz, Patrick G. O'Sullivan
  • Patent number: 10346380
    Abstract: Embodiments of the present invention provide a method, system and computer program product for test data generation using unique common factor sequencing. In an embodiment of the invention, a method for test data generation using unique common factor sequencing is provided. The method includes loading a table for population with test data in a test data generation tool executing in a memory of a computer. A column set of multiple columns in the table associated with a key to the table is selected for processing and different cardinality sequence values are assigned to the columns in the set such that the cardinality sequence values do not share a common factor except for unity as in the case of prime numbers.
    Type: Grant
    Filed: September 19, 2015
    Date of Patent: July 9, 2019
    Assignee: International Business Machines Corporation
    Inventors: Austin Clifford, Konrad Emanowicz, Enda McCallig, Gary Murtagh, Clare Scally
  • Publication number: 20190130132
    Abstract: A computer-implemented method, computer program product and system for identifying pseudonymized data within data sources. One or more data repositories within one or more of the data sources are selected. One or more privacy data models are provided, where each of the privacy data models includes pattern(s) and/or parameter(s). One or more of the one or more privacy data models are selected. Data identification information is generated, where the data identification information indicates a presence or absence of pseudonymized data and of non-pseudonymized data within the one or more of the data sources. The data identification information is generated utilizing the pattern(s) and/or the parameter(s) to determine pseudonymized data.
    Type: Application
    Filed: November 1, 2017
    Publication date: May 2, 2019
    Inventors: Pedro Barbas, Austin Clifford, Konrad Emanowicz, Patrick G. O'Sullivan
  • Publication number: 20190095461
    Abstract: Disclosed is an approach comprising a column partitioned into a plurality of partitions including an empty partition and a plurality of filled partitions each comprising data entries associated with a set of parameters having parameter values, the data entries compressed in accordance with a compression dictionary. The approach comprises receiving forecasted parameter values for an expected set of data entries to be stored in an empty partition; predicting a recurrence frequency of the data entries in the expected set using the forecasted parameter values by evaluating the respective compression dictionaries of the filled partitions with a machine learning algorithm; generating a predictive compression dictionary for the expected set of data entries based on the predicted recurrence frequency of the data entries in the expected set; receiving the expected set of data entries; and compressing at least part of the received expected set of data entries using the predictive compression dictionary.
    Type: Application
    Filed: November 29, 2018
    Publication date: March 28, 2019
    Inventors: Sami Abed, Pedro Barbas, Austin Clifford, Konrad Emanowicz
  • Patent number: 10169361
    Abstract: Disclosed is a computer-implemented method of compressing data in a columnar database comprising at least one column partitioned into a plurality of partitions including at least one empty partition and a plurality of filled partitions each comprising data entries associated with a set of parameters having parameter values relevant to the recurrence frequency of the data entry in the partition, the data entries being compressed in accordance with a compression dictionary based on the respective recurrence frequencies of the data entries in the filled partition.
    Type: Grant
    Filed: November 16, 2015
    Date of Patent: January 1, 2019
    Assignee: International Business Machines Corporation
    Inventors: Sami Abed, Pedro M Barbas, Austin Clifford, Konrad Emanowicz
  • Patent number: 9773027
    Abstract: In an exemplary embodiment of this disclosure, a method for loading data from a backup image of a database includes selecting a subset statement defining a subset of the data in the database. Tables of the database are identified based on metadata of the database. A target database is written having the structure but not the data of the identified tables. One or more table statements are constructed, by a computer processor, defining a subset of each identified table based on the subset statement. Selected data is unloaded from a backup image into the target database using respective table statements as filters.
    Type: Grant
    Filed: September 11, 2015
    Date of Patent: September 26, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sami Abed, Austin Clifford, Konrad Emanowicz, Gareth Jenkins
  • Publication number: 20170139947
    Abstract: Disclosed is a computer-implemented method of compressing data in a columnar database comprising at least one column partitioned into a plurality of partitions including at least one empty partition and a plurality of filled partitions each comprising data entries associated with a set of parameters having parameter values relevant to the recurrence frequency of the data entry in the partition, the data entries being compressed in accordance with a compression dictionary based on the respective recurrence frequencies of the data entries in the filled partition.
    Type: Application
    Filed: November 16, 2015
    Publication date: May 18, 2017
    Inventors: Sami Abed, Pedro M. Barbas, Austin Clifford, Konrad Emanowicz
  • Patent number: 9589019
    Abstract: A method and system are provided for performance analysis of a database. The method includes receiving a proposed data model, generating a hypothetical query workload using a plurality of sample query templates representing different query constructs for the proposed data model, generating hypothetical optimizer statistics using predefined generating rules that include a projected cardinality for the proposed data model and creating a sample empty database and database schema using the proposed data model. The method also includes applying the hypothetical optimizer statistics to the sample empty database, based on generating the hypothetical optimizer statistics, applying each query construct of the hypothetical query workload to the database schema and estimating a cost of the hypothetical query workload for the proposed data model.
    Type: Grant
    Filed: May 6, 2013
    Date of Patent: March 7, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Austin Clifford, Konrad Emanowicz, Enda McCallig
  • Patent number: 9471607
    Abstract: In an exemplary embodiment of this disclosure, a method for loading data from a backup image of a database includes selecting a subset statement defining a subset of the data in the database. Tables of the database are identified based on metadata of the database. A target database is written having the structure but not the data of the identified tables. One or more table statements are constructed, by a computer processor, defining a subset of each identified table based on the subset statement. Selected data is unloaded from a backup image into the target database using respective table statements as filters.
    Type: Grant
    Filed: August 14, 2013
    Date of Patent: October 18, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sami Abed, Austin Clifford, Konrad Emanowicz, Gareth Jenkins
  • Publication number: 20160299952
    Abstract: Provided are a system, method and computer program product for redistribution of data in an online shared nothing database, said shared nothing database comprising a plurality of original partitions and at least one new partition.
    Type: Application
    Filed: December 9, 2014
    Publication date: October 13, 2016
    Applicant: International Business Machines Corporation
    Inventors: Enzo Cialini, Austin Clifford, Garrett Fitzsimons
  • Patent number: 9465840
    Abstract: Dynamically identifying and preventing skewed partitions in a shared-nothing database is provided. The database management system software receives a parameter for identifying a threshold value associated with at least one distribution key value. Optimizer statistics are gathered on a first table that is distributed across one or more partitions in the shared-nothing database, wherein the first table includes a first table name. Distribution key skew is identified based on the gathered optimizer statistics indicating the threshold value being exceeded. A second table with an alternate distribution key, is created having a second table name for receiving overflow data rows associated with the at least one distribution key value based on the identified distribution key skew. A union all view is created based on the first and the second table.
    Type: Grant
    Filed: March 14, 2014
    Date of Patent: October 11, 2016
    Assignee: International Business Machines Corporation
    Inventors: Austin Clifford, Konrad Emanowicz, Enda McCallig
  • Patent number: 9460152
    Abstract: Dynamically identifying and preventing skewed partitions in a shared-nothing database is provided. The database management system software receives a parameter for identifying a threshold value associated with at least one distribution key value. Optimizer statistics are gathered on a first table that is distributed across one or more partitions in the shared-nothing database, wherein the first table includes a first table name. Distribution key skew is identified based on the gathered optimizer statistics indicating the threshold value being exceeded. A second table with an alternate distribution key, is created having a second table name for receiving overflow data rows associated with the at least one distribution key value based on the identified distribution key skew. A union all view is created based on the first and the second table.
    Type: Grant
    Filed: November 7, 2014
    Date of Patent: October 4, 2016
    Assignee: International Business Machines Corporation
    Inventors: Austin Clifford, Konrad Emanowicz, Enda McCallig
  • Patent number: 9436734
    Abstract: Embodiments of the present invention provide a method, system and computer program product for pre-migration performance prediction of a database management system (DBMS). In an embodiment of the invention, a method for pre-migration performance prediction of a DBMS can include executing a calibration workload in a target DBMS to produce a conversion factor of cost of executing the calibration workload to temporal performance of executing the calibration workload. The method also can include subsequently submitting a sample workload from a database of a source DBMS for evaluation of cost of execution on an empty replica of the database in the target DBMS. Finally, the method can include predicting a temporal performance of the sample workload in the target DBMS as a product of the conversion factor and the cost of execution of the sample workload on the empty replica of the database in the target DBMS.
    Type: Grant
    Filed: October 20, 2013
    Date of Patent: September 6, 2016
    Assignee: International Business Machines Corporation
    Inventors: Austin Clifford, Enda McCallig
  • Publication number: 20160188418
    Abstract: In an exemplary embodiment of this disclosure, a method for loading data from a backup image of a database includes selecting a subset statement defining a subset of the data in the database. Tables of the database are identified based on metadata of the database. A target database is written having the structure but not the data of the identified tables. One or more table statements are constructed, by a computer processor, defining a subset of each identified table based on the subset statement. Selected data is unloaded from a backup image into the target database using respective table statements as filters.
    Type: Application
    Filed: September 11, 2015
    Publication date: June 30, 2016
    Inventors: Sami Abed, Austin Clifford, Konrad Emanowicz, Gareth Jenkins
  • Publication number: 20160012093
    Abstract: Embodiments of the present invention provide a method, system and computer program product for test data generation using unique common factor sequencing. In an embodiment of the invention, a method for test data generation using unique common factor sequencing is provided. The method includes loading a table for population with test data in a test data generation tool executing in memory of a computer. A column set of multiple columns in the table associated with a key to the table can be selected for processing and different cardinality sequence values are assigned to the columns in the set such that the cardinality sequence values do not share a common factor except for unity as in the case of prime numbers.
    Type: Application
    Filed: September 19, 2015
    Publication date: January 14, 2016
    Inventors: Austin CLIFFORD, Konrad EMANOWICZ, Enda McCALLIG, Gary MURTAGH, Clare SCALLY
  • Patent number: 9171025
    Abstract: Embodiments of the present invention provide a system and computer program product for test data generation using unique common factor sequencing. In an embodiment of the invention, a computer program product for test data generation using unique common factor sequencing is provided. The computer program product includes loading a table for population with test data in a test data generation tool executing in a memory of a computer. A column set of multiple columns in the table associated with a key to the table is selected for processing and different cardinality sequence values are assigned to the columns in the set such that the cardinality sequence values do not share a common factor except for unity as in the case of prime numbers.
    Type: Grant
    Filed: August 23, 2013
    Date of Patent: October 27, 2015
    Assignee: International Business Machines Corporation
    Inventors: Austin Clifford, Konrad Emanowicz, Enda McCallig, Gary Murtagh, Clare Scally
  • Patent number: 9171026
    Abstract: Embodiments of the present invention provide a method for test data generation using unique common factor sequencing. In an embodiment of the invention, a method for test data generation using unique common factor sequencing is provided. The method includes loading a table for population with test data in a test data generation tool executing in a memory of a computer. A column set of multiple columns in the table associated with a key to the table is selected for processing and different cardinality sequence values are assigned to the columns in the set such that the cardinality sequence values do not share a common factor except for unity as in the case of prime numbers.
    Type: Grant
    Filed: August 23, 2013
    Date of Patent: October 27, 2015
    Assignee: International Business Machines Corporation
    Inventors: Austin Clifford, Konrad Emanowicz, Enda McCallig, Gary Murtagh, Clare Scally
  • Publication number: 20150261816
    Abstract: Dynamically identifying and preventing skewed partitions in a shared-nothing database is provided. The database management system software receives a parameter for identifying a threshold value associated with at least one distribution key value. Optimizer statistics are gathered on a first table that is distributed across one or more partitions in the shared-nothing database, wherein the first table includes a first table name. Distribution key skew is identified based on the gathered optimizer statistics indicating the threshold value being exceeded. A second table with an alternate distribution key, is created having a second table name for receiving overflow data rows associated with the at least one distribution key value based on the identified distribution key skew. A union all view is created based on the first and the second table.
    Type: Application
    Filed: March 14, 2014
    Publication date: September 17, 2015
    Applicant: International Business Machines Corporation
    Inventors: Austin Clifford, Konrad Emanowicz, Enda McCallig
  • Publication number: 20150261840
    Abstract: Dynamically identifying and preventing skewed partitions in a shared-nothing database is provided. The database management system software receives a parameter for identifying a threshold value associated with at least one distribution key value. Optimizer statistics are gathered on a first table that is distributed across one or more partitions in the shared-nothing database, wherein the first table includes a first table name. Distribution key skew is identified based on the gathered optimizer statistics indicating the threshold value being exceeded. A second table with an alternate distribution key, is created having a second table name for receiving overflow data rows associated with the at least one distribution key value based on the identified distribution key skew. A union all view is created based on the first and the second table.
    Type: Application
    Filed: November 7, 2014
    Publication date: September 17, 2015
    Inventors: Austin Clifford, Konrad Emanowicz, Enda McCallig