Patents by Inventor Oliver Schabenberger

Oliver Schabenberger has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9798755
    Abstract: A data processing system having multiple executable threads is configured to generate a cross-product matrix in a single pass through data. An example system comprises memory for receiving the data, a processor having a plurality of executable threads, and software code for generating a cross-product matrix in a single pass through the data. The software code includes threaded variable levelization code for generating thread specific binary trees for classification variables, variable tree merge code for combining the thread-specific trees into overall trees for the classification variables, effect levelization code for generating sub-matrices of the cross-product matrix using the overall trees for the classification variables, and cross-product matrix generation code for generating the cross-product matrix by storing and ordering the elements of the sub-matrices in contiguous memory space.
    Type: Grant
    Filed: February 12, 2015
    Date of Patent: October 24, 2017
    Assignee: SAS Institute Inc.
    Inventors: James Howard Goodnight, Oliver Schabenberger
  • Patent number: 9665405
    Abstract: Systems and methods are provided for generating multiple system state projections using a grid computing environment. A central coordinator software component executes on a root data processor and provides commands and data to a plurality of node coordinator software components. A node coordinator software component manages threads which execute on its associated node data processor and which perform a set of matrix operations. Stochastic simulations use results of the matrix operations to generate multiple state projections. Additional processing can be performed by the grid computing environment based upon the generated state projections, such as to develop possible change information for users.
    Type: Grant
    Filed: December 21, 2016
    Date of Patent: May 30, 2017
    Assignee: SAS Institute Inc.
    Inventors: James Howard Goodnight, Steve Krueger, Oliver Schabenberger, Christopher D. Bailey
  • Patent number: 9633104
    Abstract: This disclosure describes methods, systems, computer-readable media, and apparatuses for efficiently calculating group-by statistics. A data set that includes multiple entries is accessed. The multiple entries are grouped into group-by subsets which are formed on two or more group-by variables and which are subsets are subsets of the data set. Cardinality data is determined for each of the group-by subsets, wherein cardinality data represents a number of entries in a group-by subset. At least one summary of data in each of the group-by subsets is generated, wherein each of the summaries includes the cardinality data determined for the group-by subset. Objects for the group-by subsets are initialized such that the objects store the summaries. The objects may then be used to generate multiple statistical summaries of the data set.
    Type: Grant
    Filed: May 5, 2014
    Date of Patent: April 25, 2017
    Assignee: SAS Institute Inc.
    Inventors: Xunlei Wu, Oliver Schabenberger
  • Patent number: 9507833
    Abstract: In accordance with the teachings described herein, systems and methods are provided for estimating quantiles for data stored in a distributed system. In one embodiment, an instruction is received to estimate a specified quantile for a variate in a set of data stored at a plurality of nodes in the distributed system. A plurality of data bins for the variate are defined that are each associated with a different range of data values in the set of data. Lower and upper quantile bounds for each of the plurality of data bins are determined based on the total number of data values that fall within each of the plurality of data bins. The specified quantile is estimated based on an identified one of the plurality of data bins that includes the specified quantile based on the lower and upper quantile bounds.
    Type: Grant
    Filed: April 29, 2016
    Date of Patent: November 29, 2016
    Assignee: SAS Institute Inc.
    Inventors: Georges H. Guirguis, Scott Pope, Oliver Schabenberger
  • Publication number: 20160246852
    Abstract: In accordance with the teachings described herein, systems and methods are provided for estimating quantiles for data stored in a distributed system. In one embodiment, an instruction is received to estimate a specified quantile for a variate in a set of data stored at a plurality of nodes in the distributed system. A plurality of data bins for the variate are defined that are each associated with a different range of data values in the set of data. Lower and upper quantile bounds for each of the plurality of data bins are determined based on the total number of data values that fall within each of the plurality of data bins. The specified quantile is estimated based on an identified one of the plurality of data bins that includes the specified quantile based on the lower and upper quantile bounds.
    Type: Application
    Filed: January 15, 2016
    Publication date: August 25, 2016
    Applicant: SAS INSTITUTE INC.
    Inventors: Scott Pope, Georges H. Guirguis, Oliver Schabenberger
  • Publication number: 20160246853
    Abstract: In accordance with the teachings described herein, systems and methods are provided for estimating quantiles for data stored in a distributed system. In one embodiment, an instruction is received to estimate a specified quantile for a variate in a set of data stored at a plurality of nodes in the distributed system. A plurality of data bins for the variate are defined that are each associated with a different range of data values in the set of data. Lower and upper quantile bounds for each of the plurality of data bins are determined based on the total number of data values that fall within each of the plurality of data bins. The specified quantile is estimated based on an identified one of the plurality of data bins that includes the specified quantile based on the lower and upper quantile bounds.
    Type: Application
    Filed: April 29, 2016
    Publication date: August 25, 2016
    Applicant: SAS Institute Inc.
    Inventors: Georges H. Guirguis, Scott Pope, Oliver Schabenberger
  • Patent number: 9268796
    Abstract: In accordance with the teachings described herein, systems and methods are provided for estimating quantiles for data stored in a distributed system. In one embodiment, an instruction is received to estimate a specified quantile for a variate in a set of data stored at a plurality of nodes in the distributed system. A plurality of data bins for the variate are defined that are each associated with a different range of data values in the set of data. Lower and upper quantile bounds for each of the plurality of data bins are determined based on the total number of data values that fall within each of the plurality of data bins. The specified quantile is estimated based on an identified one of the plurality of data bins that includes the specified quantile based on the lower and upper quantile bounds.
    Type: Grant
    Filed: May 29, 2012
    Date of Patent: February 23, 2016
    Assignee: SAS Institute Inc.
    Inventors: Scott Pope, Georges H. Guirguis, Oliver Schabenberger
  • Publication number: 20150154238
    Abstract: Systems and methods are provided for a data processing system having multiple executable threads that is configured to generate a cross-product matrix in a single pass through data to be analyzed. An example system comprises memory for receiving the data to be analyzed, a processor having a plurality of executable threads for executing code to analyze data, and software code for generating a cross-product matrix in a single pass through data to be analyzed.
    Type: Application
    Filed: February 12, 2015
    Publication date: June 4, 2015
    Inventors: James Howard Goodnight, Oliver Schabenberger
  • Publication number: 20150149241
    Abstract: Systems and methods are provided for generating multiple system state projections for one or more scenarios using a grid computing environment. A central coordinator software component executes on a root data processor and provides commands and data to a plurality of node coordinator software components. A node coordinator software component manages threads which execute on its associated node data processor and which perform a set of matrix operations. Stochastic simulations use results of the matrix operations to generate multiple state projections. Additional processing can be performed by the grid computing environment based upon the generated state projections, such as to develop risk information for users.
    Type: Application
    Filed: November 17, 2014
    Publication date: May 28, 2015
    Inventors: James Howard Goodnight, Steve Krueger, Oliver Schabenberger, Christopher D. Bailey
  • Patent number: 8996518
    Abstract: Systems and methods are provided for a data processing system having multiple executable threads that is configured to generate a cross-product matrix in a single pass through data to be analyzed. An example system comprises memory for receiving the data to be analyzed, a processor having a plurality of executable threads for executing code to analyze data, and software code for generating a cross-product matrix in a single pass through data to be analyzed. The software code includes threaded variable levelization code for generating a plurality of thread specific binary trees for a plurality of classification variables, variable tree merge code for combining a plurality of the thread-specific trees into a plurality of overall trees for the plurality of classification variables, effect levelization code for generating sub-matrices using the plurality of the overall trees for the plurality of classification variables, and cross-product matrix generation code for generating the cross-product matrix.
    Type: Grant
    Filed: December 20, 2010
    Date of Patent: March 31, 2015
    Assignee: SAS Institute Inc.
    Inventors: Oliver Schabenberger, James Howard Goodnight
  • Publication number: 20140351196
    Abstract: Systems and methods for determining an optimal splitting scheme for a node in a classification decision tree. A computing system may receive input data related to a decision tree to be generated from a data set. The input data identifies a target attribute of the data set and a set of candidate attributes of the data set to be used as nodes in the decision tree. The computing system may determine, using a clustering algorithm and the set of candidate attributes, a number of potential splitting schemes to be used to split a node in the decision tree. The computing system may calculate a splitting measurement for each of the plurality of potential splitting schemes. The computing system may select an optimal splitting scheme from the plurality of potential splitting schemes for each node in the decision tree based on the splitting measurement.
    Type: Application
    Filed: May 21, 2014
    Publication date: November 27, 2014
    Applicant: SAS Institute Inc.
    Inventors: Xiangqian Hu, Xunlei Wu, Xiangxiang Meng, Oliver Schabenberger
  • Publication number: 20140330826
    Abstract: Systems and methods for data reduction of a data set are included. A computing system may group data points in a data set into a number of data point bubbles represented by a number of representative points. A data point bubble may include a one or more data points from the data set and a representative point from the data set. The computing system may calculate a cluster assignment for the representative point by executing a clustering algorithm using the number of representative points.
    Type: Application
    Filed: May 5, 2014
    Publication date: November 6, 2014
    Applicant: SAS Institute Inc.
    Inventors: Xiangqian Hu, Xunlei Wu, Xiangxiang Meng, Oliver Schabenberger
  • Publication number: 20140330827
    Abstract: This disclosure describes methods, systems, computer-readable media, and apparatuses for efficiently calculating group-by statistics. A data set that includes multiple entries is accessed. The multiple entries are grouped into group-by subsets which are formed on two or more group-by variables and which are subsets are subsets of the data set. Cardinality data is determined for each of the group-by subsets, wherein cardinality data represents a number of entries in a group-by subset. At least one summary of data in each of the group-by subsets is generated, wherein each of the summaries includes the cardinality data determined for the group-by subset. Objects for the group-by subsets are initialized such that the objects store the summaries. The objects may then be used to generate multiple statistical summaries of the data set.
    Type: Application
    Filed: May 5, 2014
    Publication date: November 6, 2014
    Applicant: SAS INSTITUTE INC.
    Inventors: Xunlei Wu, Oliver Schabenberger
  • Publication number: 20130325825
    Abstract: In accordance with the teachings described herein, systems and methods are provided for estimating quantiles for data stored in a distributed system. In one embodiment, an instruction is received to estimate a specified quantile for a variate in a set of data stored at a plurality of nodes in the distributed system. A plurality of data bins for the variate are defined that are each associated with a different range of data values in the set of data. Lower and upper quantile bounds for each of the plurality of data bins are determined based on the total number of data values that fall within each of the plurality of data bins. The specified quantile is estimated based on an identified one of the plurality of data bins that includes the specified quantile based on the lower and upper quantile bounds.
    Type: Application
    Filed: May 29, 2012
    Publication date: December 5, 2013
    Inventors: Scott Pope, Georges H. Guirguis, Oliver Schabenberger
  • Patent number: 8271537
    Abstract: Systems and methods are provided for a grid computing system that performs analytical calculations on data stored in a distributed database system. A grid-enabled software component at a control node is configured to invoke database management software (DBMS) at the control node to cause the DBMS at a plurality of the worker nodes to make available data to the grid-enabled software component local to its node; instruct the grid-enabled software components at the plurality of worker nodes to perform an analytical calculation on the received data and to send the results of the data analysis to the grid-enabled software component at the control node; and assemble the results of the data analysis performed by the grid-enabled software components at the plurality of worker nodes.
    Type: Grant
    Filed: November 15, 2010
    Date of Patent: September 18, 2012
    Assignee: SAS Institute Inc.
    Inventors: Oliver Schabenberger, Steve Krueger
  • Publication number: 20120159489
    Abstract: Systems and methods are provided for a data processing system having multiple executable threads that is configured to generate a cross-product matrix in a single pass through data to be analyzed. An example system comprises memory for receiving the data to be analyzed, a processor having a plurality of executable threads for executing code to analyze data, and software code for generating a cross-product matrix in a single pass through data to be analyzed.
    Type: Application
    Filed: December 20, 2010
    Publication date: June 21, 2012
    Inventors: Oliver Schabenberger, James Howard Goodnight
  • Publication number: 20120124100
    Abstract: Systems and methods are provided for a grid computing system that performs analytical calculations on data stored in a distributed database system. A grid-enabled software component at a control node is configured to invoke database management software (DBMS) at the control node to cause the DBMS at a plurality of the worker nodes to make available data to the grid-enabled software component local to its node; instruct the grid-enabled software components at the plurality of worker nodes to perform an analytical calculation on the received data and to send the results of the data analysis to the grid-enabled software component at the control node; and assemble the results of the data analysis performed by the grid-enabled software components at the plurality of worker nodes.
    Type: Application
    Filed: November 15, 2010
    Publication date: May 17, 2012
    Applicant: SAS Institute Inc.
    Inventors: Oliver Schabenberger, Steve Krueger
  • Publication number: 20110202329
    Abstract: Systems and methods are provided for generating multiple system state projections for one or more scenarios using a grid computing environment. A central coordinator software component executes on a root data processor and provides commands and data to a plurality of node coordinator software components. A node coordinator software component manages threads which execute on its associated node data processor and which perform a set of matrix operations. Stochastic simulations use results of the matrix operations to generate multiple state projections. Additional processing can be performed by the grid computing environment based upon the generated state projections, such as to develop risk information for users.
    Type: Application
    Filed: February 12, 2010
    Publication date: August 18, 2011
    Inventors: James Howard Goodnight, Steve Krueger, Oliver Schabenberger, Christopher D. Bailey