Patents by Inventor Oliver Schabenberger
Oliver Schabenberger has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9798755Abstract: A data processing system having multiple executable threads is configured to generate a cross-product matrix in a single pass through data. An example system comprises memory for receiving the data, a processor having a plurality of executable threads, and software code for generating a cross-product matrix in a single pass through the data. The software code includes threaded variable levelization code for generating thread specific binary trees for classification variables, variable tree merge code for combining the thread-specific trees into overall trees for the classification variables, effect levelization code for generating sub-matrices of the cross-product matrix using the overall trees for the classification variables, and cross-product matrix generation code for generating the cross-product matrix by storing and ordering the elements of the sub-matrices in contiguous memory space.Type: GrantFiled: February 12, 2015Date of Patent: October 24, 2017Assignee: SAS Institute Inc.Inventors: James Howard Goodnight, Oliver Schabenberger
-
Patent number: 9665405Abstract: Systems and methods are provided for generating multiple system state projections using a grid computing environment. A central coordinator software component executes on a root data processor and provides commands and data to a plurality of node coordinator software components. A node coordinator software component manages threads which execute on its associated node data processor and which perform a set of matrix operations. Stochastic simulations use results of the matrix operations to generate multiple state projections. Additional processing can be performed by the grid computing environment based upon the generated state projections, such as to develop possible change information for users.Type: GrantFiled: December 21, 2016Date of Patent: May 30, 2017Assignee: SAS Institute Inc.Inventors: James Howard Goodnight, Steve Krueger, Oliver Schabenberger, Christopher D. Bailey
-
Patent number: 9633104Abstract: This disclosure describes methods, systems, computer-readable media, and apparatuses for efficiently calculating group-by statistics. A data set that includes multiple entries is accessed. The multiple entries are grouped into group-by subsets which are formed on two or more group-by variables and which are subsets are subsets of the data set. Cardinality data is determined for each of the group-by subsets, wherein cardinality data represents a number of entries in a group-by subset. At least one summary of data in each of the group-by subsets is generated, wherein each of the summaries includes the cardinality data determined for the group-by subset. Objects for the group-by subsets are initialized such that the objects store the summaries. The objects may then be used to generate multiple statistical summaries of the data set.Type: GrantFiled: May 5, 2014Date of Patent: April 25, 2017Assignee: SAS Institute Inc.Inventors: Xunlei Wu, Oliver Schabenberger
-
Patent number: 9507833Abstract: In accordance with the teachings described herein, systems and methods are provided for estimating quantiles for data stored in a distributed system. In one embodiment, an instruction is received to estimate a specified quantile for a variate in a set of data stored at a plurality of nodes in the distributed system. A plurality of data bins for the variate are defined that are each associated with a different range of data values in the set of data. Lower and upper quantile bounds for each of the plurality of data bins are determined based on the total number of data values that fall within each of the plurality of data bins. The specified quantile is estimated based on an identified one of the plurality of data bins that includes the specified quantile based on the lower and upper quantile bounds.Type: GrantFiled: April 29, 2016Date of Patent: November 29, 2016Assignee: SAS Institute Inc.Inventors: Georges H. Guirguis, Scott Pope, Oliver Schabenberger
-
Publication number: 20160246852Abstract: In accordance with the teachings described herein, systems and methods are provided for estimating quantiles for data stored in a distributed system. In one embodiment, an instruction is received to estimate a specified quantile for a variate in a set of data stored at a plurality of nodes in the distributed system. A plurality of data bins for the variate are defined that are each associated with a different range of data values in the set of data. Lower and upper quantile bounds for each of the plurality of data bins are determined based on the total number of data values that fall within each of the plurality of data bins. The specified quantile is estimated based on an identified one of the plurality of data bins that includes the specified quantile based on the lower and upper quantile bounds.Type: ApplicationFiled: January 15, 2016Publication date: August 25, 2016Applicant: SAS INSTITUTE INC.Inventors: Scott Pope, Georges H. Guirguis, Oliver Schabenberger
-
Publication number: 20160246853Abstract: In accordance with the teachings described herein, systems and methods are provided for estimating quantiles for data stored in a distributed system. In one embodiment, an instruction is received to estimate a specified quantile for a variate in a set of data stored at a plurality of nodes in the distributed system. A plurality of data bins for the variate are defined that are each associated with a different range of data values in the set of data. Lower and upper quantile bounds for each of the plurality of data bins are determined based on the total number of data values that fall within each of the plurality of data bins. The specified quantile is estimated based on an identified one of the plurality of data bins that includes the specified quantile based on the lower and upper quantile bounds.Type: ApplicationFiled: April 29, 2016Publication date: August 25, 2016Applicant: SAS Institute Inc.Inventors: Georges H. Guirguis, Scott Pope, Oliver Schabenberger
-
Patent number: 9268796Abstract: In accordance with the teachings described herein, systems and methods are provided for estimating quantiles for data stored in a distributed system. In one embodiment, an instruction is received to estimate a specified quantile for a variate in a set of data stored at a plurality of nodes in the distributed system. A plurality of data bins for the variate are defined that are each associated with a different range of data values in the set of data. Lower and upper quantile bounds for each of the plurality of data bins are determined based on the total number of data values that fall within each of the plurality of data bins. The specified quantile is estimated based on an identified one of the plurality of data bins that includes the specified quantile based on the lower and upper quantile bounds.Type: GrantFiled: May 29, 2012Date of Patent: February 23, 2016Assignee: SAS Institute Inc.Inventors: Scott Pope, Georges H. Guirguis, Oliver Schabenberger
-
Publication number: 20150154238Abstract: Systems and methods are provided for a data processing system having multiple executable threads that is configured to generate a cross-product matrix in a single pass through data to be analyzed. An example system comprises memory for receiving the data to be analyzed, a processor having a plurality of executable threads for executing code to analyze data, and software code for generating a cross-product matrix in a single pass through data to be analyzed.Type: ApplicationFiled: February 12, 2015Publication date: June 4, 2015Inventors: James Howard Goodnight, Oliver Schabenberger
-
Publication number: 20150149241Abstract: Systems and methods are provided for generating multiple system state projections for one or more scenarios using a grid computing environment. A central coordinator software component executes on a root data processor and provides commands and data to a plurality of node coordinator software components. A node coordinator software component manages threads which execute on its associated node data processor and which perform a set of matrix operations. Stochastic simulations use results of the matrix operations to generate multiple state projections. Additional processing can be performed by the grid computing environment based upon the generated state projections, such as to develop risk information for users.Type: ApplicationFiled: November 17, 2014Publication date: May 28, 2015Inventors: James Howard Goodnight, Steve Krueger, Oliver Schabenberger, Christopher D. Bailey
-
Patent number: 8996518Abstract: Systems and methods are provided for a data processing system having multiple executable threads that is configured to generate a cross-product matrix in a single pass through data to be analyzed. An example system comprises memory for receiving the data to be analyzed, a processor having a plurality of executable threads for executing code to analyze data, and software code for generating a cross-product matrix in a single pass through data to be analyzed. The software code includes threaded variable levelization code for generating a plurality of thread specific binary trees for a plurality of classification variables, variable tree merge code for combining a plurality of the thread-specific trees into a plurality of overall trees for the plurality of classification variables, effect levelization code for generating sub-matrices using the plurality of the overall trees for the plurality of classification variables, and cross-product matrix generation code for generating the cross-product matrix.Type: GrantFiled: December 20, 2010Date of Patent: March 31, 2015Assignee: SAS Institute Inc.Inventors: Oliver Schabenberger, James Howard Goodnight
-
Publication number: 20140351196Abstract: Systems and methods for determining an optimal splitting scheme for a node in a classification decision tree. A computing system may receive input data related to a decision tree to be generated from a data set. The input data identifies a target attribute of the data set and a set of candidate attributes of the data set to be used as nodes in the decision tree. The computing system may determine, using a clustering algorithm and the set of candidate attributes, a number of potential splitting schemes to be used to split a node in the decision tree. The computing system may calculate a splitting measurement for each of the plurality of potential splitting schemes. The computing system may select an optimal splitting scheme from the plurality of potential splitting schemes for each node in the decision tree based on the splitting measurement.Type: ApplicationFiled: May 21, 2014Publication date: November 27, 2014Applicant: SAS Institute Inc.Inventors: Xiangqian Hu, Xunlei Wu, Xiangxiang Meng, Oliver Schabenberger
-
Publication number: 20140330826Abstract: Systems and methods for data reduction of a data set are included. A computing system may group data points in a data set into a number of data point bubbles represented by a number of representative points. A data point bubble may include a one or more data points from the data set and a representative point from the data set. The computing system may calculate a cluster assignment for the representative point by executing a clustering algorithm using the number of representative points.Type: ApplicationFiled: May 5, 2014Publication date: November 6, 2014Applicant: SAS Institute Inc.Inventors: Xiangqian Hu, Xunlei Wu, Xiangxiang Meng, Oliver Schabenberger
-
Publication number: 20140330827Abstract: This disclosure describes methods, systems, computer-readable media, and apparatuses for efficiently calculating group-by statistics. A data set that includes multiple entries is accessed. The multiple entries are grouped into group-by subsets which are formed on two or more group-by variables and which are subsets are subsets of the data set. Cardinality data is determined for each of the group-by subsets, wherein cardinality data represents a number of entries in a group-by subset. At least one summary of data in each of the group-by subsets is generated, wherein each of the summaries includes the cardinality data determined for the group-by subset. Objects for the group-by subsets are initialized such that the objects store the summaries. The objects may then be used to generate multiple statistical summaries of the data set.Type: ApplicationFiled: May 5, 2014Publication date: November 6, 2014Applicant: SAS INSTITUTE INC.Inventors: Xunlei Wu, Oliver Schabenberger
-
Publication number: 20130325825Abstract: In accordance with the teachings described herein, systems and methods are provided for estimating quantiles for data stored in a distributed system. In one embodiment, an instruction is received to estimate a specified quantile for a variate in a set of data stored at a plurality of nodes in the distributed system. A plurality of data bins for the variate are defined that are each associated with a different range of data values in the set of data. Lower and upper quantile bounds for each of the plurality of data bins are determined based on the total number of data values that fall within each of the plurality of data bins. The specified quantile is estimated based on an identified one of the plurality of data bins that includes the specified quantile based on the lower and upper quantile bounds.Type: ApplicationFiled: May 29, 2012Publication date: December 5, 2013Inventors: Scott Pope, Georges H. Guirguis, Oliver Schabenberger
-
Patent number: 8271537Abstract: Systems and methods are provided for a grid computing system that performs analytical calculations on data stored in a distributed database system. A grid-enabled software component at a control node is configured to invoke database management software (DBMS) at the control node to cause the DBMS at a plurality of the worker nodes to make available data to the grid-enabled software component local to its node; instruct the grid-enabled software components at the plurality of worker nodes to perform an analytical calculation on the received data and to send the results of the data analysis to the grid-enabled software component at the control node; and assemble the results of the data analysis performed by the grid-enabled software components at the plurality of worker nodes.Type: GrantFiled: November 15, 2010Date of Patent: September 18, 2012Assignee: SAS Institute Inc.Inventors: Oliver Schabenberger, Steve Krueger
-
Publication number: 20120159489Abstract: Systems and methods are provided for a data processing system having multiple executable threads that is configured to generate a cross-product matrix in a single pass through data to be analyzed. An example system comprises memory for receiving the data to be analyzed, a processor having a plurality of executable threads for executing code to analyze data, and software code for generating a cross-product matrix in a single pass through data to be analyzed.Type: ApplicationFiled: December 20, 2010Publication date: June 21, 2012Inventors: Oliver Schabenberger, James Howard Goodnight
-
Publication number: 20120124100Abstract: Systems and methods are provided for a grid computing system that performs analytical calculations on data stored in a distributed database system. A grid-enabled software component at a control node is configured to invoke database management software (DBMS) at the control node to cause the DBMS at a plurality of the worker nodes to make available data to the grid-enabled software component local to its node; instruct the grid-enabled software components at the plurality of worker nodes to perform an analytical calculation on the received data and to send the results of the data analysis to the grid-enabled software component at the control node; and assemble the results of the data analysis performed by the grid-enabled software components at the plurality of worker nodes.Type: ApplicationFiled: November 15, 2010Publication date: May 17, 2012Applicant: SAS Institute Inc.Inventors: Oliver Schabenberger, Steve Krueger
-
Publication number: 20110202329Abstract: Systems and methods are provided for generating multiple system state projections for one or more scenarios using a grid computing environment. A central coordinator software component executes on a root data processor and provides commands and data to a plurality of node coordinator software components. A node coordinator software component manages threads which execute on its associated node data processor and which perform a set of matrix operations. Stochastic simulations use results of the matrix operations to generate multiple state projections. Additional processing can be performed by the grid computing environment based upon the generated state projections, such as to develop risk information for users.Type: ApplicationFiled: February 12, 2010Publication date: August 18, 2011Inventors: James Howard Goodnight, Steve Krueger, Oliver Schabenberger, Christopher D. Bailey