Patents by Inventor Joseph Yarmus
Joseph Yarmus has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9471545Abstract: Processes, machines, and stored machine instructions are provided for approximating value densities in data. While generating a resulting density model to approximate value densities in a set of data, density modeling logic selects a functional component of a first model to vary based at least in part on how much the functional component contributes to how well the first model approximates the value densities. The density modeling logic then uses at least the functional component and a variation of the functional component as seed components to determine adjusted functional components of a second model by iteratively determining, in an expectation step, how much the seed components contribute to how well the second model explains the values, and, in a maximization step, new seed components, optionally to be used in further iterations, based at least in part on how much of the values are attributable to the seed components.Type: GrantFiled: February 11, 2013Date of Patent: October 18, 2016Assignee: Oracle International CorporationInventors: Boriana Lubomirova Milenova, Marcos M Campos, Joseph Yarmus
-
Patent number: 9292550Abstract: Systems, methods, and other embodiments associated with feature generation and model selection for generalized linear models are described. In one embodiment, a method includes ordering candidate features in a dataset being considered by a streamwise feature selection process according to an inclusion score that reflects a likelihood that a given candidate feature will be included in the GLM. The ordered candidate features are provided to the streamwise feature selection process for acceptance testing. In one embodiment, the method also includes selecting penalty criterion for use in the acceptance testing that is based on characteristics of the dataset.Type: GrantFiled: February 21, 2013Date of Patent: March 22, 2016Assignee: ORACLE INTERNATIONAL CORPORATIONInventor: Joseph Yarmus
-
Patent number: 9135309Abstract: A computer-implemented method of creating a data mining model in a database management system comprises accepting a database language statement at the database management system, the database language statement indicating a dataset and a data mining model to be created from the dataset, and creating, in the database management system, the indicated data mining model using the indicated dataset, wherein creation and application of the data mining model does not require moving data to a separate data mining engine.Type: GrantFiled: November 18, 2011Date of Patent: September 15, 2015Assignee: Oracle International CorporationInventors: Wei Li, Shiby Thomas, Joseph Yarmus, Ari W. Mozes, Mahesh Jagannath
-
Publication number: 20140236965Abstract: Systems, methods, and other embodiments associated with feature generation and model selection for generalized linear models are described. In one embodiment, a method includes ordering candidate features in a dataset being considered by a streamwise feature selection process according to an inclusion score that reflects a likelihood that a given candidate feature will be included in the GLM. The ordered candidate features are provided to the streamwise feature selection process for acceptance testing. In one embodiment, the method also includes selecting penalty criterion for use in the acceptance testing that is based on characteristics of the dataset.Type: ApplicationFiled: February 21, 2013Publication date: August 21, 2014Applicant: ORACLE INTERNATIONAL CORPORATIONInventor: Joseph YARMUS
-
Publication number: 20140229147Abstract: Processes, machines, and stored machine instructions are provided for approximating value densities in data. While generating a resulting density model to approximate value densities in a set of data, density modeling logic selects a functional component of a first model to vary based at least in part on how much the functional component contributes to how well the first model approximates the value densities. The density modeling logic then uses at least the functional component and a variation of the functional component as seed components to determine adjusted functional components of a second model by iteratively determining, in an expectation step, how much the seed components contribute to how well the second model explains the values, and, in a maximization step, new seed components, optionally to be used in further iterations, based at least in part on how much of the values are attributable to the seed components.Type: ApplicationFiled: February 11, 2013Publication date: August 14, 2014Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Boriana Lubomirova Milenova, Marcos M. Campos, Joseph Yarmus
-
Patent number: 8280915Abstract: Binning of predictor values used for generating a data mining model provides useful reduction in memory footprint and computation during the computationally dominant decision tree build phase, but reduces the information loss of the model and reduces the introduction of false information artifacts. A method of binning data in a database for data mining modeling in a database system, the data stored in a database table in the database system, the data mining modeling having selected at least one predictor and one target for the data, the data including a plurality of values of the predictor and a plurality of values of the target, the method comprises constructing a binary tree for the predictor that splits the values of the predictor into a plurality of portions, pruning the binary tree, and defining as bins of the predictor leaves of the tree that remain after pruning, each leaf of the tree representing a portion of the values of the predictor.Type: GrantFiled: February 1, 2006Date of Patent: October 2, 2012Assignee: Oracle International CorporationInventors: Mahesh Jagannath, Chitra Bhagwat, Joseph Yarmus, Ari W. Mozes
-
Publication number: 20120066260Abstract: A computer-implemented method of creating a data mining model in a database management system comprises accepting a database language statement at the database management system, the database language statement indicating a dataset and a data mining model to be created from the dataset, and creating, in the database management system, the indicated data mining model using the indicated dataset, wherein creation and application of the data mining model does not require moving data to a separate data mining engine.Type: ApplicationFiled: November 18, 2011Publication date: March 15, 2012Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Wei LI, Shiby THOMAS, Joseph YARMUS, Ari W. MOZES, Mahesh JAGANNATH
-
Patent number: 8065326Abstract: Decision trees are efficiently represented in a relational database. A computer-implemented method of representing a decision tree model in relational form comprises providing a directed acyclic graph comprising a plurality of nodes and a plurality of links, each link connecting a plurality of nodes, encoding a tree structure by including in each node a parent-child relationship of the node with other nodes, encoding in each node information relating to a split represented by the node, the split information including a splitting predictor and a split value, and encoding in each node a target histogram.Type: GrantFiled: February 1, 2006Date of Patent: November 22, 2011Assignee: Oracle International CorporationInventors: Wei Li, Shiby Thomas, Joseph Yarmus, Ari W. Mozes, Mahesh Jagannath
-
Patent number: 7571159Abstract: A method, system, and computer program product for counting predictor-target pairs for a decision tree model provides the capability to generate count tables that is quicker and more efficient than previous techniques. A method of counting predictor-target pairs for a decision tree model, the decision tree model based on data stored in a database, the data comprising a plurality of rows of data, at least one predictor and at least one target, comprises generating a bitmap for each split node of data stored in a database system by intersecting a parent node bitmap and a bitmap of a predictor that satisfies a condition of the node, intersecting each split node bitmap with each predictor bitmap and with each target bitmap to form intersected bitmaps, and counting bits of each intersected bitmap to generate a count of predictor-target pairs.Type: GrantFiled: February 1, 2006Date of Patent: August 4, 2009Assignee: Oracle International CorporationInventors: Shiby Thomas, Wei Li, Joseph Yarmus, Mahesh Jagannath, Ari W. Mozes
-
Publication number: 20070192341Abstract: A method, system, and computer program product for counting predictor-target pairs for a decision tree model provides the capability to generate count tables that is quicker and more efficient than previous techniques. A method of counting predictor-target pairs for a decision tree model, the decision tree model based on data stored in a database, the data comprising a plurality of rows of data, at least one predictor and at least one target, comprises generating a bitmap for each split node of data stored in a database system by intersecting a parent node bitmap and a bitmap of a predictor that satisfies a condition of the node, intersecting each split node bitmap with each predictor bitmap and with each target bitmap to form intersected bitmaps, and counting bits of each intersected bitmap to generate a count of predictor-target pairs.Type: ApplicationFiled: February 1, 2006Publication date: August 16, 2007Inventors: Shiby Thomas, Wei Li, Joseph Yarmus, Mahesh Jagannath, Ari Mozes
-
Publication number: 20070185896Abstract: Binning of predictor values used for generating a data mining model provides useful reduction in memory footprint and computation during the computationally dominant decision tree build phase, but reduces the information loss of the model and reduces the introduction of false information artifacts. A method of binning data in a database for data mining modeling in a database system, the data stored in a database table in the database system, the data mining modeling having selected at least one predictor and one target for the data, the data including a plurality of values of the predictor and a plurality of values of the target, the method comprises constructing a binary tree for the predictor that splits the values of the predictor into a plurality of portions, pruning the binary tree, and defining as bins of the predictor leaves of the tree that remain after pruning, each leaf of the tree representing a portion of the values of the predictor.Type: ApplicationFiled: February 1, 2006Publication date: August 9, 2007Inventors: Mahesh Jagannath, Chitra Bhagwat, Joseph Yarmus, Ari Mozes
-
Publication number: 20070179966Abstract: Decision trees are efficiently represented in a relational database. A computer-implemented method of representing a decision tree model in relational form comprises providing a directed acyclic graph comprising a plurality of nodes and a plurality of links, each link connecting a plurality of nodes, encoding a tree structure by including in each node a parent-child relationship of the node with other nodes, encoding in each node information relating to a split represented by the node, the split information including a splitting predictor and a split value, and encoding in each node a target histogram.Type: ApplicationFiled: February 1, 2006Publication date: August 2, 2007Inventors: Wei Li, Shiby Thomas, Joseph Yarmus, Ari Mozes, Mahesh Jagannath
-
Publication number: 20050246354Abstract: An implementation of NMF functionality integrated into a relational database management system provides the capability to apply NMF to relational datasets and to sparse datasets. A database management system comprises a multi-dimensional data table operable to store data and a processing unit operable to perform non-negative matrix factorization on data stored in the multi-dimensional data table and to generate a plurality of data tables, each data table being smaller than the multi-dimensional data table and having reduced dimensionality relative to the multi-dimensional data table. The multi-dimensional data table may be a relational data table.Type: ApplicationFiled: August 27, 2004Publication date: November 3, 2005Inventors: Pablo Tamayo, George Tang, Mark McCracken, Mahesh Jagannath, Marcos Campos, Boriana Milenova, Joseph Yarmus, Pavani Kuntala
-
Publication number: 20050049990Abstract: An implementation of SVM functionality improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A system for support vector machine processing comprises data stored in the system, a client application programming interface operable to provide an interface to client software, a build unit operable to build a support vector machine model on at least a portion of the data stored in the system, based on a plurality of model-building parameters, a parameter estimation unit operable to estimate values for at least some of the model-building parameters, and an apply unit operable to apply the support vector machine model using the data stored in the system.Type: ApplicationFiled: August 27, 2004Publication date: March 3, 2005Inventors: Boriana Milenova, Joseph Yarmus, Marcos Campos, Mark McCracken
-
Publication number: 20050050087Abstract: An implementation of SVM functionality integrated into a relational database management system (RDBMS) improves efficiency, time consumption, and data security, reduces the parameter tuning challenges presented to the inexperienced user, and reduces the computational costs of building SVM models. A database management system comprises data stored in the database management system and a processing unit comprising a client application programming interface operable to provide an interface to client software, a build unit operable to build a support vector machine model on at least a portion of the data stored in the database management system, and an apply unit operable to apply the support vector machine model using the data stored in the database management system. The database management system may be a relational database management system.Type: ApplicationFiled: August 27, 2004Publication date: March 3, 2005Inventors: Boriana Milenova, Joseph Yarmus, Marcos Campos, Mark McCracken