Patents by Inventor Ari W. Mozes
Ari W. Mozes has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11176480Abstract: Systems, methods, and other embodiments are disclosed for partitioning models in a database. In one embodiment, a set of training data is parsed into multiple data partitions based on partition keys, where the data partitions are identified by the partition keys and are used for training data mining models. The multiple data partitions are analyzed to generate partition metrics data. Algorithm data, identifying at least one algorithm for processing the multiple data partitions, and resources data, identifying available modeling resources for processing the multiple data partitions, are read. The partition metrics data, the algorithm data, and the resources data are processed to generate an organization data structure. The organization data structure is configured to control distribution and processing of the multiple data partitions across the available modeling resources to generate a composite model object that includes a separately trained data mining model for each partition of the multiple partitions.Type: GrantFiled: August 2, 2016Date of Patent: November 16, 2021Assignee: Oracle International CorporationInventors: Ari W. Mozes, Boriana L. Milenova, Marcos M. Campos, Mark A. McCracken, Gayathri P. Ayyappan
-
Patent number: 10885047Abstract: Systems, methods, and other embodiments are disclosed for performing data mining. In one embodiment, transaction records are read one at a time. Each transaction record represents a transaction for at least one item and includes an item identifier and a metric value for the item. The number-of-occurrences of at least one candidate item set in the transaction records are counted to generate a total count for the candidate item set. The candidate item set includes one or more items. As the counting proceeds, at least one aggregate metric value associated with the candidate item set is accumulated by summing metric values across the number-of-occurrences for each item represented in the candidate item set. A determination is made as to whether the candidate item set is a frequent item set in the transaction records by comparing the total count to a threshold value.Type: GrantFiled: July 1, 2016Date of Patent: January 5, 2021Assignee: Oracle International CorporationInventors: Ari W. Mozes, Dongfang Bai
-
Publication number: 20180004816Abstract: Systems, methods, and other embodiments are disclosed for performing data mining. In one embodiment, transaction records are read one at a time. Each transaction record represents a transaction for at least one item and includes an item identifier and a metric value for the item. The number-of-occurrences of at least one candidate item set in the transaction records are counted to generate a total count for the candidate item set. The candidate item set includes one or more items. As the counting proceeds, at least one aggregate metric value associated with the candidate item set is accumulated by summing metric values across the number-of-occurrences for each item represented in the candidate item set. A determination is made as to whether the candidate item set is a frequent item set in the transaction records by comparing the total count to a threshold value.Type: ApplicationFiled: July 1, 2016Publication date: January 4, 2018Inventors: Ari W. MOZES, Dongfang BAI
-
Publication number: 20170308809Abstract: Systems, methods, and other embodiments are disclosed for partitioning models in a database. In one embodiment, a set of training data is parsed into multiple data partitions based on partition keys, where the data partitions are identified by the partition keys and are used for training data mining models. The multiple data partitions are analyzed to generate partition metrics data. Algorithm data, identifying at least one algorithm for processing the multiple data partitions, and resources data, identifying available modeling resources for processing the multiple data partitions, are read. The partition metrics data, the algorithm data, and the resources data are processed to generate an organization data structure. The organization data structure is configured to control distribution and processing of the multiple data partitions across the available modeling resources to generate a composite model object that includes a separately trained data mining model for each partition of the multiple partitions.Type: ApplicationFiled: August 2, 2016Publication date: October 26, 2017Inventors: Ari W. MOZES, Boriana L. MILENOVA, Marcos M. CAMPOS, Mark A. MCCRACKEN, Gayathri P. AYYAPPAN
-
Patent number: 9135309Abstract: A computer-implemented method of creating a data mining model in a database management system comprises accepting a database language statement at the database management system, the database language statement indicating a dataset and a data mining model to be created from the dataset, and creating, in the database management system, the indicated data mining model using the indicated dataset, wherein creation and application of the data mining model does not require moving data to a separate data mining engine.Type: GrantFiled: November 18, 2011Date of Patent: September 15, 2015Assignee: Oracle International CorporationInventors: Wei Li, Shiby Thomas, Joseph Yarmus, Ari W. Mozes, Mahesh Jagannath
-
Patent number: 8280915Abstract: Binning of predictor values used for generating a data mining model provides useful reduction in memory footprint and computation during the computationally dominant decision tree build phase, but reduces the information loss of the model and reduces the introduction of false information artifacts. A method of binning data in a database for data mining modeling in a database system, the data stored in a database table in the database system, the data mining modeling having selected at least one predictor and one target for the data, the data including a plurality of values of the predictor and a plurality of values of the target, the method comprises constructing a binary tree for the predictor that splits the values of the predictor into a plurality of portions, pruning the binary tree, and defining as bins of the predictor leaves of the tree that remain after pruning, each leaf of the tree representing a portion of the values of the predictor.Type: GrantFiled: February 1, 2006Date of Patent: October 2, 2012Assignee: Oracle International CorporationInventors: Mahesh Jagannath, Chitra Bhagwat, Joseph Yarmus, Ari W. Mozes
-
Publication number: 20120066260Abstract: A computer-implemented method of creating a data mining model in a database management system comprises accepting a database language statement at the database management system, the database language statement indicating a dataset and a data mining model to be created from the dataset, and creating, in the database management system, the indicated data mining model using the indicated dataset, wherein creation and application of the data mining model does not require moving data to a separate data mining engine.Type: ApplicationFiled: November 18, 2011Publication date: March 15, 2012Applicant: ORACLE INTERNATIONAL CORPORATIONInventors: Wei LI, Shiby THOMAS, Joseph YARMUS, Ari W. MOZES, Mahesh JAGANNATH
-
Patent number: 8065326Abstract: Decision trees are efficiently represented in a relational database. A computer-implemented method of representing a decision tree model in relational form comprises providing a directed acyclic graph comprising a plurality of nodes and a plurality of links, each link connecting a plurality of nodes, encoding a tree structure by including in each node a parent-child relationship of the node with other nodes, encoding in each node information relating to a split represented by the node, the split information including a splitting predictor and a split value, and encoding in each node a target histogram.Type: GrantFiled: February 1, 2006Date of Patent: November 22, 2011Assignee: Oracle International CorporationInventors: Wei Li, Shiby Thomas, Joseph Yarmus, Ari W. Mozes, Mahesh Jagannath
-
Patent number: 7756853Abstract: A method and mechanism for performing improved frequent itemset operations is provided. A set of item groups are divided into a plurality of subsets. Each item group is composed of a set of data items. Possible combinations of data items that may frequently appear together in the same item group are referred to as candidate combinations. Candidate combinations comprising a first set of data items are identified, and thereafter the occurrence of each candidate combination in any item group in each subset is counted by comparing item bitmaps, associated with items in the candidate combination, in each subset in turn. The comparison of item bitmaps is performed in volatile memory. A total frequent itemset count that describes the frequency of candidate combinations in items groups across all subsets is obtained. Thereafter, the total frequent itemset count for candidate combinations having a larger number of data items may be determined.Type: GrantFiled: August 27, 2004Date of Patent: July 13, 2010Assignee: Oracle International CorporationInventors: Wei Li, Ari W. Mozes, Hakan Jakobsson
-
Patent number: 7571159Abstract: A method, system, and computer program product for counting predictor-target pairs for a decision tree model provides the capability to generate count tables that is quicker and more efficient than previous techniques. A method of counting predictor-target pairs for a decision tree model, the decision tree model based on data stored in a database, the data comprising a plurality of rows of data, at least one predictor and at least one target, comprises generating a bitmap for each split node of data stored in a database system by intersecting a parent node bitmap and a bitmap of a predictor that satisfies a condition of the node, intersecting each split node bitmap with each predictor bitmap and with each target bitmap to form intersected bitmaps, and counting bits of each intersected bitmap to generate a count of predictor-target pairs.Type: GrantFiled: February 1, 2006Date of Patent: August 4, 2009Assignee: Oracle International CorporationInventors: Shiby Thomas, Wei Li, Joseph Yarmus, Mahesh Jagannath, Ari W. Mozes
-
Patent number: 7359890Abstract: A number, of the blocks of data to be prefetched into a buffer cache, is determined dynamically at run time (e.g. during execution of a query), based at least in part on the load placed on the buffer cache. An application program (such as a database) is responsive to the number (also called “prefetch size”), to determine the amount of prefetching. A sequence of instructions (also called “prefetch size daemon”) computes the prefetch size based on, for example, the number of prefetched blocks aged out before use. The prefetch size daemon dynamically revises the prefetch size based on usage of the buffer cache, thereby to form a feedback loop. Depending on the embodiment, at times of excessive use of the buffer cache, prefetching may even be turned off.Type: GrantFiled: May 8, 2002Date of Patent: April 15, 2008Assignee: Oracle International CorporationInventors: Chi Ku, Arvind Nithrakashyap, Ari W. Mozes
-
Patent number: 7272589Abstract: A method evaluates a plurality of candidate index sets for a workload of database statements in a database system by first generating baseline statistics for each statement in the workload. An index superset is formed by combining an existing or current index set and a proposed index set. A candidate index set is derived from the index superset, the candidate index being one of the plurality of candidate index sets. Statistics for a statement are generated by first creating an execution plan which represents an efficient series of steps for executing the statement given the candidate index set. The execution plan is evaluated, and statistics based on the evaluation of the execution plan are generated and recorded. The cost of the execution plan is then determined and statistics are generated. Statistics for each candidate index set are rolled up and presented to a user or an index tuning mechanism.Type: GrantFiled: November 1, 2000Date of Patent: September 18, 2007Assignee: Oracle International CorporationInventors: Todd P. Guay, Gregory S. Smith, Ari W. Mozes, Gaylen D. Royal
-
Publication number: 20040193629Abstract: A system and method for determining an adequate sample size for statistics collection is disclosed. A mechanism for automatically determining an adequate sample size for both statistics and histograms is provided. The sample size determination is accomplished via an iterative approach where the process starts with a small sample, and for each attribute which may need more data, the sample size is increased while restricting the information collected to only those attributes that require the larger sample.Type: ApplicationFiled: April 6, 2004Publication date: September 30, 2004Applicant: ORACLE INTERNATIONAL CORPORATIONInventor: Ari W. Mozes
-
Patent number: 6732085Abstract: A system and method for determining an adequate sample size for statistics collection is disclosed. A mechanism for automatically determining an adequate sample size for both statistics and histograms is provided. The sample size determination is accomplished via an iterative approach where the process starts with a small sample, and for each attribute which may need more data, the sample size is increased while restricting the information collected to only those attributes that require the larger sample.Type: GrantFiled: May 31, 2001Date of Patent: May 4, 2004Assignee: Oracle International CorporationInventor: Ari W. Mozes
-
Publication number: 20040064430Abstract: A container object data structure for storing metadata associated with multiple queues is provided for processing data elements in first-in, first-out fashion. In one embodiment, the container object is implemented in a database environment providing statement syntax for creating data objects, such as tables and views, to implement user schema. Queue metadata can comprise one or more pointers for data element access and control during one or more queue operations, such as an enqueue, dequeue, or update operation.Type: ApplicationFiled: September 27, 2002Publication date: April 1, 2004Inventors: Jonathan D. Klein, Amit Ganesh, Chi Young Ku, Ari W. Mozes
-
Patent number: 6691099Abstract: A method and system for determining when to collect, save, and/or utilize histograms is disclosed. A mechanism for automatically deciding when to collect histograms upon request from the user is provided. The histogram collection decision is based on the columns the user is interested in, the role these columns play in the queries as submitted to the system, and the underlying distribution for these columns, e.g., as seen in a random sample. The user specifies which columns are of interest, and the database is configured to collect column usage information that describes how each column is being used in the workload. This column usage information could be stored in memory and periodically flushed to disk. Given a set of potential columns, the distribution of those columns is viewed in combination with the usage information to determine which columns should have histograms.Type: GrantFiled: May 31, 2001Date of Patent: February 10, 2004Assignee: Oracle International CorporationInventor: Ari W. Mozes