Patents by Inventor Charu Chandra Aggarwal
Charu Chandra Aggarwal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8321579Abstract: Systems and methods for parallel stream item counting are disclosed. A data stream is partitioned into portions and the portions are assigned to a plurality of processing cores. A sequential kernel is executed at each processing core to compute a local count for items in an assigned portion of the data stream for that processing core. The counts are aggregated for all the processing cores to determine a final count for the items in the data stream. A frequency-aware counting method (FCM) for data streams includes dynamically capturing relative frequency phases of items from a data stream and placing the items in a sketch structure using a plurality of hash functions where a number of hash functions is based on the frequency phase of the item. A zero-frequency table is provided to reduce errors due to absent items.Type: GrantFiled: July 26, 2007Date of Patent: November 27, 2012Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Rajesh Bordawekar, Dina Thomas, Philip Shilung Yu
-
Patent number: 7937269Abstract: Systems and methods are provided for real-time classification of streaming data. In particular, systems and methods for real-time classification of continuous data streams implement micro-clustering methods for offline and online processing of training data to build and dynamically update training models that are used for classification, as well as incrementally clustering the data over contiguous segments of a continuous data stream (in real-time) into a plurality of micro-clusters from which target profiles are constructed which define/model the behavior of the data in individual segments of the data stream.Type: GrantFiled: August 22, 2005Date of Patent: May 3, 2011Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Philip Shilung Yu
-
Patent number: 7676458Abstract: A method of querying a hierarchically organized sensor network, said network being sensor network with a global coordinator node at a top level which receives data from lower level intermediate nodes which are either leader nodes for lower level nodes or sensor nodes, wherein a sensor node i at a lowest level receives a signal Y(i,t) at time t, said method including constructing a sketch Swkt=(Swkt1, . . . , Swktn) for an internal node k from S wkt j = ? i ? LeafDescendents ? ( k ) ? ? q = 1 i ? b wiq · r iq j , wherein component Swktj is a sketch of a descendent of node k, ritj is a random variable associated with each sensor node i and time instant t wherein index j refers to independently drawn instantiations of the random variable, bit bwit represents a state of sensor node i for signal value w=Y(i,t) at time t, and LeafDescendents(k) are the lowest level sensor nodes under node k, wherein said sketch is adapted for responding to queries regarding a state of said network.Type: GrantFiled: August 28, 2007Date of Patent: March 9, 2010Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Philip S. Yu
-
Publication number: 20090063432Abstract: A method of querying a hierarchically organized sensor network, said network being sensor network with a global coordinator node at a top level which receives data from lower level intermediate nodes which are either leader nodes for lower level nodes or sensor nodes, wherein a sensor node i at a lowest level receives a signal Y(i,t) at time t, said method including constructing a sketch Swkt=(Swkt1, . . . ,Swktn) for an internal node k from S wkt j = ? i ? LeafDescendents ? ( k ) ? ? q = 1 i ? b wiq · r iq j , wherein component Swktj is a sketch of a descendent of node k, ritj is a random variable associated with each sensor node i and time instant t wherein index j refers to independently drawn instantiations of the random variable, bit bwit represents a state of sensor node i for signal value w=Y(i,t) at time t, and LeafDescendents(k) are the lowest level sensor nodes under node k, wherein said sketch is adapted for responding to queries regarding a state of said network.Type: ApplicationFiled: August 28, 2007Publication date: March 5, 2009Inventors: Charu Chandra Aggarwal, Philip S. Yu
-
Publication number: 20090031175Abstract: Systems and methods for parallel stream item counting are disclosed. A data stream is partitioned into portions and the portions are assigned to a plurality of processing cores. A sequential kernel is executed at each processing core to compute a local count for items in an assigned portion of the data stream for that processing core. The counts are aggregated for all the processing cores to determine a final count for the items in the data stream. A frequency-aware counting method (FCM) for data streams includes dynamically capturing relative frequency phases of items from a data stream and placing the items in a sketch structure using a plurality of hash functions where a number of hash functions is based on the frequency phase of the item. A zero-frequency table is provided to reduce errors due to absent items.Type: ApplicationFiled: July 26, 2007Publication date: January 29, 2009Inventors: CHARU CHANDRA AGGARWAL, RAJESH BORDAWEKAR, DINA THOMAS, PHILIP SHILUNG YU
-
Patent number: 6922700Abstract: A system and method for providing similarity indexing and searching in multi-dimensional databases. In one aspect, given a set of data points in a multidimensional space, the values of the data points on each dimension are partitioned into a plurality of grids, wherein each grid is assigned a grid value. Given a target data point, similarity candidates (i.e., data points that are similar to the target data point) are identified based on matching grid values. An inverted grid index comprising an index on the data points falling into each grid of each dimension is utilized to identify similarity candidates. A similarity selection process is employed to select the closest identified similarity candidates for output, which utilizes a similarity function to measure the closeness of each identified similarity candidate to the target data point. A preferred similarity function is one that considers a subset of the dimensions in which a point falls within a similar grid of the target point.Type: GrantFiled: May 16, 2000Date of Patent: July 26, 2005Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Philip Shi-lung Yu
-
Patent number: 6714975Abstract: A method for dynamically placing objects in slots on a web page in response to a current client request for the web page comprises the steps of classifying users into user groups based one or more user-characteristics, accumulating self-learning data based on user click behavior for each user group, matching the current client request with a corresponding user group and scheduling real-time selection of the slots for the objects on the web page based on the self-learning data of the corresponding user group.Type: GrantFiled: March 31, 1997Date of Patent: March 30, 2004Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Joel Leonard Wolf, Philip Shi-lung Yu
-
Patent number: 6487541Abstract: A rating of a plurality of ratings is predicted. The rating is associated with a user of a plurality of users and the rating corresponds to an item of a plurality of items. One of the plurality of ratings, corresponding to at least one of the plurality of items, is provided for each of the plurality of users. A predictability relation between ones of the plurality of users and other ones of the plurality of users is calculated based on ratings provided by users. One of a plurality of nodes is assigned to each of the plurality of users. Ones of the plurality of nodes are connected with other ones of the plurality of nodes by a plurality of edges based on the predictability relation. A graph which includes the plurality of nodes and the plurality of edges is searched for a path from a node assigned to the user of the plurality of users to another node assigned to another user of the plurality of users.Type: GrantFiled: January 22, 1999Date of Patent: November 26, 2002Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Joel Leonard Wolf, Philip Shi-Lung Yu
-
Patent number: 6360227Abstract: A graph taxonomy of information which is represented by a plurality of vectors is generated. The graph taxonomy includes a plurality of nodes and a plurality of edges. The plurality of nodes is generated, and each node of the plurality of nodes is associated with ones of the plurality of vectors. A tree hierarchy is established based on the plurality of nodes. A plurality of distances between ones of the plurality of nodes is calculated. Ones of the plurality of nodes are connected with other ones of the plurality of nodes by ones of the plurality of edges based on the plurality of distances. The information represented by the plurality of vectors may be, for example, a plurality of documents such as Web Pages.Type: GrantFiled: January 29, 1999Date of Patent: March 19, 2002Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Stephen C. Gates, Philip Shi-Lung Yu
-
Patent number: 6349309Abstract: A method of analyzing information in the form of a plurality of data values. The plurality of data values represent a plurality of objects. The plurality of data values are distributed in a data space. A set of features which characterize each of the plurality of objects is identified. The plurality of data values are stored in a database. Each of the plurality of data values corresponds to at least one of the plurality of objects based on the set of features. Ones of the plurality of data values stored in the database are partitioned into a plurality of clusters. A respective orientation associated with a position in data space of data values which are contained in each respective cluster of the plurality of clusters is calculated based on the set of features. If desired, information may be analyzed for finding peer groups in e-commerce applications.Type: GrantFiled: May 24, 1999Date of Patent: February 19, 2002Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Philip Shi-Lung Yu
-
Patent number: 6307965Abstract: A system and method are provided to analyze information stored in a computer data base by detecting clusters of related or correlated data values. Data values stored in the data base represent a set of objects. A data value is stored in the data base as an instance of a set of features that characterize the objects. The features are the dimensions of the feature space of the data base. Each cluster includes not only a subset of related data values stored in the data base but also a subset of features. The data values in a cluster are data values that are a short distance apart, in the sense of a metric, when projected onto a subspace that corresponds to the subset of features of the cluster. A set of k clusters may be detected such that the average number of features of the subsets of features of the clusters is l.Type: GrantFiled: April 30, 1998Date of Patent: October 23, 2001Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Joel Leonard Wolf, Philip Shi-Lung Yu
-
Patent number: 6289354Abstract: Information is analyzed in the form of a plurality of data values that represent a plurality of objects. A set of features that characterize each object of the plurality of objects is identified. The plurality of data values are stored in a database. Each data value corresponds to at least one of the plurality of objects based on the set of features. Ones of the plurality of data values stored in the database are partitioned into a plurality of clusters. Each cluster of the plurality of clusters is assigned to one respective node of a plurality of nodes arranged in a tree hierarchy. Ones of the plurality of nodes of the tree hierarchy are traversed. If desired, information may be analyzed for finding peer groups in e-commerce applications.Type: GrantFiled: October 7, 1998Date of Patent: September 11, 2001Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Brent Tzion Hailpern, Joel Leonard Wolf, Philip Shi-Lung Yu
-
Patent number: 6263327Abstract: A computerized method of online mining of inference rules in a large database. The method is comprised of two stages, a preprocessing stage followed by an online rule generation stage. The pro-processing stage is further defined to be a two step process that involves the generation of large itemsets. The present method defines large itemsets by how the items in the itemsets relate to each other rather than their level of presence. The measure by which itemsets are said to relate to each other is defined by a computed figure of merit, K1. The first substep of the preprocessing stage involves finding those itemsets that possess a minimum computer collective strength of K1. From those found itemsets, a second user supplied input, K2 is used to prune those itemsets with inference strength below K2.Type: GrantFiled: March 10, 2000Date of Patent: July 17, 2001Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Philip Shi-Lung Yu
-
Patent number: 6236985Abstract: A method of analyzing information in the form of a plurality of data records. Each data record includes one or more data values. The data values are partitioned into a plurality of data signatures. Data values of data signatures are compared to data values of data records. Based on the result of the comparison an index is associated with each data record. A bound corresponding to the index is calculated based on a user defined target value and an objective function. If desired, information may be analyzed for finding peer groups in e-commerce applications.Type: GrantFiled: October 7, 1998Date of Patent: May 22, 2001Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Joel Leonard Wolf, Philip Shi-Lung Yu
-
Patent number: 6151589Abstract: A method for performing continuous auctions over a computer network system consisting of a server/seller and multiple clients/buyers. The seller makes information about the type of sale items, the number of sale items, minimum bid price, time limits for bids to be submitted, and estimated time interval to the next auction decision available to the buyer by displaying it on buyers' computer terminals. Each buyer responds by entering a bid and such bid's duration, within the time limits set by the seller, in to the auction system through buyers' computer terminals. Additionally, a buyer's bid entry time is saved by the system. Determining the response time for present buyers to schedule the next auction. At least one auction winner, whose bid is within bid duration, is selected through a dynamically adjusted customer selection method.Type: GrantFiled: September 10, 1998Date of Patent: November 21, 2000Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Philip Shi-Lung Yoo
-
Patent number: 6094645Abstract: A computer method of online mining of inference rules in a large database comprising a preprocessing stage and an online rule generation stage. The pre-processing stage includes first finding itemsets that possess a minimum computed collective strength K1, and second, pruning the itemsets with inference strength below a predetermined inference strength, K2. The online rule generation stage utilizes the itemsets organized into an adjacency lattice to generate inference rules with inference strength K2.Type: GrantFiled: November 21, 1997Date of Patent: July 25, 2000Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Philip Shi-Lung Yu
-
Patent number: 6092064Abstract: A computer method of online mining of quantitative association rules consisting of two stages, a preprocessing stage followed by an online rule generation stage. The required computational effort is reduced by the pre-processing stage, defined by pre-processing data to organize the relationship between antecedent attributes to create a heirarchially arranged multidimensional indexing structure. The resulting structure facilitates the performance of the second stage, online processing, which involves the generation of quantitative association rules. The second stage, online rule generation, utilizes the multidimensional index structure created by the preprocessing stage by first finding the areas in the data which correspond to the rules and then uses a merging step to create a merged tree in order to carefully combine interesting regions in order to give a heirarchical representation of the rule set. The merged tree is then used in order to actually generate the rules.Type: GrantFiled: November 4, 1997Date of Patent: July 18, 2000Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Philip Shi-Lung Yu
-
Patent number: 6012126Abstract: A system and method for caching objects of non-uniform size. A caching logic includes a selection logic and an admission control logic. The admission control logic determines whether an object not currently in the cache is accessed may be cached at all. The admission control logic uses an auxiliary LRU stack which contains the identities and time stamps of the objects which have been recently accessed. Thus, the memory required is relatively small. The auxiliary cache serves as a dynamic popularity list and an object may be admitted to the cache if and only if it appears on the popularity list. The selection logic selects one or more of the objects in the cache which have to be purged when a new object enters the cache. The order of removal of the objects is prioritized based both on the size as well as the frequency of access of the object and may be adjusted by a time to obsolescence factor (TTO).Type: GrantFiled: October 29, 1996Date of Patent: January 4, 2000Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Marina Aleksandrovna Epelman, Joel Leonard Wolf, Philip Shi-lung Yu
-
Patent number: 5943667Abstract: A computer method of removing simple and strict redundant association rules generated from large collections of data. A compact set of rules is presented to an end user which is devoid of many redundancies in the discovery of data patterns. The method is directed primarily to on-line applications such as the Internet and Intranet. Given a number of large itemsets as input, simple redundancies are removed by generating all maximal ancestors, the frontier set, for each large itemset. The set of maximal ancestors share a hierarchical relationship with the large itemset from which they were derived and further satisfy an inequality whereby the ratio of respective support values is less than the reciprocal of some user defined confidence value.The resulting compact rule set is displayed to an end user at some specified level of support and confidence. The method is also able to generate the full set of rules from the compact set.Type: GrantFiled: June 3, 1997Date of Patent: August 24, 1999Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Philip Shi-lung Yu
-
Patent number: 5924116Abstract: A method and system of collaboratively caching information to allow improved caching decisions by a lower level or sibling node. In a caching hierarchy, the client and/or servers may factor in the caching status at the higher level in deciding whether to cache an object and which objects are to be replaced. The PICS protocol may be used to pass the caching information of some or all the upper hierarchy down the hierarchy. Furthermore, the caching status information can also be used to direct the object request to the closest higher level proxy which has potentially cached the object, instead of blindly requesting it from the next immediate higher level proxy. A selection policy used to select objects for replacement in the cache may be prioritized not only on the size and the frequency of access of the object, but also on the access time required to get the object if it is not cached.Type: GrantFiled: April 2, 1997Date of Patent: July 13, 1999Assignee: International Business Machines CorporationInventors: Charu Chandra Aggarwal, Peter Kenneth Malkin, Robert Jeffrey Schloss, Philip Shi-lung Yu