Patents by Inventor Charu Chandra Aggarwal

Charu Chandra Aggarwal has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8321579
    Abstract: Systems and methods for parallel stream item counting are disclosed. A data stream is partitioned into portions and the portions are assigned to a plurality of processing cores. A sequential kernel is executed at each processing core to compute a local count for items in an assigned portion of the data stream for that processing core. The counts are aggregated for all the processing cores to determine a final count for the items in the data stream. A frequency-aware counting method (FCM) for data streams includes dynamically capturing relative frequency phases of items from a data stream and placing the items in a sketch structure using a plurality of hash functions where a number of hash functions is based on the frequency phase of the item. A zero-frequency table is provided to reduce errors due to absent items.
    Type: Grant
    Filed: July 26, 2007
    Date of Patent: November 27, 2012
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Rajesh Bordawekar, Dina Thomas, Philip Shilung Yu
  • Patent number: 7937269
    Abstract: Systems and methods are provided for real-time classification of streaming data. In particular, systems and methods for real-time classification of continuous data streams implement micro-clustering methods for offline and online processing of training data to build and dynamically update training models that are used for classification, as well as incrementally clustering the data over contiguous segments of a continuous data stream (in real-time) into a plurality of micro-clusters from which target profiles are constructed which define/model the behavior of the data in individual segments of the data stream.
    Type: Grant
    Filed: August 22, 2005
    Date of Patent: May 3, 2011
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Philip Shilung Yu
  • Patent number: 7676458
    Abstract: A method of querying a hierarchically organized sensor network, said network being sensor network with a global coordinator node at a top level which receives data from lower level intermediate nodes which are either leader nodes for lower level nodes or sensor nodes, wherein a sensor node i at a lowest level receives a signal Y(i,t) at time t, said method including constructing a sketch Swkt=(Swkt1, . . . , Swktn) for an internal node k from S wkt j = ? i ? LeafDescendents ? ( k ) ? ? q = 1 i ? b wiq · r iq j , wherein component Swktj is a sketch of a descendent of node k, ritj is a random variable associated with each sensor node i and time instant t wherein index j refers to independently drawn instantiations of the random variable, bit bwit represents a state of sensor node i for signal value w=Y(i,t) at time t, and LeafDescendents(k) are the lowest level sensor nodes under node k, wherein said sketch is adapted for responding to queries regarding a state of said network.
    Type: Grant
    Filed: August 28, 2007
    Date of Patent: March 9, 2010
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Philip S. Yu
  • Publication number: 20090063432
    Abstract: A method of querying a hierarchically organized sensor network, said network being sensor network with a global coordinator node at a top level which receives data from lower level intermediate nodes which are either leader nodes for lower level nodes or sensor nodes, wherein a sensor node i at a lowest level receives a signal Y(i,t) at time t, said method including constructing a sketch Swkt=(Swkt1, . . . ,Swktn) for an internal node k from S wkt j = ? i ? LeafDescendents ? ( k ) ? ? q = 1 i ? b wiq · r iq j , wherein component Swktj is a sketch of a descendent of node k, ritj is a random variable associated with each sensor node i and time instant t wherein index j refers to independently drawn instantiations of the random variable, bit bwit represents a state of sensor node i for signal value w=Y(i,t) at time t, and LeafDescendents(k) are the lowest level sensor nodes under node k, wherein said sketch is adapted for responding to queries regarding a state of said network.
    Type: Application
    Filed: August 28, 2007
    Publication date: March 5, 2009
    Inventors: Charu Chandra Aggarwal, Philip S. Yu
  • Publication number: 20090031175
    Abstract: Systems and methods for parallel stream item counting are disclosed. A data stream is partitioned into portions and the portions are assigned to a plurality of processing cores. A sequential kernel is executed at each processing core to compute a local count for items in an assigned portion of the data stream for that processing core. The counts are aggregated for all the processing cores to determine a final count for the items in the data stream. A frequency-aware counting method (FCM) for data streams includes dynamically capturing relative frequency phases of items from a data stream and placing the items in a sketch structure using a plurality of hash functions where a number of hash functions is based on the frequency phase of the item. A zero-frequency table is provided to reduce errors due to absent items.
    Type: Application
    Filed: July 26, 2007
    Publication date: January 29, 2009
    Inventors: CHARU CHANDRA AGGARWAL, RAJESH BORDAWEKAR, DINA THOMAS, PHILIP SHILUNG YU
  • Patent number: 6922700
    Abstract: A system and method for providing similarity indexing and searching in multi-dimensional databases. In one aspect, given a set of data points in a multidimensional space, the values of the data points on each dimension are partitioned into a plurality of grids, wherein each grid is assigned a grid value. Given a target data point, similarity candidates (i.e., data points that are similar to the target data point) are identified based on matching grid values. An inverted grid index comprising an index on the data points falling into each grid of each dimension is utilized to identify similarity candidates. A similarity selection process is employed to select the closest identified similarity candidates for output, which utilizes a similarity function to measure the closeness of each identified similarity candidate to the target data point. A preferred similarity function is one that considers a subset of the dimensions in which a point falls within a similar grid of the target point.
    Type: Grant
    Filed: May 16, 2000
    Date of Patent: July 26, 2005
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Philip Shi-lung Yu
  • Patent number: 6714975
    Abstract: A method for dynamically placing objects in slots on a web page in response to a current client request for the web page comprises the steps of classifying users into user groups based one or more user-characteristics, accumulating self-learning data based on user click behavior for each user group, matching the current client request with a corresponding user group and scheduling real-time selection of the slots for the objects on the web page based on the self-learning data of the corresponding user group.
    Type: Grant
    Filed: March 31, 1997
    Date of Patent: March 30, 2004
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Joel Leonard Wolf, Philip Shi-lung Yu
  • Patent number: 6487541
    Abstract: A rating of a plurality of ratings is predicted. The rating is associated with a user of a plurality of users and the rating corresponds to an item of a plurality of items. One of the plurality of ratings, corresponding to at least one of the plurality of items, is provided for each of the plurality of users. A predictability relation between ones of the plurality of users and other ones of the plurality of users is calculated based on ratings provided by users. One of a plurality of nodes is assigned to each of the plurality of users. Ones of the plurality of nodes are connected with other ones of the plurality of nodes by a plurality of edges based on the predictability relation. A graph which includes the plurality of nodes and the plurality of edges is searched for a path from a node assigned to the user of the plurality of users to another node assigned to another user of the plurality of users.
    Type: Grant
    Filed: January 22, 1999
    Date of Patent: November 26, 2002
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Joel Leonard Wolf, Philip Shi-Lung Yu
  • Patent number: 6360227
    Abstract: A graph taxonomy of information which is represented by a plurality of vectors is generated. The graph taxonomy includes a plurality of nodes and a plurality of edges. The plurality of nodes is generated, and each node of the plurality of nodes is associated with ones of the plurality of vectors. A tree hierarchy is established based on the plurality of nodes. A plurality of distances between ones of the plurality of nodes is calculated. Ones of the plurality of nodes are connected with other ones of the plurality of nodes by ones of the plurality of edges based on the plurality of distances. The information represented by the plurality of vectors may be, for example, a plurality of documents such as Web Pages.
    Type: Grant
    Filed: January 29, 1999
    Date of Patent: March 19, 2002
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Stephen C. Gates, Philip Shi-Lung Yu
  • Patent number: 6349309
    Abstract: A method of analyzing information in the form of a plurality of data values. The plurality of data values represent a plurality of objects. The plurality of data values are distributed in a data space. A set of features which characterize each of the plurality of objects is identified. The plurality of data values are stored in a database. Each of the plurality of data values corresponds to at least one of the plurality of objects based on the set of features. Ones of the plurality of data values stored in the database are partitioned into a plurality of clusters. A respective orientation associated with a position in data space of data values which are contained in each respective cluster of the plurality of clusters is calculated based on the set of features. If desired, information may be analyzed for finding peer groups in e-commerce applications.
    Type: Grant
    Filed: May 24, 1999
    Date of Patent: February 19, 2002
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Philip Shi-Lung Yu
  • Patent number: 6307965
    Abstract: A system and method are provided to analyze information stored in a computer data base by detecting clusters of related or correlated data values. Data values stored in the data base represent a set of objects. A data value is stored in the data base as an instance of a set of features that characterize the objects. The features are the dimensions of the feature space of the data base. Each cluster includes not only a subset of related data values stored in the data base but also a subset of features. The data values in a cluster are data values that are a short distance apart, in the sense of a metric, when projected onto a subspace that corresponds to the subset of features of the cluster. A set of k clusters may be detected such that the average number of features of the subsets of features of the clusters is l.
    Type: Grant
    Filed: April 30, 1998
    Date of Patent: October 23, 2001
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Joel Leonard Wolf, Philip Shi-Lung Yu
  • Patent number: 6289354
    Abstract: Information is analyzed in the form of a plurality of data values that represent a plurality of objects. A set of features that characterize each object of the plurality of objects is identified. The plurality of data values are stored in a database. Each data value corresponds to at least one of the plurality of objects based on the set of features. Ones of the plurality of data values stored in the database are partitioned into a plurality of clusters. Each cluster of the plurality of clusters is assigned to one respective node of a plurality of nodes arranged in a tree hierarchy. Ones of the plurality of nodes of the tree hierarchy are traversed. If desired, information may be analyzed for finding peer groups in e-commerce applications.
    Type: Grant
    Filed: October 7, 1998
    Date of Patent: September 11, 2001
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Brent Tzion Hailpern, Joel Leonard Wolf, Philip Shi-Lung Yu
  • Patent number: 6263327
    Abstract: A computerized method of online mining of inference rules in a large database. The method is comprised of two stages, a preprocessing stage followed by an online rule generation stage. The pro-processing stage is further defined to be a two step process that involves the generation of large itemsets. The present method defines large itemsets by how the items in the itemsets relate to each other rather than their level of presence. The measure by which itemsets are said to relate to each other is defined by a computed figure of merit, K1. The first substep of the preprocessing stage involves finding those itemsets that possess a minimum computer collective strength of K1. From those found itemsets, a second user supplied input, K2 is used to prune those itemsets with inference strength below K2.
    Type: Grant
    Filed: March 10, 2000
    Date of Patent: July 17, 2001
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Philip Shi-Lung Yu
  • Patent number: 6236985
    Abstract: A method of analyzing information in the form of a plurality of data records. Each data record includes one or more data values. The data values are partitioned into a plurality of data signatures. Data values of data signatures are compared to data values of data records. Based on the result of the comparison an index is associated with each data record. A bound corresponding to the index is calculated based on a user defined target value and an objective function. If desired, information may be analyzed for finding peer groups in e-commerce applications.
    Type: Grant
    Filed: October 7, 1998
    Date of Patent: May 22, 2001
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Joel Leonard Wolf, Philip Shi-Lung Yu
  • Patent number: 6151589
    Abstract: A method for performing continuous auctions over a computer network system consisting of a server/seller and multiple clients/buyers. The seller makes information about the type of sale items, the number of sale items, minimum bid price, time limits for bids to be submitted, and estimated time interval to the next auction decision available to the buyer by displaying it on buyers' computer terminals. Each buyer responds by entering a bid and such bid's duration, within the time limits set by the seller, in to the auction system through buyers' computer terminals. Additionally, a buyer's bid entry time is saved by the system. Determining the response time for present buyers to schedule the next auction. At least one auction winner, whose bid is within bid duration, is selected through a dynamically adjusted customer selection method.
    Type: Grant
    Filed: September 10, 1998
    Date of Patent: November 21, 2000
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Philip Shi-Lung Yoo
  • Patent number: 6094645
    Abstract: A computer method of online mining of inference rules in a large database comprising a preprocessing stage and an online rule generation stage. The pre-processing stage includes first finding itemsets that possess a minimum computed collective strength K1, and second, pruning the itemsets with inference strength below a predetermined inference strength, K2. The online rule generation stage utilizes the itemsets organized into an adjacency lattice to generate inference rules with inference strength K2.
    Type: Grant
    Filed: November 21, 1997
    Date of Patent: July 25, 2000
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Philip Shi-Lung Yu
  • Patent number: 6092064
    Abstract: A computer method of online mining of quantitative association rules consisting of two stages, a preprocessing stage followed by an online rule generation stage. The required computational effort is reduced by the pre-processing stage, defined by pre-processing data to organize the relationship between antecedent attributes to create a heirarchially arranged multidimensional indexing structure. The resulting structure facilitates the performance of the second stage, online processing, which involves the generation of quantitative association rules. The second stage, online rule generation, utilizes the multidimensional index structure created by the preprocessing stage by first finding the areas in the data which correspond to the rules and then uses a merging step to create a merged tree in order to carefully combine interesting regions in order to give a heirarchical representation of the rule set. The merged tree is then used in order to actually generate the rules.
    Type: Grant
    Filed: November 4, 1997
    Date of Patent: July 18, 2000
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Philip Shi-Lung Yu
  • Patent number: 6012126
    Abstract: A system and method for caching objects of non-uniform size. A caching logic includes a selection logic and an admission control logic. The admission control logic determines whether an object not currently in the cache is accessed may be cached at all. The admission control logic uses an auxiliary LRU stack which contains the identities and time stamps of the objects which have been recently accessed. Thus, the memory required is relatively small. The auxiliary cache serves as a dynamic popularity list and an object may be admitted to the cache if and only if it appears on the popularity list. The selection logic selects one or more of the objects in the cache which have to be purged when a new object enters the cache. The order of removal of the objects is prioritized based both on the size as well as the frequency of access of the object and may be adjusted by a time to obsolescence factor (TTO).
    Type: Grant
    Filed: October 29, 1996
    Date of Patent: January 4, 2000
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Marina Aleksandrovna Epelman, Joel Leonard Wolf, Philip Shi-lung Yu
  • Patent number: 5943667
    Abstract: A computer method of removing simple and strict redundant association rules generated from large collections of data. A compact set of rules is presented to an end user which is devoid of many redundancies in the discovery of data patterns. The method is directed primarily to on-line applications such as the Internet and Intranet. Given a number of large itemsets as input, simple redundancies are removed by generating all maximal ancestors, the frontier set, for each large itemset. The set of maximal ancestors share a hierarchical relationship with the large itemset from which they were derived and further satisfy an inequality whereby the ratio of respective support values is less than the reciprocal of some user defined confidence value.The resulting compact rule set is displayed to an end user at some specified level of support and confidence. The method is also able to generate the full set of rules from the compact set.
    Type: Grant
    Filed: June 3, 1997
    Date of Patent: August 24, 1999
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Philip Shi-lung Yu
  • Patent number: 5924116
    Abstract: A method and system of collaboratively caching information to allow improved caching decisions by a lower level or sibling node. In a caching hierarchy, the client and/or servers may factor in the caching status at the higher level in deciding whether to cache an object and which objects are to be replaced. The PICS protocol may be used to pass the caching information of some or all the upper hierarchy down the hierarchy. Furthermore, the caching status information can also be used to direct the object request to the closest higher level proxy which has potentially cached the object, instead of blindly requesting it from the next immediate higher level proxy. A selection policy used to select objects for replacement in the cache may be prioritized not only on the size and the frequency of access of the object, but also on the access time required to get the object if it is not cached.
    Type: Grant
    Filed: April 2, 1997
    Date of Patent: July 13, 1999
    Assignee: International Business Machines Corporation
    Inventors: Charu Chandra Aggarwal, Peter Kenneth Malkin, Robert Jeffrey Schloss, Philip Shi-lung Yu