Patents Represented by Attorney, Agent or Law Firm Khanh Q. Tran
  • Patent number: 5870748
    Abstract: A method is disclosed for determining the correlation among data sets having a numerical attribute and a 0-1 attribute. First, a numerical attribute is divided into a plurality of buckets, and each data set is placed into a single bucket according to the value of the numerical attribute. The number of data sets in each bucket and the number of data sets with a 0-1 attribute of 1 are counted. Second, an axis corresponding to the total number of data sets in a first through a particular buckets (X axis) and an axis corresponding to the total number of data sets with a 0-1 attribute of 1 in a first through a particular buckets (Y axis) are virtually established, and points corresponding to the respective values of the first through the particular buckets are virtually plotted. Third, after a plane is constructed in this manner, one of the pairs of points separated at an interval of T.times.N or T or larger which has the largest slope is found.
    Type: Grant
    Filed: October 25, 1996
    Date of Patent: February 9, 1999
    Assignee: International Business Machines Corporation
    Inventors: Yasuhiko Morimoto, Takeshi Fukuda, Shinichi Morishida, Takeshi Tokuyama
  • Patent number: 5812997
    Abstract: A method is described for finding correlation between a plurality of data having two kinds of numerical attributes and a true-false attribute. The method comprises the steps of: constituting a plane with two numerical attributes, dividing the plane into meshes, and counting the number of data in each mesh (also called a "bucket") and the number of data whose true-false attribute represents true. If each mesh is assumed to be a pixel, such plane can be considered as a plane image in which the number of data corresponds to brilliance, and the number of data whose true-false attribute represents true corresponds to saturation. The method further includes the step of segmenting an admissible image which is convex along an axis of the plane according to a predetermined condition .theta. to find an area with strong correlation. If the segmented area as the admissible image satisfies the above-described condition such as the maximized support rule, the method also presents the area to the user.
    Type: Grant
    Filed: October 25, 1996
    Date of Patent: September 22, 1998
    Assignee: International Business Machines Incorporated
    Inventors: Yasuhiko Morimoto, Takeshi Fukuda, Shinichi Morishita, Takeshi Tokuyama
  • Patent number: 5813002
    Abstract: A method for detecting deviations in a database is disclosed, comprising the steps of: determining respective frequencies of occurrence for the attribute values of the data items, and identifying any itemset whose similarity value satisfies a predetermined criterion as a deviation, based on the frequencies of occurrence. The determination of the frequencies of occurrence includes computing an overall similarity value for the database, and for each first itemset, computing a difference between the overall similarity value and the similarity value of a second itemset. The second itemset has all the data items except those of the first itemset. Preferably, a smoothing factor is used for indicating how much dissimilarity in an itemset can be reduced by removing a subset of items from the itemset. The smoothing factor is evaluated as each item is incrementally removed from the itemset, thereby allowing a data item to be identified as a deviation when the difference if similarity value is the highest.
    Type: Grant
    Filed: July 31, 1996
    Date of Patent: September 22, 1998
    Assignee: International Business Machines Corporation
    Inventors: Rakesh Agrawal, Andreas Arning
  • Patent number: 5799311
    Abstract: A method and system are disclosed for generating a decision-tree classifier from a training set of records, independent of the system memory size. The method comprises the steps of: generating an attribute list for each attribute of the records, sorting the attribute lists for numeric attributes, and generating a decision tree by repeatedly partitioning the records using the attribute lists. For each node, split points are evaluated to determine the best split test for partitioning the records at the node. Preferably, a gini index and class histograms are used in determining the best splits. The gini index indicates how well a split point separates the records while the class histograms reflect the class distribution of the records at the node. Also, a hash table is built as the attribute list of the split attribute is divided among the child nodes, which is then used for splitting the remaining attribute lists of the node.
    Type: Grant
    Filed: May 8, 1996
    Date of Patent: August 25, 1998
    Assignee: International Business Machines Corporation
    Inventors: Rakesh Agrawal, Manish Mehta, John Christopher Shafer
  • Patent number: 5799300
    Abstract: A method for performing a range-sum query in a database, in which the data is represented as a multi-dimensional data cube, is disclosed. The method comprises the steps of: selecting a subset of the dimensions of the data cube; computing a set of prefix-sums along the selected dimensions, based on the aggregate values in the cube corresponding the queried ranges; and generating a range-sum result based on the computed prefix-sums. Two d-dimensional arrays A and P are used for representing the data cube and the prefix-sums of the data cube, respectively. By maintaining the prefix-sum array P of the same size as the data cube, all range queries for a given cube can be answered in constant time, irrespective of the size of the sub-cube circumscribed by a query, using the inverse binary operator of the SUM operator. Alternatively, only auxiliary information for any user-specified fraction of the size of the d-dimensional data cube is maintained, to minimize the required system storage.
    Type: Grant
    Filed: December 12, 1996
    Date of Patent: August 25, 1998
    Assignee: International Business Machines Corporations
    Inventors: Rakesh Agrawal, Ching-Tien Ho, Ramakrishnan Srikant
  • Patent number: 5796827
    Abstract: An apparatus and method are disclosed for encoding and transferring data from a transmitter to a receiver, using the human body as a transmission medium. The transmitter includes an electric field generator, a data encoder which operates by modulating the electric field, and electrodes to couple the electric field through the human body. The receiver includes electrodes, in physical contact with, or close proximity to, a part of the human body, for detecting an electric field carried through the body, and a demodulator for extracting the data from the modulated electric field. An authenticator, connected to the receiver, processes the encoded data and validates the authenticity of the transmission. The apparatus and method are used to identify and authorize a possessor of the transmitter. The possessor then has secure access to, and can obtain delivery of, goods and services such as the distribution of money, phone privileges, building access, and commodities.
    Type: Grant
    Filed: November 14, 1996
    Date of Patent: August 18, 1998
    Assignee: International Business Machines Corporation
    Inventors: Don Coppersmith, Prabhakar Raghavan, Thomas G. Zimmerman
  • Patent number: 5787274
    Abstract: A method and apparatus are disclosed for generating a decision tree classifier from a training set of records. The method comprises the steps of: pre-sorting the records based on each numeric record attribute, creating a decision tree breadth-first, and pruning the tree based on the MDL principle. Preferably, the pre-sorting includes generating a class list and attribute lists, and independently sorting the numeric attribute lists. The growing of the tree includes evaluating possible splitting criteria and selecting a splitting test for each leaf node, based on a splitting index, and updating the class list to reflect new leaf nodes. In a preferred embodiment, the splitting index is a gini index. The pruning preferably includes encoding the decision tree and splitting tests in an MDL-based code, and determining whether to convert a node into a leaf node, prune its child nodes, or leave the node intact, based on the code length of the node.
    Type: Grant
    Filed: November 29, 1995
    Date of Patent: July 28, 1998
    Assignee: International Business Machines Corporation
    Inventors: Rakesh Agrawal, Manish Mehta, Jorma Johannes Rissanen
  • Patent number: 5764872
    Abstract: A method and apparatus for displaying stream lines in a space are disclosed. First, the space is divided into a plurality of tetrahedral cells. Position data of each vertex of the tetrahedral cells and vector data at each position are collected. A critical point for each tetrahedral cell is then computed. The critical point is within the tetrahedral cell and for which the vector data becomes zero. Using the collected position and vector data, a Jacobian matrix J is calculated when such a critical point is found, and eigenvalues of the Jacobian matrix J are also calculated. Next, the starting point of a stream line within a tetrahedral cell is calculated for each of the eigenvalues by moving a microscopic distance from the critical point. Finally, a stream line is calculated from the starting point, and the stream line is displayed.
    Type: Grant
    Filed: June 27, 1995
    Date of Patent: June 9, 1998
    Assignee: International Business Machines Corporation
    Inventors: Koji Koyamada, Takayuki Ito
  • Patent number: 5748023
    Abstract: An integrator is disclosed that is capable of outputting the same integration result with respect to the same bit pattern even if there are fluctuations in the integrating period, semiconductor device process, or the power supply voltage. The disclosed integrator includes: (1) a first integrator having a first amplifier, for integrating a reference voltage during an integrating period, (2) a second integrator having a second amplifier, for integrating an input signal during the integrating period, and (3) control means for outputting a signal regulating a gain of the first amplifier to the first amplifier so that an output of the first integrator varies in correspondence with the integrating period, and for regulating a gain of the second amplifier by means of the signal.
    Type: Grant
    Filed: June 24, 1996
    Date of Patent: May 5, 1998
    Inventors: Martin Hassner, Seiji Koyama, Tohru Nozawa, Asao Terukina, Tamura Tetsuya
  • Patent number: 5742811
    Abstract: A method and apparatus are disclosed for mining generalized sequential patterns from a large database of data sequences, taking into account user specified constraints on the time-gap between adjacent elements of the patterns, sliding time-window, and taxonomies over data items. The invention first identifies the items with at least a minimum support, i.e., those contained in more than a minimum number of data sequences. The items are used as a seed set to generate candidate sequences. Next, the support of the candidate sequences are counted. The invention then identifies those candidate sequences that are frequent, i.e., those with a support above the minimum support. The frequent candidate sequences are entered into the set of sequential patterns, and are used to generate the next group of candidate sequences.
    Type: Grant
    Filed: October 10, 1995
    Date of Patent: April 21, 1998
    Assignee: International Business Machines Corporation
    Inventors: Rakesh Agrawal, Ramakrishnan Srikant
  • Patent number: 5724573
    Abstract: A method and apparatus are disclosed for mining quantitative association rules from a relational table of records. The method comprises the steps of: partitioning the values of selected quantitative attributes into intervals, combining adjacent attribute values and intervals into ranges, generating candidate itemsets, determining frequent itemsets, and outputting an association rule when the support for a frequent itemset bears a predetermined relationship to the support for a subset of the frequent itemset. Preferably, the partitioning step includes determining whether to partition and the number of partitions based on a partial incompleteness measure. The candidate generation includes discarding those itemsets not meeting a user-specified interest level and those having a subset which is not a frequent itemset. The frequent itemsets are determined using super-candidates that include information of the candidate itemsets.
    Type: Grant
    Filed: December 22, 1995
    Date of Patent: March 3, 1998
    Assignee: International Business Machines Corporation
    Inventors: Rakesh Agrawal, Ramakrishnan Srikant
  • Patent number: 5682470
    Abstract: A method and apparatus are disclosed for achieving collective consistency in the detection and reporting of failures in a distributed computing system having multiple processors. Each processor is capable of being called by a parallel application for system status. Initially, each processor sends the other processors its view on the status of the processors. It then waits for similar views from other processors except those regarded as failed in its own view. If the received views are identical to the view of the processor, the processor returns its view to the parallel application. In a preferred embodiment, if the views are not identical to its view, the processor sets its view to the union of the received views and its current view. The steps are then repeated. Alternately, the steps are repeated if the processor does not have information that each of the processors not regarded as failed in its view forms an identical union view.
    Type: Grant
    Filed: September 1, 1995
    Date of Patent: October 28, 1997
    Assignee: International Business Machines Corporation
    Inventors: Cynthia Dwork, Ching-Tien Ho, Hovey Raymond Strong, Jr.