Patents Assigned to Health Discovery Corporation
  • Publication number: 20090204557
    Abstract: An automated method and system are provided for receiving an input of flow cytometry data and analyzing the data using one or more support vector machines to generate an output in which the flow cytometry data is classified into two or more categories. The one or more support vector machines utilizes a kernel that captures distributional data within the input data. Such a distributional kernel is constructed by using a distance function (divergence) between two distributions. In the preferred embodiment, a kernel based upon the Bhattacharya affinity is used. The distributional kernel is applied to classification of flow cytometry data obtained from patients suspected having myelodysplastic syndrome.
    Type: Application
    Filed: February 8, 2009
    Publication date: August 13, 2009
    Applicant: HEALTH DISCOVERY CORPORATION
    Inventor: Hong Zhang
  • Patent number: 7542947
    Abstract: The data mining platform comprises a plurality of system modules, each formed from a plurality of components. Each module has an input data component, a data analysis engine for processing the input data, an output data component for outputting the results of the data analysis, and a web server to access and monitor the other modules within the unit and to provide communication to other units. Each module processes a different type of data, for example, a first module processes microarray (gene expression) data while a second module processes biomedical literature on the Internet for information supporting relationships between genes and diseases and gene functionality. In the preferred embodiment, the data analysis engine is a kernel-based learning machine, and in particular, one or more support vector machines (SVMs).
    Type: Grant
    Filed: October 30, 2007
    Date of Patent: June 2, 2009
    Assignee: Health Discovery Corporation
    Inventors: Isabelle Guyon, Edward P. Reiss, René Doursat, Jason Aaron Edward Weston, David D. Lewis
  • Patent number: 7542959
    Abstract: Identification of a determinative subset of features from within a large set of features is performed by training a support vector machine to rank the features according to classifier weights, where features are removed to determine how their removal affects the value of the classifier weights. The features having the smallest weight values are removed and a new support vector machine is trained with the remaining weights. The process is repeated until a relatively small subset of features remain that is capable of accurately separating the data into different patterns or classes. The method is applied for selecting the smallest number of genes that are capable of accurately distinguishing between medical conditions such as cancer and non-cancer.
    Type: Grant
    Filed: August 21, 2007
    Date of Patent: June 2, 2009
    Assignee: Health Discovery Corporation
    Inventors: Stephen Barnhill, Isabelle Guyon, Jason Weston
  • Patent number: 7475048
    Abstract: A computer-implemented method is provided for ranking features within a large dataset containing a large number of features according to each feature's ability to separate data into classes. For each feature, a support vector machine separates the dataset into two classes and determines the margins between extremal points in the two classes. The margins for all of the features are compared and the features are ranked based upon the size of the margin, with the highest ranked features corresponding to the largest margins. A subset of features for classifying the dataset is selected from a group of the highest ranked features. In one embodiment, the method is used to identify the best genes for disease prediction and diagnosis using gene expression data from micro-arrays.
    Type: Grant
    Filed: November 7, 2002
    Date of Patent: January 6, 2009
    Assignee: Health Discovery Corporation
    Inventors: Jason Weston, André Elisseeff, Bernhard Schölkopf, Fernando Perez-Cruz, Isabelle Guyon
  • Patent number: 7444308
    Abstract: The data mining platform comprises a plurality of system modules (500, 550), each formed from a plurality of components. Each module has an input data component (502, 552), a data analysis engine (504, 554) for processing the input data, an output data component (506, 556) for outputting the results of the data analysis, and a web server (510) to access and monitor the other modules within the unit and to provide communication to other units. Each module processes a different type of data, for example, a first module processes microarray (gene expression) data while a second module processes biomedical literature on the Internet for information supporting relationships between genes and diseases and gene functionality.
    Type: Grant
    Filed: June 17, 2002
    Date of Patent: October 28, 2008
    Assignee: Health Discovery Corporation
    Inventors: Isabelle Guyon, Edward P. Reiss, René Doursat, Jason Aaron Edward Weston
  • Patent number: 7383237
    Abstract: Digitized image data are input into a processor where a detection component identifies the areas (objects) of particular interest in the image and, by segmentation, separates those objects from the background. A feature extraction component formulates numerical values relevant to the classification task from the segmented objects. Results of the preceding analysis steps are input into a trained learning machine classifier which produces an output which may consist of an index discriminating between two possible diagnoses, or some other output in the desired output format. In one embodiment, digitized image data are input into a plurality of subsystems, each subsystem having one or more support vector machines. Pre-processing may include the use of known transformations which facilitate extraction of the useful data. Each subsystem analyzes the data relevant to a different feature or characteristic found within the image.
    Type: Grant
    Filed: February 6, 2006
    Date of Patent: June 3, 2008
    Assignee: Health Discovery Corporation
    Inventors: Hong Zhang, Garry Carls, Stephen D. Barnhill
  • Patent number: 7366719
    Abstract: There is described a method for manipulation, storage, modeling, visualization, and quantification of datasets, which correspond to target strings. An iterative algorithm is used to generate comparison strings corresponding to some set of points that can serve as the domain of an iterative function. The comparison string is scored by evaluating a function having the comparison string and one of the plurality of target strings as inputs. The score measures a relationship between a comparison string and a target string. The evaluation may be repeated for a number of the other target strings. The score or some other property corresponding to the comparison string is used to determine the target string's placement on a map. The target string may also be marked by a point on a visual display.
    Type: Grant
    Filed: October 6, 2004
    Date of Patent: April 29, 2008
    Assignee: Health Discovery Corporation
    Inventor: Sandy C. Shaw
  • Patent number: 7353215
    Abstract: Learning machines, such as support vector machines, are used to analyze datasets to recognize patterns within the dataset using kernels that are selected according to the nature of the data to be analyzed. Where the datasets possesses structural characteristics, locational kernels can be utilized to provide measures of similarity among data points within the dataset. The locational kernels are then combined to generate a decision function, or kernel, that can be used to analyze the dataset. Where invariance transformations or noise is present, tangent vectors are defined to identify relationships between the invariance or noise and the data points. A covariance matrix is formed using the tangent vectors, then used in generation of the kernel for recognizing patterns in the dataset.
    Type: Grant
    Filed: May 7, 2002
    Date of Patent: April 1, 2008
    Assignee: Health Discovery Corporation
    Inventors: Peter L. Bartlett, André Elisseeff, Bernhard Schoelkopf
  • Patent number: 7318051
    Abstract: In a pre-processing step prior to training a learning machine, pre-processing includes reducing the quantity of features to be processed using feature selection methods selected from the group consisting of recursive feature elimination (RFE), minimizing the number of non-zero parameters of the system (lo-norm minimization), evaluation of cost function to identify a subset of features that are compatible with constraints imposed by the learning set, unbalanced correlation score and transductive feature selection. The features remaining after feature selection are then used to train a learning machine for purposes of pattern classification, regression, clustering and/or novelty detection. (FIG.
    Type: Grant
    Filed: May 20, 2002
    Date of Patent: January 8, 2008
    Assignee: Health Discovery Corporation
    Inventors: Jason Aaron Edward Weston, André Elisseeff, Bernhard Schoelkopf, Fernando Pérez-Cruz
  • Patent number: 7299213
    Abstract: The spectral kernel machine combines kernel functions and spectral graph theory for solving problems of machine learning. The data points in the dataset are placed in the form of a matrix known as a kernel matrix, or Gram matrix, containing all pairwise kernels between the data points. The dataset is regarded as nodes of a fully connected graph. A weight equal to the kernel between the two nodes is assigned to each edge of the graph. The adjacency matrix of the graph is equivalent to the kernel matrix, also known as the Gram matrix. The eigenvectors and their corresponding eigenvalues provide information about the properties of the graph, and thus, the dataset. The second eigenvector can be thresholded to approximate the class assignment of graph nodes.
    Type: Grant
    Filed: September 12, 2005
    Date of Patent: November 20, 2007
    Assignee: Health Discovery Corporation
    Inventor: Nello Cristianini
  • Patent number: 7117188
    Abstract: The methods, systems and devices of the present invention comprise use of Support Vector Machines and RFE (Recursive Feature Elimination) for the identification of patterns that are useful for medical diagnosis, prognosis and treatment. SVM-RFE can be used with varied data sets.
    Type: Grant
    Filed: January 24, 2002
    Date of Patent: October 3, 2006
    Assignee: Health Discovery Corporation
    Inventors: Isabelle Guyon, Jason Aaron Edward Weston
  • Patent number: 6996549
    Abstract: Digitized image data are input into a processor where a detection component identifies the areas (objects) of particular interest in the image and, by segmentation, separates those objects from the background. A feature extraction component formulates numerical values relevant to the classification task from the segmented objects. Results of the preceding analysis steps are input into a trained learning machine classifier which produces an output which may consist of an index discriminating between two possible diagnoses, or some other output in the desired output format. In one embodiment, digitized image data are input into a plurality of subsystems, each subsystem having one or more support vector machines. Pre-processing may include the use of known transformations which facilitate extraction of the useful data. Each subsystem analyzes the data relevant to a different feature or characteristic found within the image.
    Type: Grant
    Filed: January 23, 2002
    Date of Patent: February 7, 2006
    Assignee: Health Discovery Corporation
    Inventors: Hong Zhang, Garry Carls, Stephen D. Barnhill
  • Patent number: 6944602
    Abstract: The spectral kernel machine combines kernel functions and spectral graph theory for solving problems of machine learning. The data points in the dataset are placed in the form of a matrix known as a kernel matrix, or Gram matrix, containing all pairwise kernels between the data points. The dataset is regarded as nodes of a fully connected graph. A weight equal to the kernel between the two nodes is assigned to each edge of the graph. The adjacency matrix of the graph is equivalent to the kernel matrix, also known as the Gram matrix. The eigenvectors and their corresponding eigenvalues provide information about the properties of the graph, and thus, the dataset. The second eigenvector can be thresholded to approximate the class assignment of graph nodes.
    Type: Grant
    Filed: March 1, 2002
    Date of Patent: September 13, 2005
    Assignee: Health Discovery Corporation
    Inventor: Nello Cristianini
  • Patent number: 6920451
    Abstract: There is described a method for manipulation, storage, modeling, visualization, and quantification of datasets, which correspond to target strings. A number of target strings are provided. An iterative algorithm is used to generate comparison strings corresponding to some set of points that can serve as the domain of an iterative function. Preferably these points are located in the complex plane, such as in and/or near the Mandelbrot Set or a Julia Set. These comparison strings are also datasets. The comparison string is scored by evaluating a function having the comparison string and one of the plurality of target strings as inputs. The score measures a relationship between a comparison string and a target string. The evaluation may be repeated for a number of the other target strings. The score or some other property corresponding to the comparison string is used to determine the target string's placement on a map. The target string may also be marked by a point on a visual display.
    Type: Grant
    Filed: January 19, 2001
    Date of Patent: July 19, 2005
    Assignee: Health Discovery Corporation
    Inventor: Sandy C. Shaw