Patents by Inventor Jason Weston

Jason Weston has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

RECURSIVE FEATURE ELIMINATION METHOD USING SUPPORT VECTOR MACHINES

Publication number: 20110106735

Abstract: Identification of a determinative subset of features from within a group of features is performed by training a support vector machine using training samples with class labels to determine a value of each feature, where features are removed based on their the value. One or more features having the smallest values are removed and an updated kernel matrix is generated using the remaining features. The process is repeated until a predetermined number of features remain which are capable of accurately separating the data into different classes. In some embodiments, features are eliminated by a ranking criterion based on a Lagrange multiplier corresponding to each training sample.

Type: Application

Filed: November 11, 2010

Publication date: May 5, 2011

Applicant: HEALTH DISCOVERY CORPORATION

Inventors: Jason Weston, André Elisseeff, Bernhard Schölkopf, Fernando Perez-Cruz, Isabelle Guyon
SYSTEM FOR PROVIDING DATA ANALYSIS SERVICES USING A SUPPORT VECTOR MACHINE FOR PROCESSING DATA RECEIVED FROM A REMOTE SOURCE

Publication number: 20100256988

Abstract: A network-based system is provided for performing data analysis services using a support vector machine for analyzing data received from a remote user connected to the network. The user transmits a data set to be analyzed and along with an account identifier that allows the analysis service provider to collect payment for the processing services. Once payment has been confirmed, the service provider's server transmits the analysis results to the remote user.

Type: Application

Filed: June 11, 2010

Publication date: October 7, 2010

Applicant: HEALTH DISCOVERY CORPORATION

Inventors: Stephen Barnhill, Isabelle Guyon, Jason Weston
Method for feature selection in a support vector machine using feature ranking

Patent number: 7805388

Abstract: In a pre-processing step prior to training a learning machine, pre-processing includes reducing the quantity of features to be processed using feature selection methods selected from the group consisting of recursive feature elimination (RFE), minimizing the number of non-zero parameters of the system (l0-norm minimization), evaluation of cost function to identify a subset of features that are compatible with constraints imposed by the learning set, unbalanced correlation score, transductive feature selection and single feature using margin-based ranking. The features remaining after feature selection are then used to train a learning machine for purposes of pattern classification, regression, clustering and/or novelty detection.

Type: Grant

Filed: October 30, 2007

Date of Patent: September 28, 2010

Assignee: Health Discovery Corporation

Inventors: Jason Weston, André Elisseeff, Bernhard Schölkopf, Fernando Perez-Cruz, Isabelle Guyon
System for providing data analysis services using a support vector machine for processing data received from a remote source

Patent number: 7797257

Abstract: A computer system for performing data analysis services using a support vector machine for analyzing data received from a remote source on a distributed network includes a server in communication with the distributed network for receiving a data set and a financial account identifier associated with the remote source. The server communicates over the distributed network with a financial institution to receive funds from a financial account identified by the financial account identifier. A processor receives one or more data sets from the remote source and pre-processes the data to enhance meaning within the data set. The pre-processed data is used to train and test a support vector machine for recognizing patterns within the data. Live data is processed using the trained and tested support vector machine to generate an output which is transmitted to the remote source after the server confirms that payment for the data processing service has been received.

Type: Grant

Filed: October 29, 2007

Date of Patent: September 14, 2010

Assignee: Health Discovery Corporation

Inventors: Stephen Barnhill, Isabelle Guyon, Jason Weston
Method and apparatus for transductive support vector machines

Patent number: 7778949

Abstract: Disclosed is a method for training a transductive support vector machine. The support vector machine is trained based on labeled training data and unlabeled test data. A non-convex objective function which optimizes a hyperplane classifier for classifying the unlabeled test data is decomposed into a convex function and a concave function. A local approximation of the concave function at a hyperplane is calculated, and the approximation of the concave function is combined with the convex function such that the result is a convex problem. The convex problem is then solved to determine an updated hyperplane. This method is performed iteratively until the solution converges.

Type: Grant

Filed: March 21, 2007

Date of Patent: August 17, 2010

Assignee: NEC Laboratories America, Inc.

Inventors: Ronan Collobert, Jason Weston, Leon Bottou
SUPERVISED SEMANTIC INDEXING AND ITS EXTENSIONS

Publication number: 20100185659

Abstract: A system and method for determining a similarity between a document and a query includes providing a frequently used dictionary and an infrequently used dictionary in storage memory. For each word or gram in the infrequently used dictionary, n words or grams are correlated from the frequently used dictionary based on a first score. Features for a vector of the infrequently used words or grams are replaced with features from a vector of the correlated words or grams from the frequently used dictionary when the features from a vector of the correlated words or grams meet a threshold value. A similarity score is determined between weight vectors of a query and one or more documents in a corpus by employing the features from the vector of the correlated words or grams that met the threshold value.

Type: Application

Filed: September 18, 2009

Publication date: July 22, 2010

Applicant: NEC Laboratories America, Inc.

Inventors: BING BAI, JASON WESTON, RONAN COLLORBERT, DAVID GRANGIER
SUPERVISED SEMANTIC INDEXING AND ITS EXTENSIONS

Publication number: 20100179933

Abstract: A system and method for determining a similarity between a document and a query includes building a weight vector for each of a plurality of documents in a corpus of documents stored in memory and building a weight vector for a query input into a document retrieval system. A weight matrix is generated which distinguishes between relevant documents and lower ranked documents by comparing document/query tuples using a gradient step approach. A similarity score is determined between weight vectors of the query and documents in a corpus by determining a product of a document weight vector, a query weight vector and the weight matrix.

Type: Application

Filed: September 18, 2009

Publication date: July 15, 2010

Applicant: NEC Laboratories America, Inc.

Inventors: BING BAI, Jason Weston, Ronan Collobert, David Grangier
Deep Neural Networks and Methods for Using Same

Publication number: 20090210218

Abstract: A method and system for labeling a selected word of a sentence using a deep neural network includes, in one exemplary embodiment, determining an index term corresponding to each feature of the word, transforming the index term or terms of the word into a vector, and predicting a label for the word using the vector. The method and system, in another exemplary embodiment, includes determining, for each word in the sentence, an index term corresponding to each feature of the word, transforming the index term or terms of each word in the sentence into a vector, applying a convolution operation to the vector of the selected word and at least one of the vectors of the other words in the sentence, to transform the vectors into a matrix of vectors, each of the vectors in the matrix including a plurality of row values, constructing a single vector from the vectors in the matrix, and predicting a label for the selected word using the single vector.

Type: Application

Filed: February 9, 2009

Publication date: August 20, 2009

Applicant: NEC Laboratories America, Inc.

Inventors: Ronan Collobert, Jason Weston
Semantic Search Via Role Labeling

Publication number: 20090204605

Abstract: A method and system for searching for information contained in a database of documents each includes an offline part and an online part. The offline part includes predicting, in a first computer process, semantic data for sentences of the documents contained in the database and storing this data in a database. The online part includes querying the database for information with a semantically-sensitive query, predicting, in a real time computer process, semantic data for the query, and determining, in a second computer process, a matching score against all the documents in the database, which incorporates the semantic data for the sentences and the query.

Type: Application

Filed: February 2, 2009

Publication date: August 13, 2009

Applicant: NEC LABORATORIES AMERICA, INC.

Inventors: Bing Bai, Jason Weston, Ronan Collobert
METHOD FOR TRAINING A LEARNING MACHINE HAVING A DEEP MULTI-LAYERED NETWORK WITH LABELED AND UNLABELED TRAINING DATA

Publication number: 20090204558

Abstract: A method for training a learning machine having a deep network with a plurality of layers, includes applying a regularizer to one or more of the layers of the deep network; training the regularizer with unlabeled data; and training the deep network with labeled data. Also, an apparatus for use in discriminative classification and regression, including an input device for inputting unlabeled and labeled data associated with a phenomenon of interest; a processor; and a memory communicating with the processor. The memory includes instructions executable by the processor for implementing a learning machine having a deep network structure and training the learning machine by applying a regularizer to one or more of the layers of the deep network; training the regularizer with unlabeled data; and training the deep network with labeled data.

Type: Application

Filed: February 6, 2009

Publication date: August 13, 2009

Applicant: NEC Laboratories America, Inc.

Inventors: Jason Weston, Ronan Collobert
Large Scale Manifold Transduction

Publication number: 20090204556

Abstract: A method for training a learning machine for use in discriminative classification and regression includes randomly selecting, in a first computer process, an unclassified datapoint associated with a phenomenon of interest; determining, in a second computer process, a set of datapoints associated with the phenomenon of interest that is likely to be in the same class as the selected unclassified datapoint; predicting, in a third computer process, a class label for the selected unclassified datapoint in a third computer process; predicting a class label for the set of datapoints in a fourth computer process; combining the predicted class labels in a fifth computer process, to predict a composite class label that describes the selected unclassified datapoint and the set of datapoints; and using the combined class label to adjust at least one parameter of the learning machine in a sixth computer process.

Type: Application

Filed: February 2, 2009

Publication date: August 13, 2009

Applicant: NEC LABORATORIES AMERICA, INC.

Inventors: Jason Weston, Ronan Collobert
Feature selection method using support vector machine classifier

Patent number: 7542959

Abstract: Identification of a determinative subset of features from within a large set of features is performed by training a support vector machine to rank the features according to classifier weights, where features are removed to determine how their removal affects the value of the classifier weights. The features having the smallest weight values are removed and a new support vector machine is trained with the remaining weights. The process is repeated until a relatively small subset of features remain that is capable of accurately separating the data into different patterns or classes. The method is applied for selecting the smallest number of genes that are capable of accurately distinguishing between medical conditions such as cancer and non-cancer.

Type: Grant

Filed: August 21, 2007

Date of Patent: June 2, 2009

Assignee: Health Discovery Corporation

Inventors: Stephen Barnhill, Isabelle Guyon, Jason Weston
Pre-processed feature ranking for a support vector machine

Patent number: 7475048

Abstract: A computer-implemented method is provided for ranking features within a large dataset containing a large number of features according to each feature's ability to separate data into classes. For each feature, a support vector machine separates the dataset into two classes and determines the margins between extremal points in the two classes. The margins for all of the features are compared and the features are ranked based upon the size of the margin, with the highest ranked features corresponding to the largest margins. A subset of features for classifying the dataset is selected from a group of the highest ranked features. In one embodiment, the method is used to identify the best genes for disease prediction and diagnosis using gene expression data from micro-arrays.

Type: Grant

Filed: November 7, 2002

Date of Patent: January 6, 2009

Assignee: Health Discovery Corporation

Inventors: Jason Weston, André Elisseeff, Bernhard Schölkopf, Fernando Perez-Cruz, Isabelle Guyon
METHOD FOR FEATURE SELECTION IN A SUPPORT VECTOR MACHINE USING FEATURE RANKING

Publication number: 20080233576

Abstract: In a pre-processing step prior to training a learning machine, pre-processing includes reducing the quantity of features to be processed using feature selection methods selected from the group consisting of recursive feature elimination (RFE), minimizing the number of non-zero parameters of the system (l0-norm minimization), evaluation of cost function to identify a subset of features that are compatible with constraints imposed by the learning set, unbalanced correlation score, transductive feature selection and single feature using margin-based ranking. The features remaining after feature selection are then used to train a learning machine for purposes of pattern classification, regression, clustering and/or novelty detection.

Type: Application

Filed: October 30, 2007

Publication date: September 25, 2008

Inventors: Jason Weston, Andre Ellisseeff, Bernhard Scholkopf, Fernando Perez-Cruz, Isabelle Guyon
FAST SEMANTIC EXTRACTION USING A NEURAL NETWORK ARCHITECTURE

Publication number: 20080221878

Abstract: A system and method for semantic extraction using a neural network architecture includes indexing each word in an input sentence into a dictionary and using these indices to map each word to a d-dimensional vector (the features of which are learned). Together with this, position information for a word of interest (the word to labeled) and a verb of interest (the verb that the semantic role is being predicted for) with respect to a given word are also used. These positions are integrated by employing a linear layer that is adapted to the input sentence. Several linear transformations and squashing functions are then applied to output class probabilities for semantic role labels. All the weights for the whole architecture are trained by backpropagation.

Type: Application

Filed: February 29, 2008

Publication date: September 11, 2008

Applicant: NEC LABORATORIES AMERICA, INC.

Inventors: Ronan Collobert, Jason Weston
DATA MINING PLATFORM FOR BIOINFORMATICS AND OTHER KNOWLEDGE DISCOVERY

Publication number: 20080097938

Abstract: The data mining platform comprises a plurality of system modules, each formed from a plurality of components. Each module has an input data component, a data analysis engine for processing the input data, an output data component for outputting the results of the data analysis, and a web server to access and monitor the other modules within the unit and to provide communication to other units. Each module processes a different type of data, for example, a first module processes microarray (gene expression) data while a second module processes biomedical literature on the Internet for information supporting relationships between genes and diseases and gene functionality. In the preferred embodiment, the data analysis engine is a kernel-based learning machine, and in particular, one or more support vector machines (SVMs).

Type: Application

Filed: October 30, 2007

Publication date: April 24, 2008

Inventors: Isabelle Guyon, Edward Reiss, Rene Doursat, Jason Weston, David Lewis
KERNELS AND KERNEL METHODS FOR SPECTRAL DATA

Publication number: 20080097940

Abstract: Support vector machines are used to classify data contained within a structured dataset such as a plurality of signals generated by a spectral analyzer. The signals are pre-processed to ensure alignment of peaks across the spectra. Similarity measures are constructed to provide a basis for comparison of pairs of samples of the signal. A support vector machine is trained to discriminate between different classes of the samples. to identify the most predictive features within the spectra. In a preferred embodiment feature selection is performed to reduce the number of features that must be considered.

Type: Application

Filed: October 30, 2007

Publication date: April 24, 2008

Inventors: Asa Ben-Hur, Andre Elisseeff, Olivier Chapelle, Jason Weston
DATA MINING PLATFORM FOR BIOINFORMATICS AND OTHER KNOWLEDGE DISCOVERY

Publication number: 20080097939

Abstract: The data mining platform comprises a plurality of system modules, each formed from a plurality of components. Each module has an input data component, a data analysis engine for processing the input data, an output data component for outputting the results of the data analysis, and a web server to access and monitor the other modules within the unit and to provide communication to other units. Each module processes a different type of data, for example, a first module processes microarray (gene expression) data while a second module processes biomedical literature on the Internet for information supporting relationships between genes and diseases and gene functionality. In the preferred embodiment, the data analysis engine is a kernel-based learning machine, and in particular, one or more support vector machines (SVMs).

Type: Application

Filed: October 30, 2007

Publication date: April 24, 2008

Inventors: Isabelle Guyon, Edward Reiss, Rene Doursat, Jason Weston, David Lewis
SYSTEM FOR PROVIDING DATA ANALYSIS SERVICES USING A SUPPORT VECTOR MACHINE FOR PROCESSING DATA RECEIVED FROM A REMOTE SOURCE

Publication number: 20080059392

Abstract: A computer system for performing data analysis services using a support vector machine for analyzing data received from a remote source on a distributed network includes a server in communication with the distributed network for receiving a data set and a financial account identifier associated with the remote source. The server communicates over the distributed network with a financial institution to receive funds from a financial account identified by the financial account identifier. A processor receives one or more data sets from the remote source and pre-processes the data to enhance meaning within the data set. The pre-processed data is used to train and test a support vector machine for recognizing patterns within the data. Live data is processed using the trained and tested support vector machine to generate an output which is transmitted to the remote source after the server confirms that payment for the data processing service has been received.

Type: Application

Filed: October 29, 2007

Publication date: March 6, 2008

Inventors: Stephen Barnhill, Isabelle Guyon, Jason Weston
Feature selection method using support vector machine classifier

Publication number: 20080033899

Abstract: Identification of a determinative subset of features from within a large set of features is performed by training a support vector machine to rank the features according to classifier weights, where features are removed to determine how their removal affects the value of the classifier weights. The features having the smallest weight values are removed and a new support vector machine is trained with the remaining weights. The process is repeated until a relatively small subset of features remain that is capable of accurately separating the data into different patterns or classes. The method is applied for selecting the smallest number of genes that are capable of accurately distinguishing between medical conditions such as cancer and non-cancer.

Type: Application

Filed: August 21, 2007

Publication date: February 7, 2008

Inventors: Stephen Barnhill, Isabelle Guyon, Jason Weston

prev 1 2 3 next