Patents by Inventor Tomas Mikolov

Tomas Mikolov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

CLASSIFYING DATA OBJECTS

Publication number: 20240220527

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying data objects. One of the methods includes obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object.

Type: Application

Filed: March 15, 2024

Publication date: July 4, 2024

Inventors: Gregory Sean Corrado, Tomas Mikolov, Samuel Bengio, Yoram Singer, Jonathon Shlens, Andrea L. Frome, Jeffrey Adgate Dean, Mohammad Norouzi
Classifying data objects

Patent number: 11960519

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying data objects. One of the methods includes obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object.

Type: Grant

Filed: August 20, 2020

Date of Patent: April 16, 2024

Assignee: Google LLC

Inventors: Gregory Sean Corrado, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea L Frome, Jeffrey Adgate Dean, Mohammad Norouzi
COMPUTING NUMERIC REPRESENTATIONS OF WORDS IN A HIGH-DIMENSIONAL SPACE

Publication number: 20240070392

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.

Type: Application

Filed: November 6, 2023

Publication date: February 29, 2024

Inventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
Computing numeric representations of words in a high-dimensional space

Patent number: 11809824

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.

Type: Grant

Filed: February 12, 2021

Date of Patent: November 7, 2023

Assignee: Google LLC

Inventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
Computing numeric representations of words in a high-dimensional space

Patent number: 10922488

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.

Type: Grant

Filed: March 25, 2019

Date of Patent: February 16, 2021

Assignee: Google LLC

Inventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
CLASSIFYING DATA OBJECTS

Publication number: 20200380023

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying data objects. One of the methods includes obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object.

Type: Application

Filed: August 20, 2020

Publication date: December 3, 2020

Inventors: Gregory Sean Corrado, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea L. Frome, Jeffrey Adgate Dean, Mohammad Norouzi
Classifying data objects

Patent number: 10769191

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying data objects. One of the methods includes obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object.

Type: Grant

Filed: December 19, 2014

Date of Patent: September 8, 2020

Assignee: Google LLC

Inventors: Gregory Sean Corrado, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea L. Frome, Jeffrey Adgate Dean, Mohammad Norouzi
Translating terms using numeric representations

Patent number: 10503837

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for translating terms using numeric representations. One of the methods includes obtaining data that associates each term in a vocabulary of terms in a first language with a respective high-dimensional representation of the term; obtaining data that associates each term in a vocabulary of terms in a second language with a respective high-dimensional representation of the term; receiving a first language term; and determining a translation into the second language of the first language term from the high-dimensional representation of the first language term and the high-dimensional representations of terms in the vocabulary of terms in the second language.

Type: Grant

Filed: October 30, 2017

Date of Patent: December 10, 2019

Assignee: Google LLC

Inventors: Ilya Sutskever, Tomas Mikolov, Jeffrey Adgate Dean, Quoc V. Le
Computing numeric representations of words in a high-dimensional space

Patent number: 10241997

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.

Type: Grant

Filed: August 21, 2017

Date of Patent: March 26, 2019

Assignee: Google LLC

Inventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
Translating terms using numeric representations

Patent number: 9805028

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for translating terms using numeric representations. One of the methods includes obtaining data that associates each term in a vocabulary of terms in a first language with a respective high-dimensional representation of the term; obtaining data that associates each term in a vocabulary of terms in a second language with a respective high-dimensional representation of the term; receiving a first language term; and determining a translation into the second language of the first language term from the high-dimensional representation of the first language term and the high-dimensional representations of terms in the vocabulary of terms in the second language.

Type: Grant

Filed: September 17, 2015

Date of Patent: October 31, 2017

Assignee: Google Inc.

Inventors: Ilya Sutskever, Tomas Mikolov, Jeffrey Adgate Dean, Quoc V. Le
Computing numeric representations of words in a high-dimensional space

Patent number: 9740680

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.

Type: Grant

Filed: May 18, 2015

Date of Patent: August 22, 2017

Assignee: Google Inc.

Inventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
Feature-augmented neural networks and applications of same

Patent number: 9519858

Abstract: A system is described herein which uses a neural network having an input layer that accepts an input vector and a feature vector. The input vector represents at least part of input information, such as, but not limited to, a word or phrase in a sequence of input words. The feature vector provides supplemental information pertaining to the input information. The neural network produces an output vector based on the input vector and the feature vector. In one implementation, the neural network is a recurrent neural network. Also described herein are various applications of the system, including a machine translation application.

Type: Grant

Filed: February 10, 2013

Date of Patent: December 13, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Geoffrey G. Zweig, Tomas Mikolov
Classifying Data Objects

Publication number: 20150178383

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying data objects. One of the methods includes obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object.

Type: Application

Filed: December 19, 2014

Publication date: June 25, 2015

Inventors: Gregory Sean Corrado, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea L. Frome, Jeffrey Adgate Dean, Mohammad Norouzi
Computing numeric representations of words in a high-dimensional space

Patent number: 9037464

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.

Type: Grant

Filed: March 15, 2013

Date of Patent: May 19, 2015

Assignee: Google Inc.

Inventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
RELATIONAL SIMILARITY MEASUREMENT

Publication number: 20140249799

Abstract: Relational similarity measuring embodiments are presented that generally involve creating a relational similarity model that, given two pairs of words, is used to measure a degree of relational similarity between the two relations respectively exhibited by these word pairs. In one exemplary embodiment this involves creating a combined relational similarity model from a plurality of relational similarity models. This is generally accomplished by first selecting a plurality of relational similarity models, each of which measures relational similarity between two pairs of words, and each of which is trained or created using a different method or linguistic/textual resource. The selected models are then combined to form the combined relational similarity model. The combined model inputs two pairs of words and outputs a relational similarity indicator representing a measure the degree of relational similarity between the word pairs.

Type: Application

Filed: March 4, 2013

Publication date: September 4, 2014

Applicant: Microsoft Corporation

Inventors: Wen-tau Yih, Geoffrey Zweig, Christopher Meek, Alisa Zhila, Tomas Mikolov
Feature-Augmented Neural Networks and Applications of Same

Publication number: 20140229158

Abstract: A system is described herein which uses a neural network having an input layer that accepts an input vector and a feature vector. The input vector represents at least part of input information, such as, but not limited to, a word or phrase in a sequence of input words. The feature vector provides supplemental information pertaining to the input information. The neural network produces an output vector based on the input vector and the feature vector. In one implementation, the neural network is a recurrent neural network. Also described herein are various applications of the system, including a machine translation application.

Type: Application

Filed: February 10, 2013

Publication date: August 14, 2014

Applicant: MICROSOFT CORPORATION

Inventors: Geoffrey G. Zweig, Tomas Mikolov, Alejandro Acero