Patents by Inventor Tomas Mikolov
Tomas Mikolov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11960519Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying data objects. One of the methods includes obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object.Type: GrantFiled: August 20, 2020Date of Patent: April 16, 2024Assignee: Google LLCInventors: Gregory Sean Corrado, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea L Frome, Jeffrey Adgate Dean, Mohammad Norouzi
-
Publication number: 20240070392Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.Type: ApplicationFiled: November 6, 2023Publication date: February 29, 2024Inventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
-
Patent number: 11809824Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.Type: GrantFiled: February 12, 2021Date of Patent: November 7, 2023Assignee: Google LLCInventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
-
Patent number: 10922488Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.Type: GrantFiled: March 25, 2019Date of Patent: February 16, 2021Assignee: Google LLCInventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
-
Publication number: 20200380023Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying data objects. One of the methods includes obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object.Type: ApplicationFiled: August 20, 2020Publication date: December 3, 2020Inventors: Gregory Sean Corrado, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea L. Frome, Jeffrey Adgate Dean, Mohammad Norouzi
-
Patent number: 10769191Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying data objects. One of the methods includes obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object.Type: GrantFiled: December 19, 2014Date of Patent: September 8, 2020Assignee: Google LLCInventors: Gregory Sean Corrado, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea L. Frome, Jeffrey Adgate Dean, Mohammad Norouzi
-
Patent number: 10503837Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for translating terms using numeric representations. One of the methods includes obtaining data that associates each term in a vocabulary of terms in a first language with a respective high-dimensional representation of the term; obtaining data that associates each term in a vocabulary of terms in a second language with a respective high-dimensional representation of the term; receiving a first language term; and determining a translation into the second language of the first language term from the high-dimensional representation of the first language term and the high-dimensional representations of terms in the vocabulary of terms in the second language.Type: GrantFiled: October 30, 2017Date of Patent: December 10, 2019Assignee: Google LLCInventors: Ilya Sutskever, Tomas Mikolov, Jeffrey Adgate Dean, Quoc V. Le
-
Patent number: 10241997Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.Type: GrantFiled: August 21, 2017Date of Patent: March 26, 2019Assignee: Google LLCInventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
-
Patent number: 9805028Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for translating terms using numeric representations. One of the methods includes obtaining data that associates each term in a vocabulary of terms in a first language with a respective high-dimensional representation of the term; obtaining data that associates each term in a vocabulary of terms in a second language with a respective high-dimensional representation of the term; receiving a first language term; and determining a translation into the second language of the first language term from the high-dimensional representation of the first language term and the high-dimensional representations of terms in the vocabulary of terms in the second language.Type: GrantFiled: September 17, 2015Date of Patent: October 31, 2017Assignee: Google Inc.Inventors: Ilya Sutskever, Tomas Mikolov, Jeffrey Adgate Dean, Quoc V. Le
-
Patent number: 9740680Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.Type: GrantFiled: May 18, 2015Date of Patent: August 22, 2017Assignee: Google Inc.Inventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
-
Patent number: 9519858Abstract: A system is described herein which uses a neural network having an input layer that accepts an input vector and a feature vector. The input vector represents at least part of input information, such as, but not limited to, a word or phrase in a sequence of input words. The feature vector provides supplemental information pertaining to the input information. The neural network produces an output vector based on the input vector and the feature vector. In one implementation, the neural network is a recurrent neural network. Also described herein are various applications of the system, including a machine translation application.Type: GrantFiled: February 10, 2013Date of Patent: December 13, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Geoffrey G. Zweig, Tomas Mikolov
-
Publication number: 20150178383Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying data objects. One of the methods includes obtaining data that associates each term in a vocabulary of terms with a respective high-dimensional representation of the term; obtaining classification data for a data object, wherein the classification data includes a respective score for each of a plurality of categories, and wherein each of the categories is associated with a respective category label; computing an aggregate high-dimensional representation for the data object from high-dimensional representations for the category labels associated with the categories and the respective scores; identifying a first term in the vocabulary of terms having a high-dimensional representation that is closest to the aggregate high-dimensional representation; and selecting the first term as a category label for the data object.Type: ApplicationFiled: December 19, 2014Publication date: June 25, 2015Inventors: Gregory Sean Corrado, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea L. Frome, Jeffrey Adgate Dean, Mohammad Norouzi
-
Patent number: 9037464Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.Type: GrantFiled: March 15, 2013Date of Patent: May 19, 2015Assignee: Google Inc.Inventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado, Jeffrey A. Dean
-
Publication number: 20140249799Abstract: Relational similarity measuring embodiments are presented that generally involve creating a relational similarity model that, given two pairs of words, is used to measure a degree of relational similarity between the two relations respectively exhibited by these word pairs. In one exemplary embodiment this involves creating a combined relational similarity model from a plurality of relational similarity models. This is generally accomplished by first selecting a plurality of relational similarity models, each of which measures relational similarity between two pairs of words, and each of which is trained or created using a different method or linguistic/textual resource. The selected models are then combined to form the combined relational similarity model. The combined model inputs two pairs of words and outputs a relational similarity indicator representing a measure the degree of relational similarity between the word pairs.Type: ApplicationFiled: March 4, 2013Publication date: September 4, 2014Applicant: Microsoft CorporationInventors: Wen-tau Yih, Geoffrey Zweig, Christopher Meek, Alisa Zhila, Tomas Mikolov
-
Publication number: 20140229158Abstract: A system is described herein which uses a neural network having an input layer that accepts an input vector and a feature vector. The input vector represents at least part of input information, such as, but not limited to, a word or phrase in a sequence of input words. The feature vector provides supplemental information pertaining to the input information. The neural network produces an output vector based on the input vector and the feature vector. In one implementation, the neural network is a recurrent neural network. Also described herein are various applications of the system, including a machine translation application.Type: ApplicationFiled: February 10, 2013Publication date: August 14, 2014Applicant: MICROSOFT CORPORATIONInventors: Geoffrey G. Zweig, Tomas Mikolov, Alejandro Acero