Patents by Inventor Vincent O. Vanhoucke
Vincent O. Vanhoucke has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12073823Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.Type: GrantFiled: November 10, 2023Date of Patent: August 27, 2024Assignee: Google LLCInventors: Georg Heigold, Erik Mcdermott, Vincent O. Vanhoucke, Andrew W. Senior, Michiel A. U. Bacchiani
-
Publication number: 20240087559Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.Type: ApplicationFiled: November 10, 2023Publication date: March 14, 2024Applicant: Google LLCInventors: Georg Heigold, Erik Mcdermott, Vincent O. Vanhoucke, Andrew W. Senior, Michiel A. U. Bacchiani
-
Patent number: 11854534Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.Type: GrantFiled: December 20, 2022Date of Patent: December 26, 2023Assignee: Google LLCInventors: Georg Heigold, Erik Mcdermott, Vincent O. Vanhoucke, Andrew W. Senior, Michiel A. U. Bacchiani
-
Patent number: 11809955Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image processing using deep neural networks. One of the methods includes receiving data characterizing an input image; processing the data characterizing the input image using a deep neural network to generate an alternative representation of the input image, wherein the deep neural network comprises a plurality of subnetworks, wherein the subnetworks are arranged in a sequence from lowest to highest, and wherein processing the data characterizing the input image using the deep neural network comprises processing the data through each of the subnetworks in the sequence; and processing the alternative representation of the input image through an output layer to generate an output from the input image.Type: GrantFiled: September 28, 2022Date of Patent: November 7, 2023Assignee: Google LLCInventors: Christian Szegedy, Vincent O. Vanhoucke
-
Publication number: 20230014634Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image processing using deep neural networks. One of the methods includes receiving data characterizing an input image; processing the data characterizing the input image using a deep neural network to generate an alternative representation of the input image, wherein the deep neural network comprises a plurality of subnetworks, wherein the subnetworks are arranged in a sequence from lowest to highest, and wherein processing the data characterizing the input image using the deep neural network comprises processing the data through each of the subnetworks in the sequence; and processing the alternative representation of the input image through an output layer to generate an output from the input image.Type: ApplicationFiled: September 28, 2022Publication date: January 19, 2023Inventors: Christian Szegedy, Vincent O. Vanhoucke
-
Patent number: 11557277Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.Type: GrantFiled: December 15, 2021Date of Patent: January 17, 2023Assignee: Google LLCInventors: Georg Heigold, Erik McDermott, Vincent O. VanHoucke, Andrew W. Senior, Michiel A. U. Bacchiani
-
Patent number: 11550871Abstract: Structured documents are processed using convolutional neural networks. For example, the processing can include receiving a rendered form of a structured document; mapping a grid of cells to the rendered form; assigning a respective numeric embedding to each cell in the grid, comprising, for each cell: identifying content in the structured document that corresponds to a portion of the rendered form that is mapped to the cell, mapping the identified content to a numeric embedding for the identified content, and assigning the numeric embedding for the identified content to the cell; generating a matrix representation of the structured document from the numeric embeddings assigned to the cells of the grids; and generating neural network features of the structured document by processing the matrix representation of the structured document through a subnetwork comprising one or more convolutional neural network layers.Type: GrantFiled: August 19, 2019Date of Patent: January 10, 2023Assignee: Google LLCInventor: Vincent O. Vanhoucke
-
Patent number: 11462035Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image processing using deep neural networks. One of the methods includes receiving data characterizing an input image; processing the data characterizing the input image using a deep neural network to generate an alternative representation of the input image, wherein the deep neural network comprises a plurality of subnetworks, wherein the subnetworks are arranged in a sequence from lowest to highest, and wherein processing the data characterizing the input image using the deep neural network comprises processing the data through each of the subnetworks in the sequence; and processing the alternative representation of the input image through an output layer to generate an output from the input image.Type: GrantFiled: March 12, 2021Date of Patent: October 4, 2022Assignee: Google LLCInventors: Christian Szegedy, Vincent O. Vanhoucke
-
Patent number: 11341364Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network that is used to control a robotic agent interacting with a real-world environment.Type: GrantFiled: September 20, 2018Date of Patent: May 24, 2022Assignee: Google LLCInventors: Konstantinos Bousmalis, Alexander Irpan, Paul Wohlhart, Yunfei Bai, Mrinal Kalakrishnan, Julian Ibarz, Sergey Vladimir Levine, Kurt Konolige, Vincent O. Vanhoucke, Matthew Laurance Kelcey
-
Publication number: 20220108686Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.Type: ApplicationFiled: December 15, 2021Publication date: April 7, 2022Applicant: Google LLCInventors: Georg Heigold, Erik McDermott, Vincent O. VanHoucke, Andrew W. Senior, Michiel A.U. Bacchiani
-
Patent number: 11227582Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.Type: GrantFiled: January 6, 2021Date of Patent: January 18, 2022Assignee: Google LLCInventors: Georg Heigold, Erik Mcdermott, Vincent O. Vanhoucke, Andrew W. Senior, Michiel A. U. Bacchiani
-
Publication number: 20210334605Abstract: A neural network system that includes: multiple subnetworks that includes: a first subnetwork including multiple first modules, each first module including: a pass-through convolutional layer configured to process the subnetwork input for the first subnetwork to generate a pass-through output; an average pooling stack of neural network layers that collectively processes the subnetwork input for the first subnetwork to generate an average pooling output; a first stack of convolutional neural network layers configured to collectively process the subnetwork input for the first subnetwork to generate a first stack output; a second stack of convolutional neural network layers that are configured to collectively process the subnetwork input for the first subnetwork to generate a second stack output; and a concatenation layer configured to concatenate the pass-through output, the average pooling output, the first stack output, and the second stack output to generate a first module output for the first module.Type: ApplicationFiled: July 9, 2021Publication date: October 28, 2021Inventors: Vincent O. Vanhoucke, Christian Szegedy, Sergey Ioffe
-
Patent number: 11062181Abstract: A neural network system that includes: multiple subnetworks that includes: a first subnetwork including multiple first modules, each first module including: a pass-through convolutional layer configured to process the subnetwork input for the first subnetwork to generate a pass-through output; an average pooling stack of neural network layers that collectively processes the subnetwork input for the first subnetwork to generate an average pooling output; a first stack of convolutional neural network layers configured to collectively process the subnetwork input for the first subnetwork to generate a first stack output; a second stack of convolutional neural network layers that are configured to collectively process the subnetwork input for the first subnetwork to generate a second stack output; and a concatenation layer configured to concatenate the pass-through output, the average pooling output, the first stack output, and the second stack output to generate a first module output for the first module.Type: GrantFiled: August 26, 2019Date of Patent: July 13, 2021Assignee: Google LLCInventors: Vincent O. Vanhoucke, Christian Szegedy, Sergey Ioffe
-
Publication number: 20210201092Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image processing using deep neural networks. One of the methods includes receiving data characterizing an input image; processing the data characterizing the input image using a deep neural network to generate an alternative representation of the input image, wherein the deep neural network comprises a plurality of subnetworks, wherein the subnetworks are arranged in a sequence from lowest to highest, and wherein processing the data characterizing the input image using the deep neural network comprises processing the data through each of the subnetworks in the sequence; and processing the alternative representation of the input image through an output layer to generate an output from the input image.Type: ApplicationFiled: March 12, 2021Publication date: July 1, 2021Inventors: Christian Szegedy, Vincent O. Vanhoucke
-
Publication number: 20210125601Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.Type: ApplicationFiled: January 6, 2021Publication date: April 29, 2021Applicant: Google LLCInventors: Georg Heigold, Erik Mcdermott, Vincent O. Vanhoucke, Andrew W. Senior, Michiel A.U. Bacchiani
-
Patent number: 10977529Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image processing using deep neural networks. One of the methods includes receiving data characterizing an input image; processing the data characterizing the input image using a deep neural network to generate an alternative representation of the input image, wherein the deep neural network comprises a plurality of subnetworks, wherein the subnetworks are arranged in a sequence from lowest to highest, and wherein processing the data characterizing the input image using the deep neural network comprises processing the data through each of the subnetworks in the sequence; and processing the alternative representation of the input image through an output layer to generate an output from the input image.Type: GrantFiled: April 13, 2020Date of Patent: April 13, 2021Assignee: Google LLCInventors: Christian Szegedy, Vincent O. Vanhoucke
-
Patent number: 10916238Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.Type: GrantFiled: April 30, 2020Date of Patent: February 9, 2021Assignee: Google LLCInventors: Georg Heigold, Erik Mcdermott, Vincent O. Vanhoucke, Andrew W. Senior, Michiel A. U. Bacchiani
-
Publication number: 20200311491Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image processing using deep neural networks. One of the methods includes receiving data characterizing an input image; processing the data characterizing the input image using a deep neural network to generate an alternative representation of the input image, wherein the deep neural network comprises a plurality of subnetworks, wherein the subnetworks are arranged in a sequence from lowest to highest, and wherein processing the data characterizing the input image using the deep neural network comprises processing the data through each of the subnetworks in the sequence; and processing the alternative representation of the input image through an output layer to generate an output from the input image.Type: ApplicationFiled: April 13, 2020Publication date: October 1, 2020Inventors: Christian Szegedy, Vincent O. Vanhoucke
-
Publication number: 20200279134Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network that is used to control a robotic agent interacting with a real-world environment.Type: ApplicationFiled: September 20, 2018Publication date: September 3, 2020Inventors: Konstantinos Bousmalis, Alexander Irpan, Paul Wohlhart, Yunfei Bai, Mrinal Kalakrishnan, Julian Ibarz, Sergey Vladimir Levine, Kurt Konolige, Vincent O. Vanhoucke, Matthew Laurance Kelcey
-
Publication number: 20200258500Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.Type: ApplicationFiled: April 30, 2020Publication date: August 13, 2020Applicant: Google LLCInventors: Georg Heigold, Erik McDermott, Vincent O. Vanhoucke, Andrew W. Senior, Michiel A.U. Bacchiani