Patents by Inventor Brendan Shillingford

Brendan Shillingford has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230306258
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a video data generation neural network having a plurality of video generation network parameters. In one aspect, a method includes generating one or more sequences of training video frames using the video data generation neural network in accordance with current values of the video data generation network parameters; obtaining one or more sequences of target video frames; and training the video data generation neural network using training signals derived from a similarity between respective embeddings of the training and target video frames. The embeddings are generated by a video data embedding neural network.
    Type: Application
    Filed: September 8, 2021
    Publication date: September 28, 2023
    Inventors: Ioannis Alexandros Assael, Brendan Shillingford
  • Publication number: 20220223162
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for bandwidth extension. One of the methods includes obtaining a low-resolution version of an input, the low-resolution version of the input comprising a first number of samples at a first sample rate over a first time period; and generating, from the low-resolution version of the input, a high-resolution version of the input comprising a second, larger number of samples at a second, higher sample rate over the first time period. Generating the high-resolution version includes generating a representation of the low-resolution version of the input; processing the representation of the low-resolution version of the input through a conditioning neural network to generate a conditioning input; and processing the conditioning input using a generative neural network to generate the high/resolution version of the input.
    Type: Application
    Filed: April 30, 2020
    Publication date: July 14, 2022
    Inventors: Ioannis Alexandros Assael, Thomas Chadwick Walters, Archit Gupta, Brendan Shillingford
  • Patent number: 11386900
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing visual speech recognition. In one aspect, a method comprises receiving a video comprising a plurality of video frames, wherein each video frame depicts a pair of lips; processing the video using a visual speech recognition neural network to generate, for each output position in an output sequence, a respective output score for each token in a vocabulary of possible tokens, wherein the visual speech recognition neural network comprises one or more volumetric convolutional neural network layers and one or more time-aggregation neural network layers; wherein the vocabulary of possible tokens comprises a plurality of phonemes; and determining a sequence of words expressed by the pair of lips depicted in the video using the output scores.
    Type: Grant
    Filed: May 20, 2019
    Date of Patent: July 12, 2022
    Assignee: DeepMind Technologies Limited
    Inventors: Brendan Shillingford, Ioannis Alexandros Assael, Joao Ferdinando Gomes de Freitas
  • Patent number: 11355097
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an adaptive audio-generation model. One of the methods includes generating an adaptive audio-generation model including learning a plurality of embedding vectors and parameter values of a neural network using training data comprising first text and audio data representing a plurality of different individual speakers speaking portions of the first text, wherein the plurality of embedding vectors represent respective voice characteristics of the plurality of different individual speakers.
    Type: Grant
    Filed: October 1, 2020
    Date of Patent: June 7, 2022
    Assignee: DeepMind Technologies Limited
    Inventors: Yutian Chen, Scott Ellison Reed, Aaron Gerard Antonius van den Oord, Oriol Vinyals, Heiga Zen, Ioannis Alexandros Assael, Brendan Shillingford, Joao Ferdinando Gomes de Freitas
  • Patent number: 11250838
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a video speech recognition model having a plurality of model parameters on a set of unlabeled video-audio data and using a trained speech recognition model. During the training, the values of the parameters of the trained audio speech recognition model fixed are generally fixed and only the values of the video speech recognition model are adjusted. Once being trained, the video speech recognition model can be used to recognize speech from video when corresponding audio is not available.
    Type: Grant
    Filed: November 18, 2019
    Date of Patent: February 15, 2022
    Assignee: DeepMind Technologies Limited
    Inventors: Brendan Shillingford, Ioannis Alexandros Assael, Joao Ferdinando Gomes de Freitas
  • Publication number: 20220036172
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating olfactory predictions using neural networks. One of the methods includes receiving scene data characterizing a scene in an environment; processing the scene data using a representation neural network to generate a representation of the scene; and processing the representation of the scene using a prediction neural network to generate as output an olfactory prediction that characterizes a predicted smell of the scene at a particular observer location.
    Type: Application
    Filed: July 29, 2020
    Publication date: February 3, 2022
    Inventors: Brendan Shillingford, Jakob Nicolaus Foerster
  • Publication number: 20210110831
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing visual speech recognition. In one aspect, a method comprises receiving a video comprising a plurality of video frames, wherein each video frame depicts a pair of lips; processing the video using a visual speech recognition neural network to generate, for each output position in an output sequence, a respective output score for each token in a vocabulary of possible tokens, wherein the visual speech recognition neural network comprises one or more volumetric convolutional neural network layers and one or more time-aggregation neural network layers; wherein the vocabulary of possible tokens comprises a plurality of phonemes; and determining a sequence of words expressed by the pair of lips depicted in the video using the output scores.
    Type: Application
    Filed: May 20, 2019
    Publication date: April 15, 2021
    Inventors: Brendan Shillingford, Ioannis Alexandros Assael, Joao Ferdinando Gomes de Freitas
  • Publication number: 20210020160
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an adaptive audio-generation model. One of the methods includes generating an adaptive audio-generation model including learning a plurality of embedding vectors and parameter values of a neural network using training data comprising first text and audio data representing a plurality of different individual speakers speaking portions of the first text, wherein the plurality of embedding vectors represent respective voice characteristics of the plurality of different individual speakers.
    Type: Application
    Filed: October 1, 2020
    Publication date: January 21, 2021
    Inventors: Yutian Chen, Scott Ellison Reed, Aaron Gerard Antonius van den Oord, Oriol Vinyals, Heiga Zen, Ioannis Alexandros Assael, Brendan Shillingford, Joao Ferdinando Gomes de Freitas
  • Patent number: 10810993
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an adaptive audio-generation model. One of the methods includes generating an adaptive audio-generation model including learning a plurality of embedding vectors and parameter values of a neural network using training data comprising first text and audio data representing a plurality of different individual speakers speaking portions of the first text, wherein the plurality of embedding vectors represent respective voice characteristics of the plurality of different individual speakers.
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: October 20, 2020
    Assignee: DeepMind Technologies Limited
    Inventors: Yutian Chen, Scott Ellison Reed, Aaron Gerard Antonius van den Oord, Oriol Vinyals, Heiga Zen, Ioannis Alexandros Assael, Brendan Shillingford, Joao Ferdinando Gomes de Freitas
  • Publication number: 20200160843
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a video speech recognition model having a plurality of model parameters on a set of unlabeled video-audio data and using a trained speech recognition model. During the training, the values of the parameters of the trained audio speech recognition model fixed are generally fixed and only the values of the video speech recognition model are adjusted. Once being trained, the video speech recognition model can be used to recognize speech from video when corresponding audio is not available.
    Type: Application
    Filed: November 18, 2019
    Publication date: May 21, 2020
    Inventors: Brendan Shillingford, Ioannis Alexandros Assael, Joao Ferdinando Gomes de Freitas
  • Publication number: 20200135172
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an adaptive audio-generation model. One of the methods includes generating an adaptive audio-generation model including learning a plurality of embedding vectors and parameter values of a neural network using training data comprising first text and audio data representing a plurality of different individual speakers speaking portions of the first text, wherein the plurality of embedding vectors represent respective voice characteristics of the plurality of different individual speakers.
    Type: Application
    Filed: October 28, 2019
    Publication date: April 30, 2020
    Inventors: Yutian Chen, Scott Ellison Reed, Aaron Gerard Antonius van den Oord, Oriol Vinyals, Heiga Zen, Ioannis Alexandros Assael, Brendan Shillingford, Joao Ferdinando Gomes de Freitas