Patents by Inventor Alejandro LUEBS

Alejandro LUEBS has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230368804
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
    Type: Application
    Filed: May 8, 2023
    Publication date: November 16, 2023
    Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
  • Patent number: 11756561
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.
    Type: Grant
    Filed: February 17, 2022
    Date of Patent: September 12, 2023
    Assignee: DeepMind Technologies Limited
    Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
  • Patent number: 11676613
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
    Type: Grant
    Filed: May 27, 2021
    Date of Patent: June 13, 2023
    Assignee: Google LLC
    Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
  • Publication number: 20220319527
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.
    Type: Application
    Filed: February 17, 2022
    Publication date: October 6, 2022
    Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
  • Patent number: 11257507
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.
    Type: Grant
    Filed: January 17, 2020
    Date of Patent: February 22, 2022
    Assignee: DeepMind Technologies Limited
    Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
  • Publication number: 20210366495
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
    Type: Application
    Filed: May 27, 2021
    Publication date: November 25, 2021
    Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
  • Patent number: 11024321
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
    Type: Grant
    Filed: November 30, 2018
    Date of Patent: June 1, 2021
    Assignee: Google LLC
    Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
  • Publication number: 20200234725
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.
    Type: Application
    Filed: January 17, 2020
    Publication date: July 23, 2020
    Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
  • Publication number: 20200176004
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
    Type: Application
    Filed: November 30, 2018
    Publication date: June 4, 2020
    Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
  • Patent number: 10110187
    Abstract: Mixture model based soft-clipping detection includes receiving input audio samples, generating soft-clipping information indicating whether the input audio samples include soft-clipping distortion, and outputting the soft-clipping information. Generating the soft-clipping information includes fitting a mixture model to the input audio samples, wherein fitting the mixture model to the input audio samples includes generating a fitted mixture model, such that the fitted mixture model has fitted parameters, and evaluating a soft-clipping distortion metric based on the parameters of the fitted mixture model, wherein evaluating the soft-clipping distortion metric includes identifying a soft-clipping distortion value.
    Type: Grant
    Filed: June 26, 2017
    Date of Patent: October 23, 2018
    Assignee: GOOGLE LLC
    Inventors: Alejandro Luebs, Fritz Obermeyer
  • Patent number: 10045137
    Abstract: Techniques of performing acoustic echo cancellation involve providing a bi-magnitude filtering operation that performs a first filtering operation when a magnitude of an incoming audio signal to be output from a loudspeaker is less than a specified threshold and a second filtering operation when the magnitude of the incoming audio signal is greater than the threshold. The first filtering operation may take the form of a convolution between the incoming audio signal and a first impulse response function. The second filtering operation may take the form of a convolution between a nonlinear function of the incoming audio signal and a second impulse response function. For such a convolution, the bi-magnitude filtering operation involves providing, as the incoming audio signal, samples of the incoming audio signal over a specified window of time. The first and second impulse response functions may be determined from an input signal input into a microphone.
    Type: Grant
    Filed: June 30, 2017
    Date of Patent: August 7, 2018
    Assignee: Google LLC
    Inventors: Jan Skoglund, Yiteng Huang, Alejandro Luebs
  • Publication number: 20180007482
    Abstract: Techniques of performing acoustic echo cancellation involve providing a bi-magnitude filtering operation that performs a first filtering operation when a magnitude of an incoming audio signal to be output from a loudspeaker is less than a specified threshold and a second filtering operation when the magnitude of the incoming audio signal is greater than the threshold. The first filtering operation may take the form of a convolution between the incoming audio signal and a first impulse response function. The second filtering operation may take the form of a convolution between a nonlinear function of the incoming audio signal and a second impulse response function. For such a convolution, the bi-magnitude filtering operation involves providing, as the incoming audio signal, samples of the incoming audio signal over a specified window of time. The first and second impulse response functions may be determined from an input signal input into a microphone.
    Type: Application
    Filed: June 30, 2017
    Publication date: January 4, 2018
    Inventors: Jan Skoglund, Yiteng Huang, Alejandro Luebs
  • Publication number: 20170221502
    Abstract: Existing post-filtering methods for microphone array speech enhancement have two common deficiencies. First, they assume that noise is either white or diffuse and cannot deal with point interferers. Second, they estimate the post-filter coefficients using only two microphones at a time, performing averaging over all the microphones pairs, yielding a suboptimal solution. The provided method describes a post-filtering solution that implements signal models which handle white noise, diffuse noise, and point interferers. The method also implements a globally optimized least-squares approach of microphones in a microphone array, providing a more optimal solution than existing conventional methods. Experimental results demonstrate the described method outperforming conventional methods in various acoustic scenarios.
    Type: Application
    Filed: February 3, 2016
    Publication date: August 3, 2017
    Applicant: Google Inc.
    Inventors: Yiteng HUANG, Alejandro LUEBS, Jan SKOGLUND, Willem Bastiaan KLEIJN
  • Patent number: 9721582
    Abstract: Existing post-filtering methods for microphone array speech enhancement have two common deficiencies. First, they assume that noise is either white or diffuse and cannot deal with point interferers. Second, they estimate the post-filter coefficients using only two microphones at a time, performing averaging over all the microphones pairs, yielding a suboptimal solution. The provided method describes a post-filtering solution that implements signal models which handle white noise, diffuse noise, and point interferers. The method also implements a globally optimized least-squares approach of microphones in a microphone array, providing a more optimal solution than existing conventional methods. Experimental results demonstrate the described method outperforming conventional methods in various acoustic scenarios.
    Type: Grant
    Filed: February 3, 2016
    Date of Patent: August 1, 2017
    Assignee: GOOGLE INC.
    Inventors: Yiteng Huang, Alejandro Luebs, Jan Skoglund, Willem Bastiaan Kleijn
  • Patent number: 9721580
    Abstract: Provided are methods and systems for providing situation-dependent transient noise suppression for audio signals. Different strategies (e.g., levels of aggressiveness) of transient suppression and signal restoration are applied to audio signals associated with participants in a video/audio conference depending on whether or not each participant is speaking (e.g., whether a voiced segment or an unvoiced/non-speech segment of audio is present). If no participants are speaking or there is an unvoiced/non-speech sound present, a more aggressive strategy for transient suppression and signal restoration is utilized. On the other hand, where voiced audio is detected (e.g., a participant is speaking), the methods and systems apply a softer, less aggressive suppression and restoration process.
    Type: Grant
    Filed: March 31, 2014
    Date of Patent: August 1, 2017
    Assignee: Google Inc.
    Inventors: Jan Skoglund, Alejandro Luebs
  • Publication number: 20150279386
    Abstract: Provided are methods and systems for providing situation-dependent transient noise suppression for audio signals. Different strategies (e.g., levels of aggressiveness) of transient suppression and signal restoration are applied to audio signals associated with participants in a video/audio conference depending on whether or not each participant is speaking (e.g., whether a voiced segment or an unvoiced/non-speech segment of audio is present). If no participants are speaking or there is an unvoiced/non-speech sound present, a more aggressive strategy for transient suppression and signal restoration is utilized. On the other hand, where voiced audio is detected (e.g., a participant is speaking), the methods and systems apply a softer, less aggressive suppression and restoration process.
    Type: Application
    Filed: March 31, 2014
    Publication date: October 1, 2015
    Applicant: Google Inc.
    Inventors: Jan SKOGLUND, Alejandro LUEBS