Patents by Inventor Alejandro LUEBS
Alejandro LUEBS has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230368804Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.Type: ApplicationFiled: May 8, 2023Publication date: November 16, 2023Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
-
Patent number: 11756561Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.Type: GrantFiled: February 17, 2022Date of Patent: September 12, 2023Assignee: DeepMind Technologies LimitedInventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
-
Patent number: 11676613Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.Type: GrantFiled: May 27, 2021Date of Patent: June 13, 2023Assignee: Google LLCInventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
-
Publication number: 20220319527Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.Type: ApplicationFiled: February 17, 2022Publication date: October 6, 2022Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
-
Patent number: 11257507Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.Type: GrantFiled: January 17, 2020Date of Patent: February 22, 2022Assignee: DeepMind Technologies LimitedInventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
-
Publication number: 20210366495Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.Type: ApplicationFiled: May 27, 2021Publication date: November 25, 2021Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
-
Patent number: 11024321Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.Type: GrantFiled: November 30, 2018Date of Patent: June 1, 2021Assignee: Google LLCInventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
-
Publication number: 20200234725Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.Type: ApplicationFiled: January 17, 2020Publication date: July 23, 2020Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
-
Publication number: 20200176004Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.Type: ApplicationFiled: November 30, 2018Publication date: June 4, 2020Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
-
Patent number: 10110187Abstract: Mixture model based soft-clipping detection includes receiving input audio samples, generating soft-clipping information indicating whether the input audio samples include soft-clipping distortion, and outputting the soft-clipping information. Generating the soft-clipping information includes fitting a mixture model to the input audio samples, wherein fitting the mixture model to the input audio samples includes generating a fitted mixture model, such that the fitted mixture model has fitted parameters, and evaluating a soft-clipping distortion metric based on the parameters of the fitted mixture model, wherein evaluating the soft-clipping distortion metric includes identifying a soft-clipping distortion value.Type: GrantFiled: June 26, 2017Date of Patent: October 23, 2018Assignee: GOOGLE LLCInventors: Alejandro Luebs, Fritz Obermeyer
-
Patent number: 10045137Abstract: Techniques of performing acoustic echo cancellation involve providing a bi-magnitude filtering operation that performs a first filtering operation when a magnitude of an incoming audio signal to be output from a loudspeaker is less than a specified threshold and a second filtering operation when the magnitude of the incoming audio signal is greater than the threshold. The first filtering operation may take the form of a convolution between the incoming audio signal and a first impulse response function. The second filtering operation may take the form of a convolution between a nonlinear function of the incoming audio signal and a second impulse response function. For such a convolution, the bi-magnitude filtering operation involves providing, as the incoming audio signal, samples of the incoming audio signal over a specified window of time. The first and second impulse response functions may be determined from an input signal input into a microphone.Type: GrantFiled: June 30, 2017Date of Patent: August 7, 2018Assignee: Google LLCInventors: Jan Skoglund, Yiteng Huang, Alejandro Luebs
-
Publication number: 20180007482Abstract: Techniques of performing acoustic echo cancellation involve providing a bi-magnitude filtering operation that performs a first filtering operation when a magnitude of an incoming audio signal to be output from a loudspeaker is less than a specified threshold and a second filtering operation when the magnitude of the incoming audio signal is greater than the threshold. The first filtering operation may take the form of a convolution between the incoming audio signal and a first impulse response function. The second filtering operation may take the form of a convolution between a nonlinear function of the incoming audio signal and a second impulse response function. For such a convolution, the bi-magnitude filtering operation involves providing, as the incoming audio signal, samples of the incoming audio signal over a specified window of time. The first and second impulse response functions may be determined from an input signal input into a microphone.Type: ApplicationFiled: June 30, 2017Publication date: January 4, 2018Inventors: Jan Skoglund, Yiteng Huang, Alejandro Luebs
-
Publication number: 20170221502Abstract: Existing post-filtering methods for microphone array speech enhancement have two common deficiencies. First, they assume that noise is either white or diffuse and cannot deal with point interferers. Second, they estimate the post-filter coefficients using only two microphones at a time, performing averaging over all the microphones pairs, yielding a suboptimal solution. The provided method describes a post-filtering solution that implements signal models which handle white noise, diffuse noise, and point interferers. The method also implements a globally optimized least-squares approach of microphones in a microphone array, providing a more optimal solution than existing conventional methods. Experimental results demonstrate the described method outperforming conventional methods in various acoustic scenarios.Type: ApplicationFiled: February 3, 2016Publication date: August 3, 2017Applicant: Google Inc.Inventors: Yiteng HUANG, Alejandro LUEBS, Jan SKOGLUND, Willem Bastiaan KLEIJN
-
Patent number: 9721582Abstract: Existing post-filtering methods for microphone array speech enhancement have two common deficiencies. First, they assume that noise is either white or diffuse and cannot deal with point interferers. Second, they estimate the post-filter coefficients using only two microphones at a time, performing averaging over all the microphones pairs, yielding a suboptimal solution. The provided method describes a post-filtering solution that implements signal models which handle white noise, diffuse noise, and point interferers. The method also implements a globally optimized least-squares approach of microphones in a microphone array, providing a more optimal solution than existing conventional methods. Experimental results demonstrate the described method outperforming conventional methods in various acoustic scenarios.Type: GrantFiled: February 3, 2016Date of Patent: August 1, 2017Assignee: GOOGLE INC.Inventors: Yiteng Huang, Alejandro Luebs, Jan Skoglund, Willem Bastiaan Kleijn
-
Patent number: 9721580Abstract: Provided are methods and systems for providing situation-dependent transient noise suppression for audio signals. Different strategies (e.g., levels of aggressiveness) of transient suppression and signal restoration are applied to audio signals associated with participants in a video/audio conference depending on whether or not each participant is speaking (e.g., whether a voiced segment or an unvoiced/non-speech segment of audio is present). If no participants are speaking or there is an unvoiced/non-speech sound present, a more aggressive strategy for transient suppression and signal restoration is utilized. On the other hand, where voiced audio is detected (e.g., a participant is speaking), the methods and systems apply a softer, less aggressive suppression and restoration process.Type: GrantFiled: March 31, 2014Date of Patent: August 1, 2017Assignee: Google Inc.Inventors: Jan Skoglund, Alejandro Luebs
-
Publication number: 20150279386Abstract: Provided are methods and systems for providing situation-dependent transient noise suppression for audio signals. Different strategies (e.g., levels of aggressiveness) of transient suppression and signal restoration are applied to audio signals associated with participants in a video/audio conference depending on whether or not each participant is speaking (e.g., whether a voiced segment or an unvoiced/non-speech segment of audio is present). If no participants are speaking or there is an unvoiced/non-speech sound present, a more aggressive strategy for transient suppression and signal restoration is utilized. On the other hand, where voiced audio is detected (e.g., a participant is speaking), the methods and systems apply a softer, less aggressive suppression and restoration process.Type: ApplicationFiled: March 31, 2014Publication date: October 1, 2015Applicant: Google Inc.Inventors: Jan SKOGLUND, Alejandro LUEBS