Patents by Inventor Sze Chie Lim

Sze Chie Lim has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

REPRESENTING AUDIO CHANNELS AND OBJECTS IN A CONTAINER FORMAT

Publication number: 20260171100

Abstract: A method including generating a codec agnostic container including a first parameter, the first parameter including metadata describing how to render audio from a position in three-dimensional space, generating a codec dependent container including a second parameter, the second parameter including information associated with an audio codec used to compress data corresponding to the audio, and generating an audio package, the audio package including the codec agnostic container, the codec dependent container, and the compressed data corresponding to the audio.

Type: Application

Filed: December 12, 2025

Publication date: June 18, 2026

Inventors: Sze Chie Lim, Joseph Cullen, Shawn Singh, Francis Galligan, Jan Skoglund
Immersive audio package

Patent number: 12531075

Abstract: A method including receiving first audio data, receiving second audio data, compressing the first audio data as first compressed audio data, compressing the second audio data as second compressed audio data, generating a codec dependent container including a parameter associated with compressing the first audio data, compressing the second audio data, a reference to the first compressed audio data, and a reference to the second compressed audio data, generating a codec agnostic container including at least one parameter representing time-varying data associated with playback of the first audio data and the second audio data, and generating an audio package including the codec dependent container and the codec agnostic container.

Type: Grant

Filed: July 20, 2023

Date of Patent: January 20, 2026

Assignee: GOOGLE LLC

Inventors: Sze Chie Lim, Shawn Singh, Anjali Wheeler, Jani Huoponen, Jan Skoglund
Identifying salient features for generative networks

Patent number: 12242567

Abstract: Implementations identify a small set of independent, salient features from an input signal. The salient features may be used for conditioning a generative network, making the generative network robust to noise. The salient features may facilitate compression and data transmission. An example method includes receiving an input signal and extracting salient features for the input signal by providing the input signal to an encoder trained to extract salient features. The salient features may be independent and have a sparse distribution. The encoder may be configured to generate almost identical features from two input signals a system designer deems equivalent. The method also includes conditioning a generative network using the salient features. In some implementations, the method may also include extracting a plurality of time sequences from the input signal and extracting the salient features for each time sequence.

Type: Grant

Filed: May 16, 2019

Date of Patent: March 4, 2025

Assignee: Google LLC

Inventors: Willem Bastiaan Kleijn, Sze Chie Lim, Michael Chinen, Jan Skoglund
Spatial audio recording from home assistant devices

Patent number: 12200465

Abstract: The technology generally relates to spatial audio communication between devices. For example, a first device and a second device may be connected via a communication link. The first device may capture audio signals in an environment through two or more microphones. The first device may encode the captured audio with spatial configuration data. The first device may transmit the encoded audio via the communication link to the second device. The second device may decode the encoded audio into binaural or ambisonic audio to be output by one or more speakers of the second device. The binaural or ambisonic audio may be converted into spatial audio to be output. The second device may output the binaural or spatial audio to create an immersive listening experience.

Type: Grant

Filed: May 19, 2022

Date of Patent: January 14, 2025

Assignee: Google LLC

Inventors: Rajeev Conrad Nongpiur, Qian Zhang, Andrew James Sutter, Kung-Wei Liu, Jihan Li, Hélène Bahu, Leonardo Kusumo, Sze Chie Lim, Marco Tagliasacchi, Neil Zeghidour, Michael Takezo Chinen
SPECIFYING LOUDNESS IN AN IMMERSIVE AUDIO PACKAGE

Publication number: 20240329915

Abstract: A method including generating an audio stream including a first substream as first audio data and a second substream as second audio data, generating a first loudness parameter associated with playback of the first substream, generating a second loudness parameter associated with playback of the second substream, and generating an audio package including an identification corresponding to the first audio data, an identification corresponding to the second audio data, and a codec agnostic container including the first loudness parameter, and the second loudness parameter.

Type: Application

Filed: July 14, 2023

Publication date: October 3, 2024

Inventors: Sze Chie Lim, Shawn Singh
IMMERSIVE AUDIO PACKAGE

Publication number: 20240331709

Abstract: A method including receiving first audio data, receiving second audio data, compressing the first audio data as first compressed audio data, compressing the second audio data as second compressed audio data, generating a codec dependent container including a parameter associated with compressing the first audio data, compressing the second audio data, a reference to the first compressed audio data, and a reference to the second compressed audio data, generating a codec agnostic container including at least one parameter representing time-varying data associated with playback of the first audio data and the second audio data, and generating an audio package including the codec dependent container and the codec agnostic container.

Type: Application

Filed: July 20, 2023

Publication date: October 3, 2024

Inventors: Sze Chie Lim, Shawn Singh, Anjali Wheeler, Jani Huoponen, Jan Skoglund
Speech coding using auto-regressive generative neural networks

Patent number: 12062380

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

Type: Grant

Filed: May 8, 2023

Date of Patent: August 13, 2024

Assignee: Google LLC

Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
Spatial Audio Recording from Home Assistant Devices

Publication number: 20230379645

Abstract: The technology generally relates to spatial audio communication between devices. For example, a first device and a second device may be connected via a communication link. The first device may capture audio signals in an environment through two or more microphones. The first device may encode the captured audio with spatial configuration data. The first device may transmit the encoded audio via the communication link to the second device. The second device may decode the encoded audio into binaural or ambisonic audio to be output by one or more speakers of the second device. The binaural or ambisonic audio may be converted into spatial audio to be output. The second device may output the binaural or spatial audio to create an immersive listening experience.

Type: Application

Filed: May 19, 2022

Publication date: November 23, 2023

Inventors: Rajeev Conrad Nongpiur, Qian Zhang, Andrew James Sutter, Kung-Wei Liu, Jihan Li, Hélène Bahu, Leonardo Kusumo, Sze Chie Lim, Marco Tagliasacchi, Neil Zeghidour, Michael Takezo Chinen
SPEECH CODING USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS

Publication number: 20230368804

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

Type: Application

Filed: May 8, 2023

Publication date: November 16, 2023

Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
Speech coding using content latent embedding vectors and speaker latent embedding vectors

Patent number: 11756561

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.

Type: Grant

Filed: February 17, 2022

Date of Patent: September 12, 2023

Assignee: DeepMind Technologies Limited

Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
Speech coding using auto-regressive generative neural networks

Patent number: 11676613

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

Type: Grant

Filed: May 27, 2021

Date of Patent: June 13, 2023

Assignee: Google LLC

Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
SPEECH CODING USING CONTENT LATENT EMBEDDING VECTORS AND SPEAKER LATENT EMBEDDING VECTORS

Publication number: 20220319527

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.

Type: Application

Filed: February 17, 2022

Publication date: October 6, 2022

Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
Speech coding using content latent embedding vectors and speaker latent embedding vectors

Patent number: 11257507

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.

Type: Grant

Filed: January 17, 2020

Date of Patent: February 22, 2022

Assignee: DeepMind Technologies Limited

Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
SPEECH CODING USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS

Publication number: 20210366495

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

Type: Application

Filed: May 27, 2021

Publication date: November 25, 2021

Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
IDENTIFYING SALIENT FEATURES FOR GENERATIVE NETWORKS

Publication number: 20210287038

Abstract: Implementations identify a small set of independent, salient features from an input signal. The salient features may be used for conditioning a generative network, making the generative network robust to noise. The salient features may facilitate compression and data transmission. An example method includes receiving an input signal and extracting salient features for the input signal by providing the input signal to an encoder trained to extract salient features. The salient features may be independent and have a sparse distribution. The encoder may be configured to generate almost identical features from two input signals a system designer deems equivalent. The method also includes conditioning a generative network using the salient features. In some implementations, the method may also include extracting a plurality of time sequences from the input signal and extracting the salient features for each time sequence.

Type: Application

Filed: May 16, 2019

Publication date: September 16, 2021

Inventors: Willem Bastiaan Kleijn, Sze Chie Lim, Michael Chinen, Jan Skoglund
Speech coding using auto-regressive generative neural networks

Patent number: 11024321

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

Type: Grant

Filed: November 30, 2018

Date of Patent: June 1, 2021

Assignee: Google LLC

Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
Coding of a soundfield representation

Patent number: 10839815

Abstract: A method includes: receiving a representation of a soundfield, the representation characterizing the soundfield around a point in space; decomposing the received representation into independent signals; and encoding the independent signals, wherein a quantization noise for any of the independent signals has a common spatial profile with the independent signal.

Type: Grant

Filed: May 6, 2019

Date of Patent: November 17, 2020

Assignee: Google LLC

Inventors: Willem Bastiaan Kleijn, Jan Skoglund, Sze Chie Lim
Blind source separation using similarity measure

Patent number: 10770091

Abstract: A method includes: receiving time instants of audio signals generated by a set of microphones at a location; determining a distortion measure between frequency components of at least some of the received audio signals; determining a similarity measure for the frequency components using the determined distortion measure; and processing the audio signals based on the determined similarity measure.

Type: Grant

Filed: January 23, 2017

Date of Patent: September 8, 2020

Assignee: GOOGLE LLC

Inventors: Willem Bastiaan Kleijn, Sze Chie Lim
SPEECH CODING USING DISCRETE LATENT REPRESENTATIONS

Publication number: 20200234725

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.

Type: Application

Filed: January 17, 2020

Publication date: July 23, 2020

Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
SPEECH CODING USING AUTO-REGRESSIVE GENERATIVE NEURAL NETWORKS

Publication number: 20200176004

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

Type: Application

Filed: November 30, 2018

Publication date: June 4, 2020

Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim

1 2 next