Patents by Inventor Sze Chie Lim

Sze Chie Lim has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230379645
    Abstract: The technology generally relates to spatial audio communication between devices. For example, a first device and a second device may be connected via a communication link. The first device may capture audio signals in an environment through two or more microphones. The first device may encode the captured audio with spatial configuration data. The first device may transmit the encoded audio via the communication link to the second device. The second device may decode the encoded audio into binaural or ambisonic audio to be output by one or more speakers of the second device. The binaural or ambisonic audio may be converted into spatial audio to be output. The second device may output the binaural or spatial audio to create an immersive listening experience.
    Type: Application
    Filed: May 19, 2022
    Publication date: November 23, 2023
    Inventors: Rajeev Conrad Nongpiur, Qian Zhang, Andrew James Sutter, Kung-Wei Liu, Jihan Li, Hélène Bahu, Leonardo Kusumo, Sze Chie Lim, Marco Tagliasacchi, Neil Zeghidour, Michael Takezo Chinen
  • Publication number: 20230368804
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
    Type: Application
    Filed: May 8, 2023
    Publication date: November 16, 2023
    Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
  • Patent number: 11756561
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.
    Type: Grant
    Filed: February 17, 2022
    Date of Patent: September 12, 2023
    Assignee: DeepMind Technologies Limited
    Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
  • Patent number: 11676613
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
    Type: Grant
    Filed: May 27, 2021
    Date of Patent: June 13, 2023
    Assignee: Google LLC
    Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
  • Publication number: 20220319527
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.
    Type: Application
    Filed: February 17, 2022
    Publication date: October 6, 2022
    Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
  • Patent number: 11257507
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.
    Type: Grant
    Filed: January 17, 2020
    Date of Patent: February 22, 2022
    Assignee: DeepMind Technologies Limited
    Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
  • Publication number: 20210366495
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
    Type: Application
    Filed: May 27, 2021
    Publication date: November 25, 2021
    Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
  • Publication number: 20210287038
    Abstract: Implementations identify a small set of independent, salient features from an input signal. The salient features may be used for conditioning a generative network, making the generative network robust to noise. The salient features may facilitate compression and data transmission. An example method includes receiving an input signal and extracting salient features for the input signal by providing the input signal to an encoder trained to extract salient features. The salient features may be independent and have a sparse distribution. The encoder may be configured to generate almost identical features from two input signals a system designer deems equivalent. The method also includes conditioning a generative network using the salient features. In some implementations, the method may also include extracting a plurality of time sequences from the input signal and extracting the salient features for each time sequence.
    Type: Application
    Filed: May 16, 2019
    Publication date: September 16, 2021
    Inventors: Willem Bastiaan Kleijn, Sze Chie Lim, Michael Chinen, Jan Skoglund
  • Patent number: 11024321
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
    Type: Grant
    Filed: November 30, 2018
    Date of Patent: June 1, 2021
    Assignee: Google LLC
    Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
  • Patent number: 10839815
    Abstract: A method includes: receiving a representation of a soundfield, the representation characterizing the soundfield around a point in space; decomposing the received representation into independent signals; and encoding the independent signals, wherein a quantization noise for any of the independent signals has a common spatial profile with the independent signal.
    Type: Grant
    Filed: May 6, 2019
    Date of Patent: November 17, 2020
    Assignee: Google LLC
    Inventors: Willem Bastiaan Kleijn, Jan Skoglund, Sze Chie Lim
  • Patent number: 10770091
    Abstract: A method includes: receiving time instants of audio signals generated by a set of microphones at a location; determining a distortion measure between frequency components of at least some of the received audio signals; determining a similarity measure for the frequency components using the determined distortion measure; and processing the audio signals based on the determined similarity measure.
    Type: Grant
    Filed: January 23, 2017
    Date of Patent: September 8, 2020
    Assignee: GOOGLE LLC
    Inventors: Willem Bastiaan Kleijn, Sze Chie Lim
  • Publication number: 20200234725
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.
    Type: Application
    Filed: January 17, 2020
    Publication date: July 23, 2020
    Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
  • Publication number: 20200176004
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.
    Type: Application
    Filed: November 30, 2018
    Publication date: June 4, 2020
    Inventors: Willem Bastiaan Kleijn, Jan K. Skoglund, Alejandro Luebs, Sze Chie Lim
  • Publication number: 20190259397
    Abstract: A method includes: receiving a representation of a soundfield, the representation characterizing the soundfield around a point in space; decomposing the received representation into independent signals; and encoding the independent signals, wherein a quantization noise for any of the independent signals has a common spatial profile with the independent signal.
    Type: Application
    Filed: May 6, 2019
    Publication date: August 22, 2019
    Inventors: Willem Bastiaan Kleijn, Jan Skoglund, Sze Chie Lim
  • Patent number: 10332530
    Abstract: A method includes: receiving a representation of a soundfield, the representation characterizing the soundfield around a point in space; decomposing the received representation into independent signals; and encoding the independent signals, wherein a quantization noise for any of the independent signals has a common spatial profile with the independent signal.
    Type: Grant
    Filed: January 27, 2017
    Date of Patent: June 25, 2019
    Assignee: GOOGLE LLC
    Inventors: Willem Bastiaan Kleijn, Jan Skoglund, Sze Chie Lim
  • Publication number: 20180218740
    Abstract: A method includes: receiving a representation of a soundfield, the representation characterizing the soundfield around a point in space; decomposing the received representation into independent signals; and encoding the independent signals, wherein a quantization noise for any of the independent signals has a common spatial profile with the independent signal.
    Type: Application
    Filed: January 27, 2017
    Publication date: August 2, 2018
    Inventors: Willem Bastiaan Kleijn, Jan Skoglund, Sze Chie Lim
  • Patent number: 10015618
    Abstract: Techniques of rendering sound for a listener involve producing, as the amplitude of each of the source driving signals, a sum of two terms: a first term based on a solution s† to the equation b=A·s, and a second term based on a projection of a specified vector ? onto the nullspace of A, ? not being a solution to the equation b=A·s. Along these lines, in one example, the first term is equivalent to a Moore-Penrose pseudoinverse, e.g., AH(AAH)?1·b. In general, any solution to the equation b=A·s is satisfactory. The specified vector that is projected onto the nullspace of A is defined to reduce the coherence of the net sound field. Advantageously, the resulting operator is both linear time-invariant and idempotent so that the sound field may be faithfully reproduce both inside the RSF and at a sufficient range outside the RSF to cover a human head.
    Type: Grant
    Filed: August 1, 2017
    Date of Patent: July 3, 2018
    Assignee: GOOGLE LLC
    Inventors: Willem Bastiaan Kleijn, Andrew Allen, Jan Skoglund, Sze Chie Lim
  • Publication number: 20180182412
    Abstract: A method includes: receiving time instants of audio signals generated by a set of microphones at a location; determining a distortion measure between frequency components of at least some of the received audio signals; determining a similarity measure for the frequency components using the determined distortion measure; and processing the audio signals based on the determined similarity measure.
    Type: Application
    Filed: January 23, 2017
    Publication date: June 28, 2018
    Inventors: Willem Bastiaan Kleijn, Sze Chie Lim