Patents by Inventor Jordi Pons PUIG

Jordi Pons PUIG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

METHOD AND APPARATUS FOR GENERATING AN INTERMEDIATE AUDIO FORMAT FROM AN INPUT MULTICHANNEL AUDIO SIGNAL

Publication number: 20240022868

Abstract: Described herein is a method for training a machine learning algorithm. The method may comprise receiving a first input multichannel audio signal. The method may comprise generating, using the machine learning algorithm, an intermediate audio signal based on the first input multichannel audio signal. The method may comprise rendering the intermediate audio signal into a first output multichannel audio signal. Further, the method may comprise improving the machine learning algorithm based on a difference between the first input multichannel audio signal and the first output multichannel audio signal. Described herein are further an apparatus for generating an intermediate audio format from an input multichannel audio signal as well as a respective computer program product comprising a computer-readable storage medium with instructions adapted to carry out said method when executed by a device having processing capability.

Type: Application

Filed: October 14, 2021

Publication date: January 18, 2024

Applicant: DOLBY INTERNATIONAL AB

Inventors: Daniel Arteaga, Jordi Pons Puig
FRAME-LEVEL PERMUTATION INVARIANT TRAINING FOR SOURCE SEPARATION

Publication number: 20240005942

Abstract: Described is a method of training a deep-learning-based system for sound source separation. The system comprises a separation stage for frame-wise extraction of representations of sound sources from a representation of an audio signal, and a clustering stage for generating, for each frame, a vector indicative of an assignment permutation of extracted frames of representations of sound sources to respective sound sources. The representation of the audio signal is a waveform-based representation. The separation stage is trained using frame-level permutation invariant training. Further, the clustering stage is trained to generate embedding vectors for the frames of the audio signal that allow to determine estimates of respective assignment permutations between extracted sound signals and labels of sound sources that had been used for the frames. Also described is a method of using the deep-learning-based system for sound source separation.

Type: Application

Filed: October 13, 2021

Publication date: January 4, 2024

Applicants: Dolby Laboratories Licensing Corporation, DOLBY INTERNATIONAL AB

Inventors: Xiaoyu LIU, Jordi PONS PUIG
REAL-TIME PACKET LOSS CONCEALMENT USING DEEP GENERATIVE NETWORKS

Publication number: 20230377584

Abstract: The present disclosure relates to a method and system for performing packet loss concealment using a neural network system. The method comprises obtaining a representation of an incomplete audio signal, inputting the representation of the incomplete audio signal to an encoder neural network and outputting a latent representation of a predicted complete audio signal. The latent representation is input to a decoder neural network which outputs a representation of a predicted complete audio signal comprising a reconstruction of the original portion of the complete audio signal, wherein said encoder neural network and said decoder neural network have been trained with an adversarial neural network.

Type: Application

Filed: October 14, 2021

Publication date: November 23, 2023

Applicant: DOLBY INTERNATIONAL AB

Inventors: Santiago PASCUAL, Joan SERRA, Jordi PONS PUIG
METHOD FOR LEARNING AN AUDIO QUALITY METRIC COMBINING LABELED AND UNLABELED DATA

Publication number: 20230245674

Abstract: Described is a method of training a neural-network-based system for determining an indication of an audio quality of an audio input. The method includes obtaining, as input, at least one training set comprising audio samples. The audio samples include audio samples of a first type and audio samples of a second type, wherein each of the first type of audio samples is labelled with information indicative of a respective predetermined audio quality metric, and wherein each of the second type of audio samples is labelled with information indicative of a respective audio quality metric relative to that of a reference audio sample. The method further includes: inputting the training set to the neural-network-based system; and iteratively training the system to predict the respective label information of the audio samples in the training set.

Type: Application

Filed: June 21, 2021

Publication date: August 3, 2023

Applicant: Dolby International AB

Inventors: Joan Serra, Jordi Pons Puig, Santiago Pascual
DEEP SOURCE SEPARATION ARCHITECTURE

Publication number: 20220406323

Abstract: A speech separation server comprises a deep-learning encoder with nonlinear activation. The encoder is programmed to take a mixture audio waveform in the time domain, learn generalized patterns from the mixture audio waveform, and generate an encoded representation that effectively characterizes the mixture audio waveform for speech separation.

Type: Application

Filed: October 20, 2020

Publication date: December 22, 2022

Applicants: Dolby Laboratories Licensing Corporation, Dolby International AB

Inventors: Berkan KADIOGLU, Michael Getty HORGAN, Jordi Pons PUIG, Xiaoyu LIU

METHOD AND APPARATUS FOR GENERATING AN INTERMEDIATE AUDIO FORMAT FROM AN INPUT MULTICHANNEL AUDIO SIGNAL

FRAME-LEVEL PERMUTATION INVARIANT TRAINING FOR SOURCE SEPARATION

REAL-TIME PACKET LOSS CONCEALMENT USING DEEP GENERATIVE NETWORKS

METHOD FOR LEARNING AN AUDIO QUALITY METRIC COMBINING LABELED AND UNLABELED DATA

DEEP SOURCE SEPARATION ARCHITECTURE