Patents by Inventor Roy M. FEJGIN

Roy M. FEJGIN has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Perceptually-based loss functions for audio encoding and decoding based on machine learning

Patent number: 12361956

Abstract: Computer-implemented methods for training a neural network, as well as for implementing audio encoders and decoders via trained neural networks, are provided. The neural network may receive an input audio signal, generate an encoded audio signal and decode the encoded audio signal. A loss function generating module may receive the decoded audio signal and a ground truth audio signal, and may generate a loss function value corresponding to the decoded audio signal. Generating the loss function value may involve applying a psychoacoustic model. The neural network may be trained based on the loss function value. The training may involve updating at least one weight of the neural network.

Type: Grant

Filed: November 13, 2023

Date of Patent: July 15, 2025

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Roy M. Fejgin, Grant A. Davidson, Chih-Wei Wu, Vivek Kumar
PERCEPTUALLY-BASED LOSS FUNCTIONS FOR AUDIO ENCODING AND DECODING BASED ON MACHINE LEARNING

Publication number: 20240079019

Abstract: Computer-implemented methods for training a neural network, as well as for implementing audio encoders and decoders via trained neural networks, are provided. The neural network may receive an input audio signal, generate an encoded audio signal and decode the encoded audio signal. A loss function generating module may receive the decoded audio signal and a ground truth audio signal, and may generate a loss function value corresponding to the decoded audio signal. Generating the loss function value may involve applying a psychoacoustic model. The neural network may be trained based on the loss function value. The training may involve updating at least one weight of the neural network.

Type: Application

Filed: November 13, 2023

Publication date: March 7, 2024

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Roy M. FEJGIN, Grant A. DAVIDSON, Chih-Wei WU, Vivek KUMAR
METHOD AND APPARATUS FOR PROCESSING OF AUDIO USING A NEURAL NETWORK

Publication number: 20230395086

Abstract: Described herein is a method of processing an audio signal using a neural network or using a first and a second neural network. Described is further a method of training said neural network or of jointly training a set of said first and said second neural network. Moreover, described is a method of obtaining and transmitting a latent feature space representation of a perceptual domain audio signal using a neural network and a method of obtaining an audio signal from a latent feature space representation of a perceptual domain audio signal using a neural network. Described are also respective apparatuses and computer program products.

Type: Application

Filed: October 14, 2021

Publication date: December 7, 2023

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Mark S. VINTON, Cong ZHOU, Roy M. FEJGIN, Grant A. DAVIDSON
DEEP-LEARNING BASED SPEECH ENHANCEMENT

Publication number: 20230368807

Abstract: A system for suppressing noise and enhancing speech and a related method are disclosed. The system trains a neural network model that takes banded energies corresponding to an original noisy waveform and produces a speech value indicating the amount of speech present in each band at each frame. The neural model comprises a feature extraction block that implements some lookahead. The feature extraction block is followed by an encoder with steady down-sampling along the frequency domain forming a contracting path. The encoder is followed by a corresponding decoder with steady up-sampling along the frequency domain forming an expanding path. The decoder receives scaled output feature maps from the encoder at a corresponding level. The decoder is followed by a classification block that generates a speech value indicating an amount of speech present for each frequency band of the plurality of frequency bands at each frame of the plurality of frames.

Type: Application

Filed: October 29, 2021

Publication date: November 16, 2023

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Xiaoyu LIU, Michael Getty HORGAN, Roy M. FEJGIN, Paul HOLMBERG
Perceptually-based loss functions for audio encoding and decoding based on machine learning

Patent number: 11817111

Abstract: Computer-implemented methods for training a neural network, as well as for implementing audio encoders and decoders via trained neural networks, are provided. The neural network may receive an input audio signal, generate an encoded audio signal and decode the encoded audio signal. A loss function generating module may receive the decoded audio signal and a ground truth audio signal, and may generate a loss function value corresponding to the decoded audio signal. Generating the loss function value may involve applying a psychoacoustic model. The neural network may be trained based on the loss function value. The training may involve updating at least one weight of the neural network.

Type: Grant

Filed: April 10, 2019

Date of Patent: November 14, 2023

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Roy M. Fejgin, Grant A. Davidson, Chih-Wei Wu, Vivek Kumar
METHODS AND SYSTEM FOR WAVEFORM CODING OF AUDIO SIGNALS WITH A GENERATIVE MODEL

Publication number: 20220392458

Abstract: Described herein is a method of waveform decoding, the method including the steps of: (a) receiving, by a waveform decoder, a bitstream including a finite bitrate representation of a source signal; (b) waveform decoding the finite bitrate representation of the source signal to obtain a waveform approximation of the source signal; (c) providing the waveform approximation of the source signal to a generative model that implements a probability density function, to obtain a probability distribution for a reconstructed signal of the source signal; and (d) generating the reconstructed signal of the source signal based on the probability distribution. Described are further a method and system for waveform coding and a method of training a generative model.

Type: Application

Filed: October 16, 2020

Publication date: December 8, 2022

Applicants: Dolby Laboratories Licensing Corporation, DOLBY INTERNATIONAL AB

Inventors: Janusz Klejsa, Arijit Biswas, Lars Villemoes, Roy M. Fejgin, Cong Zhou
Audio discontinuity detection and correction

Patent number: 11183202

Abstract: Methods for detecting whether a rendered version of a specified seamless connection (“SSC”) at a connection point between two audio segment sequences results in an audible discontinuity, and methods for analyzing at least one SSC between audio segment sequences to determine whether a renderable version of each SSC would have an audible discontinuity at the connection point when rendered, and in appropriate cases, for a SSC having a renderable version which is determined to have an audible discontinuity when rendered, correcting at least one audio segment of at least one segment sequence to be connected in accordance with the SSC in an effort to ensure that rendering of the SSC will result in seamless connection without an audible discontinuity. Other aspects are editing systems configured to implement any of the methods, and storage media and rendering systems which store audio data generated in accordance with any of the methods.

Type: Grant

Filed: July 26, 2016

Date of Patent: November 23, 2021

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Roy M. Fejgin, Freddie Sanchez, Vinay Melkote, Michael Ward
PERCEPTUALLY-BASED LOSS FUNCTIONS FOR AUDIO ENCODING AND DECODING BASED ON MACHINE LEARNING

Publication number: 20210082444

Abstract: Computer-implemented methods for training a neural network, as well as for implementing audio encoders and decoders via trained neural networks, are provided. The neural network may receive an input audio signal, generate an encoded audio signal and decode the encoded audio signal. A loss function generating module may receive the decoded audio signal and a ground truth audio signal, and may generate a loss function value corresponding to the decoded audio signal. Generating the loss function value may involve applying a psychoacoustic model. The neural network may be trained based on the loss function value. The training may involve updating at least one weight of the neural network.

Type: Application

Filed: April 10, 2019

Publication date: March 18, 2021

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Roy M. FEJGIN, Grant A. DAVIDSON, Chih-Wei WU, Vivek KUMAR
Method and system for inter-channel coding

Patent number: 10553224

Abstract: A method for performing inter-channel encoding of a multi-channel audio signal comprising channel signals for N channels, with N being an integer, with N>1, is described. The method comprises determining a basic graph comprising the N channels as nodes and comprising directed edges between at least some of the N channels. Furthermore, the method comprises determining an inter-channel coding graph from the basic graph, such that the inter-channel coding graph is a directed acyclic graph, and such that a cumulated a cumulated cost of the signals of the nodes of the inter-channel coding graph is reduced.

Type: Grant

Filed: October 2, 2018

Date of Patent: February 4, 2020

Assignees: Dolby Laboratories Licensing Corporation, Dolby International AB

Inventors: Janusz Klejsa, Roy M. Fejgin, Mark S. Vinton
Method and System for Inter-Channel Coding

Publication number: 20190103119

Abstract: A method for performing inter-channel encoding of a multi-channel audio signal comprising channel signals for N channels, with N being an integer, with N>1, is described. The method comprises determining a basic graph comprising the N channels as nodes and comprising directed edges between at least some of the N channels. Furthermore, the method comprises determining an inter-channel coding graph from the basic graph, such that the inter-channel coding graph is a directed acyclic graph, and such that a cumulated a cumulated cost of the signals of the nodes of the inter-channel coding graph is reduced.

Type: Application

Filed: October 2, 2018

Publication date: April 4, 2019

Applicants: Dolby Laboratories Licensing Corporation, DOLBY INTERNATIONAL AB

Inventors: Janusz KLEJSA, SR., Roy M. FEJGIN, Mark S. VINTON
Audio segmentation based on spatial metadata

Patent number: 10068577

Abstract: A method of encoding adaptive audio, comprising receiving N objects and associated spatial metadata that describes the continuing motion of these objects, and partitioning the audio into segments based on the spatial metadata. The method encodes adaptive audio having objects and channel beds by capturing a continuing motion of a number N objects in a time-varying matrix trajectory comprising a sequence of matrices, coding coefficients of the time-varying matrix trajectory in spatial metadata to be transmitted via a high-definition audio format for rendering the adaptive audio through a number M output channels, and segmenting the sequence of matrices into a plurality of sub-segments based on the spatial metadata, wherein the plurality of sub-segments are configured to facilitate coding of one or more characteristics of the adaptive audio.

Type: Grant

Filed: April 23, 2015

Date of Patent: September 4, 2018

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Vinay Melkote, Malcolm James Law, Roy M. Fejgin
AUDIO DISCONTINUITY DETECTION AND CORRECTION

Publication number: 20180218749

Abstract: Methods for detecting whether a rendered version of a specified seamless connection (“SSC”) at a connection point between two audio segment sequences results in an audible discontinuity, and methods for analyzing at least one SSC between audio segment sequences to determine whether a renderable version of each SSC would have an audible discontinuity at the connection point when rendered, and in appropriate cases, for a SSC having a renderable version which is determined to have an audible discontinuity when rendered, correcting at least one audio segment of at least one segment sequence to be connected in accordance with the SSC in an effort to ensure that rendering of the SSC will result in seamless connection without an audible discontinuity. Other aspects are editing systems configured to implement any of the methods, and storage media and rendering systems which store audio data generated in accordance with any of the methods.

Type: Application

Filed: July 26, 2016

Publication date: August 2, 2018

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Roy M. FEJGIN, Freddie SANCHEZ, Vinay MELKOTE, Michael WARD
Audio Segmentation Based on Spatial Metadata

Publication number: 20170047071

Abstract: A method of encoding adaptive audio, comprising receiving N objects and associated spatial metadata that describes the continuing motion of these objects, and partitioning the audio into segments based on the spatial metadata. The method encodes adaptive audio having objects and channel beds by capturing a continuing motion of a number N objects in a time-varying matrix trajectory comprising a sequence of matrices, coding coefficients of the time-varying matrix trajectory in spatial metadata to be transmitted via a high-definition audio format for rendering the adaptive audio through a number M output channels, and segmenting the sequence of matrices into a plurality of sub-segments based on the spatial metadata, wherein the plurality of sub segments are configured to facilitate coding of one or more characteristics of the adaptive audio.

Type: Application

Filed: April 23, 2015

Publication date: February 16, 2017

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Vinay MELKOTE, Malcolm James LAW, Roy M. FEJGIN