Patents by Inventor Lie Lu

Lie Lu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Audio content identification

Patent number: 12340822

Abstract: A method of audio content identification includes using a two-stage classifier. The first stage includes previously-existing classifiers and the second stage includes a new classifier. The outputs of the first and second stages calculated over different time periods are combined to generate a steering signal. The final classification results from a combination of the steering signal and the outputs of the first and second stages. In this manner, a new classifier may be added without disrupting existing classifiers.

Type: Grant

Filed: August 18, 2021

Date of Patent: June 24, 2025

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Guiping Wang, Lie Lu
Processing object-based audio signals

Patent number: 12335715

Abstract: An audio processing system and method which calculates, based on spatial metadata of the audio object, a panning coefficient for each of the audio objects in relation to each of a plurality of predefined channel coverage zones. Converts the audio signal into submixes in relation to the predefined channel coverage zones based on the calculated panning coefficients and the audio objects. Each of the submixes indicating a sum of components of the plurality of the audio objects in relation to one of the predefined channel coverage zones. Generating a submix gain by applying an audio processing to each of the submix and controls an object gain applied to each of the audio objects. The object gain being as a function of the panning coefficients for each of the audio objects and the submix gains in relation to each of the predefined channel coverage zones.

Type: Grant

Filed: December 20, 2023

Date of Patent: June 17, 2025

Assignee: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Alan J. Seefeldt, Lie Lu, Chen Zhang
Methods, apparatus, and systems for detection and extraction of spatially-identifiable subband audio sources

Patent number: 12334098

Abstract: In an embodiment, a method comprises: transforming one or more frames of a two-channel time domain audio signal into a time-frequency domain representation including a plurality of time-frequency tiles, wherein the frequency domain of the time-frequency domain representation includes a plurality of frequency bins grouped into subbands. For each time-frequency tile, the method comprises: calculating spatial parameters and a level for the time-frequency tile; modifying the spatial parameters using shift and squeeze parameters; obtaining a softmask value for each frequency bin using the modified spatial parameters, the level and subband information; and applying the softmask values to the time-frequency tile to generate a modified time-frequency tile of an estimated audio source.

Type: Grant

Filed: June 11, 2021

Date of Patent: June 17, 2025

Assignees: Dolby Laboratories Licensing Corporation, DOLBY INTERNATIONAL AB

Inventors: Aaron Steven Master, Lie Lu, Harald Mundt
SOURCE SEPARATION COMBINING SPATIAL AND SOURCE CUES

Publication number: 20250191604

Abstract: The present disclosure relates to a method and system for processing audio for source separation. The method comprises obtaining an input audio signal (A) comprising at least two channels and processing the input audio signal (A) with a spatial cue based separation module (10) to obtain an intermediate audio signal (B). The spatial cue based separation module (10) is configured to determine a mixing parameter of the at least two channels of the input audio signal (A) and modify the channels, based on the mixing parameter, to obtain the intermediate audio signal (B). The method further comprises processing the intermediate audio signal (B) with a source cue based separation module (20) to generate an output audio signal (C), wherein the source cue based separation module (20) is configured to implement a neural network trained to predict a noise reduced output audio signal (C) given the intermediate audio signal (B).

Type: Application

Filed: March 17, 2023

Publication date: June 12, 2025

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Aaron Steven MASTER, Lie LU
MULTICHANNEL AND MULTI-STREAM SOURCE SEPARATION VIA MULTI-PAIR PROCESSING

Publication number: 20250182774

Abstract: A method and system for separating a target audio source from a multi-channel audio input including N audio signals, N>=3. The N audio signals are combined into at least two unique signal pairs, and pairwise source separation is performed on each signal pair to generate at least two processed signal pairs, each processed signal pair including source separated versions of the audio signals in the signal pair. The at least two processed signal pairs are combined to form the target audio source having N target audio signals corresponding to the N audio signals.

Type: Application

Filed: March 17, 2023

Publication date: June 5, 2025

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Aaron Steven MASTER, Lie LU, Scott Gregory NORCROSS
TARGET MID-SIDE SIGNALS FOR AUDIO APPLICATIONS

Publication number: 20250184681

Abstract: The present disclosure relates to a method and audio processing arrangement for extracting a target mid (and optionally a target side) audio signal from a stereo audio signal. The method comprises obtaining (S1) a plurality of consecutive time segments of the stereo audio signal and obtaining (S2), for each of a plurality of frequency bands of each time segment of the stereo audio signal, at least one of a target panning parameter (?) and a target phase difference parameter (?). The method further comprises extracting (S3), for each time segment and each frequency band, a partial mid signal representation (211, 212) based on at least one of the target panning parameter (?) and the target phase difference parameter (?) of each frequency band and forming (S4) the target mid audio signal (M) by combining the partial mid signal representations (211, 212) for each frequency band and time segment.

Type: Application

Filed: March 3, 2023

Publication date: June 5, 2025

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Aaron Steven MASTER, Lie LU
CONTROL OF A VOLUME LEVELING UNIT USING TWO-STAGE NOISE CLASSIFIER

Publication number: 20250166652

Abstract: Volume leveling of an audio signal using a volume leveling control signal. The method comprises determining a noise reliability ratio w(n) as a ratio of noise-like frames over all frames in a current time segment, determining a PGC noise confidence score XPGN(n) indicating a likelihood that professionally generated content, PGC, noise is present in the time segment, and determining, for the time segment, whether the noise reliability ratio is above a predetermined threshold. When the noise reliability ratio is above the predetermined threshold, the volume leveling control signal is updated based on the PGC noise confidence score, and when the noise reliability ratio is below the predetermined threshold, the volume leveling control signal is left unchanged. Volume leveling is improved by preventing boosting of e.g. phone-recorded environmental noise in UGC, while keeping original behavior for other types of content.

Type: Application

Filed: February 6, 2023

Publication date: May 22, 2025

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Ziyu YANG, Lie LU, Zhiwei SHUANG
METHOD, APPARATUS OR SYSTEMS FOR PROCESSING AUDIO OBJECTS

Publication number: 20250142285

Abstract: Diffuse or spatially large audio objects may be identified for special processing. A decorrelation process may be performed on audio signals corresponding to the large audio objects to produce decorrelated large audio object audio signals. These decorrelated large audio object audio signals may be associated with object locations, which may be stationary or time-varying locations. For example, the decorrelated large audio object audio signals may be rendered to virtual or actual speaker locations. The output of such a rendering process may be input to a scene simplification process. The decorrelation, associating and/or scene simplification processes may be performed prior to a process of encoding the audio data.

Type: Application

Filed: January 7, 2025

Publication date: May 1, 2025

Applicants: Dolby Laboratories Licensing Corporation, Dolby International AB

Inventors: Dirk Jeroen BREEBAART, Lie LU, Nicolas R. TSINGOS, Antonio MATEOS SOLE
VOLUME LEVELER CONTROLLER AND CONTROLLING METHOD

Publication number: 20250062736

Abstract: Volume leveler controller and controlling method are disclosed. In one embodiment, A volume leveler controller includes an audio content classifier for identifying the content type of an audio signal in real time; and an adjusting unit for adjusting a volume leveler in a continuous manner based on the content type as identified. The adjusting unit may configured to positively correlate the dynamic gain of the volume leveler with informative content types of the audio signal, and negatively correlate the dynamic gain of the volume leveler with interfering content types of the audio signal.

Type: Application

Filed: November 4, 2024

Publication date: February 20, 2025

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Jun WANG, Lie LU, Alan J. SEEFELDT
METHOD FOR NEURAL NETWORK TRAINING WITH MULTIPLE SUPERVISORS

Publication number: 20250045585

Abstract: The present disclosure relates to a method for designing a processor (20) and a computer implemented neural network. The method comprises obtaining input data and corresponding ground truth target data and providing the input data to a processor (20) for outputting a first prediction of target data given the input data. The method further comprises providing the latent variables output by a processor module (21: 1, 21: 2, . . . 21: n?1) to a supervisor module (22: 1, 22: 2, 22: 3, . . . 22: n?1) which outputs a second prediction of target data based on latent variables and determining a first and second loss measure by comparing the predictions of target data with the ground truth target data. The method further comprises training the processor (20) and the supervisor module (22: 1, 22: 2, 22: 3, . . . 22: n?1) based on the first and second loss measure and adjusting the processor by at least one of removing, replacing and adding a processor module.

Type: Application

Filed: December 8, 2022

Publication date: February 6, 2025

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Jundai SUN, Lie LU, Zhiwei SHUANG, Yuanxing MA
CONTROL OF SPEECH PRESERVATION IN SPEECH ENHANCEMENT

Publication number: 20250037729

Abstract: A method for performing denoising on audio signals is provided. In some implementations, the method involves determining an aggressiveness control parameter value that modulates a degree of speech preservation to be applied. In some implementations, the method involves obtaining a training set of training samples, a training sample having a noisy audio signal and a target denoising mask. In some implementations, the method involves training a machine learning model, wherein the trained machine learning model is usable to take, as an input, a noisy test audio signal and generate a corresponding denoised test audio signal, and wherein the aggressiveness control parameter value is used for: 1) generating a frequency domain representation of the noisy audio signals included in the training set: 2) modifying the target denoising masks: 3) determining an architecture of the machine learning model: or determining a loss during training of the machine learning model.

Type: Application

Filed: November 8, 2022

Publication date: January 30, 2025

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Jundai Sun, Lie Lu
Method, apparatus or systems for processing audio objects

Patent number: 12212953

Abstract: Diffuse or spatially large audio objects may be identified for special processing. A decorrelation process may be performed on audio signals corresponding to the large audio objects to produce decorrelated large audio object audio signals. These decorrelated large audio object audio signals may be associated with object locations, which may be stationary or time-varying locations. For example, the decorrelated large audio object audio signals may be rendered to virtual or actual speaker locations. The output of such a rendering process may be input to a scene simplification process. The decorrelation, associating and/or scene simplification processes may be performed prior to a process of encoding the audio data.

Type: Grant

Filed: July 10, 2023

Date of Patent: January 28, 2025

Assignees: Dolby Laboratories Licensing Corporation, Dolby International AB

Inventors: Dirk Jeroen Breebaart, Lie Lu, Nicolas R. Tsingos, Antonio Mateos Sole
Headphone rendering metadata-preserving spatial coding

Patent number: 12177647

Abstract: Systems and methods for preserving headphone rendering mode (HRM) in object clustering are described. In an embodiment, an object-based audio data processing system includes a processor configured to receive a plurality of audio objects, wherein an audio object of the plurality of audio objects is associated with respective object metadata that indicates respective spatial position information and an HRM; determine a plurality of cluster positions by applying an extended hybrid distance metric to a spatial coding algorithm to calculate a partial loudness for each of the audio objects; render the audio objects to the cluster positions to form a plurality of clusters by applying the extended hybrid distance metric to the spatial coding algorithm to calculate object-to-cluster gains; and transmit the clusters to a spatial reproduction system.

Type: Grant

Filed: September 8, 2022

Date of Patent: December 24, 2024

Assignees: DOLBY LABORATORIES LICENSING CORPORATION, DUBLIN INTERNATIONAL AB

Inventors: Ziyu Yang, Lie Lu, Heiko Purnhagen, Jeremy Grant Stoddard, Dirk Jeroen Breebaart
Volume leveler controller and controlling method

Patent number: 12166460

Abstract: Volume leveler controller and controlling method are disclosed. In one embodiment, A volume leveler controller includes an audio content classifier for identifying the content type of an audio signal in real time; and an adjusting unit for adjusting a volume leveler in a continuous manner based on the content type as identified. The adjusting unit may configured to positively correlate the dynamic gain of the volume leveler with informative content types of the audio signal, and negatively correlate the dynamic gain of the volume leveler with interfering content types of the audio signal.

Type: Grant

Filed: July 20, 2023

Date of Patent: December 10, 2024

Assignee: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Jun Wang, Lie Lu, Alan J. Seefeldt
DETECTING ENVIRONMENTAL NOISE IN USER-GENERATED CONTENT

Publication number: 20240355348

Abstract: A method of audio processing includes classifying an audio signal as noise or as non-noise using a first model. For a noise signal. the audio signal is classified as user-generated content (UGC) noise or as professionally-generated content (PGC) noise using a second model. For a non-noise signal or PGC noise. the audio signal is processed using a first audio processing process. For UGC noise. the audio signal is processed using a second audio processing process.

Type: Application

Filed: August 23, 2022

Publication date: October 24, 2024

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Ziyu Yang, Zhiwei Shuang, Lie Lu
Dialog detector

Patent number: 12118987

Abstract: The present application relates to a method of extracting audio features in a dialog detector in response to an input audio signal, the method comprising dividing the input audio signal into a plurality of frames, extracting frame audio features from each frame, determining a set of context windows, each context window including a number of frames surrounding a current frame, deriving, for each context window, a relevant context audio feature for the current frame based on the frame audio features of the frames in each respective context, and concatenating each context audio feature to form a combined feature vector to represent the current frame. The context windows with the different length can improve the response speed and improve robustness.

Type: Grant

Filed: April 13, 2020

Date of Patent: October 15, 2024

Inventors: Lie Lu, Xin Liu
HEADPHONE RENDERING METADATA-PRESERVING SPATIAL CODING

Publication number: 20240334146

Abstract: Systems and methods for preserving headphone rendering mode (HRM) in object clustering are described. In an embodiment, an object-based audio data processing system includes a processor configured to receive a plurality of audio objects, wherein an audio object of the plurality of audio objects is associated with respective object metadata that indicates respective spatial position information and an HRM; determine a plurality of cluster positions by applying an extended hybrid distance metric to a spatial coding algorithm to calculate a partial loudness for each of the audio objects; render the audio objects to the cluster positions to form a plurality of clusters by applying the extended hybrid distance metric to the spatial coding algorithm to calculate object-to-cluster gains; and transmit the clusters to a spatial reproduction system.

Type: Application

Filed: September 8, 2022

Publication date: October 3, 2024

Applicants: DOLBY LABORATORIES LICENSING CORPORATION, DOLBY INTERNATIONAL AB

Inventors: Ziyu Yang, Lie Lu, Heiko Purnhagen, Jeremy Grant Stoddard, Dirk Jeroen Breebaart
Method and apparatus for speech source separation based on a convolutional neural network

Patent number: 12073828

Abstract: Described herein is a method for Convolutional Neural Network (CNN) based speech source separation, wherein the method includes the steps of: (a) providing multiple frames of a time-frequency transform of an original noisy speech signal; (b) inputting the time-frequency transform of said multiple frames into an aggregated multi-scale CNN having a plurality of parallel convolution paths; (c) extracting and outputting, by each parallel convolution path, features from the input time-frequency transform of said multiple frames; (d) obtaining an aggregated output of the outputs of the parallel convolution paths; and (e) generating an output mask for extracting speech from the original noisy speech signal based on the aggregated output. Described herein are further an apparatus for CNN based speech source separation as well as a respective computer program product comprising a computer-readable storage medium with instructions adapted to carry out said method when executed by a device having processing capability.

Type: Grant

Filed: May 13, 2020

Date of Patent: August 27, 2024

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Jundai Sun, Zhiwei Shuang, Lie Lu, Shaofan Yang, Jia Dai
PROCESSING OBJECT-BASED AUDIO SIGNALS

Publication number: 20240205629

Abstract: An audio processing system and method which calculates, based on spatial metadata of the audio object, a panning coefficient for each of the audio objects in relation to each of a plurality of predefined channel coverage zones. Converts the audio signal into submixes in relation to the predefined channel coverage zones based on the calculated panning coefficients and the audio objects. Each of the submixes indicating a sum of components of the plurality of the audio objects in relation to one of the predefined channel coverage zones. Generating a submix gain by applying an audio processing to each of the submix and controls an object gain applied to each of the audio objects. The object gain being as a function of the panning coefficients for each of the audio objects and the submix gains in relation to each of the predefined channel coverage zones.

Type: Application

Filed: December 20, 2023

Publication date: June 20, 2024

Applicant: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Alan J. SEEFELDT, Lie LU, Chen ZHANG
CLUSTERING AUDIO OBJECTS

Publication number: 20240187807

Abstract: A method for clustering audio objects may involve identifying a plurality of audio objects, wherein each audio object of the plurality of audio objects is associated with respective metadata that indicates respective spatial position information and respective rendering metadata. The method may involve assigning audio objects of the plurality of audio objects to categories of rendering metadata of a plurality of categories of rendering metadata, wherein at least one category of rendering metadata comprises a plurality of types of rendering metadata to be preserved. The method may involve determining an allocation of a plurality of audio object clusters to each category of rendering metadata. The method may involve rendering audio objects of the plurality of audio objects to an allocated plurality of audio object clusters based on the metadata that indicates spatial position information and based on the assignments of the audio objects to the categories of rendering metadata.

Type: Application

Filed: February 15, 2022

Publication date: June 6, 2024

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Ziyu Yang, Lie Lu

1 2 3 4 5 … next