Patents by Inventor Ju-Chiang Wang

Ju-Chiang Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

IMPLEMENTING IMPROVED AUDIO SOURCE SEPARATION

Publication number: 20250078857

Abstract: The present disclosure describes techniques for implementing improved audio source separation. A complex spectrum X is split into a plurality of K bands along a frequency axis by applying band-split operations on the complex spectrum X. The complex spectrum is a time-frequency representation of audio signals. Each of the plurality of K bands is denoted as Xk, k=1, . . . , K. Each band Xk comprises one or more frequency bins. Each individual multilayer perceptron is applied to each band Xk to extract latent representations and obtain outputs Hk0. A time-domain transformer and a frequency-domain transformer are applied on a stacked representation H0. Time-domain and frequency-domain transformers are repeatedly applying in an interleaved manner for L times to obtain HL output from the transformer blocks. The HL is input into a multi-band mask estimation sub-model. A complex ideal ratio mask is generated based on outputs from the multi-band mask estimation sub-model.

Type: Application

Filed: August 31, 2023

Publication date: March 6, 2025

Inventors: Wei Tsung LU, Ju-Chiang WANG
IMPLEMENTING AUTOMATIC MUSIC AUDIO TRANSCRIPTION

Publication number: 20240404494

Abstract: The present disclosure describes techniques for implementing automatic music audio transcription. A deep neural network model may be configured. The deep neural network model comprises a spectral cross-attention sub-model configured to project a spectral representation of each time step t, denoted as St, into a set of latent arrays at the time step t, denoted as ?th, h representing an h-th iteration. The deep neutral network model comprises a plurality of latent transformers configured to perform self-attention on the set of latent arrays ?th. The deep neural network model further comprises a set of temporal transformers configured to enable communications between any pairs of latent arrays ?that different time steps. Training data may be augmented by randomly mixing a plurality of types of datasets comprising a vocal dataset and an instrument dataset. The deep neural network model may be trained using the augmented training data.

Type: Application

Filed: June 1, 2023

Publication date: December 5, 2024

Inventors: Wei Tsung LU, Ju-Chiang WANG, Yun-Ning HUNG
TRACKING BEATS AND DOWNBEATS OF VOICES IN REAL TIME

Publication number: 20240395231

Abstract: The present disclosure describes techniques for tracking beats and downbeats of audio, such as human voices, in real time. Audio may be received in real time. The audio may be split into a sequence of segments. A sequence of audio features representing the sequence of segments of the audio may be extracted. A continuous sequence of activations indicative of probabilities of beats or downbeats occurring in the sequence of segments of the audio may be generated using a machine learning model with causal mechanisms. Timings of the beats or the downbeats occurring in the sequence of segments of the audio may be determined based on the continuous sequence of activations by fusing local rhythmic information with respect to each instant segment with information indicative of beats or downbeats in previous segments among the sequence of segments.

Type: Application

Filed: May 23, 2023

Publication date: November 28, 2024

Inventors: Yun-Ning HUNG, Ju-Chiang WANG, Mojtaba HEYDARI
Supervised metric learning for music structure features

Patent number: 12106740

Abstract: Devices, systems, and methods related to implementing supervised metric learning during a training of a deep neural network model are disclosed herein. In examples, audio input may be received, where the audio input includes a plurality of song fragments from a plurality of songs. For each song fragment, an aligning function may be performed to center the song fragment based on determined beat information, thereby creating a plurality of aligned song fragments. For each song fragment of the plurality of song fragments, an embedding vector may be obtained from the deep neural network. Thus, a batch of aligned song fragments from the plurality of aligned song fragments may be selected, such that a training tuple may be selected. A loss metric may be generated based on the selected training tuple and one or more weights of the deep neural network model may be updated based on the loss metric.

Type: Grant

Filed: October 15, 2021

Date of Patent: October 1, 2024

Assignee: Lemon Inc.

Inventors: Ju-Chiang Wang, Jordan Smith, Wei Tsung Lu
System and method for training a transformer-in-transformer-based neural network model for audio data

Patent number: 11854558

Abstract: Devices, systems and methods related to causing an apparatus to generate music information of audio data using a transformer-based neural network model with a multilevel transformer for audio analysis, using a spectral and a temporal transformer, are disclosed herein.

Type: Grant

Filed: October 15, 2021

Date of Patent: December 26, 2023

Assignee: Lemon Inc.

Inventors: Wei Tsung Lu, Ju-Chiang Wang, Minz Won, Keunwoo Choi, Xuchen Song
NEURAL NETWORK MODEL FOR AUDIO TRACK LABEL GENERATION

Publication number: 20230386437

Abstract: System and methods directed to identifying music theory labels for audio tracks are described. More specifically, a first training set of audio portions may be generated from a plurality of audio tracks, segments within the plurality of audio tracks being labeled according to a plurality of music theory labels. A deep neural network model may then be trained using the first training set as an input, a first loss function for music theory label identifications of audio portions of the first training set, and a second loss function for segment boundary identifications within the audio portions of the first training set. In examples, the music theory label identifications and the segment boundary identifications are generated by the deep neural network model. A first audio track is received and segment boundary identifications and music theory labels for segments within the first audio track are generated using the deep neural network model.

Type: Application

Filed: May 26, 2022

Publication date: November 30, 2023

Inventors: Ju-Chiang Wang, Yun-Ning Hung, Jordan Smith
SUPERVISED METRIC LEARNING FOR MUSIC STRUCTURE FEATURES

Publication number: 20230121764

Abstract: Devices, systems, and methods related to implementing supervised metric learning during a training of a deep neural network model are disclosed herein. In examples, audio input may be received, where the audio input includes a plurality of song fragments from a plurality of songs. For each song fragment, an aligning function may be performed to center the song fragment based on determined beat information, thereby creating a plurality of aligned song fragments. For each song fragment of the plurality of song fragments, an embedding vector may be obtained from the deep neural network. Thus, a batch of aligned song fragments from the plurality of aligned song fragments may be selected, such that a training tuple may be selected. A loss metric may be generated based on the selected training tuple and one or more weights of the deep neural network model may be updated based on the loss metric.

Type: Application

Filed: October 15, 2021

Publication date: April 20, 2023

Inventors: Ju-Chiang Wang, Jordan Smith, Wei Tsung Lu
SYSTEM AND METHOD FOR TRAINING A TRANSFORMER-IN-TRANSFORMER-BASED NEURAL NETWORK MODEL FOR AUDIO DATA

Publication number: 20230124006

Abstract: Devices, systems and methods related to causing an apparatus to generate music information of audio data using a transformer-based neural network model with a multilevel transformer for audio analysis, using a spectral and a temporal transformer, are disclosed herein.

Type: Application

Filed: October 15, 2021

Publication date: April 20, 2023

Inventors: Wei Tsung Lu, Ju-Chiang Wang, Minz Won, Keunwoo Choi, Xuchen Song