Patents by Inventor Ju-Chiang Wang

Ju-Chiang Wang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250078857
    Abstract: The present disclosure describes techniques for implementing improved audio source separation. A complex spectrum X is split into a plurality of K bands along a frequency axis by applying band-split operations on the complex spectrum X. The complex spectrum is a time-frequency representation of audio signals. Each of the plurality of K bands is denoted as Xk, k=1, . . . , K. Each band Xk comprises one or more frequency bins. Each individual multilayer perceptron is applied to each band Xk to extract latent representations and obtain outputs Hk0. A time-domain transformer and a frequency-domain transformer are applied on a stacked representation H0. Time-domain and frequency-domain transformers are repeatedly applying in an interleaved manner for L times to obtain HL output from the transformer blocks. The HL is input into a multi-band mask estimation sub-model. A complex ideal ratio mask is generated based on outputs from the multi-band mask estimation sub-model.
    Type: Application
    Filed: August 31, 2023
    Publication date: March 6, 2025
    Inventors: Wei Tsung LU, Ju-Chiang WANG
  • Publication number: 20240404494
    Abstract: The present disclosure describes techniques for implementing automatic music audio transcription. A deep neural network model may be configured. The deep neural network model comprises a spectral cross-attention sub-model configured to project a spectral representation of each time step t, denoted as St, into a set of latent arrays at the time step t, denoted as ?th, h representing an h-th iteration. The deep neutral network model comprises a plurality of latent transformers configured to perform self-attention on the set of latent arrays ?th. The deep neural network model further comprises a set of temporal transformers configured to enable communications between any pairs of latent arrays ?that different time steps. Training data may be augmented by randomly mixing a plurality of types of datasets comprising a vocal dataset and an instrument dataset. The deep neural network model may be trained using the augmented training data.
    Type: Application
    Filed: June 1, 2023
    Publication date: December 5, 2024
    Inventors: Wei Tsung LU, Ju-Chiang WANG, Yun-Ning HUNG
  • Publication number: 20240395231
    Abstract: The present disclosure describes techniques for tracking beats and downbeats of audio, such as human voices, in real time. Audio may be received in real time. The audio may be split into a sequence of segments. A sequence of audio features representing the sequence of segments of the audio may be extracted. A continuous sequence of activations indicative of probabilities of beats or downbeats occurring in the sequence of segments of the audio may be generated using a machine learning model with causal mechanisms. Timings of the beats or the downbeats occurring in the sequence of segments of the audio may be determined based on the continuous sequence of activations by fusing local rhythmic information with respect to each instant segment with information indicative of beats or downbeats in previous segments among the sequence of segments.
    Type: Application
    Filed: May 23, 2023
    Publication date: November 28, 2024
    Inventors: Yun-Ning HUNG, Ju-Chiang WANG, Mojtaba HEYDARI
  • Patent number: 12106740
    Abstract: Devices, systems, and methods related to implementing supervised metric learning during a training of a deep neural network model are disclosed herein. In examples, audio input may be received, where the audio input includes a plurality of song fragments from a plurality of songs. For each song fragment, an aligning function may be performed to center the song fragment based on determined beat information, thereby creating a plurality of aligned song fragments. For each song fragment of the plurality of song fragments, an embedding vector may be obtained from the deep neural network. Thus, a batch of aligned song fragments from the plurality of aligned song fragments may be selected, such that a training tuple may be selected. A loss metric may be generated based on the selected training tuple and one or more weights of the deep neural network model may be updated based on the loss metric.
    Type: Grant
    Filed: October 15, 2021
    Date of Patent: October 1, 2024
    Assignee: Lemon Inc.
    Inventors: Ju-Chiang Wang, Jordan Smith, Wei Tsung Lu
  • Patent number: 11854558
    Abstract: Devices, systems and methods related to causing an apparatus to generate music information of audio data using a transformer-based neural network model with a multilevel transformer for audio analysis, using a spectral and a temporal transformer, are disclosed herein.
    Type: Grant
    Filed: October 15, 2021
    Date of Patent: December 26, 2023
    Assignee: Lemon Inc.
    Inventors: Wei Tsung Lu, Ju-Chiang Wang, Minz Won, Keunwoo Choi, Xuchen Song
  • Publication number: 20230386437
    Abstract: System and methods directed to identifying music theory labels for audio tracks are described. More specifically, a first training set of audio portions may be generated from a plurality of audio tracks, segments within the plurality of audio tracks being labeled according to a plurality of music theory labels. A deep neural network model may then be trained using the first training set as an input, a first loss function for music theory label identifications of audio portions of the first training set, and a second loss function for segment boundary identifications within the audio portions of the first training set. In examples, the music theory label identifications and the segment boundary identifications are generated by the deep neural network model. A first audio track is received and segment boundary identifications and music theory labels for segments within the first audio track are generated using the deep neural network model.
    Type: Application
    Filed: May 26, 2022
    Publication date: November 30, 2023
    Inventors: Ju-Chiang Wang, Yun-Ning Hung, Jordan Smith
  • Publication number: 20230121764
    Abstract: Devices, systems, and methods related to implementing supervised metric learning during a training of a deep neural network model are disclosed herein. In examples, audio input may be received, where the audio input includes a plurality of song fragments from a plurality of songs. For each song fragment, an aligning function may be performed to center the song fragment based on determined beat information, thereby creating a plurality of aligned song fragments. For each song fragment of the plurality of song fragments, an embedding vector may be obtained from the deep neural network. Thus, a batch of aligned song fragments from the plurality of aligned song fragments may be selected, such that a training tuple may be selected. A loss metric may be generated based on the selected training tuple and one or more weights of the deep neural network model may be updated based on the loss metric.
    Type: Application
    Filed: October 15, 2021
    Publication date: April 20, 2023
    Inventors: Ju-Chiang Wang, Jordan Smith, Wei Tsung Lu
  • Publication number: 20230124006
    Abstract: Devices, systems and methods related to causing an apparatus to generate music information of audio data using a transformer-based neural network model with a multilevel transformer for audio analysis, using a spectral and a temporal transformer, are disclosed herein.
    Type: Application
    Filed: October 15, 2021
    Publication date: April 20, 2023
    Inventors: Wei Tsung Lu, Ju-Chiang Wang, Minz Won, Keunwoo Choi, Xuchen Song