Patents by Inventor Kai Zhen

Kai Zhen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Apparatus and method for speech processing using a densely connected hybrid neural network

Patent number: 11837220

Abstract: Disclosed is a speech processing apparatus and method using a densely connected hybrid neural network. The speech processing method includes inputting a time domain sample of N*1 dimension for an input speech into a densely connected hybrid network; passing the time domain sample through a plurality of dense blocks in a densely connected hybrid network; reshaping the time domain samples into M subframes by passing the time domain samples through the plurality of dense blocks; inputting the M subframes into gated recurrent unit (GRU) components of N/M-dimension; outputting clean speech from which noise is removed from the input speech by passing the M subframes through GRU components.

Type: Grant

Filed: May 5, 2021

Date of Patent: December 5, 2023

Assignees: Electronics and Telecommunications Research Institute, The Trustees of Indiana University

Inventors: Minje Kim, Mi Suk Lee, Seung Kwon Beack, Jongmo Sung, Tae Jin Lee, Jin Soo Choi, Kai Zhen
Method and apparatus for processing audio signal

Patent number: 11790926

Abstract: A method and apparatus for processing an audio signal are disclosed. According to an example embodiment, a method of processing an audio signal may include acquiring a final audio signal for an initial audio signal using a plurality of neural network models generating output audio signals by encoding and decoding input audio signals, calculating a difference between the initial audio signal and the final audio signal in a time domain, converting the initial audio signal and the final audio signal into Mel-spectra, calculating a difference between the Mel-spectra of the initial audio signal and the final audio signal in a frequency domain, training the plurality of neural network models based on results calculated in the time domain and the frequency domain, and generating a new final audio signal distinguished from the final audio signal from the initial audio signal using the trained neural network models.

Type: Grant

Filed: January 22, 2021

Date of Patent: October 17, 2023

Assignees: Electronics and Telecommunications Research Institute, The Trustees of Indiana University

Inventors: Mi Suk Lee, Seung Kwon Beack, Jongmo Sung, Tae Jin Lee, Jin Soo Choi, Minje Kim, Kai Zhen
Residual coding method of linear prediction coding coefficient based on collaborative quantization, and computing device for performing the method

Patent number: 11488613

Abstract: Disclosed are a method for coding a residual signal of LPC coefficients based on collaborative quantization and a computing device for performing the method. The residual signal coding method includes: generating encoded LPC coefficients and LPC residual signals by performing LPC analysis and quantization on an input speech; Determining a predicted LPC residual signal by applying the LPC residual signal to cross module residual learning; Performing LPC synthesis using the coded LPC coefficients and the predicted LPC residual signal; It may include the step of determining an output speech that is a synthesized output according to a result of performing the LPC synthesis.

Type: Grant

Filed: November 13, 2020

Date of Patent: November 1, 2022

Assignees: Electronics and Telecommunications Research Institute, The Trustees of Indiana University

Inventors: Minje Kim, Kai Zhen, Mi Suk Lee, Seung Kwon Beack, Jongmo Sung, Tae Jin Lee, Jin Soo Choi
Audio signal encoding method and apparatus and audio signal decoding method and apparatus using psychoacoustic-based weighted error function

Patent number: 11416742

Abstract: Provided is a training method of a neural network that is applied to an audio signal encoding method using an audio signal encoding apparatus, the training method including generating a masking threshold of a first audio signal before training is performed, calculating a weight matrix to be applied to a frequency component of the first audio signal based on the masking threshold, generating a weighted error function obtained by correcting a preset error function using the weight matrix, and generating a second audio signal by applying a parameter learned using the weighted error function to the first audio signal.

Type: Grant

Filed: September 5, 2018

Date of Patent: August 16, 2022

Assignees: Electronics and Telecommunications Research Institute, THE TRUSTEES OF INDIANA UNIVERSITY

Inventors: Jongmo Sung, Minje Kim, Aswin Sivaraman, Kai Zhen
Audio signal encoding method and audio signal decoding method, and encoder and decoder performing the same

Patent number: 11276413

Abstract: Disclosed are an audio signal encoding method and audio signal decoding method, and an encoder and decoder performing the same. The audio signal encoding method includes applying an audio signal to a training model including N autoencoders provided in a cascade structure, encoding an output result derived through the training model, and generating a bitstream with respect to the audio signal based on the encoded output result.

Type: Grant

Filed: August 16, 2019

Date of Patent: March 15, 2022

Assignees: Electronics and Telecommunications Research Institute, THE TRUSTEES OF INDIANA UNIVERSITY

Inventors: Mi Suk Lee, Jongmo Sung, Minje Kim, Kai Zhen
APPARATUS AND METHOD FOR SPEECH PROCESSING USING A DENSELY CONNECTED HYBRID NEURAL NETWORK

Publication number: 20210350796

Abstract: Disclosed is a speech processing apparatus and method using a densely connected hybrid neural network. The speech processing method includes inputting a time domain sample of N*1 dimension for an input speech into a densely connected hybrid network; passing the time domain sample through a plurality of dense blocks in a densely connected hybrid network; reshaping the time domain samples into M subframes by passing the time domain samples through the plurality of dense blocks, inputting the M subframes into gated recurrent unit (GRU) components of N/M-dimension; outputting clean speech from which noise is removed from the input speech by passing the M subframes through GRU components.

Type: Application

Filed: May 5, 2021

Publication date: November 11, 2021

Applicants: Electronics and Telecommunications Research Institute, The Trustees of Indiana University

Inventors: Minje KIM, Mi Suk LEE, Seung Kwon BEACK, Jongmo SUNG, Tae Jin LEE, Jin Soo CHOI, Kai ZHEN
METHOD AND APPARATUS FOR PROCESSING AUDIO SIGNAL

Publication number: 20210233547

Abstract: A method and apparatus for processing an audio signal are disclosed. According to an example embodiment, a method of processing an audio signal may include acquiring a final audio signal for an initial audio signal using a plurality of neural network models generating output audio signals by encoding and decoding input audio signals, calculating a difference between the initial audio signal and the final audio signal in a time domain, converting the initial audio signal and the final audio signal into Mel-spectra, calculating a difference between the Mel-spectra of the initial audio signal and the final audio signal in a frequency domain, training the plurality of neural network models based on results calculated in the time domain and the frequency domain, and generating a new final audio signal distinguished from the final audio signal from the initial audio signal using the trained neural network models.

Type: Application

Filed: January 22, 2021

Publication date: July 29, 2021

Applicants: Electronics and Telecommunications Research Institute, The Trustees of Indiana University

Inventors: Mi Suk LEE, Seung Kwon BEACK, Jongmo SUNG, Tae Jin LEE, Jin Soo CHOI, Minje KIM, Kai ZHEN
RESIDUAL CODING METHOD OF LINEAR PREDICTION CODING COEFFICIENT BASED ON COLLABORATIVE QUANTIZATION, AND COMPUTING DEVICE FOR PERFORMING THE METHOD

Publication number: 20210142812

Abstract: Disclosed are a method for coding a residual signal of LPC coefficients based on collaborative quantization and a computing device for performing the method. The residual signal coding method includes: generating encoded LPC coefficients and LPC residual signals by performing LPC analysis and quantization on an input speech; Determining a predicted LPC residual signal by applying the LPC residual signal to cross module residual learning; Performing LPC synthesis using the coded LPC coefficients and the predicted LPC residual signal; It may include the step of determining an output speech that is a synthesized output according to a result of performing the LPC synthesis.

Type: Application

Filed: November 13, 2020

Publication date: May 13, 2021

Applicants: Electronics and Telecommunications Research Institute, The Trustees of Indiana University

Inventors: Minje KIM, Kai ZHEN, Mi Suk LEE, Seung Kwon BEACK, Jongmo SUNG, Tae Jin LEE, Jin Soo CHOI
AUDIO SIGNAL ENCODING METHOD AND AUDIO SIGNAL DECODING METHOD, AND ENCODER AND DECODER PERFORMING THE SAME

Publication number: 20200135220

Abstract: Disclosed are an audio signal encoding method and audio signal decoding method, and an encoder and decoder performing the same. The audio signal encoding method includes applying an audio signal to a training model including N autoencoders provided in a cascade structure, encoding an output result derived through the training model, and generating a bitstream with respect to the audio signal based on the encoded output result.

Type: Application

Filed: August 16, 2019

Publication date: April 30, 2020

Applicants: Electronics and Telecommunications Research Institute, THE TRUSTEES OF INDIANA UNIVERSITY

Inventors: Mi Suk LEE, Jongmo SUNG, Minje KIM, Kai ZHEN
AUDIO SIGNAL ENCODING METHOD AND APPARATUS AND AUDIO SIGNAL DECODING METHOD AND APPARATUS USING PSYCHOACOUSTIC-BASED WEIGHTED ERROR FUNCTION

Publication number: 20190164052

Abstract: Provided is a training method of a neural network that is applied to an audio signal encoding method using an audio signal encoding apparatus, the training method including generating a masking threshold of a first audio signal before training is performed, calculating a weight matrix to be applied to a frequency component of the first audio signal based on the masking threshold, generating a weighted error function obtained by correcting a preset error function using the weight matrix, and generating a second audio signal by applying a parameter learned using the weighted error function to the first audio signal.

Type: Application

Filed: September 5, 2018

Publication date: May 30, 2019

Applicants: Electronics and Telecommunications Research Institute, THE TRUSTEES OF INDIANA UNIVERSITY

Inventors: Jongmo SUNG, Minje KIM, Aswin Sivaraman, Kai Zhen