Patents by Inventor Yuanxun KANG

Yuanxun KANG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and system for generating mixed voice data

Patent number: 11508397

Abstract: The present disclosure discloses a method and system for generating mixed voice data, and belongs to the technical field of voice recognition. In the method for generating mixed voice data according to the present disclosure, a pure voice and noise are collected first, normalization processing is performed on the collected voice data, randomization processing is performed on processed data, then GAIN processing is performed on the data, and finally filter processing is performed to obtain mixed voice data. The system for generating mixed voice data according to the present disclosure includes a collecting unit, a calculating unit, and a storage unit, the collecting unit being electrically connected to the calculating unit, and the calculating unit being connected to the storage unit through a data transmitting unit. The present disclosure provides the method and the system to meet the data requirement of deep learning.

Type: Grant

Filed: May 11, 2020

Date of Patent: November 22, 2022

Assignee: YEALINK (XIAMEN) NETWORK TECHNOLOGY CO., LTD.

Inventors: Yuanxun Kang, Zehuang Fang, Wanjian Feng
Method for constructing voice detection model and voice endpoint detection system

Patent number: 11295761

Abstract: The present disclosure discloses a method for constructing a voice detection model and a voice endpoint detection system, and belongs to the technical field of voice recognition. In the method for constructing a voice detection model according to the present disclosure, audio data is first collected and a mixed voice is synthesized, feature extraction is performed on the mixed voice to obtain a 62-dimensional feature, and then the 62-dimensional feature is input to a recurrent neural network (RNN) model for training to obtain a voice detection model. The voice endpoint detection system according to the present disclosure includes a collecting unit, a calculating unit, a transmitting unit, and a terminal, the collecting unit being electrically connected to the calculating unit, and the calculating unit and the terminal being respectively connected to the transmitting unit. The voice detection model can be applied to a real-time conference communication device.

Type: Grant

Filed: May 11, 2020

Date of Patent: April 5, 2022

Assignee: YEALINK (XIAMEN) NETWORK TECHNOLOGY CO., LTD.

Inventors: Zehuang Fang, Yuanxun Kang, Wanjian Feng
Methods and devices for RNN-based noise reduction in real-time conferences

Patent number: 11024324

Abstract: Disclosed herein is a method for RNN-based noise reduction in a real-time conference, comprising: performing frame-and-window for a speech signal to obtain a logarithmic spectrum of the speech signal, and placing the logarithmic spectrum into the RNN model to determine a noise reduction suppression coefficient, and then obtaining the denoised speech signal by applying the noise reduction suppression coefficient to the logarithmic spectrum of the original signal, thereby achieving utilization of the RNN noise reduction method in real-time conferences. In the present disclosure, when inputting the RNN model for estimation, only the logarithmic spectrum of the current frame needs to be inputted. The RNN model of the present disclosure has few requirements on inputted information, without performing huge preprocessing on the received speech signal, which in turn reduces computation burden, increases response speed, and enhances real-time performance.

Type: Grant

Filed: August 22, 2018

Date of Patent: June 1, 2021

Assignee: YEALINK (XIAMEN) NETWORK TECHNOLOGY CO., LTD.

Inventor: Yuanxun Kang
METHODS AND DEVICES FOR RNN-BASED NOISE REDUCTION IN REAL-TIME CONFERENCES

Publication number: 20210035594

Abstract: Disclosed herein is a method for RNN-based noise reduction in a real-time conference, comprising: performing frame-and-window for a speech signal to obtain a logarithmic spectrum of the speech signal, and placing the logarithmic spectrum into the RNN model to determine a noise reduction suppression coefficient, and then obtaining the denoised speech signal by applying the noise reduction suppression coefficient to the logarithmic spectrum of the original signal, thereby achieving utilization of the RNN noise reduction method in real-time conferences. In the present disclosure, when inputting the RNN model for estimation, only the logarithmic spectrum of the current frame needs to be inputted. The RNN model of the present disclosure has few requirements on inputted information, without performing huge preprocessing on the received speech signal, which in turn reduces computation burden, increases response speed, and enhances real-time performance.

Type: Application

Filed: August 22, 2018

Publication date: February 4, 2021

Applicant: YEALINK (XIAMEN) NETWORK TECHNOLOGY CO., LTD.

Inventor: Yuanxun Kang
METHOD AND SYSTEM FOR GENERATING MIXED VOICE DATA

Publication number: 20200365174

Abstract: The present disclosure discloses a method and system for generating mixed voice data, and belongs to the technical field of voice recognition. In the method for generating mixed voice data according to the present disclosure, a pure voice and noise are collected first, normalization processing is performed on the collected voice data, randomization processing is performed on processed data, then GAIN processing is performed on the data, and finally filter processing is performed to obtain mixed voice data. The system for generating mixed voice data according to the present disclosure includes a collecting unit, a calculating unit, and a storage unit, the collecting unit being electrically connected to the calculating unit, and the calculating unit being connected to the storage unit through a data transmitting unit. The present disclosure provides the method and the system to meet the data requirement of deep learning.

Type: Application

Filed: May 11, 2020

Publication date: November 19, 2020

Inventors: Yuanxun KANG, Zehuang FANG, Wanjian FENG
METHOD FOR CONSTRUCTING VOICE DETECTION MODEL AND VOICE ENDPOINT DETECTION SYSTEM

Publication number: 20200365173

Abstract: The present disclosure discloses a method for constructing a voice detection model and a voice endpoint detection system, and belongs to the technical field of voice recognition. In the method for constructing a voice detection model according to the present disclosure, audio data is first collected and a mixed voice is synthesized, feature extraction is performed on the mixed voice to obtain a 62-dimensional feature, and then the 62-dimensional feature is input to a recurrent neural network (RNN) model for training to obtain a voice detection model. The voice endpoint detection system according to the present disclosure includes a collecting unit, a calculating unit, a transmitting unit, and a terminal, the collecting unit being electrically connected to the calculating unit, and the calculating unit and the terminal being respectively connected to the transmitting unit. The voice detection model can be applied to a real-time conference communication device.

Type: Application

Filed: May 11, 2020

Publication date: November 19, 2020

Inventors: Zehuang FANG, Yuanxun KANG, Wanjian FENG

Method and system for generating mixed voice data

Method for constructing voice detection model and voice endpoint detection system

Methods and devices for RNN-based noise reduction in real-time conferences

METHODS AND DEVICES FOR RNN-BASED NOISE REDUCTION IN REAL-TIME CONFERENCES

METHOD AND SYSTEM FOR GENERATING MIXED VOICE DATA

METHOD FOR CONSTRUCTING VOICE DETECTION MODEL AND VOICE ENDPOINT DETECTION SYSTEM