Patents by Inventor Qiongqiong WANG
Qiongqiong WANG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240038244Abstract: A speaker subset selection means 81 selects speakers corresponding to an attribute from subset information of an entire speaker to determine a subset of a speech model from which test utterance is identified. A speaker identification means 82 identifies a speaker of the test utterance from a subset of the determined speech model based on features extracted from the test utterance.Type: ApplicationFiled: December 25, 2020Publication date: February 1, 2024Applicant: NEC CorporationInventors: Qiongqiong Wang, Takafumi KOSHINAKA
-
Publication number: 20230368809Abstract: A speech enhancement means 81 determines an enhancement mask generated based on a mask for speech enhancement, when a test utterance is input as speech data. A first hyper-parameter optimization means 82 determines, when the test utterance is input, a first hyper-parameter which is a hyper-parameter representing the degree to which the signal representing the test utterance is kept using the mask, and the first hyper-parameter which is set to take into account a downstream task that is processed using an enhanced test utterance. A mask generation means 83 generates an adaptive mask from the determined enhancement mask and the first hyper-parameter that enhances the test utterance for the downstream task. The mask generation means 83 generates the adaptive mask in which the first hyper-parameter is a power of the mask.Type: ApplicationFiled: October 15, 2020Publication date: November 16, 2023Applicant: NEC CorporationInventors: Qiongqiong WANG, Takafumi Koshinaka
-
Patent number: 11817103Abstract: Provided is a pattern recognition apparatus to provide classification robustness to any kind of domain variability. The pattern recognition apparatus 500 based on Neural Network (NN) includes: NN training unit 501 that trains an NN model to generate NN parameters, based on at least one first feature vector and at least one domain vector indicating one of subsets in a specific domain, wherein, the first feature vector is extracted from each of the subsets, the domain vector indicates an identifier corresponding to the each of the subsets; and NN verification unit 502 that verifies a pair of second feature vectors in the specific domain to output whether the pair indicates same individual or not, based on a target domain vector and the NN parameters.Type: GrantFiled: September 15, 2017Date of Patent: November 14, 2023Assignee: NEC CORPORATIONInventors: Qiongqiong Wang, Takafumi Koshinaka
-
Patent number: 11798564Abstract: A spoofing detection apparatus 100 includes a multi-channel spectrogram creation unit 10 and an evaluation unit 40. The multi-channel spectrogram creation unit 10 extracts different type of spectrograms from speech data and integrates the different type of spectrograms to create a multi-channel spectrogram. The evaluation unit 40 evaluates the created multi-channel spectrogram by applying the created multi-channel spectrogram to a classifier constructed using labeled multi-channel spectrograms as training data and classifies it to either genuine or spoof.Type: GrantFiled: June 28, 2019Date of Patent: October 24, 2023Assignee: NEC CORPORATIONInventors: Qiongqiong Wang, Kong Aik Lee, Takafumi Koshinaka
-
Patent number: 11600273Abstract: The speech processing apparatus 100 includes an air microphone speech recognition unit 101 which recognizes speech from an air microphone 200 acquiring speech through air, a wearable microphone speech recognition unit 102 which recognizes speech from a wearable microphone 300, a sensing unit 103 which measures environmental conditions, a weight decision unit 104 which calculates the weights for recognition results of the air microphone speech recognition unit 101 and the wearable microphone speech recognition unit 102 on the basis of the environmental conditions, and a combination unit 105 which combines the recognition results outputted from the air microphone speech recognition unit 101 and the wearable microphone speech recognition unit 102, using the weights.Type: GrantFiled: February 14, 2018Date of Patent: March 7, 2023Assignee: NEC CORPORATIONInventors: Qiongqiong Wang, Takafumi Koshinaka
-
Patent number: 11580967Abstract: A speech feature extraction apparatus 100 includes a voice activity detection unit 103 that drops non-voice frames from frames corresponding to an input speech utterance, and calculates a posterior of being voiced for each frame, a voice activity detection process unit 106 calculates a function value as weights in pooling frames to produce an utterance-level feature, from a given a voice activity detection posterior, and an utterance-level feature extraction unit 112 that extracts an utterance-level feature, from the frame on a basis of multiple frame-level features, using the function values.Type: GrantFiled: June 29, 2018Date of Patent: February 14, 2023Assignee: NEC CORPORATIONInventors: Qiongqiong Wang, Koji Okabe, Kong Aik Lee, Takafumi Koshinaka
-
Publication number: 20220358934Abstract: A spoofing detection apparatus 100 includes a multi-channel spectrogram creation unit 10 and an evaluation unit 40. The multi-channel spectrogram creation unit 10 extracts different type of spectrograms from speech data and integrates the different type of spectrograms to create a multi-channel spectrogram. The evaluation unit 40 evaluates the created multi-channel spectrogram by applying the created multi-channel spectrogram to a classifier constructed using labeled multi-channel spectrograms as training data and classifies it to either genuine or spoof.Type: ApplicationFiled: June 28, 2019Publication date: November 10, 2022Applicant: NEC CorporationInventors: Qiongqiong WANG, Kong Aik LEE, Takafumi KOSHINAKA
-
Publication number: 20220335950Abstract: A spoofing detection apparatus 100 includes a multi-channel spectrogram creation unit 10 and an evaluation unit 40. The multi-channel spectrogram creation unit 10 extracts different type of spectrograms from speech data and integrates the different type of spectrograms to create a multi-channel spectrogram. The evaluation unit 40 evaluates the created multi-channel spectrogram by applying the created multi-channel spectrogram to a classifier constructed using labeled multi-channel spectrograms as training data and classifies it to either genuine or spoof.Type: ApplicationFiled: October 18, 2019Publication date: October 20, 2022Applicant: NEC CorporationInventors: Qiongqiong WANG, Takafumi KOSHINAKA, Kong Aik LEE
-
Patent number: 11403545Abstract: A pattern recognition apparatus for discriminative training includes: a similarity calculator that calculates similarities among training data; a statistics calculator that calculates statistics from the similarities in accordance with current labels for the training data; and a discriminative probabilistic linear discriminant analysis (PLDA) trainer that receives the training data, the statistics of the training data, the current labels and PLDA parameters, and updates the PLDA parameters and the labels of the training data.Type: GrantFiled: March 9, 2017Date of Patent: August 2, 2022Assignee: NEC CORPORATIONInventors: Qiongqiong Wang, Takafumi Koshinaka
-
Publication number: 20220130397Abstract: A speaker recognition system includes a non-transitory computer readable medium configured to store instructions. The speaker recognition system further includes a processor connected to the non-transitory computer readable medium. The processor is configured to execute the instructions for extracting acoustic features from each frame of a plurality of frames in input speech data. The processor is configured to execute the instructions for calculating a saliency value for each frame of the plurality of frames using a first neural network (NN) based on the extracted acoustic features, wherein the first NN is a trained NN using speaker posteriors. The processor is configured to execute the instructions for extracting a speaker feature using the saliency value for each frame of the plurality of frames.Type: ApplicationFiled: February 5, 2020Publication date: April 28, 2022Inventors: Qiongqiong WANG, Koji OKABE, Takafumi KOSHINAKA
-
Publication number: 20210390158Abstract: A covariance matrix computation unit 81 computes a pseudo-in-domain covariance matrix from one or both of a within class covariance matrix and a between class covariance matrix of an out-of-domain Probabilistic Linear Discriminant Analysis (PLDA) model. A simultaneous diagonalization unit 82 computes a generalized eigenvalue and an eigenvector for a pseudo-in-domain covariance matrix and the class covariance matrix of the out-of-domain PLDA model on the basis of simultaneous diagonalization. An adaptation unit 83 computes one or both of a within class covariance matrix and a between class covariance matrix of an in-domain PLDA model using the generalized eigenvalues and eigenvectors. The covariance matrix computation unit 81 computes the pseudo-in-domain covariance matrix based on the out-of-domain PLDA model and a covariance matrix of in-domain data.Type: ApplicationFiled: March 28, 2019Publication date: December 16, 2021Applicant: NEC CorporationInventors: Kong Aik LEE, Qiongqiong WANG, Takafumi KOSHINAKA
-
Publication number: 20210256970Abstract: A speech feature extraction apparatus 100 includes a voice activity detection unit 103 that drops non-voice frames from frames corresponding to an input speech utterance, and calculates a posterior of being voiced for each frame, a voice activity detection process unit 106 calculates a function value as weights in pooling frames to produce an utterance-level feature, from a given a voice activity detection posterior, and an utterance-level feature extraction unit 112 that extracts an utterance-level feature, from the frame on a basis of multiple frame-level features, using the function values.Type: ApplicationFiled: June 29, 2018Publication date: August 19, 2021Applicant: NEC CorporationInventors: Qiongqiong WANG, Koji OKABE, Kong Aik LEE, Takafumi KOSHINAKA
-
Publication number: 20210027778Abstract: The speech processing apparatus 100 includes an air microphone speech recognition unit 101 which recognizes speech from an air microphone 200 acquiring speech through air, a wearable microphone speech recognition unit 102 which recognizes speech from a wearable microphone 300, a sensing unit 103 which measures environmental conditions, a weight decision unit 104 which calculates the weights for recognition results of the air microphone speech recognition unit 101 and the wearable microphone speech recognition unit 102 on the basis of the environmental conditions, and a combination unit 105 which combines the recognition results outputted from the air microphone speech recognition unit 101 and the wearable microphone speech recognition unit 102, using the weights.Type: ApplicationFiled: February 14, 2018Publication date: January 28, 2021Applicant: NEC CorporationInventors: Qiongqiong WANG, Takafumi KOSHINAKA
-
Patent number: 10803875Abstract: A speaker recognition system includes a non-transitory computer readable medium configured to store instructions. The speaker recognition system further includes a processor connected to the non-transitory computer readable medium. The processor is configured to execute the instructions for extracting acoustic features from each frame of a plurality of frames in input speech data. The processor is configured to execute the instructions for calculating a saliency value for each frame of the plurality of frames using a first neural network (NN) based on the extracted acoustic features, wherein the first NN is a trained NN using speaker posteriors. The processor is configured to execute the instructions for extracting a speaker feature using the saliency value for each frame of the plurality of frames.Type: GrantFiled: February 8, 2019Date of Patent: October 13, 2020Assignee: NEC CORPORATIONInventors: Qiongqiong Wang, Koji Okabe, Takafumi Koshinaka
-
Publication number: 20200258527Abstract: A speaker recognition system includes a non-transitory computer readable medium configured to store instructions. The speaker recognition system further includes a processor connected to the non-transitory computer readable medium. The processor is configured to execute the instructions for extracting acoustic features from each frame of a plurality of frames in input speech data. The processor is configured to execute the instructions for calculating a saliency value for each frame of the plurality of frames using a first neural network (NN) based on the extracted acoustic features, wherein the first NN is a trained NN using speaker posteriors. The processor is configured to execute the instructions for extracting a speaker feature using the saliency value for each frame of the plurality of frames.Type: ApplicationFiled: February 8, 2019Publication date: August 13, 2020Inventors: Qiongqiong WANG, Koji OKABE, Takafumi KOSHINAKA
-
Publication number: 20200211567Abstract: Provided is a pattern recognition apparatus to provide classification robustness to any kind of domain variability. The pattern recognition apparatus 500 based on Neural Network (NN) includes: NN training unit 501 that trains an NN model to generate NN parameters, based on at least one first feature vector and at least one domain vector indicating one of subsets in a specific domain, wherein, the first feature vector is extracted from each of the subsets, the domain vector indicates an identifier corresponding to the each of the subsets; and NN verification unit 502 that verifies a pair of second feature vectors in the specific domain to output whether the pair indicates same individual or not, based on a target domain vector and the NN parameters.Type: ApplicationFiled: September 15, 2017Publication date: July 2, 2020Applicant: NEC CorporationInventors: Qiongqiong WANG, Takafumi KOSHINAKA
-
Patent number: 10614343Abstract: The A pattern recognition apparatus using domain adaptation 10 comprises an estimation unit 11. The estimation unit 11 estimates PLDA (Probabilistic Linear Discriminant Analysis) parameters and transformation parameters from features of a first domain data and a second domain data so as to maximize/minimize an objective function with respect to the features.Type: GrantFiled: September 16, 2015Date of Patent: April 7, 2020Assignee: NEC CORPORATIONInventors: Qiongqiong Wang, Takafumi Koshinaka
-
Publication number: 20190347565Abstract: A pattern recognition apparatus for discriminative training includes: a similarity calculator that calculates similarities among training data; a statistics calculator that calculates statistics from the similarities in accordance with current labels for the training data; and a discriminative probabilistic linear discriminant analysis (PLDA) trainer that receives the training data, the statistics of the training data, the current labels and PLDA parameters, and updates the PLDA parameters and the labels of the training data.Type: ApplicationFiled: March 9, 2017Publication date: November 14, 2019Applicant: NEC CorporationInventors: Qiongqiong WANG, Takafumi KOSHINAKA
-
Publication number: 20180253628Abstract: The A pattern recognition apparatus using domain adaptation 10 comprises an estimation unit 11. The estimation unit 11 estimates PLDA (Probabilistic Linear Discriminant Analysis) parameters and transformation parameters from features of a first domain data and a second domain data so as to maximize/minimize an objective function with respect to the features.Type: ApplicationFiled: September 16, 2015Publication date: September 6, 2018Applicant: NEC CorporationInventors: Qiongqiong WANG, Takafumi KOSHINAKA