Patents by Inventor Tomohiro Nakatani

Tomohiro Nakatani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SIGNAL PROCESSING DEVICE, SIGNAL PROCESSING METHOD, AND SIGNAL PROCESSING PROGRAM

Publication number: 20230067132

Abstract: A signal processing apparatus includes a neural network (“NN”), a sorting unit, and a spatial covariance matrix calculation unit. The NN converts a mixed signal, in which sounds of a plurality of sound sources input by a plurality of channels are mixed, into a separated signal separated into a signal for each sound source as a signal in a time domain as it is and outputs the separated signal. The sorting unit sorts, for the separated signal of each channel output from the NN, the separated signal of each channel such that the plurality of sound sources of a plurality of the separated signals are aligned among the plurality of channels. The spatial covariance matrix calculation unit calculates a spatial covariance matrix corresponding to each sound source in accordance with the separated signal for each channel output from the sorting unit and sorted.

Type: Application

Filed: February 14, 2020

Publication date: March 2, 2023

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Tsubasa OCHIAI, Marc DELCROIX, Rintaro IKESHITA, Keisuke KINOSHITA, Tomohiro NAKATANI, Shoko ARAKI
Mask estimation apparatus, model learning apparatus, sound source separation apparatus, mask estimation method, model learning method, sound source separation method, and program

Patent number: 11562765

Abstract: A mask estimation apparatus for estimating mask information for specifying a mask used to extract a signal of a specific sound source from an input audio signal includes a converter which converts the input audio signal into embedded vectors of a predetermined dimension using a trained neural network model and a mask calculator which calculates the mask information by fitting the embedded vectors to a mixed Gaussian model.

Type: Grant

Filed: February 19, 2019

Date of Patent: January 24, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takuya Higuchi, Tomohiro Nakatani, Keisuke Kinoshita
Learning device and method for updating a parameter of a speech recognition model

Patent number: 11551667

Abstract: A learning device (10) includes a feature extracting unit (11) that extracts features of speech from speech data for training, a probability calculating unit (12) that, on the basis of the features of speech, performs prefix searching using a speech recognition model of which a neural network is representative, and calculates a posterior probability of a recognition character string to obtain a plurality of hypothetical character strings, an error calculating unit (13) that calculates an error by word error rates of the plurality of hypothetical character strings and a correct character string for training, and obtains a parameter for the entire model that minimizes an expected value of summation of loss in the word error rates, and an updating unit (14) that updates a parameter of the model in accordance with the parameter obtained by the error calculating unit (13).

Type: Grant

Filed: February 1, 2019

Date of Patent: January 10, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Shigeki Karita, Atsunori Ogawa, Marc Delcroix, Tomohiro Nakatani
SPEECH SIGNAL PROCESSING DEVICE, SPEECH SIGNAL PROCESSING METHOD, SPEECH SIGNAL PROCESSING PROGRAM, TRAINING DEVICE, TRAINING METHOD, AND TRAINING PROGRAM

Publication number: 20220335965

Abstract: An audio signal processing apparatus (10) includes a first auxiliary feature conversion unit (12) and a second auxiliary feature conversion unit (13) that convert a plurality of signals relating to processing of an audio signal of a target speaker into a plurality of auxiliary features for the plurality of signals using a plurality of auxiliary neural networks corresponding to the plurality of signals, and an audio signal processing unit (11) that estimates information regarding an audio signal of the target speaker included in a mixed audio signal using a main neural network based on an input feature of the mixed audio signal and the plurality of auxiliary features, wherein the plurality of signals relating to processing of the audio signal of the target speaker are two or more pieces of information of different modalities.

Type: Application

Filed: August 7, 2020

Publication date: October 20, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Hiroshi SATO, Tsubasa OCHIAI, Keisuke KINOSHITA, Marc DELCROIX, Tomohiro NAKATANI, Atsunori OGAWA
Speech intelligibility calculating method, speech intelligibility calculating apparatus, and speech intelligibility calculating program

Patent number: 11462228

Abstract: A speech intelligibility calculating method is a method executed by a speech intelligibility calculating apparatus, the speech intelligibility calculating method including: a speech intelligibility calculating step of calculating a speech intelligibility that is an objective assessment index of a speech quality, based on a difference component between features found through an analysis of an input clean speech and an input enhanced speech, using one or more filter banks; and a step of outputting the speech intelligibility calculated at the speech intelligibility calculating step. This speech intelligibility calculating method is capable of calculating a speech intelligibility without any dependency on a speech enhancement method.

Type: Grant

Filed: August 3, 2018

Date of Patent: October 4, 2022

Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, WAKAYAMA UNIVERSITY

Inventors: Shoko Araki, Tomohiro Nakatani, Keisuke Kinoshita, Toshio Irino, Toshie Matsui, Katsuhiko Yamamoto
Estimation device, learning device, estimation method, learning method, and recording medium

Patent number: 11456003

Abstract: An estimation device includes a memory, and processing circuitry coupled to the memory and configured to receive an input of an input audio signal that is an audio signal in which sounds from a plurality of sound sources are mixed, and an input of supplemental information, and output an estimation result of mask information that identifies a mask for extracting a sound of any one of the sound sources included in an entire or a part of a signal included in the input audio signal, the signal being identified by the supplemental information, cause a neural network to iterate a process of outputting the estimation result of the mask information, and cause the neural network to output an estimation result of the mask information for a different sound source, by inputting a different piece of the supplemental information to the neural network at each iteration.

Type: Grant

Filed: January 29, 2019

Date of Patent: September 27, 2022

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Shoko Araki, Lukas Drude, Thilo Christoph Von Neumann
ESTIMATION DEVICE, ESTIMATION METHOD, AND ESTIMATION PROGRAM

Publication number: 20220301570

Abstract: A sound source separation filter information estimation device (10) estimates a covariance matrix having information on a correlation between sound source spectra and information on a correlation between channels as information on sound source separation filter information for separating an individual sound source signal from a mixed acoustic signal.

Type: Application

Filed: August 21, 2019

Publication date: September 22, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Rintaro IKESHITA, Nobutaka ITO, Tomohiro NAKATANI, Hiroshi SAWADA
Signal analysis device for modeling spatial characteristics of source signals, signal analysis method, and recording medium

Patent number: 11423924

Abstract: A signal analysis device includes a memory and processing circuitry coupled to the memory and configured to obtain, for a spatial covariance matrix Rj (j is an integral number equal to or larger than 1 and equal to or smaller than J) for modeling spatial characteristics of J (J is an integral number equal to or larger than 2) source signals that are present in a mixed manner, a simultaneous decorrelation matrix P as a matrix in which all PHRjP are diagonal matrices, or/and Hermitian transposition PH thereof, as a parameter for decorrelating components corresponding to the J source signals for observation signal vectors based on observation signals acquired at I (I is an integral number equal to or larger than 2) different positions.

Type: Grant

Filed: February 1, 2019

Date of Patent: August 23, 2022

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Nobutaka Ito, Tomohiro Nakatani, Shoko Araki
DETERMINATION DEVICE, TRAINING DEVICE, DETERMINATION METHOD, AND DETERMINATION PROGRAM

Publication number: 20220262356

Abstract: A reranking device include a hypothesis input unit configured to receive input of N-best hypotheses associated with scores of a speech recognition accuracy; a hypothesis selection unit configured to select two hypotheses to be determined from among the input N-best hypotheses. Further, there is a determination unit configured to determine which accuracy of two hypotheses is higher by using: a plurality of first auxiliary model to M-th auxiliary model represented by such a neural network as to be capable of converting, when the selected two hypotheses are given, the two hypotheses into hidden state vectors, and determining which of the two hypotheses is higher based on the hidden state vectors of the two hypotheses; and a main model represented by such a neural network as to be capable of determining which of the two hypotheses is higher based on the hidden state vectors of the two hypotheses.

Type: Application

Filed: August 8, 2019

Publication date: August 18, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Atsunori OGAWA, Marc DELCROIX, Shigeki KARITA, Tomohiro NAKATANI
ABSTRACT GENERATION DEVICE, METHOD, PROGRAM, AND RECORDING MEDIUM

Publication number: 20220189468

Abstract: A speech recognition unit (12) converts an input utterance sequence into a confusion network sequence constituted by a k-best of candidate words of speech recognition results; a lattice generating unit (14) generates a lattice sequence having the candidate words as internal nodes and a combination of k words among the candidate words for an identical speech as an external node, in which edges are extended between internal nodes other than internal nodes included in an identical external node, from the confusion network sequence; an integer programming problem generating unit (16) generates an integer programming problem for selecting a path that maximizes an objective function including at least a coverage score of an important word, of paths following the internal nodes with the edges extended, in the lattice sequence; and the summary generating unit generates a high-quality summary having less speech recognition errors and low redundancy using candidate words indicated by the internal nodes included in the

Type: Application

Filed: January 16, 2020

Publication date: June 16, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Tsutomu HIRAO, Atsunori OGAWA, Tomohiro NAKATANI, Masaaki NAGATA
NOISE SPATIAL COVARIANCE MATRIX ESTIMATION APPARATUS, NOISE SPATIAL COVARIANCE MATRIX ESTIMATION METHOD, AND PROGRAM

Publication number: 20220130406

Abstract: A time-variant noise spatial covariance matrix is estimated effectively. Using time-frequency-divided observation signals based on observation signals acquired by collecting acoustic signals emitted from one or a plurality of sound sources and mask information expressing the occupancy probability of a component of each of the time-frequency-divided observation signals that corresponds to each noise source, a time-independent first noise spatial covariance matrix corresponding to the time-frequency-divided observation signals and the mask information belonging to a long time interval is acquired for each noise source. Further, using the mask information of each of a plurality of different short time intervals, a mixture weight corresponding to each noise source in each short time interval is acquired.

Type: Application

Filed: February 28, 2020

Publication date: April 28, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Tomohiro NAKATANI, Marc DELCROIX, Keisuke KINOSHITA, Shoko ARAKI, Yuki KUBO
Signal analysis device, signal analysis method, and signal analysis program

Patent number: 11302343

Abstract: A signal analysis device includes an estimation unit that models a sound source position occurrence probability matrix Q using a product of a sound source position probability matrix B and a sound source existence probability matrix A, and estimates at least one of the sound source position probability matrix B and the sound source existence probability matrix A based on the modeling, the sound source position occurrence probability matrix Q being composed of probabilities of arrival of a signal from each sound source position candidate per frame, which is a time section, with respect to a plurality of sound source position candidates. The sound source position probability matrix B being composed of probabilities of arrival of a signal from each sound source position candidate per sound source with respect to a plurality of sound sources.

Type: Grant

Filed: April 4, 2019

Date of Patent: April 12, 2022

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Nobutaka Ito, Tomohiro Nakatani, Shoko Araki
Neural network based signal processing device, neural network based signal processing method, and signal processing program

Patent number: 11304000

Abstract: A signal processing device includes a power estimating unit that treats the feature quantity of a signal including reverberation as the input; inputs an observation feature quantity corresponding to an observation signal to a neural network which is learnt in such a way that the estimate value of the feature quantity corresponding to the power of the signal having reduced reverberation, from among the input signal, is output; and estimates the estimate value of the feature quantity corresponding to the power of the signal having reduced reverberation and corresponding to the observation signal. Moreover, the signal processing device includes a regression coefficient estimating unit that uses the estimate value of the feature quantity corresponding to the power as obtained as the estimation result by the power estimating unit, and estimates a regression coefficient of the autoregressive process for generating the observation signal.

Type: Grant

Filed: August 1, 2018

Date of Patent: April 12, 2022

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Keisuke Kinoshita, Tomohiro Nakatani, Marc Delcroix
SIGNAL PROCESSING APPARATUS, LEARNING APPARATUS, SIGNAL PROCESSING METHOD, LEARNING METHOD AND PROGRAM

Publication number: 20220076690

Abstract: A signal processing device according to an embodiment of the present invention includes: a conversion unit configured to convert an input mixed acoustic signal into a plurality of first internal states, a weighting unit configured to generate a second internal state which is a weighted sum of the plurality of first internal states based on auxiliary information regarding an acoustic signal of a target sound source when the auxiliary information is input, and generate the second internal state by selecting one of the plurality of first internal states when the auxiliary information is not input, and a mask estimation unit configured to estimate a mask based on the second internal state.

Type: Application

Filed: February 12, 2020

Publication date: March 10, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Tsubasa OCHIAI, Marc DELCROIX, Keisuke KINOSHITA, Atsunori OGAWA, Tomohiro NAKATANI
SIGNAL PROCESSING APPARATUS, SIGNAL PROCESSING METHOD, AND PROGRAM

Publication number: 20220068288

Abstract: To sufficiently suppress noise and reverberation, a convolutional beamformer for calculating, at each time point, a weighted sum of a current signal and a past signal sequence having a predetermined delay and a length of 0 or more such that it increases a probability expressing a speech-likeness of an estimation signals based on a predetermined probability model is acquired where the estimation signals are acquired by applying the convolutional beamformer to frequency-divided observation signals corresponding respectively to a plurality of frequency bands of observation signals acquired by picking up acoustic signals emitted from a sound source, whereupon target signals are acquired by applying the acquired convolutional beamformer to the frequency-divided observation signals.

Type: Application

Filed: July 31, 2019

Publication date: March 3, 2022

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Tomohiro NAKATANI, Keisuke KINOSHITA
Acoustic model training method, speech recognition method, acoustic model training apparatus, speech recognition apparatus, acoustic model training program, and speech recognition program

Patent number: 11264044

Abstract: To begin with, an acoustic model training apparatus extracts speech features representing speech characteristics, and calculates an acoustic-condition feature representing a feature of an acoustic condition of the speech data using an acoustic-condition calculation model that is represented as a neural network, based on an acoustic-condition calculation model parameter characterizing the acoustic-condition calculation model. The acoustic model training apparatus then generates an adjusted parameter that is an acoustic model parameter adjusted based on the acoustic-condition feature, the acoustic model parameter characterizing an acoustic model represented as a neural network to which an output layer of the acoustic-condition calculation model is coupled. The acoustic model training apparatus then updates the acoustic model parameter based on the adjusted parameter and the speech features, and updates the acoustic-condition calculation model parameters based on the adjusted parameter and the speech features.

Type: Grant

Filed: January 26, 2017

Date of Patent: March 1, 2022

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Takuya Yoshioka, Tomohiro Nakatani
SIGNAL SEPARATION APPARATUS, SIGNAL SEPARATION METHOD AND PROGRAM

Publication number: 20210398549

Abstract: The signal separation device includes: cross product calculation means receiving an input of an observed signal that is a mixture of a plurality of target signals, and calculating a cross product of the observed signal; model calculation means updating a parameter of a model for estimating the cross product with a predetermined algorithm using an inverse matrix of a matrix that represents an estimate of the cross product; inverse matrix calculation means calculating the inverse matrix of a matrix by a SIMD command when the parameter is updated; and separation means calculating the target signals using a matrix representing an estimate of the cross product, the updated parameter, and the observed signal.

Type: Application

Filed: July 1, 2019

Publication date: December 23, 2021

Inventors: Hiroshi SAWADA, Rintaro IKESHITA, Nobutaka ITO, Tomohiro NAKATANI
NEURAL NETWORK BASED SIGNAL PROCESSING DEVICE, NEURAL NETWORK BASED SIGNAL PROCESSING METHOD, AND SIGNAL PROCESSING PROGRAM

Publication number: 20210400383

Abstract: A signal processing device includes a power estimating unit that treats the feature quantity of a signal including reverberation as the input; inputs an observation feature quantity corresponding to an observation signal to a neural network which is learnt in such a way that the estimate value of the feature quantity corresponding to the power of the signal having reduced reverberation, from among the input signal, is output; and estimates the estimate value of the feature quantity corresponding to the power of the signal having reduced reverberation and corresponding to the observation signal. Moreover, the signal processing device includes a regression coefficient estimating unit that uses the estimate value of the feature quantity corresponding to the power as obtained as the estimation result by the power estimating unit, and estimates a regression coefficient of the autoregressive process for generating the observation signal.

Type: Application

Filed: August 1, 2018

Publication date: December 23, 2021

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Keisuke KINOSHITA, Tomohiro NAKATANI, Marc DELCROIX
SPEECH INTELLIGIBILITY CALCULATING METHOD, SPEECH INTELLIGIBILITY CALCULATING APPARATUS, AND SPEECH INTELLIGIBILITY CALCULATING PROGRAM

Publication number: 20210375300

Abstract: A speech intelligibility calculating method is a method executed by a speech intelligibility calculating apparatus, the speech intelligibility calculating method including: a speech intelligibility calculating step of calculating a speech intelligibility that is an objective assessment index of a speech quality, based on a difference component between features found through an analysis of an input clean speech and an input enhanced speech, using one or more filter banks; and a step of outputting the speech intelligibility calculated at the speech intelligibility calculating step. This speech intelligibility calculating method is capable of calculating a speech intelligibility without any dependency on a speech enhancement method.

Type: Application

Filed: August 3, 2018

Publication date: December 2, 2021

Applicants: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Wakayama University

Inventors: Shoko ARAKI, Tomohiro NAKATANI, Keisuke KINOSHITA, Toshio IRINO, Toshie MATSUI, Katsuhiko YAMAMOTO
ESTIMATION DEVICE, LEARNING DEVICE, ESTIMATION METHOD, LEARNING METHOD, AND RECORDING MEDIUM

Publication number: 20210366502

Abstract: An estimation device includes a memory, and processing circuitry coupled to the memory and configured to receive an input of an input audio signal that is an audio signal in which sounds from a plurality of sound sources are mixed, and an input of supplemental information, and output an estimation result of mask information that identifies a mask for extracting a sound of any one of the sound sources included in an entire or a part of a signal included in the input audio signal, the signal being identified by the supplemental information, cause a neural network to iterate a process of outputting the estimation result of the mask information, and cause the neural network to output an estimation result of the mask information for a different sound source, by inputting a different piece of the supplemental information to the neural network at each iteration.

Type: Application

Filed: January 29, 2019

Publication date: November 25, 2021

Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Keisuke KINOSHITA, Marc DELCROIX, Tomohiro NAKATANI, Shoko ARAKI, Lukas DRUDE, Thilo Christoph VON NEUMANN

prev 1 2 3 4 next