Patents by Inventor Tomohiro Nakatani
Tomohiro Nakatani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20210216687Abstract: A mask estimation apparatus includes processing circuitry configured to estimate, for a target segment to be processed among a plurality of segments of a continuous time, a first mask which is an occupancy ratio of a target signal to an observation signal of the target segment, based on a first feature obtained from a plurality of the observation signals of the target segment recorded at a plurality of locations, and estimate a parameter for modeling a second feature and a second mask which is an occupancy ratio of the target signal to the observation signal based on an estimation result of the first mask in the target segment and the second feature obtained from the plurality of the observation signals of the target segment.Type: ApplicationFiled: August 23, 2019Publication date: July 15, 2021Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Tomohiro NAKATANI, Marc DELCROIX, Keisuke KINOSHITA, Nobutaka ITO, Shoko ARAKI
-
Publication number: 20210193161Abstract: To begin with, an acoustic model training apparatus extracts speech features representing speech characteristics, and calculates an acoustic-condition feature representing a feature of an acoustic condition of the speech data using an acoustic-condition calculation model that is represented as a neural network, based on an acoustic-condition calculation model parameter characterizing the acoustic-condition calculation model. The acoustic model training apparatus then generates an adjusted parameter that is an acoustic model parameter adjusted based on the acoustic-condition feature, the acoustic model parameter characterizing an acoustic model represented as a neural network to which an output layer of the acoustic-condition calculation model is coupled. The acoustic model training apparatus then updates the acoustic model parameter based on the adjusted parameter and the speech features, and updates the acoustic-condition calculation model parameters based on the adjusted parameter and the speech features.Type: ApplicationFiled: January 26, 2017Publication date: June 24, 2021Applicant: NIPPON TELEGRAPH AND TELEPHPNE CORPORATIONInventors: Marc DELCROIX, Keisuke KINOSHITA, Astunori OGAWA, Takuya YOSHIOKA, Tomohiro NAKATANI
-
Publication number: 20210056954Abstract: A learning device (10) includes a feature extracting unit (11) that extracts features of speech from speech data for training, a probability calculating unit (12) that, on the basis of the features of speech, performs prefix searching using a speech recognition model of which a neural network is representative, and calculates a posterior probability of a recognition character string to obtain a plurality of hypothetical character strings, an error calculating unit (13) that calculates an error by word error rates of the plurality of hypothetical character strings and a correct character string for training, and obtains a parameter for the entire model that minimizes an expected value of summation of loss in the word error rates, and an updating unit (14) that updates a parameter of the model in accordance with the parameter obtained by the error calculating unit (13).Type: ApplicationFiled: February 1, 2019Publication date: February 25, 2021Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Shigeki KARITA, Atsunori OGAWA, Marc DELCROIX, Tomohiro NAKATANI
-
Publication number: 20210049324Abstract: Disclosed is a model adaptation technology of a language model with higher adaptability. An aspect of the present disclosure relates to an apparatus includes a first neural network unit that transforms an input symbol and outputs an intermediate state; and a second neural network unit that transforms input auxiliary information and the intermediate state and predicts a symbol following the input symbol, wherein the second neural network unit includes a plurality of hidden layers receiving, as input, the intermediate state and auxiliary information, and pieces of the auxiliary information input to each hidden layer are different from each other.Type: ApplicationFiled: February 18, 2019Publication date: February 18, 2021Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Marc DELCROIX, Atsunori OGAWA, Tomohiro NAKATANI, Michael HENTSCHEL
-
Publication number: 20210035564Abstract: A determination device includes a memory, and processing circuitry coupled to the memory and configured to accept input of a plurality of sequences provided as candidates for a solution to one given input, and determine, for two sequences of the plurality of sequences, a sequence that has a higher accuracy than the other sequence of the two sequences, using a model expressed as a neural network.Type: ApplicationFiled: February 1, 2019Publication date: February 4, 2021Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Atsunori OGAWA, Marc DELCROIX, Shigeki KARITA, Tomohiro NAKATANI
-
Publication number: 20210012790Abstract: A signal analysis device (1) includes an estimation unit (10) that, when a parameter for modeling spatial characteristics of signals from N signal sources (where N is an integer equal to or larger than 2) is a spatial parameter, estimates a signal source position prior probability which is a mixture weight for modeling a prior distribution of the spatial parameter with respect to each signal source using a mixture distribution that is a linear combination of prior distributions of the spatial parameter with respect to K signal source position candidates (where K is an integer equal to or larger than 2), and which is a probability that a signal arrives from each signal source position candidate per signal source.Type: ApplicationFiled: April 5, 2019Publication date: January 14, 2021Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Nobutaka ITO, Tomohiro NAKATANI, Shoko ARAKI
-
Publication number: 20200411031Abstract: A signal analysis device includes a memory and processing circuitry coupled to the memory and configured to obtain, for a spatial covariance matrix Rj (j is an integral number equal to or larger than 1 and equal to or smaller than J) for modeling spatial characteristics of J (J is an integral number equal to or larger than 2) source signals that are present in a mixed manner, a simultaneous decorrelation matrix P as a matrix in which all PHRjP are diagonal matrices, or/and Hermitian transposition PH thereof, as a parameter for decorrelating components corresponding to the J source signals for observation signal vectors based on observation signals acquired at I (I is an integral number equal to or larger than 2) different positions.Type: ApplicationFiled: February 1, 2019Publication date: December 31, 2020Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Nobutaka ITO, Tomohiro NAKATANI, Shoko ARAKI
-
Publication number: 20200411027Abstract: A signal analysis device includes an estimation unit that models a sound source position occurrence probability matrix Q using a product of a sound source position probability matrix B and a sound source existence probability matrix A, and estimates at least one of the sound source position probability matrix B and the sound source existence probability matrix A based on the modeling, the sound source position occurrence probability matrix Q being composed of probabilities of arrival of a signal from each sound source position candidate per frame, which is a time section, with respect to a plurality of sound source position candidates. The sound source position probability matrix B being composed of probabilities of arrival of a signal from each sound source position candidate per sound source with respect to a plurality of sound sources.Type: ApplicationFiled: April 4, 2019Publication date: December 31, 2020Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Nobutaka ITO, Tomohiro NAKATANI, Shoko ARAKI
-
Patent number: 10878832Abstract: A feature extraction unit in a mask estimation apparatus extracts, from a plurality of observation signals obtained by observing a plurality of acoustic signals at different positions, feature vectors obtained by collecting time-frequency components of the observation signals for each time-frequency point. A mask update unit uses the feature vectors, a mixture weight of each component distribution, and a shape parameter that is a model parameter capable of controlling a shape of each component distribution, where a probability distribution of the feature vectors is modeled by a mixture distribution consisting of a plurality of component distributions, to estimate masks indicating a proportion in which each component distribution contributes to each time-frequency point. A mixture weight update unit updates the mixture weight based on the updated masks. A parameter update unit updates the shape parameter by using the feature vectors and the masks.Type: GrantFiled: December 20, 2016Date of Patent: December 29, 2020Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Nobutaka Ito, Shoko Araki, Tomohiro Nakatani
-
Publication number: 20200395037Abstract: A mask estimation apparatus for estimating mask information for specifying a mask used to extract a signal of a specific sound source from an input audio signal includes a converter which converts the input audio signal into embedded vectors of a predetermined dimension using a trained neural network model and a mask calculator which calculates the mask information by fitting the embedded vectors to a mixed Gaussian model.Type: ApplicationFiled: February 19, 2019Publication date: December 17, 2020Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Takuya HIGUCHI, Tomohiro NAKATANI, Keisuke KINOSHITA
-
Publication number: 20200365143Abstract: A learning device includes a memory, and processing circuitry coupled to the memory and configured to receive an input of a plurality of series for learning having known accuracy, and learn a model represented by a neural network, the model being capable of determining accuracy levels of two series when given feature amounts of the two series among the plurality of series.Type: ApplicationFiled: February 1, 2019Publication date: November 19, 2020Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Atsunori OGAWA, Marc DELCROIX, Shigeki KARITA, Tomohiro NAKATANI
-
Publication number: 20200143819Abstract: A cluster weight calculator calculates weights corresponding to respective clusters in a mask calculation NN with at least one of the layers divided into the clusters, based on the signals of speech of a target speaker using a cluster weight calculation NN. A mask calculator calculates a mask for extracting features of speech of the target speaker from features in observed speech signals of one or more speakers based on the features in the observation signals of the speech of the one or more speakers using the mask calculator NN weighted by the weights calculated by the cluster weight calculator.Type: ApplicationFiled: July 18, 2018Publication date: May 7, 2020Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Marc DELCROIX, Keisuke KINOSHITA, Atsunori OGAWA, Takuya HIGUCHI, Tomohiro NAKATANI
-
Patent number: 10643633Abstract: An observation feature value vector is calculated based on observation signals recorded at different positions in a situation in which target sound sources and background noise are present in a mixed manner; masks associated with the target sound sources and a mask associated with the background noise are estimated; a spatial correlation matrix of the target sound sources that includes the background noise is calculated based on the masks associated with the observation signals and the target sound sources; a spatial correlation matrix of the background noise is calculated based on the masks associated with the observation signals and the background noise; and a spatial correlation matrix of the target sound sources is estimated based on the matrix obtained by weighting each of the spatial correlation matrices by predetermined coefficients.Type: GrantFiled: December 1, 2016Date of Patent: May 5, 2020Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Tomohiro Nakatani, Nobutaka Ito, Takuya Higuchi, Shoko Araki, Takuya Yoshioka
-
Publication number: 20190267019Abstract: A feature extraction unit in a mask estimation apparatus extracts, from a plurality of observation signals obtained by observing a plurality of acoustic signals at different positions, feature vectors obtained by collecting time-frequency components of the observation signals for each time-frequency point. A mask update unit uses the feature vectors, a mixture weight of each component distribution, and a shape parameter that is a model parameter capable of controlling a shape of each component distribution, where a probability distribution of the feature vectors is modeled by a mixture distribution consisting of a plurality of component distributions, to estimate masks indicating a proportion in which each component distribution contributes to each time-frequency point. A mixture weight update unit updates the mixture weight based on the updated masks. A parameter update unit updates the shape parameter by using the feature vectors and the masks.Type: ApplicationFiled: December 20, 2016Publication date: August 29, 2019Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Nobutaka ITO, Shoko ARAKI, Tomohiro NAKATANI
-
Publication number: 20180366135Abstract: An observation feature value vector is calculated based on observation signals recorded at different positions in a situation in which target sound sources and background noise are present in a mixed manner; masks associated with the target sound sources and a mask associated with the background noise are estimated; a spatial correlation matrix of the target sound sources that includes the background noise is calculated based on the masks associated with the observation signals and the target sound sources; a spatial correlation matrix of the background noise is calculated based on the masks associated with the observation signals and the background noise; and a spatial correlation matrix of the target sound sources is estimated based on the matrix obtained by weighting each of the spatial correlation matrices by predetermined coefficients.Type: ApplicationFiled: December 1, 2016Publication date: December 20, 2018Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Tomohiro NAKATANI, Nobutaka ITO, Takuya HIGUCHI, Shoko ARAKI, Takuya YOSHIOKA
-
Patent number: 9754608Abstract: A noise estimation apparatus which estimates a non-stationary noise component on the basis of the likelihood maximization criterion is provided. The noise estimation apparatus obtains the variance of a noise signal that causes a large value to be obtained by weighted addition of the sums each of which is obtained by adding the product of the log likelihood of a model of an observed signal expressed by a Gaussian distribution in a speech segment and a speech posterior probability in each frame, and the product of the log likelihood of a model of an observed signal expressed by a Gaussian distribution in a non-speech segment and a non-speech posterior probability in each frame, by using complex spectra of a plurality of observed signals up to the current frame.Type: GrantFiled: January 30, 2013Date of Patent: September 5, 2017Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Mehrez Souden, Keisuke Kinoshita, Tomohiro Nakatani, Marc Delcroix, Takuya Yoshioka
-
Patent number: 9208780Abstract: The processing efficiency and estimation accuracy of a voice activity detection apparatus are improved. An acoustic signal analyzer receives a digital acoustic signal containing a speech signal and a noise signal, generates a non-speech GMM and a speech GMM adapted to a noise environment, by using a silence GMM and a clean-speech GMM in each frame of the digital acoustic signal, and calculates the output probabilities of dominant Gaussian distributions of the GMMs. A speech state probability to non-speech state probability ratio calculator calculates a speech state probability to non-speech state probability ratio based on a state transition model of a speech state and a non-speech state, by using the output probabilities; and a voice activity detection unit judges, from the speech state probability to non-speech state probability ratio, whether the acoustic signal in the frame is in the speech state or in the non-speech state and outputs only the acoustic signal in the speech state.Type: GrantFiled: July 15, 2010Date of Patent: December 8, 2015Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Masakiyo Fujimoto, Tomohiro Nakatani
-
Publication number: 20150032445Abstract: A noise estimation apparatus which estimates a non-stationary noise component on the basis of the likelihood maximization criterion is provided. The noise estimation apparatus obtains the variance of a noise signal that causes a large value to be obtained by weighted addition of the sums each of which is obtained by adding the product of the log likelihood of a model of an observed signal expressed by a Gaussian distribution in a speech segment and a speech posterior probability in each frame, and the product of the log likelihood of a model of an observed signal expressed by a Gaussian distribution in a non-speech segment and a non-speech posterior probability in each frame, by using complex spectra of a plurality of observed signals up to the current frame.Type: ApplicationFiled: January 30, 2013Publication date: January 29, 2015Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Mehrez Souden, Keisuke Kinoshita, Tomohiro Nakatani, Marc Delcroix, Takuya Yoshioka
-
Patent number: 8848933Abstract: The initial values of parameter estimates are set, including reverberation parameter estimates, which includes a regression coefficient used in a linear convolutional operation for calculating an estimated value of reverberation included in an observed signal, source parameter estimates, which includes estimated values of a linear prediction coefficient and a prediction residual power that identify the power spectrum of a source signal, and noise parameter estimates, which include noise power spectrum estimates. Then, the maximum likelihood estimation is used to alternately repeat processing for updating at least one of the reverberation parameter estimates and the noise parameter estimates and processing for updating the source parameter estimates until a predetermined termination condition is satisfied.Type: GrantFiled: March 5, 2009Date of Patent: September 30, 2014Assignee: Nippon Telegraph and Telephone CorporationInventors: Takuya Yoshioka, Tomohiro Nakatani, Masato Miyoshi
-
Publication number: 20130168141Abstract: A method for producing a substrate with through-electrode includes the steps of: forming recesses or through-holes in either one of a silicon substrate and a glass substrate; forming protrusions in the other substrate; laying the silicon substrate and glass substrate on each other so that the protrusions are inserted in the respective recesses or through-holes; and bonding the silicon substrate and the glass substrate to each other.Type: ApplicationFiled: January 24, 2012Publication date: July 4, 2013Applicant: PANASONIC CORPORATIONInventors: Junichi Hozumi, Takumi Taura, Shin Okumura, Tomohiro Nakatani, Ryo Tomoida