Patents by Inventor Tomohiro Nakatani

Tomohiro Nakatani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8467538
    Abstract: A sound source model storage section stores a sound source model that represents an audio signal emitted from a sound source in the form of a probability density function. An observation signal, which is obtained by collecting the audio signal, is converted into a plurality of frequency-specific observation signals each corresponding to one of a plurality of frequency bands. Then, a dereverberation filter corresponding to each frequency band is estimated by using the frequency-specific observation signal for the frequency band on the basis of the sound source model and a reverberation model that represents a relationship for each frequency band among the audio signal, the observation signal and the dereverberation filter. A frequency-specific target signal corresponding to each frequency band is determined by applying the dereverberation filter for the frequency band to the frequency-specific observation signal for the frequency band, and the resulting frequency-specific target signals are integrated.
    Type: Grant
    Filed: February 27, 2009
    Date of Patent: June 18, 2013
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Tomohiro Nakatani, Takuya Yoshioka, Keisuke Kinoshita, Masato Miyoshi
  • Patent number: 8290170
    Abstract: Speech dereverberation is achieved by accepting an observed signal for initialization (1000) and performing likelihood maximization (2000) which includes Fourier Transforms (4000).
    Type: Grant
    Filed: May 1, 2006
    Date of Patent: October 16, 2012
    Assignees: Nippon Telegraph and Telephone Corporation, Georgia Tech Research Corporation
    Inventors: Tomohiro Nakatani, Biing-Hwang Juang
  • Patent number: 8271277
    Abstract: A model application unit calculates linear prediction coefficients of a multi-step linear prediction model by using discrete acoustic signals. Then, a late reverberation predictor calculates linear prediction values obtained by substituting the linear prediction coefficients and the discrete acoustic signals into linear prediction term of the multi-step linear prediction model, as predicted late reverberations. Next, a frequency domain converter converts the discrete acoustic signals to discrete acoustic signals in the frequency domain and also converts the predicted late reverberations to predicted late reverberations in the frequency domain. A late reverberation eliminator calculates relative values between the amplitude spectra of the discrete acoustic signals expressed in the frequency domain and the amplitude spectra of the predicted late reverberations expressed in the frequency domain, and provides the relative values as predicted amplitude spectra of a dereverberation signal.
    Type: Grant
    Filed: March 5, 2007
    Date of Patent: September 18, 2012
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Keisuke Kinoshita, Tomohiro Nakatani, Masato Miyoshi
  • Publication number: 20120173234
    Abstract: The processing efficiency and estimation accuracy of a voice activity detection apparatus are improved. An acoustic signal analyzer receives a digital acoustic signal containing a speech signal and a noise signal, generates a non-speech GMM and a speech GMM adapted to a noise environment, by using a silence GMM and a clean-speech GMM in each frame of the digital acoustic signal, and calculates the output probabilities of dominant Gaussian distributions of the GMMs. A speech state probability to non-speech state probability ratio calculator calculates a speech state probability to non-speech state probability ratio based on a state transition model of a speech state and a non-speech state, by using the output probabilities; and a voice activity detection unit judges, from the speech state probability to non-speech state probability ratio, whether the acoustic signal in the frame is in the speech state or in the non-speech state and outputs only the acoustic signal in the speech state.
    Type: Application
    Filed: July 15, 2010
    Publication date: July 5, 2012
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORP.
    Inventors: Masakiyo Fujimoto, Tomohiro Nakatani
  • Publication number: 20110044462
    Abstract: The initial values of parameter estimates are set, including reverberation parameter estimates, which includes a regression coefficient used in a linear convolutional operation for calculating an estimated value of reverberation included in an observed signal, source parameter estimates, which includes estimated values of a linear prediction coefficient and a prediction residual power that identify the power spectrum of a source signal, and noise parameter estimates, which include noise power spectrum estimates. Then, the maximum likelihood estimation is used to alternately repeat processing for updating at least one of the reverberation parameter estimates and the noise parameter estimates and processing for updating the source parameter estimates until a predetermined termination condition is satisfied.
    Type: Application
    Filed: March 5, 2009
    Publication date: February 24, 2011
    Applicant: Nippon Telegraph and Telephone Corp.
    Inventors: Takuya Yoshioka, Tomohiro Nakatani, Masato Miyoshi
  • Publication number: 20110002473
    Abstract: A sound source model storage section stores a sound source model that represents an audio signal emitted from a sound source in the form of a probability density function. An observation signal, which is obtained by collecting the audio signal, is converted into a plurality of frequency-specific observation signals each corresponding to one of a plurality of frequency bands. Then, a dereverberation filter corresponding to each frequency band is estimated by using the frequency-specific observation signal for the frequency band on the basis of the sound source model and a reverberation model that represents a relationship for each frequency band among the audio signal, the observation signal and the dereverberation filter. A frequency-specific target signal corresponding to each frequency band is determined by applying the dereverberation filter for the frequency band to the frequency-specific observation signal for the frequency band, and the resulting frequency-specific target signals are integrated.
    Type: Application
    Filed: February 27, 2009
    Publication date: January 6, 2011
    Applicant: Nippon Telegraph and Telephone Corporation
    Inventors: Tomohiro Nakatani, Takuya Yoshioka, Keisuke Kinoshita, Masato Miyoshi
  • Publication number: 20090248403
    Abstract: A model application unit calculates linear prediction coefficients of a multi-step linear prediction model by using discrete acoustic signals. Then, a late reverberation predictor calculates linear prediction values obtained by substituting the linear prediction coefficients and the discrete acoustic signals into linear prediction term of the multi-step linear prediction model, as predicted late reverberations. Next, a frequency domain converter converts the discrete acoustic signals to discrete acoustic signals in the frequency domain and also converts the predicted late reverberations to predicted late reverberations in the frequency domain. A late reverberation eliminator calculates relative values between the amplitude spectra of the discrete acoustic signals expressed in the frequency domain and the amplitude spectra of the predicted late reverberations expressed in the frequency domain, and provides the relative values as predicted amplitude spectra of a dereverberation signal.
    Type: Application
    Filed: March 5, 2007
    Publication date: October 1, 2009
    Applicant: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Keisuke Kinoshita, Tomohiro Nakatani, Masato Miyoshi
  • Publication number: 20090110207
    Abstract: Speech dereverberation is achieved by accepting an observed signal for initialization (1000) and performing likelihood maximization (2000) which includes Fourier Transforms (4000).
    Type: Application
    Filed: May 1, 2006
    Publication date: April 30, 2009
    Applicants: NIPPON TELEGRAPH AND TELEPHONE COMPANY, GEORGIA TECH RESEARCH CORPORATION
    Inventors: Tomohiro Nakatani, Biing-Hwang Juang