Patents by Inventor Masatsune Tamura

Masatsune Tamura has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9280967
    Abstract: According to one embodiment, an apparatus for supporting reading of a document includes a model storage unit, a document acquisition unit, a feature information extraction, and an utterance style estimation unit. The model storage unit is configured to store a model which has trained a correspondence relationship between first feature information and an utterance style. The first feature information is extracted from a plurality of sentences in a training document. The document acquisition unit is configured to acquire a document to be read. The feature information extraction unit is configured to extract second feature information from each sentence in the document to be read. The utterance style estimation unit is configured to compare the second feature information of a plurality of sentences in the document to be read with the model, and to estimate an utterance style of the each sentence of the document to be read.
    Type: Grant
    Filed: September 14, 2011
    Date of Patent: March 8, 2016
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kosei Fume, Masaru Suzuki, Masahiro Morita, Kentaro Tachibana, Kouichirou Mori, Yuji Shimizu, Takehiko Kagoshima, Masatsune Tamura, Tomohiro Yamasaki
  • Publication number: 20160012035
    Abstract: According to an embodiment, a device includes a table creator, an estimator, and a dictionary creator. The table creator is configured to create a table based on similarity between distributions of nodes of speech synthesis dictionaries of a specific speaker in respective first and second languages. The estimator is configured to estimate a matrix to transform the speech synthesis dictionary of the specific speaker in the first language to a speech synthesis dictionary of a target speaker in the first language, based on speech and a recorded text of the target speaker in the first language and the speech synthesis dictionary of the specific speaker in the first language. The dictionary creator is configured to create a speech synthesis dictionary of the target speaker in the second language, based on the table, the matrix, and the speech synthesis dictionary of the specific speaker in the second language.
    Type: Application
    Filed: July 9, 2015
    Publication date: January 14, 2016
    Inventors: Kentaro Tachibana, Masatsune Tamura, Yamato Ohtani
  • Publication number: 20150325232
    Abstract: According to an embodiment, a speech synthesizer includes a source generator, a phase modulator, and a vocal tract filter unit. The source generator generates a source signal by using a fundamental frequency sequence and a pulse signal. The phase modulator modulates, with respect to the source signal generated by the source generator, a phase of the pulse signal at each pitch mark based on audio watermarking information. The vocal tract filter unit generates a speech signal by using a spectrum parameter sequence with respect to the source signal in which the phase of the pulse signal is modulated by the phase modulator.
    Type: Application
    Filed: July 16, 2015
    Publication date: November 12, 2015
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kentaro TACHIBANA, Takehiko KAGOSHIMA, Masatsune TAMURA, Masahiro MORITA
  • Patent number: 9135910
    Abstract: According to an embodiment, a speech synthesis device includes a first storage, a second storage, a first generator, a second generator, a third generator, and a fourth generator. The first storage is configured to store therein first information obtained from a target uttered voice. The second storage is configured to store therein second information obtained from an arbitrary uttered voice. The first generator is configured to generate third information by converting the second information so as to be close to a target voice quality or prosody. The second generator is configured to generate an information set including the first information and the third information. The third generator is configured to generate fourth information used to generate a synthesized speech, based on the information set. The fourth generator configured to generate the synthesized speech corresponding to input text using the fourth information.
    Type: Grant
    Filed: February 12, 2013
    Date of Patent: September 15, 2015
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masatsune Tamura, Masahiro Morita
  • Patent number: 9110887
    Abstract: According to one embodiment, a speech synthesis apparatus includes a language analyzer, statistical model storage, model selector, parameter generator, basis model storage, and filter processor. The language analyzer analyzes text data and outputs language information data that represents linguistic information of the text data. The statistical model storage stores statistical models prepared by statistically modeling acoustic information included in speech. The model selector selects a statistical model from the models based on the language information data. The parameter generator generates speech parameter sequences using the statistical model selected by the model selector. The basis model storage stores a basis model including basis vectors, each of which expresses speech information for each limited frequency range. The filter processor outputs synthetic speech by executing filter processing of the speech parameter sequences and the basis model.
    Type: Grant
    Filed: December 26, 2012
    Date of Patent: August 18, 2015
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Yamato Ohtani, Masatsune Tamura, Masahiro Morita
  • Patent number: 9058807
    Abstract: According to one embodiment, a first storage unit stores n band noise signals obtained by applying n band-pass filters to a noise signal. A second storage unit stores n band pulse signals. A parameter input unit inputs a fundamental frequency, n band noise intensities, and a spectrum parameter. A extraction unit extracts for each pitch mark the n band noise signals while shifting. An amplitude control unit changes amplitudes of the extracted band noise signals and band pulse signals in accordance with the band noise intensities. A generation unit generates a mixed sound source signal by adding the n band noise signals and the n band pulse signals. A generation unit generates the mixed sound source signal generated based on the pitch mark. A vocal tract filter unit generates a speech waveform by applying a vocal tract filter using the spectrum parameter to the generated mixed sound source signal.
    Type: Grant
    Filed: March 18, 2011
    Date of Patent: June 16, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
  • Publication number: 20150081306
    Abstract: According to an embodiment, a prosody editing device includes an approximate contour generator, a setter, a display controller, an operation receiver, and an updater. The approximate contour generator approximates a contour representing a time series of prosody information with a parametric curve including a control point to generate an approximate contour. The setter sets, on the approximate contour, an operation point corresponding to the control point. The display controller displays, on a display device, an operation screen including the approximate contour on which the operation point is shown. The operation receiver receives an operation to move the operation point optionally selected on the operation screen. The updater calculates a position of the control point from a moving amount of the operation point and updates the approximate contour.
    Type: Application
    Filed: September 2, 2014
    Publication date: March 19, 2015
    Inventors: Kouichirou MORI, Yu NASU, Masatsune TAMURA, Masahiro MORITA
  • Publication number: 20140257816
    Abstract: According to an embodiment, a speech synthesis dictionary modification device includes an extracting unit, a display unit, an acquiring unit, an modification unit, and an updating unit. The extracting unit extracts a synthesis information containing a feature sequence of a synthetic speech from the synthetic speech generated by using a speech synthesis dictionary containing probability distributions of speech features. The display unit displays an image prompting to modify a probability distribution contained in the speech synthesis dictionary on a basis of the synthesis information extracted by the extracting unit. The acquiring unit acquires an instruction to modify the probability distribution contained in the speech synthesis dictionary. The modification unit modifies the probability distribution contained in the speech synthesis dictionary according to the instruction.
    Type: Application
    Filed: January 31, 2014
    Publication date: September 11, 2014
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Ryo Morinaka, Masatsune Tamura, Masahiro Morita
  • Publication number: 20130262087
    Abstract: According to one embodiment, a speech synthesis apparatus includes a language analyzer, statistical model storage, model selector, parameter generator, basis model storage, and filter processor. The language analyzer analyzes text data and outputs language information data that represents linguistic information of the text data. The statistical model storage stores statistical models prepared by statistically modeling acoustic information included in speech. The model selector selects a statistical model from the models based on the language information data. The parameter generator generates speech parameter sequences using the statistical model selected by the model selector. The basis model storage stores a basis model including basis vectors, each of which expresses speech information for each limited frequency range. The filter processor outputs synthetic speech by executing filter processing of the speech parameter sequences and the basis model.
    Type: Application
    Filed: December 26, 2012
    Publication date: October 3, 2013
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Yamato OHTANI, Masatsune TAMURA, Masahiro MORITA
  • Patent number: 8438033
    Abstract: A voice conversion apparatus stores, in a parameter memory, target speech spectral parameters of target speech, stores, in a voice conversion rule memory, a voice conversion rule for converting voice quality of source speech into voice quality of the target speech, extracts, from an input source speech, a source speech spectral parameter of the input source speech, converts extracted source speech spectral parameter into a first conversion spectral parameter by using the voice conversion rule, selects target speech spectral parameter similar to the first conversion spectral parameter from the parameter memory, generates an aperiodic component spectral parameter representing from selected target speech spectral parameter, mixes a periodic component spectral parameter included in the first conversion spectral parameter with the aperiodic component spectral parameter, to obtain a second conversion spectral parameter, and generates a speech waveform from the second conversion spectral parameter.
    Type: Grant
    Filed: July 20, 2009
    Date of Patent: May 7, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
  • Publication number: 20120323569
    Abstract: According to one embodiment, a speech processing apparatus includes a histogram calculation unit, a cumulative frequency calculation unit, and a filter production unit. The histogram calculation unit is configured to calculate a first histogram from a first speech feature extracted from speech data, and to calculate a second histogram from a second speech feature different from the first speech feature. The cumulative frequency calculation unit is configured to calculate a first cumulative frequency by accumulating a frequency of the first histogram, and to calculate a second cumulative frequency by accumulating a frequency of the second histogram. The filter production unit is configured to produce a filter having a characteristic to get the second cumulative frequency near to the first cumulative frequency.
    Type: Application
    Filed: March 15, 2012
    Publication date: December 20, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Yamato Ohtani, Masatsune Tamura, Masahiro Morita
  • Patent number: 8321208
    Abstract: An information extraction unit extracts spectral envelope information of L-dimension from each frame of speech data by discrete Fourier transform. The spectral envelope information is represented by L points. A basis storage unit stores N bases (L>N>1). Each basis is differently a frequency band having a maximum as a peak frequency in a spectral domain having L-dimension. A value corresponding to a frequency outside the frequency band along a frequency axis of the spectral domain is zero. Two frequency bands of which two peak frequencies are adjacent along the frequency axis partially overlap. A parameter calculation unit minimizes a distortion between the spectral envelope information and a linear combination of each basis with a coefficient for each of L points of the spectral envelope information by changing the coefficient, and sets the coefficient of each basis from which the distortion is minimized to a spectral envelope parameter of the spectral envelope information.
    Type: Grant
    Filed: December 3, 2008
    Date of Patent: November 27, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Katsumi Tsuchiya, Takehiko Kagoshima
  • Publication number: 20120239390
    Abstract: According to one embodiment, an apparatus for supporting reading of a document includes a model storage unit, a document acquisition unit, a feature information extraction, and an utterance style estimation unit. The model storage unit is configured to store a model which has trained a correspondence relationship between first feature information and an utterance style. The first feature information is extracted from a plurality of sentences in a training document. The document acquisition unit is configured to acquire a document to be read. The feature information extraction unit is configured to extract second feature information from each sentence in the document to be read. The utterance style estimation unit is configured to compare the second feature information of a plurality of sentences in the document to be read with the model, and to estimate an utterance style of the each sentence of the document to be read.
    Type: Application
    Filed: September 14, 2011
    Publication date: September 20, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kosei Fume, Masaru Suzuki, Masahiro Morita, Kentaro Tachibana, Kouichirou Mori, Yuji Shimizu, Takehiko Kagoshima, Masatsune Tamura, Tomohiro Yamasaki
  • Patent number: 8175881
    Abstract: A phoneme sequence corresponding to a target speech is divided into a plurality of segments. A plurality of speech units for each segment is selected from a speech unit memory that stores speech units having at least one frame. The plurality of speech units has a prosodic feature accordant or similar to the target speech. A formant parameter having at least one formant frequency is generated for each frame of the plurality of speech units. A fused formant parameter of each frame is generated from formant parameters of each frame of the plurality of speech units. A fused speech unit of each segment is generated from the fused formant parameter of each frame. A synthesized speech is generated by concatenating the fused speech unit of each segment.
    Type: Grant
    Filed: August 14, 2008
    Date of Patent: May 8, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Ryo Morinaka, Masatsune Tamura, Takehiko Kagoshima
  • Publication number: 20120053933
    Abstract: According to one embodiment, a first storage unit stores n band noise signals obtained by applying n band-pass filters to a noise signal. A second storage unit stores n band pulse signals. A parameter input unit inputs a fundamental frequency, n band noise intensities, and a spectrum parameter. A extraction unit extracts for each pitch mark the n band noise signals while shifting. An amplitude control unit changes amplitudes of the extracted band noise signals and band pulse signals in accordance with the band noise intensities. A generation unit generates a mixed sound source signal by adding the n band noise signals and the n band pulse signals. A generation unit generates the mixed sound source signal generated based on the pitch mark. A vocal tract filter unit generates a speech waveform by applying a vocal tract filter using the spectrum parameter to the generated mixed sound source signal.
    Type: Application
    Filed: March 18, 2011
    Publication date: March 1, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
  • Patent number: 8010362
    Abstract: A voice conversion rule and a rule selection parameter are stored. The voice conversion rule converts a spectral parameter vector of a source speaker to a spectral parameter vector of a target speaker. The rule selection parameter represents the spectral parameter vector of the source speaker. A first voice conversion rule of start time and a second voice conversion rule of end time in a speech unit of the source speaker are selected by the spectral parameter vector of the start time and the end time. An interpolation coefficient corresponding to the spectral parameter vector of each time in the speech unit is calculated by the first voice conversion rule and the second voice conversion rule. A third voice conversion rule corresponding to the spectral parameter vector of each time in the speech unit is calculated by interpolating the first voice conversion rule and the second voice conversion rule with the interpolation coefficient.
    Type: Grant
    Filed: January 22, 2008
    Date of Patent: August 30, 2011
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Takehiro Kagoshima
  • Publication number: 20100049522
    Abstract: A voice conversion apparatus stores, in a parameter memory, target speech spectral parameters of target speech, stores, in a voice conversion rule memory, a voice conversion rule for converting voice quality of source speech into voice quality of the target speech, extracts, from an input source speech, a source speech spectral parameter of the input source speech, converts extracted source speech spectral parameter into a first conversion spectral parameter by using the voice conversion rule, selects target speech spectral parameter similar to the first conversion spectral parameter from the parameter memory, generates an aperiodic component spectral parameter representing from selected target speech spectral parameter, mixes a periodic component spectral parameter included in the first conversion spectral parameter with the aperiodic component spectral parameter, to obtain a second conversion spectral parameter, and generates a speech waveform from the second conversion spectral parameter.
    Type: Application
    Filed: July 20, 2009
    Publication date: February 25, 2010
    Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
  • Patent number: 7630896
    Abstract: A speech synthesis system in a preferred embodiment includes a speech unit storage section, a phonetic environment storage section, a phonetic sequence/prosodic information input section, a plural-speech-unit selection section, a fused-speech-unit sequence generation section, and a fused-speech-unit modification/concatenation section. By fusing a plurality of selected speech units in the fused speech unit sequence generation section, a fused speech unit is generated. In the fused speech unit sequence generation section, the average power information is calculated for a plurality of selected M speech units, N speech units are fused together, and the power information of the fused speech unit is so corrected as to be equalized with the average power information of the M speech units.
    Type: Grant
    Filed: September 23, 2005
    Date of Patent: December 8, 2009
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Gou Hirabayashi, Takehiko Kagoshima
  • Patent number: 7580839
    Abstract: A speech processing apparatus according to an embodiment of the invention includes a conversion-source-speaker speech-unit database; a voice-conversion-rule-learning-data generating means; and a voice-conversion-rule learning means, with which it makes voice conversion rules. The voice-conversion-rule-learning-data generating means includes a conversion-target-speaker speech-unit extracting means; an attribute-information generating means; a conversion-source-speaker speech-unit database; and a conversion-source-speaker speech-unit selection means.
    Type: Grant
    Filed: September 19, 2006
    Date of Patent: August 25, 2009
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Takehiko Kagoshima
  • Publication number: 20090144053
    Abstract: An information extraction unit extracts spectral envelope information of L-dimension from each frame of speech data. The spectral envelope information does not have a spectral fine structure. A basis storage unit stores N bases (L>N>1). Each basis is differently a frequency band having a maximum as a peak frequency in a spectral domain having L-dimension. A value corresponding to a frequency outside the frequency band along a frequency axis of the spectral domain is zero. Two frequency bands of which two peak frequencies are adjacent along the frequency axis partially overlap. A parameter calculation unit minimizes a distortion between the spectral envelope information and a linear combination of each basis with a coefficient by changing the coefficient, and sets the coefficient of each basis from which the distortion is minimized to a spectral envelope parameter of the spectral envelope information.
    Type: Application
    Filed: December 3, 2008
    Publication date: June 4, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masatsune TAMURA, Katsumi TSUCHIYA, Takehiko KAGOSHIMA