Patents by Inventor Ryo Morinaka

Ryo Morinaka has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9830904
    Abstract: According to an embodiment, a text-to-speech device includes a context acquirer, an acoustic model parameter acquirer, a conversion parameter acquirer, a converter, and a waveform generator. The context acquirer is configured to acquire a context sequence affecting fluctuations in voice. The acoustic model parameter acquirer is configured to acquire an acoustic model parameter sequence that corresponds to the context sequence and represents an acoustic model in a standard speaking style of a target speaker. The conversion parameter acquirer is configured to acquire a conversion parameter sequence corresponding to the context sequence to convert an acoustic model parameter in the standard speaking style into one in a different speaking style. The converter is configured to convert the acoustic model parameter sequence using the conversion parameter sequence. The waveform generator is configured to generate a voice signal based on the acoustic model parameter sequence acquired after conversion.
    Type: Grant
    Filed: June 17, 2016
    Date of Patent: November 28, 2017
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Yu Nasu, Masatsune Tamura, Ryo Morinaka, Masahiro Morita
  • Publication number: 20160300564
    Abstract: According to an embodiment, a text-to-speech device includes a context acquirer, an acoustic model parameter acquirer, a conversion parameter acquirer, a converter, and a waveform generator. The context acquirer is configured to acquire a context sequence affecting fluctuations in voice. The acoustic model parameter acquirer is configured to acquire an acoustic model parameter sequence that corresponds to the context sequence and represents an acoustic model in a standard speaking style of a target speaker. The conversion parameter acquirer is configured to acquire a conversion parameter sequence corresponding to the context sequence to convert an acoustic model parameter in the standard speaking style into one in a different speaking style. The converter is configured to convert the acoustic model parameter sequence using the conversion parameter sequence. The waveform generator is configured to generate a voice signal based on the acoustic model parameter sequence acquired after conversion.
    Type: Application
    Filed: June 17, 2016
    Publication date: October 13, 2016
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Yu NASU, Masatsune Tamura, Ryo Morinaka, Masahiro Morita
  • Patent number: 9002711
    Abstract: According to an embodiment, a speech synthesis apparatus includes a selecting unit configured to select speaker's parameters one by one for respective speakers and obtain a plurality of speakers' parameters, the speaker's parameters being prepared for respective pitch waveforms corresponding to speaker's speech sounds, the speaker's parameters including formant frequencies, formant phases, formant powers, and window functions concerning respective formants that are contained in the respective pitch waveforms. The apparatus includes a mapping unit configured to make formants correspond to each other between the plurality of speakers' parameters using a cost function based on the formant frequencies and the formant powers. The apparatus includes a generating unit configured to generate an interpolated speaker's parameter by interpolating, at desired interpolation ratios, the formant frequencies, formant phases, formant powers, and window functions of formants which are made to correspond to each other.
    Type: Grant
    Filed: December 16, 2010
    Date of Patent: April 7, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Ryo Morinaka, Takehiko Kagoshima
  • Publication number: 20140257816
    Abstract: According to an embodiment, a speech synthesis dictionary modification device includes an extracting unit, a display unit, an acquiring unit, an modification unit, and an updating unit. The extracting unit extracts a synthesis information containing a feature sequence of a synthetic speech from the synthetic speech generated by using a speech synthesis dictionary containing probability distributions of speech features. The display unit displays an image prompting to modify a probability distribution contained in the speech synthesis dictionary on a basis of the synthesis information extracted by the extracting unit. The acquiring unit acquires an instruction to modify the probability distribution contained in the speech synthesis dictionary. The modification unit modifies the probability distribution contained in the speech synthesis dictionary according to the instruction.
    Type: Application
    Filed: January 31, 2014
    Publication date: September 11, 2014
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Ryo Morinaka, Masatsune Tamura, Masahiro Morita
  • Patent number: 8175881
    Abstract: A phoneme sequence corresponding to a target speech is divided into a plurality of segments. A plurality of speech units for each segment is selected from a speech unit memory that stores speech units having at least one frame. The plurality of speech units has a prosodic feature accordant or similar to the target speech. A formant parameter having at least one formant frequency is generated for each frame of the plurality of speech units. A fused formant parameter of each frame is generated from formant parameters of each frame of the plurality of speech units. A fused speech unit of each segment is generated from the fused formant parameter of each frame. A synthesized speech is generated by concatenating the fused speech unit of each segment.
    Type: Grant
    Filed: August 14, 2008
    Date of Patent: May 8, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Ryo Morinaka, Masatsune Tamura, Takehiko Kagoshima
  • Publication number: 20110087488
    Abstract: According to an embodiment, a speech synthesis apparatus includes a selecting unit configured to select speaker's parameters one by one for respective speakers and obtain a plurality of speakers' parameters, the speaker's parameters being prepared for respective pitch waveforms corresponding to speaker's speech sounds, the speaker's parameters including formant frequencies, formant phases, formant powers, and window functions concerning respective formants that are contained in the respective pitch waveforms. The apparatus includes a mapping unit configured to make formants correspond to each other between the plurality of speakers' parameters using a cost function based on the formant frequencies and the formant powers. The apparatus includes a generating unit configured to generate an interpolated speaker's parameter by interpolating, at desired interpolation ratios, the formant frequencies, formant phases, formant powers, and window functions of formants which are made to correspond to each other.
    Type: Application
    Filed: December 16, 2010
    Publication date: April 14, 2011
    Inventors: Ryo Morinaka, Takehiko Kagoshima
  • Publication number: 20090326951
    Abstract: Ratios of powers at the peaks of respective formants of the spectrum of a pitch-cycle waveform and powers at boundaries between the formants are obtained and, when the ratios are large, bandwidth of window functions are widened and the formant waveforms are generated by multiplying generated sinusoidal waveforms from the formant parameter sets on the basis of pitch-cycle waveform generating data by the window functions of the widened bandwidth, whereby a pitch-cycle waveform is generated by the sum of these formant waveforms.
    Type: Application
    Filed: April 14, 2009
    Publication date: December 31, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Ryo Morinaka, Takehiko Kagoshima
  • Publication number: 20090048844
    Abstract: A phoneme sequence corresponding to a target speech is divided into a plurality of segments. A plurality of speech units for each segment is selected from a speech unit memory that stores speech units having at least one frame. The plurality of speech units has a prosodic feature accordant or similar to the target speech. A formant parameter having at least one formant frequency is generated for each frame of the plurality of speech units. A fused formant parameter of each frame is generated from formant parameters of each frame of the plurality of speech units. A fused speech unit of each segment is generated from the fused formant parameter of each frame. A synthesized speech is generated by concatenating the fused speech unit of each segment.
    Type: Application
    Filed: August 14, 2008
    Publication date: February 19, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Ryo Morinaka, Masatsune Tamura, Takehiko Kagoshima