Patents by Inventor Gou Hirabayashi

Gou Hirabayashi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10872597
    Abstract: A speech synthesis dictionary delivery device that delivers a dictionary for performing speech synthesis to terminals, comprises a storage device for speech synthesis dictionary database that stores a first dictionary which includes an acoustic model of a speaker and is associated with identification information of the speaker, that stores a second dictionary which includes an acoustic model generated using voice data of a plurality of speakers, and that stores parameter sets of the speakers to be used with the second dictionary and which are associated with identification information of the speakers, a processor that determines one of the first dictionary and the second dictionary, which should be used in the terminal for a specified speaker, and an input output interface (I/F) that receives the identification information of a speaker transmitted from the terminal and then delivers at least one of a first dictionary, the second dictionary, and a parameter set of the second dictionary, on the basis of the rec
    Type: Grant
    Filed: August 8, 2018
    Date of Patent: December 22, 2020
    Assignees: Kabushiki Kaisha Toshiba, Toshiba Digital Solutions Cornoration
    Inventors: Kouichirou Mori, Gou Hirabayashi, Masahiro Morita, Yamato Ohtani
  • Publication number: 20190066656
    Abstract: A speech synthesis dictionary delivery device that delivers a dictionary for performing speech synthesis to terminals, comprises a storage device for speech synthesis dictionary database that stores a first dictionary which includes an acoustic model of a speaker and is associated with identification information of the speaker, that stores a second dictionary which includes an acoustic model generated using voice data of a plurality of speakers, and that stores parameter sets of the speakers to be used with the second dictionary and which are associated with identification information of the speakers, a processor that determines one of the first dictionary and the second dictionary, which should be used in the terminal for a specified speaker, and an input output interface (I/F) that receives the identification information of a speaker transmitted from the terminal and then delivers at least one of a first dictionary, the second dictionary, and a parameter set of the second dictionary, on the basis of the rec
    Type: Application
    Filed: August 8, 2018
    Publication date: February 28, 2019
    Applicants: Kabushiki Kaisha Toshiba, Toshiba Digital Solutions Corporation
    Inventors: Kouichirou MORI, Gou HIRABAYASHI, Masahiro MORITA, Yamato OHTANI
  • Patent number: 8868422
    Abstract: According to one embodiment, a method for editing speech is disclosed. The method can generate speech information from a text. The speech information includes phonologic information and prosody information. The method can divide the speech information into a plurality of speech units, based on at least one of the phonologic information and the prosody information. The method can search at least two speech units from the plurality of speech units. At least one of the phonologic information and the prosody information in the at least two speech units are identical or similar. In addition, the method can store a speech unit waveform corresponding to one of the at least two speech units as a representative speech unit into a memory.
    Type: Grant
    Filed: September 13, 2010
    Date of Patent: October 21, 2014
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Gou Hirabayashi, Takehiko Kagoshima
  • Patent number: 8655664
    Abstract: According to an embodiment, a text presentation apparatus presenting text for a speaker to read aloud for voice recording includes: a text storing unit for storing first text; a presenting unit for presenting the first text; a determination unit for determining whether or not the first text needs to be replaced, on the basis of a speaker's input for the first text presented; a preliminary text storing unit for storing preliminary text; a select unit configured to select, if it is determined that the first text needs to be replaced, second text to replace the first text from among the preliminary text, the selecting being performed on the basis of attribute information describing an attribute of the first text and on the basis of at least one of attribute information describing pronunciation of the first text and attribute information describing a stress type of the first text; and a control unit configured to control the presenting unit so that the presenting unit presents the second text.
    Type: Grant
    Filed: August 11, 2011
    Date of Patent: February 18, 2014
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kentaro Tachibana, Gou Hirabayashi, Takehiko Kagoshima
  • Publication number: 20120065981
    Abstract: According to an embodiment, a text presentation apparatus presenting text for a speaker to read aloud for voice recording includes: a text storing unit for storing first text; a presenting unit for presenting the first text; a determination unit for determining whether or not the first text needs to be replaced, on the basis of a speaker's input for the first text presented; a preliminary text storing unit for storing preliminary text; a select unit configured to select, if it is determined that the first text needs to be replaced, second text to replace the first text from among the preliminary text, the selecting being performed on the basis of attribute information describing an attribute of the first text and on the basis of at least one of attribute information describing pronunciation of the first text and attribute information describing a stress type of the first text; and a control unit configured to control the presenting unit so that the presenting unit presents the second text.
    Type: Application
    Filed: August 11, 2011
    Publication date: March 15, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kentaro Tachibana, Gou Hirabayashi, Takehiko Kagoshima
  • Publication number: 20110238420
    Abstract: According to one embodiment, a method for editing speech is disclosed. The method can generate speech information from a text. The speech information includes phonologic information and prosody information. The method can divide the speech information into a plurality of speech units, based on at least one of the phonologic information and the prosody information. The method can search at least two speech units from the plurality of speech units. At least one of the phonologic information and the prosody information in the at least two speech units are identical or similar. In addition, the method can store a speech unit waveform corresponding to one of the at least two speech units as a representative speech unit into a memory.
    Type: Application
    Filed: September 13, 2010
    Publication date: September 29, 2011
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Gou Hirabayashi, Takehiko Kagoshima
  • Patent number: 7630896
    Abstract: A speech synthesis system in a preferred embodiment includes a speech unit storage section, a phonetic environment storage section, a phonetic sequence/prosodic information input section, a plural-speech-unit selection section, a fused-speech-unit sequence generation section, and a fused-speech-unit modification/concatenation section. By fusing a plurality of selected speech units in the fused speech unit sequence generation section, a fused speech unit is generated. In the fused speech unit sequence generation section, the average power information is calculated for a plurality of selected M speech units, N speech units are fused together, and the power information of the fused speech unit is so corrected as to be equalized with the average power information of the M speech units.
    Type: Grant
    Filed: September 23, 2005
    Date of Patent: December 8, 2009
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Gou Hirabayashi, Takehiko Kagoshima
  • Publication number: 20090112580
    Abstract: The speech processing apparatus configured to split a first speech waveform and a second speech waveform into a plurality of frequency bands respectively to generate a first band speech waveform and a second band speech waveform each being a component of each frequency band; determine an overlap-added position between the first band speech waveform and the second band speech waveform by the each frequency band so that a high cross correlation between the first band speech waveform and the second band speech waveform is obtained; and overlap-add the first band speech waveform and the second band speech waveform by the each frequency band on the basis of the overlap-added position and integrates overlap-added band speech waveforms in the plurality of frequency bands over all the plurality of frequency bands to generate a concatenated speech waveform.
    Type: Application
    Filed: July 21, 2008
    Publication date: April 30, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Gou Hirabayashi, Dawei Xu, Takehiko Kagoshima
  • Publication number: 20090055188
    Abstract: The prosody control unit pattern generation module generates pitch patterns in respective prosody control units based on language attribute information, the phoneme duration and emphasis degree information, the modification method decision module decides a modification method by smoothing processing with respect to the pitch pattern in a connection portion between the prosody control unit and at least one of previous and next prosody control units based on at least emphasis degree information to generate modification method information, and the pattern connection module modifies pitch patterns generated in respective prosody control units by smoothing processing according to the modification method information and connects them to generate a sentence pitch pattern corresponding to a text to be a target for speech synthesis.
    Type: Application
    Filed: February 22, 2008
    Publication date: February 26, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Gou Hirabayashi, Takehiko Kagoshima
  • Publication number: 20070179779
    Abstract: In language information translating device and method, registered vocabulary information pieces of plural users registered into a user dictionary registering unit are referred to, and when plural vocabulary information pieces having the same direction word exist, a direction word to be added to a basic dictionary is extracted on the basis of at least one of the number of registered vocabulary information pieces of the direction word concerned and the number of registered vocabulary information pieces that are registered vocabulary information pieces of the direction word concerned, the second language expressions corresponding to the registered vocabulary information pieces concerned being coincident with one another, and the basic vocabulary information of the extracted direction word is registered in the basic dictionary.
    Type: Application
    Filed: October 26, 2006
    Publication date: August 2, 2007
    Inventors: Takehiko Kagoshima, Gou Hirabayashi, Yuji Shimizu, Dawei Xu
  • Publication number: 20060224391
    Abstract: A speech synthesis system in a preferred embodiment includes a speech unit storage section, a phonetic environment storage section, a phonetic sequence/prosodic information input section, a plural-speech-unit selection section, a fused-speech-unit sequence generation section, and a fused-speech-unit modification/concatenation section. By fusing a plurality of selected speech units in the fused speech unit sequence generation section, a fused speech unit is generated. In the fused speech unit sequence generation section, the average power information is calculated for a plurality of selected M speech units, N speech units are fused together, and the power information of the fused speech unit is so corrected as to be equalized with the average power information of the M speech units.
    Type: Application
    Filed: September 23, 2005
    Publication date: October 5, 2006
    Applicant: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Gou Hirabayashi, Takehiko Kagoshima
  • Publication number: 20060224380
    Abstract: A pitch pattern generating method includes preparing a memory to store a plurality of pitch patterns each extracted from natural speech, and pattern attribute information corresponding to the pitch patterns, inputting language attribute information obtained by analyzing a text including prosody control units, selecting, from the pitch patterns stored in the memory, a group of pitch patterns corresponding to each of the prosody control units based on the language attribute information, to obtain a plurality of groups corresponding to the prosody control units respectively, generating a new pitch pattern corresponding to the each of prosody control units by fusing pitch patterns of the group, to obtain a plurality of new pitch patterns corresponding to the prosody control units respectively, and generating a pitch pattern corresponding to the text based on the new pitch patterns.
    Type: Application
    Filed: March 22, 2006
    Publication date: October 5, 2006
    Inventors: Gou Hirabayashi, Takehiko Kagoshima