Patents by Inventor Gou Hirabayashi
Gou Hirabayashi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10872597Abstract: A speech synthesis dictionary delivery device that delivers a dictionary for performing speech synthesis to terminals, comprises a storage device for speech synthesis dictionary database that stores a first dictionary which includes an acoustic model of a speaker and is associated with identification information of the speaker, that stores a second dictionary which includes an acoustic model generated using voice data of a plurality of speakers, and that stores parameter sets of the speakers to be used with the second dictionary and which are associated with identification information of the speakers, a processor that determines one of the first dictionary and the second dictionary, which should be used in the terminal for a specified speaker, and an input output interface (I/F) that receives the identification information of a speaker transmitted from the terminal and then delivers at least one of a first dictionary, the second dictionary, and a parameter set of the second dictionary, on the basis of the recType: GrantFiled: August 8, 2018Date of Patent: December 22, 2020Assignees: Kabushiki Kaisha Toshiba, Toshiba Digital Solutions CornorationInventors: Kouichirou Mori, Gou Hirabayashi, Masahiro Morita, Yamato Ohtani
-
Publication number: 20190066656Abstract: A speech synthesis dictionary delivery device that delivers a dictionary for performing speech synthesis to terminals, comprises a storage device for speech synthesis dictionary database that stores a first dictionary which includes an acoustic model of a speaker and is associated with identification information of the speaker, that stores a second dictionary which includes an acoustic model generated using voice data of a plurality of speakers, and that stores parameter sets of the speakers to be used with the second dictionary and which are associated with identification information of the speakers, a processor that determines one of the first dictionary and the second dictionary, which should be used in the terminal for a specified speaker, and an input output interface (I/F) that receives the identification information of a speaker transmitted from the terminal and then delivers at least one of a first dictionary, the second dictionary, and a parameter set of the second dictionary, on the basis of the recType: ApplicationFiled: August 8, 2018Publication date: February 28, 2019Applicants: Kabushiki Kaisha Toshiba, Toshiba Digital Solutions CorporationInventors: Kouichirou MORI, Gou HIRABAYASHI, Masahiro MORITA, Yamato OHTANI
-
Patent number: 8868422Abstract: According to one embodiment, a method for editing speech is disclosed. The method can generate speech information from a text. The speech information includes phonologic information and prosody information. The method can divide the speech information into a plurality of speech units, based on at least one of the phonologic information and the prosody information. The method can search at least two speech units from the plurality of speech units. At least one of the phonologic information and the prosody information in the at least two speech units are identical or similar. In addition, the method can store a speech unit waveform corresponding to one of the at least two speech units as a representative speech unit into a memory.Type: GrantFiled: September 13, 2010Date of Patent: October 21, 2014Assignee: Kabushiki Kaisha ToshibaInventors: Gou Hirabayashi, Takehiko Kagoshima
-
Patent number: 8655664Abstract: According to an embodiment, a text presentation apparatus presenting text for a speaker to read aloud for voice recording includes: a text storing unit for storing first text; a presenting unit for presenting the first text; a determination unit for determining whether or not the first text needs to be replaced, on the basis of a speaker's input for the first text presented; a preliminary text storing unit for storing preliminary text; a select unit configured to select, if it is determined that the first text needs to be replaced, second text to replace the first text from among the preliminary text, the selecting being performed on the basis of attribute information describing an attribute of the first text and on the basis of at least one of attribute information describing pronunciation of the first text and attribute information describing a stress type of the first text; and a control unit configured to control the presenting unit so that the presenting unit presents the second text.Type: GrantFiled: August 11, 2011Date of Patent: February 18, 2014Assignee: Kabushiki Kaisha ToshibaInventors: Kentaro Tachibana, Gou Hirabayashi, Takehiko Kagoshima
-
Publication number: 20120065981Abstract: According to an embodiment, a text presentation apparatus presenting text for a speaker to read aloud for voice recording includes: a text storing unit for storing first text; a presenting unit for presenting the first text; a determination unit for determining whether or not the first text needs to be replaced, on the basis of a speaker's input for the first text presented; a preliminary text storing unit for storing preliminary text; a select unit configured to select, if it is determined that the first text needs to be replaced, second text to replace the first text from among the preliminary text, the selecting being performed on the basis of attribute information describing an attribute of the first text and on the basis of at least one of attribute information describing pronunciation of the first text and attribute information describing a stress type of the first text; and a control unit configured to control the presenting unit so that the presenting unit presents the second text.Type: ApplicationFiled: August 11, 2011Publication date: March 15, 2012Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Kentaro Tachibana, Gou Hirabayashi, Takehiko Kagoshima
-
Publication number: 20110238420Abstract: According to one embodiment, a method for editing speech is disclosed. The method can generate speech information from a text. The speech information includes phonologic information and prosody information. The method can divide the speech information into a plurality of speech units, based on at least one of the phonologic information and the prosody information. The method can search at least two speech units from the plurality of speech units. At least one of the phonologic information and the prosody information in the at least two speech units are identical or similar. In addition, the method can store a speech unit waveform corresponding to one of the at least two speech units as a representative speech unit into a memory.Type: ApplicationFiled: September 13, 2010Publication date: September 29, 2011Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Gou Hirabayashi, Takehiko Kagoshima
-
Patent number: 7630896Abstract: A speech synthesis system in a preferred embodiment includes a speech unit storage section, a phonetic environment storage section, a phonetic sequence/prosodic information input section, a plural-speech-unit selection section, a fused-speech-unit sequence generation section, and a fused-speech-unit modification/concatenation section. By fusing a plurality of selected speech units in the fused speech unit sequence generation section, a fused speech unit is generated. In the fused speech unit sequence generation section, the average power information is calculated for a plurality of selected M speech units, N speech units are fused together, and the power information of the fused speech unit is so corrected as to be equalized with the average power information of the M speech units.Type: GrantFiled: September 23, 2005Date of Patent: December 8, 2009Assignee: Kabushiki Kaisha ToshibaInventors: Masatsune Tamura, Gou Hirabayashi, Takehiko Kagoshima
-
Publication number: 20090112580Abstract: The speech processing apparatus configured to split a first speech waveform and a second speech waveform into a plurality of frequency bands respectively to generate a first band speech waveform and a second band speech waveform each being a component of each frequency band; determine an overlap-added position between the first band speech waveform and the second band speech waveform by the each frequency band so that a high cross correlation between the first band speech waveform and the second band speech waveform is obtained; and overlap-add the first band speech waveform and the second band speech waveform by the each frequency band on the basis of the overlap-added position and integrates overlap-added band speech waveforms in the plurality of frequency bands over all the plurality of frequency bands to generate a concatenated speech waveform.Type: ApplicationFiled: July 21, 2008Publication date: April 30, 2009Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Gou Hirabayashi, Dawei Xu, Takehiko Kagoshima
-
Publication number: 20090055188Abstract: The prosody control unit pattern generation module generates pitch patterns in respective prosody control units based on language attribute information, the phoneme duration and emphasis degree information, the modification method decision module decides a modification method by smoothing processing with respect to the pitch pattern in a connection portion between the prosody control unit and at least one of previous and next prosody control units based on at least emphasis degree information to generate modification method information, and the pattern connection module modifies pitch patterns generated in respective prosody control units by smoothing processing according to the modification method information and connects them to generate a sentence pitch pattern corresponding to a text to be a target for speech synthesis.Type: ApplicationFiled: February 22, 2008Publication date: February 26, 2009Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Gou Hirabayashi, Takehiko Kagoshima
-
Publication number: 20070179779Abstract: In language information translating device and method, registered vocabulary information pieces of plural users registered into a user dictionary registering unit are referred to, and when plural vocabulary information pieces having the same direction word exist, a direction word to be added to a basic dictionary is extracted on the basis of at least one of the number of registered vocabulary information pieces of the direction word concerned and the number of registered vocabulary information pieces that are registered vocabulary information pieces of the direction word concerned, the second language expressions corresponding to the registered vocabulary information pieces concerned being coincident with one another, and the basic vocabulary information of the extracted direction word is registered in the basic dictionary.Type: ApplicationFiled: October 26, 2006Publication date: August 2, 2007Inventors: Takehiko Kagoshima, Gou Hirabayashi, Yuji Shimizu, Dawei Xu
-
Publication number: 20060224391Abstract: A speech synthesis system in a preferred embodiment includes a speech unit storage section, a phonetic environment storage section, a phonetic sequence/prosodic information input section, a plural-speech-unit selection section, a fused-speech-unit sequence generation section, and a fused-speech-unit modification/concatenation section. By fusing a plurality of selected speech units in the fused speech unit sequence generation section, a fused speech unit is generated. In the fused speech unit sequence generation section, the average power information is calculated for a plurality of selected M speech units, N speech units are fused together, and the power information of the fused speech unit is so corrected as to be equalized with the average power information of the M speech units.Type: ApplicationFiled: September 23, 2005Publication date: October 5, 2006Applicant: Kabushiki Kaisha ToshibaInventors: Masatsune Tamura, Gou Hirabayashi, Takehiko Kagoshima
-
Publication number: 20060224380Abstract: A pitch pattern generating method includes preparing a memory to store a plurality of pitch patterns each extracted from natural speech, and pattern attribute information corresponding to the pitch patterns, inputting language attribute information obtained by analyzing a text including prosody control units, selecting, from the pitch patterns stored in the memory, a group of pitch patterns corresponding to each of the prosody control units based on the language attribute information, to obtain a plurality of groups corresponding to the prosody control units respectively, generating a new pitch pattern corresponding to the each of prosody control units by fusing pitch patterns of the group, to obtain a plurality of new pitch patterns corresponding to the prosody control units respectively, and generating a pitch pattern corresponding to the text based on the new pitch patterns.Type: ApplicationFiled: March 22, 2006Publication date: October 5, 2006Inventors: Gou Hirabayashi, Takehiko Kagoshima