Patents by Inventor Takehiko Kagoshima
Takehiko Kagoshima has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20090043568Abstract: An accent type is determined by outputting mora synchronized signals, extracting a pitch pattern which is a variation pattern of a voice height (fundamental frequency) from a speech signal entered by a user, generating mora synchronized pattern from the pitch pattern and the mora synchronized signal, storing typical patterns for respective accent types, collating the mora synchronized pattern and reference accent pattern, calculating matching of the mora synchronized patterns with respect to the respective accent types, referring the matching and determining the accent type.Type: ApplicationFiled: February 20, 2008Publication date: February 12, 2009Applicant: KABUSHIKI KAISHA TOSHIBAInventor: Takehiko Kagoshima
-
Publication number: 20090018836Abstract: In a speech synthesis, a selecting unit selects one string from first speech unit strings corresponding to a first segment sequence obtained by dividing a phoneme string corresponding to target speech into segments. The selecting unit performs repeatedly generating, based on maximum W second speech unit strings corresponding to a second segment sequence as a partial sequence of the first sequence, third speech unit strings corresponding to a third segment sequence obtained by adding a segment to the second sequence, and selecting maximum W strings from the third strings based on a evaluation value of each of the third strings. The value is obtained by correcting a total cost of each of the third string candidate with a penalty coefficient for each of the third strings. The coefficient is based on a restriction concerning quickness of speech unit data acquisition, and depends on extent in which the restriction is approached.Type: ApplicationFiled: March 19, 2008Publication date: January 15, 2009Applicant: Kabushiki Kaisha ToshibaInventors: Masahiro Morita, Takehiko Kagoshima
-
Publication number: 20080312931Abstract: A speech synthesis system stores a group of speech units in a memory, selects a plurality of speech units from the group based on prosodic information of target speech, the speech units selected corresponding to each of segments which are obtained by segmenting a phoneme string of the target speech and minimizing distortion of synthetic speech generated from the speech units selected to the target speech, generates a new speech unit corresponding to the each of the segments, by fusing the speech units selected, to obtain a plurality of new speech units corresponding to the segments respectively, and generates synthetic speech by concatenating the new speech units.Type: ApplicationFiled: August 18, 2008Publication date: December 18, 2008Inventors: Tatsuya MIZUTANI, Takehiko Kagoshima
-
Publication number: 20080027727Abstract: A speech unit corpus stores a group of speech units. A selection unit divides a phoneme sequence of target speech into a plurality of segments, and selects a combination of speech units for each segment from the speech unit corpus. An estimation unit estimates a distortion between the target speech and synthesized speech generated by fusing each speech unit of the combination for each segment. The selection unit recursively selects the combination of speech units for each segment based on the distortion. A fusion unit generates a new speech unit for each segment by fusing each speech unit of the combination selected for each segment. A concatenation unit generates synthesized speech by concatenating the new speech unit for each segment.Type: ApplicationFiled: July 23, 2007Publication date: January 31, 2008Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Masahiro MORITA, Takehiko Kagoshima
-
Publication number: 20070271099Abstract: A waveform memory stores a plurality of speech unit waveforms. A information memory correspondingly stores speech unit information and an address of each of the plurality of speech unit waveforms. A selector selects a speech unit sequence corresponding to the input phoneme sequence by referring to the speech unit information. A speech unit waveform acquisition unit acquires a speech unit waveform corresponding to each speech unit of the speech unit sequence from the waveform memory by referring to the address. A speech unit concatenation unit generates the speech by concatenating the speech unit waveform acquired. The speech unit waveform acquisition unit acquires at least two speech unit waveforms corresponding to at least two speech units included in the speech unit sequence from a continuous region of the waveform memory during one access.Type: ApplicationFiled: May 8, 2007Publication date: November 22, 2007Applicant: KABUSHIKI KAISHA TOSHIBAInventor: Takehiko KAGOSHIMA
-
Publication number: 20070179779Abstract: In language information translating device and method, registered vocabulary information pieces of plural users registered into a user dictionary registering unit are referred to, and when plural vocabulary information pieces having the same direction word exist, a direction word to be added to a basic dictionary is extracted on the basis of at least one of the number of registered vocabulary information pieces of the direction word concerned and the number of registered vocabulary information pieces that are registered vocabulary information pieces of the direction word concerned, the second language expressions corresponding to the registered vocabulary information pieces concerned being coincident with one another, and the basic vocabulary information of the extracted direction word is registered in the basic dictionary.Type: ApplicationFiled: October 26, 2006Publication date: August 2, 2007Inventors: Takehiko Kagoshima, Gou Hirabayashi, Yuji Shimizu, Dawei Xu
-
Patent number: 7251601Abstract: A speech synthesis method comprises selecting a predetermined formant parameters from formant parameters according to a pitch pattern, phoneme duration, and phoneme symbol string, generating a plurality of sine waves based on formant frequency and formant phase of the formant parameters selected, multiplying the sine waves by windowing functions of the selected formant parameters, respectively, to generate a plurality of formant waveforms, adding the formant waveforms to generate a plurality of pitch waveforms, and superposing the pitch waveforms according to a pitch period to generate a speech signal.Type: GrantFiled: March 21, 2002Date of Patent: July 31, 2007Assignee: Kabushiki Kaisha ToshibaInventors: Takehiko Kagoshima, Masami Akamine
-
Publication number: 20070168189Abstract: A speech processing apparatus according to an embodiment of the invention includes a conversion-source-speaker speech-unit database; a voice-conversion-rule-learning-data generating means; and a voice-conversion-rule learning means, with which it makes voice conversion rules. The voice-conversion-rule-learning-data generating means includes a conversion-target-speaker speech-unit extracting means; an attribute-information generating means; a conversion-source-speaker speech-unit database; and a conversion-source-speaker speech-unit selection means.Type: ApplicationFiled: September 19, 2006Publication date: July 19, 2007Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Masatsune TAMURA, Takehiko Kagoshima
-
Patent number: 7184958Abstract: A speech synthesis method subjects a reference speech signal to windowing to extract a speech pitch wave having a window function of a window length double a pitch period of the reference speech signal from the reference speech signal. A linear prediction coefficient is generated by subjecting the reference speech signal to a linear prediction analysis. The speech pitch wave is subjected to inverse-filtering based on the linear prediction coefficient to produce a residual pitch wave, which is then stored as information of a speech synthesis unit in a voiced period in a storage. Speech using the information of the speech synthesis unit is then synthesized.Type: GrantFiled: March 5, 2004Date of Patent: February 27, 2007Assignee: Kabushiki Kaisha ToshibaInventors: Takehiko Kagoshima, Masami Akamine
-
Publication number: 20060271367Abstract: A pitch pattern generation method which enables generation of a stable pitch pattern with high naturalness is provided, a pattern selection part 10 selects N pitch patterns 101 and M pitch patterns 103 for each prosody control unit from pitch patterns stored in a pitch pattern storage part 14 based on language attribute information 100 obtained by analyzing a text and phoneme duration 111, a pattern shape generation part 11 fuses the N selected pitch patterns 101 based on the language attribute information 100 to generate a fused pitch pattern and performs expansion or contraction of the fused pitch pattern in a time axis direction in accordance with the phoneme duration 111 to generate a new pitch pattern 102, an offset control part 12 calculates a statistic amount of offset values from the M selected pitch patterns 103 and deforms the pitch pattern 102 in accordance with the statistic amount to output a pitch pattern 104, and a pattern connection part 13 connects the pitch pattern 104 generated for each proType: ApplicationFiled: September 23, 2005Publication date: November 30, 2006Applicant: Kabushiki Kaisha ToshibaInventors: Go Hirabayashi, Takehiko Kagoshima
-
Publication number: 20060224380Abstract: A pitch pattern generating method includes preparing a memory to store a plurality of pitch patterns each extracted from natural speech, and pattern attribute information corresponding to the pitch patterns, inputting language attribute information obtained by analyzing a text including prosody control units, selecting, from the pitch patterns stored in the memory, a group of pitch patterns corresponding to each of the prosody control units based on the language attribute information, to obtain a plurality of groups corresponding to the prosody control units respectively, generating a new pitch pattern corresponding to the each of prosody control units by fusing pitch patterns of the group, to obtain a plurality of new pitch patterns corresponding to the prosody control units respectively, and generating a pitch pattern corresponding to the text based on the new pitch patterns.Type: ApplicationFiled: March 22, 2006Publication date: October 5, 2006Inventors: Gou Hirabayashi, Takehiko Kagoshima
-
Publication number: 20060224391Abstract: A speech synthesis system in a preferred embodiment includes a speech unit storage section, a phonetic environment storage section, a phonetic sequence/prosodic information input section, a plural-speech-unit selection section, a fused-speech-unit sequence generation section, and a fused-speech-unit modification/concatenation section. By fusing a plurality of selected speech units in the fused speech unit sequence generation section, a fused speech unit is generated. In the fused speech unit sequence generation section, the average power information is calculated for a plurality of selected M speech units, N speech units are fused together, and the power information of the fused speech unit is so corrected as to be equalized with the average power information of the M speech units.Type: ApplicationFiled: September 23, 2005Publication date: October 5, 2006Applicant: Kabushiki Kaisha ToshibaInventors: Masatsune Tamura, Gou Hirabayashi, Takehiko Kagoshima
-
Publication number: 20050137870Abstract: A speech synthesis system stores a group of speech units in a memory, selects a plurality of speech units from the group based on prosodic information of target speech, the speech units selected corresponding to each of segments which are obtained by segmenting a phoneme string of the target speech and minimizing distortion of synthetic speech generated from the speech units selected to the target speech, generates a new speech unit corresponding to the each of the segments, by fusing the speech units selected, to obtain a plurality of new speech units corresponding to the segments respectively, and generates synthetic speech by concatenating the new speech units.Type: ApplicationFiled: November 26, 2004Publication date: June 23, 2005Inventors: Tatsuya Mizutani, Takehiko Kagoshima
-
Publication number: 20040172251Abstract: In a synthesis unit generator, a plurality of synthesis speech segments are generated by synthesizing training speech segments labeled with phonetic contexts and input speech segments while altering the pitch/duration of the input speech segments in accordance with the pitch/duration of the training speech segments. Typical speech segments are selected from the input speech segments on the basis of a distance between the synthesis speech segments and the training speech segments, and are stored in a storage. In addition, a plurality of phonetic contact clusters corresponding to the synthesis units are generated on the basis of the distance, and are stored in a storage. A synthesis speech signal is generated by reading out, from the storage, those of the synthesis units, which correspond to the phonetic context clusters including phonetic contexts of input phonemes, and connecting the selected synthesis units in a speech synthesizer.Type: ApplicationFiled: March 5, 2004Publication date: September 2, 2004Inventors: Takehiko Kagoshima, Masami Akamine
-
Patent number: 6760703Abstract: A speech synthesis method that generates a speech pitch wave from a reference speech signal by subjecting the reference speech signal to one of Fourier transform and Fourier series expansion to produce a discrete spectrum, that interpolates the discrete spectrum to generate a consecutive spectrum, and that subjects the consecutive spectrum to inverse Fourier transform. A linear prediction coefficient is generated by subjecting the reference speech signal to a linear prediction analysis. The speech pitch wave is subjected to inverse-filtering based on the linear prediction coefficient to produce a residual pitch wave. Information regarding the residual pitch wave is stored as information of a speech synthesis unit in a voice period. A speech is then synthesized using the information of the speech synthesis unit.Type: GrantFiled: October 7, 2002Date of Patent: July 6, 2004Assignee: Kabushiki Kaisha ToshibaInventors: Takehiko Kagoshima, Masami Akamine
-
Publication number: 20030088418Abstract: In a synthesis unit generator, a plurality of synthesis speech segments are generated by synthesizing training speech segments labeled with phonetic contexts and input speech segments while altering the pitch/duration of the input speech segments in accordance with the pitch/duration of the training speech segments. Typical speech segments are selected from the input speech segments on the basis of a distance between the synthesis speech segments and the training speech segments, and are stored in a storage. In addition, a plurality of phonetic context clusters corresponding to the synthesis units are generated on the basis of the distance, and are stored in a storage. A synthesis speech signal is generated by reading out, from the storage, those of the synthesis units, which correspond to the phonetic context clusters including phonetic contexts of input phonemes, and connecting the selected synthesis units in a speech synthesizer.Type: ApplicationFiled: October 7, 2002Publication date: May 8, 2003Inventors: Takehiko Kagoshima, Masami Akamine
-
Patent number: 6553343Abstract: A speech synthesis method subjects a reference speech signal to windowing to extract an aperiodic speech pitch wave from the reference speech signal. A linear prediction coefficient is generated by subjecting the reference speech signal to a linear prediction analysis. The aperiodic speech pitch wave is subjected to inverse-filtering based on the linear prediction coefficient to produce a residual pitch wave. Information regarding the residual pitch wave is stored as information of a speech synthesis unit and a voiced period in the storage. The speech is then synthesized using the information of the speech synthesis unit.Type: GrantFiled: October 29, 2001Date of Patent: April 22, 2003Assignee: Kabushiki Kaisha ToshibaInventors: Takehiko Kagoshima, Masami Akamine
-
Patent number: 6529874Abstract: A representative pattern memory stores a plurality of initial representative patterns as a noise pattern. Different attribute is affixed to each initial representative pattern. A pitch pattern memory stores a large number of natural pitch patterns as an accent phrase. A clustering unit classifies each natural pitch pattern to the initial representative pattern based on the attribute of the accent phrase. A transformation parameter generation unit calculates an error between a transformed representative pattern and each natural pitch pattern classified to the initial representative pattern. A representative pattern generation unit calculates an evaluation function of the sum of the error between the transformed-representative pattern and each natural pitch pattern classified to the initial representative pattern, and updates each initial representative pattern.Type: GrantFiled: September 8, 1998Date of Patent: March 4, 2003Assignee: Kabushiki Kaisha ToshibaInventors: Takehiko Kagoshima, Takaaki Nii, Shigenobu Seto, Masahiro Morita, Masami Akamine, Yoshinori Shiga
-
Publication number: 20020138253Abstract: A speech synthesis method comprises selecting a predetermined formant parameters from formant parameters according to a pitch pattern, phoneme duration, and phoneme symbol string, generating a plurality of sine waves based on formant frequency and formant phase of the formant parameters selected, multiplying the sine waves by windowing functions of the selected formant parameters, respectively, to generate a plurality of formant waveforms, adding the formant waveforms to generate a plurality of pitch waveforms, and superposing the pitch waveforms according to a pitch period to generate a speech signal.Type: ApplicationFiled: March 21, 2002Publication date: September 26, 2002Inventors: Takehiko Kagoshima, Masami Akamine
-
Patent number: 6332121Abstract: In a synthesis unit generator, a plurality of synthesis speech segments are generated by synthesizing training speech segments labeled with phonetic contexts and input speech segments while altering the pitch/duration of the input speech segments in accordance with the pitch/duration of the training speech segments. Typical speech segments are selected from the input speech segments on the basis of a distance between the synthesis speech segments and the training speech segments, and are stored in a storage. In addition, a plurality of phonetic context clusters corresponding to the synthesis units are generated on the basis of the distance, and are stored in a storage. A synthesis speech signal is generated by reading out, from the storage, those of the synthesis units, which correspond to the phonetic context clusters including phonetic contexts of input phonemes, and connecting the selected synthesis units in a speech synthesizer.Type: GrantFiled: November 27, 2000Date of Patent: December 18, 2001Assignee: Kabushiki Kaisha ToshibaInventors: Takehiko Kagoshima, Masami Akamine