Patents by Inventor Takehiko Kagoshima

Takehiko Kagoshima has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20120053933
    Abstract: According to one embodiment, a first storage unit stores n band noise signals obtained by applying n band-pass filters to a noise signal. A second storage unit stores n band pulse signals. A parameter input unit inputs a fundamental frequency, n band noise intensities, and a spectrum parameter. A extraction unit extracts for each pitch mark the n band noise signals while shifting. An amplitude control unit changes amplitudes of the extracted band noise signals and band pulse signals in accordance with the band noise intensities. A generation unit generates a mixed sound source signal by adding the n band noise signals and the n band pulse signals. A generation unit generates the mixed sound source signal generated based on the pitch mark. A vocal tract filter unit generates a speech waveform by applying a vocal tract filter using the spectrum parameter to the generated mixed sound source signal.
    Type: Application
    Filed: March 18, 2011
    Publication date: March 1, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
  • Patent number: 8108216
    Abstract: In a speech synthesis, a selecting unit selects one string from first speech unit strings corresponding to a first segment sequence obtained by dividing a phoneme string corresponding to target speech into segments. The selecting unit performs repeatedly generating, based on maximum W second speech unit strings corresponding to a second segment sequence as a partial sequence of the first sequence, third speech unit strings corresponding to a third segment sequence obtained by adding a segment to the second sequence, and selecting maximum W strings from the third strings based on a evaluation value of each of the third strings. The value is obtained by correcting a total cost of each of the third string candidate with a penalty coefficient for each of the third strings. The coefficient is based on a restriction concerning quickness of speech unit data acquisition, and depends on extent in which the restriction is approached.
    Type: Grant
    Filed: March 19, 2008
    Date of Patent: January 31, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masahiro Morita, Takehiko Kagoshima
  • Publication number: 20110246199
    Abstract: According to one embodiment, a speech synthesizer generates a speech segment sequence and synthesizes speech by connecting speech segments of the generated speech segment sequence. If a speech segment of a synthesized first speech segment sequence is different from the speech segment of a synthesized second speech segment sequence having the same synthesis unit as the first speech segment sequence, the speech synthesizer disables the speech segment of the first speech segment sequence that is different from the speech segment of the second speech segment sequence.
    Type: Application
    Filed: September 14, 2010
    Publication date: October 6, 2011
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Osamu NISHIYAMA, Takehiko Kagoshima
  • Publication number: 20110238420
    Abstract: According to one embodiment, a method for editing speech is disclosed. The method can generate speech information from a text. The speech information includes phonologic information and prosody information. The method can divide the speech information into a plurality of speech units, based on at least one of the phonologic information and the prosody information. The method can search at least two speech units from the plurality of speech units. At least one of the phonologic information and the prosody information in the at least two speech units are identical or similar. In addition, the method can store a speech unit waveform corresponding to one of the at least two speech units as a representative speech unit into a memory.
    Type: Application
    Filed: September 13, 2010
    Publication date: September 29, 2011
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Gou Hirabayashi, Takehiko Kagoshima
  • Publication number: 20110087488
    Abstract: According to an embodiment, a speech synthesis apparatus includes a selecting unit configured to select speaker's parameters one by one for respective speakers and obtain a plurality of speakers' parameters, the speaker's parameters being prepared for respective pitch waveforms corresponding to speaker's speech sounds, the speaker's parameters including formant frequencies, formant phases, formant powers, and window functions concerning respective formants that are contained in the respective pitch waveforms. The apparatus includes a mapping unit configured to make formants correspond to each other between the plurality of speakers' parameters using a cost function based on the formant frequencies and the formant powers. The apparatus includes a generating unit configured to generate an interpolated speaker's parameter by interpolating, at desired interpolation ratios, the formant frequencies, formant phases, formant powers, and window functions of formants which are made to correspond to each other.
    Type: Application
    Filed: December 16, 2010
    Publication date: April 14, 2011
    Inventors: Ryo Morinaka, Takehiko Kagoshima
  • Patent number: 7856357
    Abstract: A speech synthesis system stores a group of speech units in a memory, selects a plurality of speech units from the group based on prosodic information of target speech, the speech units selected corresponding to each of segments which are obtained by segmenting a phoneme string of the target speech and minimizing distortion of synthetic speech generated from the speech units selected to the target speech, generates a new speech unit corresponding to the each of the segments, by fusing the speech units selected, to obtain a plurality of new speech units corresponding to the segments respectively, and generates synthetic speech by concatenating the new speech units.
    Type: Grant
    Filed: August 18, 2008
    Date of Patent: December 21, 2010
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Tatsuya Mizutani, Takehiko Kagoshima
  • Publication number: 20100211392
    Abstract: The speech synthesizing device acquires numerical data at regular time intervals, each piece of the numerical data representing a value having a plurality of digits, detects a change between two values represented by the numerical data that is acquired at two consecutive times, determines which digit of the value represented by the numerical data is used to generate speech data depending on the detected change, generates numerical information that indicates the determined digit of the value represented by the numerical data, and generates speech data from the digit indicated by the numerical information.
    Type: Application
    Filed: September 21, 2009
    Publication date: August 19, 2010
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Ryutaro Tokuda, Takehiko Kagoshima
  • Publication number: 20100049522
    Abstract: A voice conversion apparatus stores, in a parameter memory, target speech spectral parameters of target speech, stores, in a voice conversion rule memory, a voice conversion rule for converting voice quality of source speech into voice quality of the target speech, extracts, from an input source speech, a source speech spectral parameter of the input source speech, converts extracted source speech spectral parameter into a first conversion spectral parameter by using the voice conversion rule, selects target speech spectral parameter similar to the first conversion spectral parameter from the parameter memory, generates an aperiodic component spectral parameter representing from selected target speech spectral parameter, mixes a periodic component spectral parameter included in the first conversion spectral parameter with the aperiodic component spectral parameter, to obtain a second conversion spectral parameter, and generates a speech waveform from the second conversion spectral parameter.
    Type: Application
    Filed: July 20, 2009
    Publication date: February 25, 2010
    Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
  • Patent number: 7668717
    Abstract: A speech synthesis system stores a group of speech units in a memory, selects a plurality of speech units from the group based on prosodic information of target speech, the speech units selected corresponding to each of segments which are obtained by segmenting a phoneme string of the target speech and minimizing distortion of synthetic speech generated from the speech units selected to the target speech, generates a new speech unit corresponding to the each of the segments, by fusing the speech units selected, to obtain a plurality of new speech units corresponding to the segments respectively, and generates synthetic speech by concatenating the new speech units.
    Type: Grant
    Filed: November 26, 2004
    Date of Patent: February 23, 2010
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Tatsuya Mizutani, Takehiko Kagoshima
  • Publication number: 20090326951
    Abstract: Ratios of powers at the peaks of respective formants of the spectrum of a pitch-cycle waveform and powers at boundaries between the formants are obtained and, when the ratios are large, bandwidth of window functions are widened and the formant waveforms are generated by multiplying generated sinusoidal waveforms from the formant parameter sets on the basis of pitch-cycle waveform generating data by the window functions of the widened bandwidth, whereby a pitch-cycle waveform is generated by the sum of these formant waveforms.
    Type: Application
    Filed: April 14, 2009
    Publication date: December 31, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Ryo Morinaka, Takehiko Kagoshima
  • Patent number: 7630896
    Abstract: A speech synthesis system in a preferred embodiment includes a speech unit storage section, a phonetic environment storage section, a phonetic sequence/prosodic information input section, a plural-speech-unit selection section, a fused-speech-unit sequence generation section, and a fused-speech-unit modification/concatenation section. By fusing a plurality of selected speech units in the fused speech unit sequence generation section, a fused speech unit is generated. In the fused speech unit sequence generation section, the average power information is calculated for a plurality of selected M speech units, N speech units are fused together, and the power information of the fused speech unit is so corrected as to be equalized with the average power information of the M speech units.
    Type: Grant
    Filed: September 23, 2005
    Date of Patent: December 8, 2009
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Gou Hirabayashi, Takehiko Kagoshima
  • Publication number: 20090216537
    Abstract: A speech synthesis apparatus includes a text obtaining device that obtains text data for speech synthesis from the outside, a language processor that carries out morphological analysis/parsing to the text data, a prosodic processor that outputs, to a speech synthesizer, a synthesis unit string based on the prosodic and language related attributes of the text data such as accents and word classes, the speech synthesizer that generates synthesized speech from the synthesis unit string, and a speech waveform output device that reproduces a prescribed amount of output synthesized speech after it is accumulated or sequentially as it is output.
    Type: Application
    Filed: October 19, 2006
    Publication date: August 27, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Osamu Nishiyama, Masahiro Morita, Takehiko Kagoshima
  • Patent number: 7580839
    Abstract: A speech processing apparatus according to an embodiment of the invention includes a conversion-source-speaker speech-unit database; a voice-conversion-rule-learning-data generating means; and a voice-conversion-rule learning means, with which it makes voice conversion rules. The voice-conversion-rule-learning-data generating means includes a conversion-target-speaker speech-unit extracting means; an attribute-information generating means; a conversion-source-speaker speech-unit database; and a conversion-source-speaker speech-unit selection means.
    Type: Grant
    Filed: September 19, 2006
    Date of Patent: August 25, 2009
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Takehiko Kagoshima
  • Publication number: 20090177474
    Abstract: A speech synthesizer includes a periodic component fusing unit and an aperiodic component fusing unit, and fuses periodic components and aperiodic components of a plurality of speech units for each segment, which are selected by a unit selector, by a periodic component fusing unit and an aperiodic component fusing unit, respectively. The speech synthesizer is further provided with an adder, so that the adder adds, edits, and concatenates the periodic components and the aperiodic components of the fused speech units to generate a speech waveform.
    Type: Application
    Filed: September 18, 2008
    Publication date: July 9, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masahiro Morita, Takehiko Kagoshima
  • Publication number: 20090150157
    Abstract: A word dictionary including sets of a character string which constitutes a word, a phoneme sequence which constitutes pronunciation of the word and a part of speech of the word is referenced, an entered text is analyzed, the entered text is divided into one or more subtexts, a phoneme sequence and a part of speech sequence are generated for each subtext, the part of speech sequence of the subtext and a list of part of speech sequence are collated to determine whether the phonetic sound of the subtext is to be converted or not, and the phonetic sounds of the phoneme sequence in the subtext whose phonetic sounds are determined to be converted are converted.
    Type: Application
    Filed: September 15, 2008
    Publication date: June 11, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Takehiko KAGOSHIMA, Noriko YAMANAKA, Makoto YAJIMA
  • Publication number: 20090144053
    Abstract: An information extraction unit extracts spectral envelope information of L-dimension from each frame of speech data. The spectral envelope information does not have a spectral fine structure. A basis storage unit stores N bases (L>N>1). Each basis is differently a frequency band having a maximum as a peak frequency in a spectral domain having L-dimension. A value corresponding to a frequency outside the frequency band along a frequency axis of the spectral domain is zero. Two frequency bands of which two peak frequencies are adjacent along the frequency axis partially overlap. A parameter calculation unit minimizes a distortion between the spectral envelope information and a linear combination of each basis with a coefficient by changing the coefficient, and sets the coefficient of each basis from which the distortion is minimized to a spectral envelope parameter of the spectral envelope information.
    Type: Application
    Filed: December 3, 2008
    Publication date: June 4, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masatsune TAMURA, Katsumi TSUCHIYA, Takehiko KAGOSHIMA
  • Publication number: 20090112580
    Abstract: The speech processing apparatus configured to split a first speech waveform and a second speech waveform into a plurality of frequency bands respectively to generate a first band speech waveform and a second band speech waveform each being a component of each frequency band; determine an overlap-added position between the first band speech waveform and the second band speech waveform by the each frequency band so that a high cross correlation between the first band speech waveform and the second band speech waveform is obtained; and overlap-add the first band speech waveform and the second band speech waveform by the each frequency band on the basis of the overlap-added position and integrates overlap-added band speech waveforms in the plurality of frequency bands over all the plurality of frequency bands to generate a concatenated speech waveform.
    Type: Application
    Filed: July 21, 2008
    Publication date: April 30, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Gou Hirabayashi, Dawei Xu, Takehiko Kagoshima
  • Publication number: 20090055188
    Abstract: The prosody control unit pattern generation module generates pitch patterns in respective prosody control units based on language attribute information, the phoneme duration and emphasis degree information, the modification method decision module decides a modification method by smoothing processing with respect to the pitch pattern in a connection portion between the prosody control unit and at least one of previous and next prosody control units based on at least emphasis degree information to generate modification method information, and the pattern connection module modifies pitch patterns generated in respective prosody control units by smoothing processing according to the modification method information and connects them to generate a sentence pitch pattern corresponding to a text to be a target for speech synthesis.
    Type: Application
    Filed: February 22, 2008
    Publication date: February 26, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Gou Hirabayashi, Takehiko Kagoshima
  • Publication number: 20090055158
    Abstract: A speech translation apparatus includes a speech recognition unit configured to recognize input speech of a first language to generate a first text of the first language, an extraction unit configured to compare original prosody information of the input speech with first synthesized prosody information based on the first text to extract paralinguistic information about each of first words of the first text, a machine translation unit configured to translate the first text to a second text of a second language, a mapping unit configured to allocate the paralinguistic information about each of the first words to each of second words of the second text in accordance with synonymity, a generating unit configured to generate second synthesized prosody information based on the paralinguistic information allocated to each of the second words, and a speech synthesis unit configured to synthesize output speech based on the second synthesized prosody information.
    Type: Application
    Filed: August 21, 2008
    Publication date: February 26, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Dawei Xu, Takehiko Kagoshima
  • Publication number: 20090048844
    Abstract: A phoneme sequence corresponding to a target speech is divided into a plurality of segments. A plurality of speech units for each segment is selected from a speech unit memory that stores speech units having at least one frame. The plurality of speech units has a prosodic feature accordant or similar to the target speech. A formant parameter having at least one formant frequency is generated for each frame of the plurality of speech units. A fused formant parameter of each frame is generated from formant parameters of each frame of the plurality of speech units. A fused speech unit of each segment is generated from the fused formant parameter of each frame. A synthesized speech is generated by concatenating the fused speech unit of each segment.
    Type: Application
    Filed: August 14, 2008
    Publication date: February 19, 2009
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Ryo Morinaka, Masatsune Tamura, Takehiko Kagoshima