Patents by Inventor Takehiko Kagoshima

Takehiko Kagoshima has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9002711
    Abstract: According to an embodiment, a speech synthesis apparatus includes a selecting unit configured to select speaker's parameters one by one for respective speakers and obtain a plurality of speakers' parameters, the speaker's parameters being prepared for respective pitch waveforms corresponding to speaker's speech sounds, the speaker's parameters including formant frequencies, formant phases, formant powers, and window functions concerning respective formants that are contained in the respective pitch waveforms. The apparatus includes a mapping unit configured to make formants correspond to each other between the plurality of speakers' parameters using a cost function based on the formant frequencies and the formant powers. The apparatus includes a generating unit configured to generate an interpolated speaker's parameter by interpolating, at desired interpolation ratios, the formant frequencies, formant phases, formant powers, and window functions of formants which are made to correspond to each other.
    Type: Grant
    Filed: December 16, 2010
    Date of Patent: April 7, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Ryo Morinaka, Takehiko Kagoshima
  • Patent number: 8868422
    Abstract: According to one embodiment, a method for editing speech is disclosed. The method can generate speech information from a text. The speech information includes phonologic information and prosody information. The method can divide the speech information into a plurality of speech units, based on at least one of the phonologic information and the prosody information. The method can search at least two speech units from the plurality of speech units. At least one of the phonologic information and the prosody information in the at least two speech units are identical or similar. In addition, the method can store a speech unit waveform corresponding to one of the at least two speech units as a representative speech unit into a memory.
    Type: Grant
    Filed: September 13, 2010
    Date of Patent: October 21, 2014
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Gou Hirabayashi, Takehiko Kagoshima
  • Publication number: 20140180681
    Abstract: A waveform memory that stores a plurality of speech unit waveforms corresponding to respective speech units, wherein an address order of the speech unit waveforms is determined by a sort order of speech units included in a speech unit sequence corresponding to a phoneme sequence of training data, and the speech units included in the speech unit sequence are selected so as to synthesize a speech of the phone sequence.
    Type: Application
    Filed: February 26, 2014
    Publication date: June 26, 2014
    Applicant: Kabushiki Kaisha Toshiba
    Inventor: Takehiko KAGOSHIMA
  • Patent number: 8731933
    Abstract: A speech synthesizing apparatus includes a selector configured to select a plurality of speech units for synthesizing a speech of a phoneme sequence by referring to speech unit information stored in an information memory. Speech unit waveforms corresponding to the speech units are acquired from a plurality of speech unit waveforms stored in a waveform memory, and the speech is synthesized by utilizing the speech unit waveforms acquired. When acquiring the speech unit waveforms, at least two speech unit waveforms from a continuous region of the waveform memory are copied onto a buffer by one access, wherein a data quantity of the at least two speech unit waveforms is less than or equal to a size of the buffer.
    Type: Grant
    Filed: April 10, 2013
    Date of Patent: May 20, 2014
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Takehiko Kagoshima
  • Publication number: 20140052446
    Abstract: According to one embodiment, a prosody editing apparatus includes a storage, a first selection unit, a search unit, a normalization unit, a mapping unit, a display, a second selection unit, a restoring unit and a replacing unit. The search unit searches the storage for one or more second prosodic patterns corresponding to attribute information that matches attribute information of the selected phrase. The mapping maps each of the normalized second prosodic patterns on a low-dimensional space. The restoring unit restores a restored prosodic pattern according to the selected coordinates. The replacing unit replaces prosody of synthetic speech generated based on the selected phrase by the restored prosodic pattern.
    Type: Application
    Filed: August 15, 2013
    Publication date: February 20, 2014
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kouichirou MORI, Takehiko KAGOSHIMA, Masahiro MORITA
  • Publication number: 20140052447
    Abstract: According to one embodiment, a speech synthesis apparatus is provided with generation, normalization, interpolation and synthesis units. The generation unit generates a first parameter using a prosodic control dictionary of a target speaker and one or more second parameters using a prosodic control dictionary of one or more standard speakers based on language information for an input text. The normalization unit normalizes the one or more second parameters based a normalization parameter. The interpolation unit interpolates the first parameter and the one or more normalized second parameters based on weight information to generate a third parameter and the synthesis unit generates synthesized speech using the third parameter.
    Type: Application
    Filed: August 16, 2013
    Publication date: February 20, 2014
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kentaro TACHIBANA, Takehiko KAGOSHIMA, Masahiro MORITA
  • Patent number: 8655664
    Abstract: According to an embodiment, a text presentation apparatus presenting text for a speaker to read aloud for voice recording includes: a text storing unit for storing first text; a presenting unit for presenting the first text; a determination unit for determining whether or not the first text needs to be replaced, on the basis of a speaker's input for the first text presented; a preliminary text storing unit for storing preliminary text; a select unit configured to select, if it is determined that the first text needs to be replaced, second text to replace the first text from among the preliminary text, the selecting being performed on the basis of attribute information describing an attribute of the first text and on the basis of at least one of attribute information describing pronunciation of the first text and attribute information describing a stress type of the first text; and a control unit configured to control the presenting unit so that the presenting unit presents the second text.
    Type: Grant
    Filed: August 11, 2011
    Date of Patent: February 18, 2014
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Kentaro Tachibana, Gou Hirabayashi, Takehiko Kagoshima
  • Patent number: 8554565
    Abstract: According to one embodiment, a speech synthesizer generates a speech segment sequence and synthesizes speech by connecting speech segments of the generated speech segment sequence. If a speech segment of a synthesized first speech segment sequence is different from the speech segment of a synthesized second speech segment sequence having the same synthesis unit as the first speech segment sequence, the speech synthesizer disables the speech segment of the first speech segment sequence that is different from the speech segment of the second speech segment sequence.
    Type: Grant
    Filed: September 14, 2010
    Date of Patent: October 8, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Osamu Nishiyama, Takehiko Kagoshima
  • Patent number: 8468020
    Abstract: An apparatus for synthesizing a speech including a waveform memory that stores a plurality of speech unit waveforms, an information memory that correspondingly stores speech unit information and an address of each of the speech unit waveforms, a selector that selects a speech unit sequence corresponding to the input phoneme sequence by referring to the speech unit information, a speech unit waveform acquisition unit that acquires a speech unit waveform corresponding to each speech unit of the speech unit sequence from the waveform memory by referring to the address, a speech unit concatenation unit that generates the speech by concatenating the speech unit waveform acquired.
    Type: Grant
    Filed: May 8, 2007
    Date of Patent: June 18, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Takehiko Kagoshima
  • Patent number: 8438014
    Abstract: According to one embodiment, in a speech processing device, an extractor windows a part of the speech signal and extracts a partial waveform. A calculator performs frequency analysis of the partial waveform to calculate a frequency spectrum. An estimator generates an artificial waveform that is a waveform according to an interval between the pitch marks for each harmonic component having a frequency that is a predetermined multiple of a fundamental frequency of the speech signal and estimates harmonic spectral features representing characteristics of the frequency spectrum of the harmonic component from each of the artificial waveforms. A separator separates the partial waveform into a periodic component produced from periodic vocal-fold vibration as an acoustic source and an aperiodic component produced from aperiodic acoustic sources other than the vocal-fold vibration by using the respective harmonic spectral features and the frequency spectrum of the partial waveform.
    Type: Grant
    Filed: January 26, 2012
    Date of Patent: May 7, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masahiro Morita, Javier Latorre, Takehiko Kagoshima
  • Patent number: 8438033
    Abstract: A voice conversion apparatus stores, in a parameter memory, target speech spectral parameters of target speech, stores, in a voice conversion rule memory, a voice conversion rule for converting voice quality of source speech into voice quality of the target speech, extracts, from an input source speech, a source speech spectral parameter of the input source speech, converts extracted source speech spectral parameter into a first conversion spectral parameter by using the voice conversion rule, selects target speech spectral parameter similar to the first conversion spectral parameter from the parameter memory, generates an aperiodic component spectral parameter representing from selected target speech spectral parameter, mixes a periodic component spectral parameter included in the first conversion spectral parameter with the aperiodic component spectral parameter, to obtain a second conversion spectral parameter, and generates a speech waveform from the second conversion spectral parameter.
    Type: Grant
    Filed: July 20, 2009
    Date of Patent: May 7, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
  • Publication number: 20130080155
    Abstract: Apparatus for creating a dictionary for speech synthesis includes a sentence storage unit configured to store N sentences, a sentence display unit configured to selectively display a first sentence which is one of the N sentences, a recording unit configured to record each user speech, a necessity determination unit configured to make a determination of whether to create the dictionary, a dictionary creation unit configured to create the dictionary by utilizing the user speech, and a speech synthesis unit configured to convert a second sentence to a synthesized speech with the dictionary. The determination unit makes the determination under a condition that the recording unit records the user speech of M first sentences (M is less than N) and the determination is based on at least one of an instruction from the user, M and an amount of the recorded user speech.
    Type: Application
    Filed: June 28, 2012
    Publication date: March 28, 2013
    Inventors: Kentaro Tachibana, Masahiro Morita, Takehiko Kagoshima
  • Patent number: 8321208
    Abstract: An information extraction unit extracts spectral envelope information of L-dimension from each frame of speech data by discrete Fourier transform. The spectral envelope information is represented by L points. A basis storage unit stores N bases (L>N>1). Each basis is differently a frequency band having a maximum as a peak frequency in a spectral domain having L-dimension. A value corresponding to a frequency outside the frequency band along a frequency axis of the spectral domain is zero. Two frequency bands of which two peak frequencies are adjacent along the frequency axis partially overlap. A parameter calculation unit minimizes a distortion between the spectral envelope information and a linear combination of each basis with a coefficient for each of L points of the spectral envelope information by changing the coefficient, and sets the coefficient of each basis from which the distortion is minimized to a spectral envelope parameter of the spectral envelope information.
    Type: Grant
    Filed: December 3, 2008
    Date of Patent: November 27, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masatsune Tamura, Katsumi Tsuchiya, Takehiko Kagoshima
  • Publication number: 20120239390
    Abstract: According to one embodiment, an apparatus for supporting reading of a document includes a model storage unit, a document acquisition unit, a feature information extraction, and an utterance style estimation unit. The model storage unit is configured to store a model which has trained a correspondence relationship between first feature information and an utterance style. The first feature information is extracted from a plurality of sentences in a training document. The document acquisition unit is configured to acquire a document to be read. The feature information extraction unit is configured to extract second feature information from each sentence in the document to be read. The utterance style estimation unit is configured to compare the second feature information of a plurality of sentences in the document to be read with the model, and to estimate an utterance style of the each sentence of the document to be read.
    Type: Application
    Filed: September 14, 2011
    Publication date: September 20, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kosei Fume, Masaru Suzuki, Masahiro Morita, Kentaro Tachibana, Kouichirou Mori, Yuji Shimizu, Takehiko Kagoshima, Masatsune Tamura, Tomohiro Yamasaki
  • Publication number: 20120185244
    Abstract: According to one embodiment, in a speech processing device, an extractor windows a part of the speech signal and extracts a partial waveform. A calculator performs frequency analysis of the partial waveform to calculate a frequency spectrum. An estimator generates an artificial waveform that is a waveform according to an interval between the pitch marks for each harmonic component having a frequency that is a predetermined multiple of a fundamental frequency of the speech signal and estimates harmonic spectral features representing characteristics of the frequency spectrum of the harmonic component from each of the artificial waveforms. A separator separates the partial waveform into a periodic component produced from periodic vocal-fold vibration as an acoustic source and an aperiodic component produced from aperiodic acoustic sources other than the vocal-fold vibration by using the respective harmonic spectral features and the frequency spectrum of the partial waveform.
    Type: Application
    Filed: January 26, 2012
    Publication date: July 19, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masahiro Morita, Javier Latorre, Takehiko Kagoshima
  • Patent number: 8224646
    Abstract: The speech synthesizing device acquires numerical data at regular time intervals, each piece of the numerical data representing a value having a plurality of digits, detects a change between two values represented by the numerical data that is acquired at two consecutive times, determines which digit of the value represented by the numerical data is used to generate speech data depending on the detected change, generates numerical information that indicates the determined digit of the value represented by the numerical data, and generates speech data from the digit indicated by the numerical information.
    Type: Grant
    Filed: September 21, 2009
    Date of Patent: July 17, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Ryutaro Tokuda, Takehiko Kagoshima
  • Patent number: 8195464
    Abstract: A speech synthesizer includes a periodic component fusing unit and an aperiodic component fusing unit, and fuses periodic components and aperiodic components of a plurality of speech units for each segment, which are selected by a unit selector, by a periodic component fusing unit and an aperiodic component fusing unit, respectively. The speech synthesizer is further provided with an adder, so that the adder adds, edits, and concatenates the periodic components and the aperiodic components of the fused speech units to generate a speech waveform.
    Type: Grant
    Filed: September 18, 2008
    Date of Patent: June 5, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masahiro Morita, Takehiko Kagoshima
  • Patent number: 8175881
    Abstract: A phoneme sequence corresponding to a target speech is divided into a plurality of segments. A plurality of speech units for each segment is selected from a speech unit memory that stores speech units having at least one frame. The plurality of speech units has a prosodic feature accordant or similar to the target speech. A formant parameter having at least one formant frequency is generated for each frame of the plurality of speech units. A fused formant parameter of each frame is generated from formant parameters of each frame of the plurality of speech units. A fused speech unit of each segment is generated from the fused formant parameter of each frame. A synthesized speech is generated by concatenating the fused speech unit of each segment.
    Type: Grant
    Filed: August 14, 2008
    Date of Patent: May 8, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Ryo Morinaka, Masatsune Tamura, Takehiko Kagoshima
  • Patent number: 8170876
    Abstract: A word dictionary including sets of a character string which constitutes a word, a phoneme sequence which constitutes pronunciation of the word and a part of speech of the word is referenced, an entered text is analyzed, the entered text is divided into one or more subtexts, a phoneme sequence and a part of speech sequence are generated for each subtext, the part of speech sequence of the subtext and a list of part of speech sequence are collated to determine whether the phonetic sound of the subtext is to be converted or not, and the phonetic sounds of the phoneme sequence in the subtext whose phonetic sounds are determined to be converted are converted.
    Type: Grant
    Filed: September 15, 2008
    Date of Patent: May 1, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Takehiko Kagoshima, Noriko Yamanaka, Makoto Yajima
  • Publication number: 20120065981
    Abstract: According to an embodiment, a text presentation apparatus presenting text for a speaker to read aloud for voice recording includes: a text storing unit for storing first text; a presenting unit for presenting the first text; a determination unit for determining whether or not the first text needs to be replaced, on the basis of a speaker's input for the first text presented; a preliminary text storing unit for storing preliminary text; a select unit configured to select, if it is determined that the first text needs to be replaced, second text to replace the first text from among the preliminary text, the selecting being performed on the basis of attribute information describing an attribute of the first text and on the basis of at least one of attribute information describing pronunciation of the first text and attribute information describing a stress type of the first text; and a control unit configured to control the presenting unit so that the presenting unit presents the second text.
    Type: Application
    Filed: August 11, 2011
    Publication date: March 15, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kentaro Tachibana, Gou Hirabayashi, Takehiko Kagoshima