Patents by Inventor Takehiko Kagoshima
Takehiko Kagoshima has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9002711Abstract: According to an embodiment, a speech synthesis apparatus includes a selecting unit configured to select speaker's parameters one by one for respective speakers and obtain a plurality of speakers' parameters, the speaker's parameters being prepared for respective pitch waveforms corresponding to speaker's speech sounds, the speaker's parameters including formant frequencies, formant phases, formant powers, and window functions concerning respective formants that are contained in the respective pitch waveforms. The apparatus includes a mapping unit configured to make formants correspond to each other between the plurality of speakers' parameters using a cost function based on the formant frequencies and the formant powers. The apparatus includes a generating unit configured to generate an interpolated speaker's parameter by interpolating, at desired interpolation ratios, the formant frequencies, formant phases, formant powers, and window functions of formants which are made to correspond to each other.Type: GrantFiled: December 16, 2010Date of Patent: April 7, 2015Assignee: Kabushiki Kaisha ToshibaInventors: Ryo Morinaka, Takehiko Kagoshima
-
Patent number: 8868422Abstract: According to one embodiment, a method for editing speech is disclosed. The method can generate speech information from a text. The speech information includes phonologic information and prosody information. The method can divide the speech information into a plurality of speech units, based on at least one of the phonologic information and the prosody information. The method can search at least two speech units from the plurality of speech units. At least one of the phonologic information and the prosody information in the at least two speech units are identical or similar. In addition, the method can store a speech unit waveform corresponding to one of the at least two speech units as a representative speech unit into a memory.Type: GrantFiled: September 13, 2010Date of Patent: October 21, 2014Assignee: Kabushiki Kaisha ToshibaInventors: Gou Hirabayashi, Takehiko Kagoshima
-
Publication number: 20140180681Abstract: A waveform memory that stores a plurality of speech unit waveforms corresponding to respective speech units, wherein an address order of the speech unit waveforms is determined by a sort order of speech units included in a speech unit sequence corresponding to a phoneme sequence of training data, and the speech units included in the speech unit sequence are selected so as to synthesize a speech of the phone sequence.Type: ApplicationFiled: February 26, 2014Publication date: June 26, 2014Applicant: Kabushiki Kaisha ToshibaInventor: Takehiko KAGOSHIMA
-
Patent number: 8731933Abstract: A speech synthesizing apparatus includes a selector configured to select a plurality of speech units for synthesizing a speech of a phoneme sequence by referring to speech unit information stored in an information memory. Speech unit waveforms corresponding to the speech units are acquired from a plurality of speech unit waveforms stored in a waveform memory, and the speech is synthesized by utilizing the speech unit waveforms acquired. When acquiring the speech unit waveforms, at least two speech unit waveforms from a continuous region of the waveform memory are copied onto a buffer by one access, wherein a data quantity of the at least two speech unit waveforms is less than or equal to a size of the buffer.Type: GrantFiled: April 10, 2013Date of Patent: May 20, 2014Assignee: Kabushiki Kaisha ToshibaInventor: Takehiko Kagoshima
-
Publication number: 20140052446Abstract: According to one embodiment, a prosody editing apparatus includes a storage, a first selection unit, a search unit, a normalization unit, a mapping unit, a display, a second selection unit, a restoring unit and a replacing unit. The search unit searches the storage for one or more second prosodic patterns corresponding to attribute information that matches attribute information of the selected phrase. The mapping maps each of the normalized second prosodic patterns on a low-dimensional space. The restoring unit restores a restored prosodic pattern according to the selected coordinates. The replacing unit replaces prosody of synthetic speech generated based on the selected phrase by the restored prosodic pattern.Type: ApplicationFiled: August 15, 2013Publication date: February 20, 2014Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Kouichirou MORI, Takehiko KAGOSHIMA, Masahiro MORITA
-
Publication number: 20140052447Abstract: According to one embodiment, a speech synthesis apparatus is provided with generation, normalization, interpolation and synthesis units. The generation unit generates a first parameter using a prosodic control dictionary of a target speaker and one or more second parameters using a prosodic control dictionary of one or more standard speakers based on language information for an input text. The normalization unit normalizes the one or more second parameters based a normalization parameter. The interpolation unit interpolates the first parameter and the one or more normalized second parameters based on weight information to generate a third parameter and the synthesis unit generates synthesized speech using the third parameter.Type: ApplicationFiled: August 16, 2013Publication date: February 20, 2014Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Kentaro TACHIBANA, Takehiko KAGOSHIMA, Masahiro MORITA
-
Patent number: 8655664Abstract: According to an embodiment, a text presentation apparatus presenting text for a speaker to read aloud for voice recording includes: a text storing unit for storing first text; a presenting unit for presenting the first text; a determination unit for determining whether or not the first text needs to be replaced, on the basis of a speaker's input for the first text presented; a preliminary text storing unit for storing preliminary text; a select unit configured to select, if it is determined that the first text needs to be replaced, second text to replace the first text from among the preliminary text, the selecting being performed on the basis of attribute information describing an attribute of the first text and on the basis of at least one of attribute information describing pronunciation of the first text and attribute information describing a stress type of the first text; and a control unit configured to control the presenting unit so that the presenting unit presents the second text.Type: GrantFiled: August 11, 2011Date of Patent: February 18, 2014Assignee: Kabushiki Kaisha ToshibaInventors: Kentaro Tachibana, Gou Hirabayashi, Takehiko Kagoshima
-
Patent number: 8554565Abstract: According to one embodiment, a speech synthesizer generates a speech segment sequence and synthesizes speech by connecting speech segments of the generated speech segment sequence. If a speech segment of a synthesized first speech segment sequence is different from the speech segment of a synthesized second speech segment sequence having the same synthesis unit as the first speech segment sequence, the speech synthesizer disables the speech segment of the first speech segment sequence that is different from the speech segment of the second speech segment sequence.Type: GrantFiled: September 14, 2010Date of Patent: October 8, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Osamu Nishiyama, Takehiko Kagoshima
-
Patent number: 8468020Abstract: An apparatus for synthesizing a speech including a waveform memory that stores a plurality of speech unit waveforms, an information memory that correspondingly stores speech unit information and an address of each of the speech unit waveforms, a selector that selects a speech unit sequence corresponding to the input phoneme sequence by referring to the speech unit information, a speech unit waveform acquisition unit that acquires a speech unit waveform corresponding to each speech unit of the speech unit sequence from the waveform memory by referring to the address, a speech unit concatenation unit that generates the speech by concatenating the speech unit waveform acquired.Type: GrantFiled: May 8, 2007Date of Patent: June 18, 2013Assignee: Kabushiki Kaisha ToshibaInventor: Takehiko Kagoshima
-
Patent number: 8438014Abstract: According to one embodiment, in a speech processing device, an extractor windows a part of the speech signal and extracts a partial waveform. A calculator performs frequency analysis of the partial waveform to calculate a frequency spectrum. An estimator generates an artificial waveform that is a waveform according to an interval between the pitch marks for each harmonic component having a frequency that is a predetermined multiple of a fundamental frequency of the speech signal and estimates harmonic spectral features representing characteristics of the frequency spectrum of the harmonic component from each of the artificial waveforms. A separator separates the partial waveform into a periodic component produced from periodic vocal-fold vibration as an acoustic source and an aperiodic component produced from aperiodic acoustic sources other than the vocal-fold vibration by using the respective harmonic spectral features and the frequency spectrum of the partial waveform.Type: GrantFiled: January 26, 2012Date of Patent: May 7, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Masahiro Morita, Javier Latorre, Takehiko Kagoshima
-
Patent number: 8438033Abstract: A voice conversion apparatus stores, in a parameter memory, target speech spectral parameters of target speech, stores, in a voice conversion rule memory, a voice conversion rule for converting voice quality of source speech into voice quality of the target speech, extracts, from an input source speech, a source speech spectral parameter of the input source speech, converts extracted source speech spectral parameter into a first conversion spectral parameter by using the voice conversion rule, selects target speech spectral parameter similar to the first conversion spectral parameter from the parameter memory, generates an aperiodic component spectral parameter representing from selected target speech spectral parameter, mixes a periodic component spectral parameter included in the first conversion spectral parameter with the aperiodic component spectral parameter, to obtain a second conversion spectral parameter, and generates a speech waveform from the second conversion spectral parameter.Type: GrantFiled: July 20, 2009Date of Patent: May 7, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
-
Publication number: 20130080155Abstract: Apparatus for creating a dictionary for speech synthesis includes a sentence storage unit configured to store N sentences, a sentence display unit configured to selectively display a first sentence which is one of the N sentences, a recording unit configured to record each user speech, a necessity determination unit configured to make a determination of whether to create the dictionary, a dictionary creation unit configured to create the dictionary by utilizing the user speech, and a speech synthesis unit configured to convert a second sentence to a synthesized speech with the dictionary. The determination unit makes the determination under a condition that the recording unit records the user speech of M first sentences (M is less than N) and the determination is based on at least one of an instruction from the user, M and an amount of the recorded user speech.Type: ApplicationFiled: June 28, 2012Publication date: March 28, 2013Inventors: Kentaro Tachibana, Masahiro Morita, Takehiko Kagoshima
-
Patent number: 8321208Abstract: An information extraction unit extracts spectral envelope information of L-dimension from each frame of speech data by discrete Fourier transform. The spectral envelope information is represented by L points. A basis storage unit stores N bases (L>N>1). Each basis is differently a frequency band having a maximum as a peak frequency in a spectral domain having L-dimension. A value corresponding to a frequency outside the frequency band along a frequency axis of the spectral domain is zero. Two frequency bands of which two peak frequencies are adjacent along the frequency axis partially overlap. A parameter calculation unit minimizes a distortion between the spectral envelope information and a linear combination of each basis with a coefficient for each of L points of the spectral envelope information by changing the coefficient, and sets the coefficient of each basis from which the distortion is minimized to a spectral envelope parameter of the spectral envelope information.Type: GrantFiled: December 3, 2008Date of Patent: November 27, 2012Assignee: Kabushiki Kaisha ToshibaInventors: Masatsune Tamura, Katsumi Tsuchiya, Takehiko Kagoshima
-
Publication number: 20120239390Abstract: According to one embodiment, an apparatus for supporting reading of a document includes a model storage unit, a document acquisition unit, a feature information extraction, and an utterance style estimation unit. The model storage unit is configured to store a model which has trained a correspondence relationship between first feature information and an utterance style. The first feature information is extracted from a plurality of sentences in a training document. The document acquisition unit is configured to acquire a document to be read. The feature information extraction unit is configured to extract second feature information from each sentence in the document to be read. The utterance style estimation unit is configured to compare the second feature information of a plurality of sentences in the document to be read with the model, and to estimate an utterance style of the each sentence of the document to be read.Type: ApplicationFiled: September 14, 2011Publication date: September 20, 2012Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Kosei Fume, Masaru Suzuki, Masahiro Morita, Kentaro Tachibana, Kouichirou Mori, Yuji Shimizu, Takehiko Kagoshima, Masatsune Tamura, Tomohiro Yamasaki
-
Publication number: 20120185244Abstract: According to one embodiment, in a speech processing device, an extractor windows a part of the speech signal and extracts a partial waveform. A calculator performs frequency analysis of the partial waveform to calculate a frequency spectrum. An estimator generates an artificial waveform that is a waveform according to an interval between the pitch marks for each harmonic component having a frequency that is a predetermined multiple of a fundamental frequency of the speech signal and estimates harmonic spectral features representing characteristics of the frequency spectrum of the harmonic component from each of the artificial waveforms. A separator separates the partial waveform into a periodic component produced from periodic vocal-fold vibration as an acoustic source and an aperiodic component produced from aperiodic acoustic sources other than the vocal-fold vibration by using the respective harmonic spectral features and the frequency spectrum of the partial waveform.Type: ApplicationFiled: January 26, 2012Publication date: July 19, 2012Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Masahiro Morita, Javier Latorre, Takehiko Kagoshima
-
Patent number: 8224646Abstract: The speech synthesizing device acquires numerical data at regular time intervals, each piece of the numerical data representing a value having a plurality of digits, detects a change between two values represented by the numerical data that is acquired at two consecutive times, determines which digit of the value represented by the numerical data is used to generate speech data depending on the detected change, generates numerical information that indicates the determined digit of the value represented by the numerical data, and generates speech data from the digit indicated by the numerical information.Type: GrantFiled: September 21, 2009Date of Patent: July 17, 2012Assignee: Kabushiki Kaisha ToshibaInventors: Ryutaro Tokuda, Takehiko Kagoshima
-
Patent number: 8195464Abstract: A speech synthesizer includes a periodic component fusing unit and an aperiodic component fusing unit, and fuses periodic components and aperiodic components of a plurality of speech units for each segment, which are selected by a unit selector, by a periodic component fusing unit and an aperiodic component fusing unit, respectively. The speech synthesizer is further provided with an adder, so that the adder adds, edits, and concatenates the periodic components and the aperiodic components of the fused speech units to generate a speech waveform.Type: GrantFiled: September 18, 2008Date of Patent: June 5, 2012Assignee: Kabushiki Kaisha ToshibaInventors: Masahiro Morita, Takehiko Kagoshima
-
Patent number: 8175881Abstract: A phoneme sequence corresponding to a target speech is divided into a plurality of segments. A plurality of speech units for each segment is selected from a speech unit memory that stores speech units having at least one frame. The plurality of speech units has a prosodic feature accordant or similar to the target speech. A formant parameter having at least one formant frequency is generated for each frame of the plurality of speech units. A fused formant parameter of each frame is generated from formant parameters of each frame of the plurality of speech units. A fused speech unit of each segment is generated from the fused formant parameter of each frame. A synthesized speech is generated by concatenating the fused speech unit of each segment.Type: GrantFiled: August 14, 2008Date of Patent: May 8, 2012Assignee: Kabushiki Kaisha ToshibaInventors: Ryo Morinaka, Masatsune Tamura, Takehiko Kagoshima
-
Patent number: 8170876Abstract: A word dictionary including sets of a character string which constitutes a word, a phoneme sequence which constitutes pronunciation of the word and a part of speech of the word is referenced, an entered text is analyzed, the entered text is divided into one or more subtexts, a phoneme sequence and a part of speech sequence are generated for each subtext, the part of speech sequence of the subtext and a list of part of speech sequence are collated to determine whether the phonetic sound of the subtext is to be converted or not, and the phonetic sounds of the phoneme sequence in the subtext whose phonetic sounds are determined to be converted are converted.Type: GrantFiled: September 15, 2008Date of Patent: May 1, 2012Assignee: Kabushiki Kaisha ToshibaInventors: Takehiko Kagoshima, Noriko Yamanaka, Makoto Yajima
-
Publication number: 20120065981Abstract: According to an embodiment, a text presentation apparatus presenting text for a speaker to read aloud for voice recording includes: a text storing unit for storing first text; a presenting unit for presenting the first text; a determination unit for determining whether or not the first text needs to be replaced, on the basis of a speaker's input for the first text presented; a preliminary text storing unit for storing preliminary text; a select unit configured to select, if it is determined that the first text needs to be replaced, second text to replace the first text from among the preliminary text, the selecting being performed on the basis of attribute information describing an attribute of the first text and on the basis of at least one of attribute information describing pronunciation of the first text and attribute information describing a stress type of the first text; and a control unit configured to control the presenting unit so that the presenting unit presents the second text.Type: ApplicationFiled: August 11, 2011Publication date: March 15, 2012Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Kentaro Tachibana, Gou Hirabayashi, Takehiko Kagoshima