Patents by Inventor Masatsune Tamura

Masatsune Tamura has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Apparatus and method for estimating utterance style of each sentence in documents, and non-transitory computer readable medium thereof

Patent number: 9280967

Abstract: According to one embodiment, an apparatus for supporting reading of a document includes a model storage unit, a document acquisition unit, a feature information extraction, and an utterance style estimation unit. The model storage unit is configured to store a model which has trained a correspondence relationship between first feature information and an utterance style. The first feature information is extracted from a plurality of sentences in a training document. The document acquisition unit is configured to acquire a document to be read. The feature information extraction unit is configured to extract second feature information from each sentence in the document to be read. The utterance style estimation unit is configured to compare the second feature information of a plurality of sentences in the document to be read with the model, and to estimate an utterance style of the each sentence of the document to be read.

Type: Grant

Filed: September 14, 2011

Date of Patent: March 8, 2016

Assignee: Kabushiki Kaisha Toshiba

Inventors: Kosei Fume, Masaru Suzuki, Masahiro Morita, Kentaro Tachibana, Kouichirou Mori, Yuji Shimizu, Takehiko Kagoshima, Masatsune Tamura, Tomohiro Yamasaki
SPEECH SYNTHESIS DICTIONARY CREATION DEVICE, SPEECH SYNTHESIZER, SPEECH SYNTHESIS DICTIONARY CREATION METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20160012035

Abstract: According to an embodiment, a device includes a table creator, an estimator, and a dictionary creator. The table creator is configured to create a table based on similarity between distributions of nodes of speech synthesis dictionaries of a specific speaker in respective first and second languages. The estimator is configured to estimate a matrix to transform the speech synthesis dictionary of the specific speaker in the first language to a speech synthesis dictionary of a target speaker in the first language, based on speech and a recorded text of the target speaker in the first language and the speech synthesis dictionary of the specific speaker in the first language. The dictionary creator is configured to create a speech synthesis dictionary of the target speaker in the second language, based on the table, the matrix, and the speech synthesis dictionary of the specific speaker in the second language.

Type: Application

Filed: July 9, 2015

Publication date: January 14, 2016

Inventors: Kentaro Tachibana, Masatsune Tamura, Yamato Ohtani
SPEECH SYNTHESIZER, AUDIO WATERMARKING INFORMATION DETECTION APPARATUS, SPEECH SYNTHESIZING METHOD, AUDIO WATERMARKING INFORMATION DETECTION METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20150325232

Abstract: According to an embodiment, a speech synthesizer includes a source generator, a phase modulator, and a vocal tract filter unit. The source generator generates a source signal by using a fundamental frequency sequence and a pulse signal. The phase modulator modulates, with respect to the source signal generated by the source generator, a phase of the pulse signal at each pitch mark based on audio watermarking information. The vocal tract filter unit generates a speech signal by using a spectrum parameter sequence with respect to the source signal in which the phase of the pulse signal is modulated by the phase modulator.

Type: Application

Filed: July 16, 2015

Publication date: November 12, 2015

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Kentaro TACHIBANA, Takehiko KAGOSHIMA, Masatsune TAMURA, Masahiro MORITA
Speech synthesis device, speech synthesis method, and computer program product

Patent number: 9135910

Abstract: According to an embodiment, a speech synthesis device includes a first storage, a second storage, a first generator, a second generator, a third generator, and a fourth generator. The first storage is configured to store therein first information obtained from a target uttered voice. The second storage is configured to store therein second information obtained from an arbitrary uttered voice. The first generator is configured to generate third information by converting the second information so as to be close to a target voice quality or prosody. The second generator is configured to generate an information set including the first information and the third information. The third generator is configured to generate fourth information used to generate a synthesized speech, based on the information set. The fourth generator configured to generate the synthesized speech corresponding to input text using the fourth information.

Type: Grant

Filed: February 12, 2013

Date of Patent: September 15, 2015

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune Tamura, Masahiro Morita
Speech synthesis apparatus, speech synthesis method, speech synthesis program product, and learning apparatus

Patent number: 9110887

Abstract: According to one embodiment, a speech synthesis apparatus includes a language analyzer, statistical model storage, model selector, parameter generator, basis model storage, and filter processor. The language analyzer analyzes text data and outputs language information data that represents linguistic information of the text data. The statistical model storage stores statistical models prepared by statistically modeling acoustic information included in speech. The model selector selects a statistical model from the models based on the language information data. The parameter generator generates speech parameter sequences using the statistical model selected by the model selector. The basis model storage stores a basis model including basis vectors, each of which expresses speech information for each limited frequency range. The filter processor outputs synthetic speech by executing filter processing of the speech parameter sequences and the basis model.

Type: Grant

Filed: December 26, 2012

Date of Patent: August 18, 2015

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Yamato Ohtani, Masatsune Tamura, Masahiro Morita
Speech synthesizer, speech synthesis method and computer program product

Patent number: 9058807

Abstract: According to one embodiment, a first storage unit stores n band noise signals obtained by applying n band-pass filters to a noise signal. A second storage unit stores n band pulse signals. A parameter input unit inputs a fundamental frequency, n band noise intensities, and a spectrum parameter. A extraction unit extracts for each pitch mark the n band noise signals while shifting. An amplitude control unit changes amplitudes of the extracted band noise signals and band pulse signals in accordance with the band noise intensities. A generation unit generates a mixed sound source signal by adding the n band noise signals and the n band pulse signals. A generation unit generates the mixed sound source signal generated based on the pitch mark. A vocal tract filter unit generates a speech waveform by applying a vocal tract filter using the spectrum parameter to the generated mixed sound source signal.

Type: Grant

Filed: March 18, 2011

Date of Patent: June 16, 2015

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
PROSODY EDITING DEVICE AND METHOD AND COMPUTER PROGRAM PRODUCT

Publication number: 20150081306

Abstract: According to an embodiment, a prosody editing device includes an approximate contour generator, a setter, a display controller, an operation receiver, and an updater. The approximate contour generator approximates a contour representing a time series of prosody information with a parametric curve including a control point to generate an approximate contour. The setter sets, on the approximate contour, an operation point corresponding to the control point. The display controller displays, on a display device, an operation screen including the approximate contour on which the operation point is shown. The operation receiver receives an operation to move the operation point optionally selected on the operation screen. The updater calculates a position of the control point from a moving amount of the operation point and updates the approximate contour.

Type: Application

Filed: September 2, 2014

Publication date: March 19, 2015

Inventors: Kouichirou MORI, Yu NASU, Masatsune TAMURA, Masahiro MORITA
SPEECH SYNTHESIS DICTIONARY MODIFICATION DEVICE, SPEECH SYNTHESIS DICTIONARY MODIFICATION METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20140257816

Abstract: According to an embodiment, a speech synthesis dictionary modification device includes an extracting unit, a display unit, an acquiring unit, an modification unit, and an updating unit. The extracting unit extracts a synthesis information containing a feature sequence of a synthetic speech from the synthetic speech generated by using a speech synthesis dictionary containing probability distributions of speech features. The display unit displays an image prompting to modify a probability distribution contained in the speech synthesis dictionary on a basis of the synthesis information extracted by the extracting unit. The acquiring unit acquires an instruction to modify the probability distribution contained in the speech synthesis dictionary. The modification unit modifies the probability distribution contained in the speech synthesis dictionary according to the instruction.

Type: Application

Filed: January 31, 2014

Publication date: September 11, 2014

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Ryo Morinaka, Masatsune Tamura, Masahiro Morita
SPEECH SYNTHESIS APPARATUS, SPEECH SYNTHESIS METHOD, SPEECH SYNTHESIS PROGRAM PRODUCT, AND LEARNING APPARATUS

Publication number: 20130262087

Abstract: According to one embodiment, a speech synthesis apparatus includes a language analyzer, statistical model storage, model selector, parameter generator, basis model storage, and filter processor. The language analyzer analyzes text data and outputs language information data that represents linguistic information of the text data. The statistical model storage stores statistical models prepared by statistically modeling acoustic information included in speech. The model selector selects a statistical model from the models based on the language information data. The parameter generator generates speech parameter sequences using the statistical model selected by the model selector. The basis model storage stores a basis model including basis vectors, each of which expresses speech information for each limited frequency range. The filter processor outputs synthetic speech by executing filter processing of the speech parameter sequences and the basis model.

Type: Application

Filed: December 26, 2012

Publication date: October 3, 2013

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Yamato OHTANI, Masatsune TAMURA, Masahiro MORITA
Voice conversion apparatus and method and speech synthesis apparatus and method

Patent number: 8438033

Abstract: A voice conversion apparatus stores, in a parameter memory, target speech spectral parameters of target speech, stores, in a voice conversion rule memory, a voice conversion rule for converting voice quality of source speech into voice quality of the target speech, extracts, from an input source speech, a source speech spectral parameter of the input source speech, converts extracted source speech spectral parameter into a first conversion spectral parameter by using the voice conversion rule, selects target speech spectral parameter similar to the first conversion spectral parameter from the parameter memory, generates an aperiodic component spectral parameter representing from selected target speech spectral parameter, mixes a periodic component spectral parameter included in the first conversion spectral parameter with the aperiodic component spectral parameter, to obtain a second conversion spectral parameter, and generates a speech waveform from the second conversion spectral parameter.

Type: Grant

Filed: July 20, 2009

Date of Patent: May 7, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
SPEECH PROCESSING APPARATUS, A SPEECH PROCESSING METHOD, AND A FILTER PRODUCED BY THE METHOD

Publication number: 20120323569

Abstract: According to one embodiment, a speech processing apparatus includes a histogram calculation unit, a cumulative frequency calculation unit, and a filter production unit. The histogram calculation unit is configured to calculate a first histogram from a first speech feature extracted from speech data, and to calculate a second histogram from a second speech feature different from the first speech feature. The cumulative frequency calculation unit is configured to calculate a first cumulative frequency by accumulating a frequency of the first histogram, and to calculate a second cumulative frequency by accumulating a frequency of the second histogram. The filter production unit is configured to produce a filter having a characteristic to get the second cumulative frequency near to the first cumulative frequency.

Type: Application

Filed: March 15, 2012

Publication date: December 20, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Yamato Ohtani, Masatsune Tamura, Masahiro Morita
Speech processing and speech synthesis using a linear combination of bases at peak frequencies for spectral envelope information

Patent number: 8321208

Abstract: An information extraction unit extracts spectral envelope information of L-dimension from each frame of speech data by discrete Fourier transform. The spectral envelope information is represented by L points. A basis storage unit stores N bases (L>N>1). Each basis is differently a frequency band having a maximum as a peak frequency in a spectral domain having L-dimension. A value corresponding to a frequency outside the frequency band along a frequency axis of the spectral domain is zero. Two frequency bands of which two peak frequencies are adjacent along the frequency axis partially overlap. A parameter calculation unit minimizes a distortion between the spectral envelope information and a linear combination of each basis with a coefficient for each of L points of the spectral envelope information by changing the coefficient, and sets the coefficient of each basis from which the distortion is minimized to a spectral envelope parameter of the spectral envelope information.

Type: Grant

Filed: December 3, 2008

Date of Patent: November 27, 2012

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masatsune Tamura, Katsumi Tsuchiya, Takehiko Kagoshima
APPARATUS AND METHOD FOR SUPPORTING READING OF DOCUMENT, AND COMPUTER READABLE MEDIUM

Publication number: 20120239390

Abstract: According to one embodiment, an apparatus for supporting reading of a document includes a model storage unit, a document acquisition unit, a feature information extraction, and an utterance style estimation unit. The model storage unit is configured to store a model which has trained a correspondence relationship between first feature information and an utterance style. The first feature information is extracted from a plurality of sentences in a training document. The document acquisition unit is configured to acquire a document to be read. The feature information extraction unit is configured to extract second feature information from each sentence in the document to be read. The utterance style estimation unit is configured to compare the second feature information of a plurality of sentences in the document to be read with the model, and to estimate an utterance style of the each sentence of the document to be read.

Type: Application

Filed: September 14, 2011

Publication date: September 20, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Kosei Fume, Masaru Suzuki, Masahiro Morita, Kentaro Tachibana, Kouichirou Mori, Yuji Shimizu, Takehiko Kagoshima, Masatsune Tamura, Tomohiro Yamasaki
Method and apparatus using fused formant parameters to generate synthesized speech

Patent number: 8175881

Abstract: A phoneme sequence corresponding to a target speech is divided into a plurality of segments. A plurality of speech units for each segment is selected from a speech unit memory that stores speech units having at least one frame. The plurality of speech units has a prosodic feature accordant or similar to the target speech. A formant parameter having at least one formant frequency is generated for each frame of the plurality of speech units. A fused formant parameter of each frame is generated from formant parameters of each frame of the plurality of speech units. A fused speech unit of each segment is generated from the fused formant parameter of each frame. A synthesized speech is generated by concatenating the fused speech unit of each segment.

Type: Grant

Filed: August 14, 2008

Date of Patent: May 8, 2012

Assignee: Kabushiki Kaisha Toshiba

Inventors: Ryo Morinaka, Masatsune Tamura, Takehiko Kagoshima
SPEECH SYNTHESIZER, SPEECH SYNTHESIS METHOD AND COMPUTER PROGRAM PRODUCT

Publication number: 20120053933

Abstract: According to one embodiment, a first storage unit stores n band noise signals obtained by applying n band-pass filters to a noise signal. A second storage unit stores n band pulse signals. A parameter input unit inputs a fundamental frequency, n band noise intensities, and a spectrum parameter. A extraction unit extracts for each pitch mark the n band noise signals while shifting. An amplitude control unit changes amplitudes of the extracted band noise signals and band pulse signals in accordance with the band noise intensities. A generation unit generates a mixed sound source signal by adding the n band noise signals and the n band pulse signals. A generation unit generates the mixed sound source signal generated based on the pitch mark. A vocal tract filter unit generates a speech waveform by applying a vocal tract filter using the spectrum parameter to the generated mixed sound source signal.

Type: Application

Filed: March 18, 2011

Publication date: March 1, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
Voice conversion using interpolated speech unit start and end-time conversion rule matrices and spectral compensation on its spectral parameter vector

Patent number: 8010362

Abstract: A voice conversion rule and a rule selection parameter are stored. The voice conversion rule converts a spectral parameter vector of a source speaker to a spectral parameter vector of a target speaker. The rule selection parameter represents the spectral parameter vector of the source speaker. A first voice conversion rule of start time and a second voice conversion rule of end time in a speech unit of the source speaker are selected by the spectral parameter vector of the start time and the end time. An interpolation coefficient corresponding to the spectral parameter vector of each time in the speech unit is calculated by the first voice conversion rule and the second voice conversion rule. A third voice conversion rule corresponding to the spectral parameter vector of each time in the speech unit is calculated by interpolating the first voice conversion rule and the second voice conversion rule with the interpolation coefficient.

Type: Grant

Filed: January 22, 2008

Date of Patent: August 30, 2011

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masatsune Tamura, Takehiro Kagoshima
VOICE CONVERSION APPARATUS AND METHOD AND SPEECH SYNTHESIS APPARATUS AND METHOD

Publication number: 20100049522

Abstract: A voice conversion apparatus stores, in a parameter memory, target speech spectral parameters of target speech, stores, in a voice conversion rule memory, a voice conversion rule for converting voice quality of source speech into voice quality of the target speech, extracts, from an input source speech, a source speech spectral parameter of the input source speech, converts extracted source speech spectral parameter into a first conversion spectral parameter by using the voice conversion rule, selects target speech spectral parameter similar to the first conversion spectral parameter from the parameter memory, generates an aperiodic component spectral parameter representing from selected target speech spectral parameter, mixes a periodic component spectral parameter included in the first conversion spectral parameter with the aperiodic component spectral parameter, to obtain a second conversion spectral parameter, and generates a speech waveform from the second conversion spectral parameter.

Type: Application

Filed: July 20, 2009

Publication date: February 25, 2010

Inventors: Masatsune Tamura, Masahiro Morita, Takehiko Kagoshima
Speech synthesis system and method

Patent number: 7630896

Abstract: A speech synthesis system in a preferred embodiment includes a speech unit storage section, a phonetic environment storage section, a phonetic sequence/prosodic information input section, a plural-speech-unit selection section, a fused-speech-unit sequence generation section, and a fused-speech-unit modification/concatenation section. By fusing a plurality of selected speech units in the fused speech unit sequence generation section, a fused speech unit is generated. In the fused speech unit sequence generation section, the average power information is calculated for a plurality of selected M speech units, N speech units are fused together, and the power information of the fused speech unit is so corrected as to be equalized with the average power information of the M speech units.

Type: Grant

Filed: September 23, 2005

Date of Patent: December 8, 2009

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masatsune Tamura, Gou Hirabayashi, Takehiko Kagoshima
Apparatus and method for voice conversion using attribute information

Patent number: 7580839

Abstract: A speech processing apparatus according to an embodiment of the invention includes a conversion-source-speaker speech-unit database; a voice-conversion-rule-learning-data generating means; and a voice-conversion-rule learning means, with which it makes voice conversion rules. The voice-conversion-rule-learning-data generating means includes a conversion-target-speaker speech-unit extracting means; an attribute-information generating means; a conversion-source-speaker speech-unit database; and a conversion-source-speaker speech-unit selection means.

Type: Grant

Filed: September 19, 2006

Date of Patent: August 25, 2009

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masatsune Tamura, Takehiko Kagoshima
SPEECH PROCESSING APPARATUS AND SPEECH SYNTHESIS APPARATUS

Publication number: 20090144053

Abstract: An information extraction unit extracts spectral envelope information of L-dimension from each frame of speech data. The spectral envelope information does not have a spectral fine structure. A basis storage unit stores N bases (L>N>1). Each basis is differently a frequency band having a maximum as a peak frequency in a spectral domain having L-dimension. A value corresponding to a frequency outside the frequency band along a frequency axis of the spectral domain is zero. Two frequency bands of which two peak frequencies are adjacent along the frequency axis partially overlap. A parameter calculation unit minimizes a distortion between the spectral envelope information and a linear combination of each basis with a coefficient by changing the coefficient, and sets the coefficient of each basis from which the distortion is minimized to a spectral envelope parameter of the spectral envelope information.

Type: Application

Filed: December 3, 2008

Publication date: June 4, 2009

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune TAMURA, Katsumi TSUCHIYA, Takehiko KAGOSHIMA

prev 1 2 3 next