Patents by Inventor Masatsune Tamura
Masatsune Tamura has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11423874Abstract: A speech synthesis model training device includes one or more hardware processors configured to perform the following. Storing, in a speech corpus storing unit, speech data, and pitch mark information and context information of the speech data. From the speech data, analyzing acoustic feature parameters at each pitch mark timing in pitch mark information. From the acoustic feature parameters analyzed, training a statistical model which has a plurality of states and which includes an output distribution of acoustic feature parameters including pitch feature parameters and a duration distribution based on timing parameters.Type: GrantFiled: July 29, 2020Date of Patent: August 23, 2022Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Masatsune Tamura, Masahiro Morita
-
Patent number: 11348569Abstract: A speech processing device includes a hardware processor configured to receive input speech and extract speech frames from the input speech. The hardware processor is configured to calculate a spectrum parameter for each of the speech frames, calculate a first phase spectrum for each of the speech frames, calculate a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum, calculate a band group delay parameter in a predetermined frequency band from the group delay spectrum, and calculate a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum. The hardware processor is configured to generate a speech waveform based on the spectrum parameter, the band group delay parameter, and the band group delay compensation parameter.Type: GrantFiled: April 7, 2020Date of Patent: May 31, 2022Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Masatsune Tamura, Masahiro Morita
-
Patent number: 11170756Abstract: A speech processing device of an embodiment includes a spectrum parameter calculation unit, a phase spectrum calculation unit, a group delay spectrum calculation unit, a band group delay parameter calculation unit, and a band group delay compensation parameter calculation unit. The spectrum parameter calculation unit calculates a spectrum parameter. The phase spectrum calculation unit calculates a first phase spectrum. The group delay spectrum calculation unit calculates a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum. The band group delay parameter calculation unit calculates a band group delay parameter in a predetermined frequency band from a group delay spectrum. The band group delay compensation parameter calculation unit calculates a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum.Type: GrantFiled: April 7, 2020Date of Patent: November 9, 2021Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Masatsune Tamura, Masahiro Morita
-
Patent number: 10878801Abstract: A speech synthesis device of an embodiment includes a memory unit, a creating unit, a deciding unit, a generating unit and a waveform generating unit. The memory unit stores, as statistical model information of a statistical model, an output distribution of acoustic feature parameters including pitch feature parameters and a duration distribution. The creating unit creates a statistical model sequence from context information and the statistical model information. The deciding unit decides a pitch-cycle waveform count of each state using a duration based on the duration distribution of each state of each statistical model in the statistical model sequence, and pitch information based on the output distribution of the pitch feature parameters. The generating unit generates an output distribution sequence based on the pitch-cycle waveform count, and acoustic feature parameters based on the output distribution sequence.Type: GrantFiled: February 14, 2018Date of Patent: December 29, 2020Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Masatsune Tamura, Masahiro Morita
-
Publication number: 20200357381Abstract: A speech synthesis device of an embodiment includes a memory unit, a creating unit, a deciding unit, a generating unit and a waveform generating unit. The memory unit stores, as statistical model information of a statistical model, an output distribution of acoustic feature parameters including pitch feature parameters and a duration distribution. The creating unit creates a statistical model sequence from context information and the statistical model information. The deciding unit decides a pitch-cycle waveform count of each state using a duration based on the duration distribution of each state of each statistical model in the statistical model sequence, and pitch information based on the output distribution of the pitch feature parameters. The generating unit generates an output distribution sequence based on the pitch-cycle waveform count, and acoustic feature parameters based on the output distribution sequence.Type: ApplicationFiled: July 29, 2020Publication date: November 12, 2020Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Masatsune TAMURA, Masahiro MORITA
-
Publication number: 20200234692Abstract: A speech processing device of an embodiment includes a spectrum parameter calculation unit, a phase spectrum calculation unit, a group delay spectrum calculation unit, a band group delay parameter calculation unit, and a band group delay compensation parameter calculation unit. The spectrum parameter calculation unit calculates a spectrum parameter. The phase spectrum calculation unit calculates a first phase spectrum. The group delay spectrum calculation unit calculates a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum. The band group delay parameter calculation unit calculates a band group delay parameter in a predetermined frequency band from a group delay spectrum. The band group delay compensation parameter calculation unit calculates a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum.Type: ApplicationFiled: April 7, 2020Publication date: July 23, 2020Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Masatsune TAMURA, Masahiro MORITA
-
Publication number: 20200234691Abstract: A speech processing device of an embodiment includes a spectrum parameter calculation unit, a phase spectrum calculation unit, a group delay spectrum calculation unit, a band group delay parameter calculation unit, and a band group delay compensation parameter calculation unit. The spectrum parameter calculation unit calculates a spectrum parameter. The phase spectrum calculation unit calculates a first phase spectrum. The group delay spectrum calculation unit calculates a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum. The band group delay parameter calculation unit calculates a band group delay parameter in a predetermined frequency band from a group delay spectrum. The band group delay compensation parameter calculation unit calculates a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum.Type: ApplicationFiled: April 7, 2020Publication date: July 23, 2020Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Masatsune TAMURA, Masahiro MORITA
-
Patent number: 10650800Abstract: A speech processing device of an embodiment includes a spectrum parameter calculation unit, a phase spectrum calculation unit, a group delay spectrum calculation unit, a band group delay parameter calculation unit, and a band group delay compensation parameter calculation unit. The spectrum parameter calculation unit calculates a spectrum parameter. The phase spectrum calculation unit calculates a first phase spectrum. The group delay spectrum calculation unit calculates a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum. The band group delay parameter calculation unit calculates a band group delay parameter in a predetermined frequency band from a group delay spectrum. The band group delay compensation parameter calculation unit calculates a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum.Type: GrantFiled: February 16, 2018Date of Patent: May 12, 2020Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Masatsune Tamura, Masahiro Morita
-
Patent number: 10529314Abstract: A speech synthesizer includes a statistical-model sequence generator, a multiple-acoustic feature parameter sequence generator, and a waveform generator. The statistical-model sequence generator generates, based on context information corresponding to an input text, a statistical model sequence that comprises a first sequence of a statistical model comprising a plurality of states. The multiple-acoustic feature parameter sequence generator, for each speech section corresponding to each state of the statistical model sequence, selects a first plurality of acoustic feature parameters from a first set of acoustic feature parameters extracted from a first speech waveform stored in a speech database and generates a multiple-acoustic feature parameter sequence that comprises a sequence of the first plurality of acoustic feature parameters.Type: GrantFiled: February 16, 2017Date of Patent: January 7, 2020Assignee: Kabushiki Kaisha ToshibaInventors: Masatsune Tamura, Masahiro Morita
-
Patent number: 10347237Abstract: According to an embodiment, a device includes a table creator, an estimator, and a dictionary creator. The table creator is configured to create a table based on similarity between distributions of nodes of speech synthesis dictionaries of a specific speaker in respective first and second languages. The estimator is configured to estimate a matrix to transform the speech synthesis dictionary of the specific speaker in the first language to a speech synthesis dictionary of a target speaker in the first language, based on speech and a recorded text of the target speaker in the first language and the speech synthesis dictionary of the specific speaker in the first language. The dictionary creator is configured to create a speech synthesis dictionary of the target speaker in the second language, based on the table, the matrix, and the speech synthesis dictionary of the specific speaker in the second language.Type: GrantFiled: July 9, 2015Date of Patent: July 9, 2019Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Kentaro Tachibana, Masatsune Tamura, Yamato Ohtani
-
Patent number: 10157608Abstract: According to an embodiment, a voice processing device includes an interface system, a determining processor, and a predicting processor. The interface system configured to receive neutral voice data representing audio in a neutral voice of a user. The determining processor configured to determine a predictive parameter based at least in part on the neutral voice data. The predicting processor configured to predict a voice conversion model for converting the neutral voice of the speaker to a target voice using at least the predictive parameter.Type: GrantFiled: February 15, 2017Date of Patent: December 18, 2018Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Yamato Ohtani, Yu Nasu, Masatsune Tamura, Masahiro Morita
-
Patent number: 10109286Abstract: According to an embodiment, a speech synthesizer includes a source generator, a phase modulator, and a vocal tract filter unit. The source generator generates a source signal by using a fundamental frequency sequence and a pulse signal. The phase modulator modulates, with respect to the source signal generated by the source generator, a phase of the pulse signal at each pitch mark based on audio watermarking information. The vocal tract filter unit generates a speech signal by using a spectrum parameter sequence with respect to the source signal in which the phase of the pulse signal is modulated by the phase modulator.Type: GrantFiled: September 14, 2017Date of Patent: October 23, 2018Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Kentaro Tachibana, Takehiko Kagoshima, Masatsune Tamura, Masahiro Morita
-
Publication number: 20180174570Abstract: A speech synthesis device of an embodiment includes a memory unit, a creating unit, a deciding unit, a generating unit and a waveform generating unit. The memory unit stores, as statistical model information of a statistical model, an output distribution of acoustic feature parameters including pitch feature parameters and a duration distribution. The creating unit creates a statistical model sequence from context information and the statistical model information. The deciding unit decides a pitch-cycle waveform count of each state using a duration based on the duration distribution of each state of each statistical model in the statistical model sequence, and pitch information based on the output distribution of the pitch feature parameters. The generating unit generates an output distribution sequence based on the pitch-cycle waveform count, and acoustic feature parameters based on the output distribution sequence.Type: ApplicationFiled: February 14, 2018Publication date: June 21, 2018Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Masatsune TAMURA, Masahiro MORITA
-
Publication number: 20180174571Abstract: A speech processing device of an embodiment includes a spectrum parameter calculation unit, a phase spectrum calculation unit, a group delay spectrum calculation unit, a band group delay parameter calculation unit, and a band group delay compensation parameter calculation unit. The spectrum parameter calculation unit calculates a spectrum parameter. The phase spectrum calculation unit calculates a first phase spectrum. The group delay spectrum calculation unit calculates a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum. The band group delay parameter calculation unit calculates a band group delay parameter in a predetermined frequency band from a group delay spectrum. The band group delay compensation parameter calculation unit calculates a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum.Type: ApplicationFiled: February 16, 2018Publication date: June 21, 2018Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Masatsune TAMURA, Masahiro MORITA
-
Patent number: 9870779Abstract: According to an embodiment, a speech synthesizer includes a source generator, a phase modulator, and a vocal tract filter unit. The source generator generates a source signal by using a fundamental frequency sequence and a pulse signal. The phase modulator modulates, with respect to the source signal generated by the source generator, a phase of the pulse signal at each pitch mark based on audio watermarking information. The vocal tract filter unit generates a speech signal by using a spectrum parameter sequence with respect to the source signal in which the phase of the pulse signal is modulated by the phase modulator.Type: GrantFiled: July 16, 2015Date of Patent: January 16, 2018Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Kentaro Tachibana, Takehiko Kagoshima, Masatsune Tamura, Masahiro Morita
-
Publication number: 20180005637Abstract: According to an embodiment, a speech synthesizer includes a source generator, a phase modulator, and a vocal tract filter unit. The source generator generates a source signal by using a fundamental frequency sequence and a pulse signal. The phase modulator modulates, with respect to the source signal generated by the source generator, a phase of the pulse signal at each pitch mark based on audio watermarking information. The vocal tract filter unit generates a speech signal by using a spectrum parameter sequence with respect to the source signal in which the phase of the pulse signal is modulated by the phase modulator.Type: ApplicationFiled: September 14, 2017Publication date: January 4, 2018Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Kentaro TACHIBANA, Takehiko KAGOSHIMA, Masatsune TAMURA, Masahiro MORITA
-
Patent number: 9830904Abstract: According to an embodiment, a text-to-speech device includes a context acquirer, an acoustic model parameter acquirer, a conversion parameter acquirer, a converter, and a waveform generator. The context acquirer is configured to acquire a context sequence affecting fluctuations in voice. The acoustic model parameter acquirer is configured to acquire an acoustic model parameter sequence that corresponds to the context sequence and represents an acoustic model in a standard speaking style of a target speaker. The conversion parameter acquirer is configured to acquire a conversion parameter sequence corresponding to the context sequence to convert an acoustic model parameter in the standard speaking style into one in a different speaking style. The converter is configured to convert the acoustic model parameter sequence using the conversion parameter sequence. The waveform generator is configured to generate a voice signal based on the acoustic model parameter sequence acquired after conversion.Type: GrantFiled: June 17, 2016Date of Patent: November 28, 2017Assignee: KABUSHIKI KAISHA TOSHIBAInventors: Yu Nasu, Masatsune Tamura, Ryo Morinaka, Masahiro Morita
-
Publication number: 20170162186Abstract: A speech synthesizer includes a statistical-model sequence generator, a multiple-acoustic feature parameter sequence generator, and a waveform generator. The statistical-model sequence generator generates, based on context information corresponding to an input text, a statistical model sequence that comprises a first sequence of a statistical model comprising a plurality of states. The multiple-acoustic feature parameter sequence generator, for each speech section corresponding to each state of the statistical model sequence, selects a first plurality of acoustic feature parameters from a first set of acoustic feature parameters extracted from a first speech waveform stored in a speech database and generates a multiple-acoustic feature parameter sequence that comprises a sequence of the first plurality of acoustic feature parameters.Type: ApplicationFiled: February 16, 2017Publication date: June 8, 2017Inventors: Masatsune TAMURA, Masahiro MORITA
-
Publication number: 20170162187Abstract: According to an embodiment, a voice processing device includes an interface system, a determining processor, and a predicting processor. The interface system configured to receive neutral voice data representing audio in a neutral voice of a user. The determining processor configured to determine a predictive parameter based at least in part on the neutral voice data. The predicting processor configured to predict a voice conversion model for converting the neutral voice of the speaker to a target voice using at least the predictive parameter.Type: ApplicationFiled: February 15, 2017Publication date: June 8, 2017Inventors: Yamato OHTANI, Yu NASU, Masatsune TAMURA, Masahiro MORITA
-
Publication number: 20160300564Abstract: According to an embodiment, a text-to-speech device includes a context acquirer, an acoustic model parameter acquirer, a conversion parameter acquirer, a converter, and a waveform generator. The context acquirer is configured to acquire a context sequence affecting fluctuations in voice. The acoustic model parameter acquirer is configured to acquire an acoustic model parameter sequence that corresponds to the context sequence and represents an acoustic model in a standard speaking style of a target speaker. The conversion parameter acquirer is configured to acquire a conversion parameter sequence corresponding to the context sequence to convert an acoustic model parameter in the standard speaking style into one in a different speaking style. The converter is configured to convert the acoustic model parameter sequence using the conversion parameter sequence. The waveform generator is configured to generate a voice signal based on the acoustic model parameter sequence acquired after conversion.Type: ApplicationFiled: June 17, 2016Publication date: October 13, 2016Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Yu NASU, Masatsune Tamura, Ryo Morinaka, Masahiro Morita