Patents by Inventor Masatsune Tamura

Masatsune Tamura has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SPEECH SYNTHESIS DEVICE, SPEECH SYNTHESIS METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20250006176

Abstract: The response time till generation of waveforms is improved, and detailed editing for a metrical-feature quantity based on the entire input is performable before generation of waveforms. A speech synthesis device includes an analyzing unit, a first-processing unit, and a second-processing unit. The analyzing unit analyzes an input text and generates a language feature quantity sequence including one or more vectors indicating a language feature quantity. The first-processing unit includes an encoder that converts the language feature quantity sequence into an intermediate expression sequence including one or more vectors indicating a latent variable, using a first neural network; and includes a metrical-feature quantity decoder that generates a metrical-feature quantity from the intermediate expression sequence using a second neural network.

Type: Application

Filed: September 13, 2024

Publication date: January 2, 2025

Applicants: KABUSHIKI KAISHA TOSHIBA, TOSHIBA DIGITAL SOLUTIONS CORPORATION

Inventors: Yoshiki HIRUTA, Masatsune TAMURA
Speech synthesis statistical model training device, speech synthesis statistical model training method, and computer program product

Patent number: 11423874

Abstract: A speech synthesis model training device includes one or more hardware processors configured to perform the following. Storing, in a speech corpus storing unit, speech data, and pitch mark information and context information of the speech data. From the speech data, analyzing acoustic feature parameters at each pitch mark timing in pitch mark information. From the acoustic feature parameters analyzed, training a statistical model which has a plurality of states and which includes an output distribution of acoustic feature parameters including pitch feature parameters and a duration distribution based on timing parameters.

Type: Grant

Filed: July 29, 2020

Date of Patent: August 23, 2022

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune Tamura, Masahiro Morita
Speech processing device, speech processing method, and computer program product using compensation parameters

Patent number: 11348569

Abstract: A speech processing device includes a hardware processor configured to receive input speech and extract speech frames from the input speech. The hardware processor is configured to calculate a spectrum parameter for each of the speech frames, calculate a first phase spectrum for each of the speech frames, calculate a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum, calculate a band group delay parameter in a predetermined frequency band from the group delay spectrum, and calculate a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum. The hardware processor is configured to generate a speech waveform based on the spectrum parameter, the band group delay parameter, and the band group delay compensation parameter.

Type: Grant

Filed: April 7, 2020

Date of Patent: May 31, 2022

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune Tamura, Masahiro Morita
Speech processing device, speech processing method, and computer program product

Patent number: 11170756

Abstract: A speech processing device of an embodiment includes a spectrum parameter calculation unit, a phase spectrum calculation unit, a group delay spectrum calculation unit, a band group delay parameter calculation unit, and a band group delay compensation parameter calculation unit. The spectrum parameter calculation unit calculates a spectrum parameter. The phase spectrum calculation unit calculates a first phase spectrum. The group delay spectrum calculation unit calculates a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum. The band group delay parameter calculation unit calculates a band group delay parameter in a predetermined frequency band from a group delay spectrum. The band group delay compensation parameter calculation unit calculates a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum.

Type: Grant

Filed: April 7, 2020

Date of Patent: November 9, 2021

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune Tamura, Masahiro Morita
Statistical speech synthesis device, method, and computer program product using pitch-cycle counts based on state durations

Patent number: 10878801

Abstract: A speech synthesis device of an embodiment includes a memory unit, a creating unit, a deciding unit, a generating unit and a waveform generating unit. The memory unit stores, as statistical model information of a statistical model, an output distribution of acoustic feature parameters including pitch feature parameters and a duration distribution. The creating unit creates a statistical model sequence from context information and the statistical model information. The deciding unit decides a pitch-cycle waveform count of each state using a duration based on the duration distribution of each state of each statistical model in the statistical model sequence, and pitch information based on the output distribution of the pitch feature parameters. The generating unit generates an output distribution sequence based on the pitch-cycle waveform count, and acoustic feature parameters based on the output distribution sequence.

Type: Grant

Filed: February 14, 2018

Date of Patent: December 29, 2020

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune Tamura, Masahiro Morita
SPEECH SYNTHESIS DEVICE, SPEECH SYNTHESIS METHOD, SPEECH SYNTHESIS MODEL TRAINING DEVICE, SPEECH SYNTHESIS MODEL TRAINING METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20200357381

Abstract: A speech synthesis device of an embodiment includes a memory unit, a creating unit, a deciding unit, a generating unit and a waveform generating unit. The memory unit stores, as statistical model information of a statistical model, an output distribution of acoustic feature parameters including pitch feature parameters and a duration distribution. The creating unit creates a statistical model sequence from context information and the statistical model information. The deciding unit decides a pitch-cycle waveform count of each state using a duration based on the duration distribution of each state of each statistical model in the statistical model sequence, and pitch information based on the output distribution of the pitch feature parameters. The generating unit generates an output distribution sequence based on the pitch-cycle waveform count, and acoustic feature parameters based on the output distribution sequence.

Type: Application

Filed: July 29, 2020

Publication date: November 12, 2020

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune TAMURA, Masahiro MORITA
SPEECH PROCESSING DEVICE, SPEECH PROCESSING METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20200234692

Abstract: A speech processing device of an embodiment includes a spectrum parameter calculation unit, a phase spectrum calculation unit, a group delay spectrum calculation unit, a band group delay parameter calculation unit, and a band group delay compensation parameter calculation unit. The spectrum parameter calculation unit calculates a spectrum parameter. The phase spectrum calculation unit calculates a first phase spectrum. The group delay spectrum calculation unit calculates a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum. The band group delay parameter calculation unit calculates a band group delay parameter in a predetermined frequency band from a group delay spectrum. The band group delay compensation parameter calculation unit calculates a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum.

Type: Application

Filed: April 7, 2020

Publication date: July 23, 2020

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune TAMURA, Masahiro MORITA
SPEECH PROCESSING DEVICE, SPEECH PROCESSING METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20200234691

Abstract: A speech processing device of an embodiment includes a spectrum parameter calculation unit, a phase spectrum calculation unit, a group delay spectrum calculation unit, a band group delay parameter calculation unit, and a band group delay compensation parameter calculation unit. The spectrum parameter calculation unit calculates a spectrum parameter. The phase spectrum calculation unit calculates a first phase spectrum. The group delay spectrum calculation unit calculates a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum. The band group delay parameter calculation unit calculates a band group delay parameter in a predetermined frequency band from a group delay spectrum. The band group delay compensation parameter calculation unit calculates a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum.

Type: Application

Filed: April 7, 2020

Publication date: July 23, 2020

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune TAMURA, Masahiro MORITA
Speech processing device, speech processing method, and computer program product

Patent number: 10650800

Abstract: A speech processing device of an embodiment includes a spectrum parameter calculation unit, a phase spectrum calculation unit, a group delay spectrum calculation unit, a band group delay parameter calculation unit, and a band group delay compensation parameter calculation unit. The spectrum parameter calculation unit calculates a spectrum parameter. The phase spectrum calculation unit calculates a first phase spectrum. The group delay spectrum calculation unit calculates a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum. The band group delay parameter calculation unit calculates a band group delay parameter in a predetermined frequency band from a group delay spectrum. The band group delay compensation parameter calculation unit calculates a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum.

Type: Grant

Filed: February 16, 2018

Date of Patent: May 12, 2020

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune Tamura, Masahiro Morita
Speech synthesizer, and speech synthesis method and computer program product utilizing multiple-acoustic feature parameters selection

Patent number: 10529314

Abstract: A speech synthesizer includes a statistical-model sequence generator, a multiple-acoustic feature parameter sequence generator, and a waveform generator. The statistical-model sequence generator generates, based on context information corresponding to an input text, a statistical model sequence that comprises a first sequence of a statistical model comprising a plurality of states. The multiple-acoustic feature parameter sequence generator, for each speech section corresponding to each state of the statistical model sequence, selects a first plurality of acoustic feature parameters from a first set of acoustic feature parameters extracted from a first speech waveform stored in a speech database and generates a multiple-acoustic feature parameter sequence that comprises a sequence of the first plurality of acoustic feature parameters.

Type: Grant

Filed: February 16, 2017

Date of Patent: January 7, 2020

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masatsune Tamura, Masahiro Morita
Speech synthesis dictionary creation device, speech synthesizer, speech synthesis dictionary creation method, and computer program product

Patent number: 10347237

Abstract: According to an embodiment, a device includes a table creator, an estimator, and a dictionary creator. The table creator is configured to create a table based on similarity between distributions of nodes of speech synthesis dictionaries of a specific speaker in respective first and second languages. The estimator is configured to estimate a matrix to transform the speech synthesis dictionary of the specific speaker in the first language to a speech synthesis dictionary of a target speaker in the first language, based on speech and a recorded text of the target speaker in the first language and the speech synthesis dictionary of the specific speaker in the first language. The dictionary creator is configured to create a speech synthesis dictionary of the target speaker in the second language, based on the table, the matrix, and the speech synthesis dictionary of the specific speaker in the second language.

Type: Grant

Filed: July 9, 2015

Date of Patent: July 9, 2019

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Kentaro Tachibana, Masatsune Tamura, Yamato Ohtani
Device for predicting voice conversion model, method of predicting voice conversion model, and computer program product

Patent number: 10157608

Abstract: According to an embodiment, a voice processing device includes an interface system, a determining processor, and a predicting processor. The interface system configured to receive neutral voice data representing audio in a neutral voice of a user. The determining processor configured to determine a predictive parameter based at least in part on the neutral voice data. The predicting processor configured to predict a voice conversion model for converting the neutral voice of the speaker to a target voice using at least the predictive parameter.

Type: Grant

Filed: February 15, 2017

Date of Patent: December 18, 2018

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Yamato Ohtani, Yu Nasu, Masatsune Tamura, Masahiro Morita
Speech synthesizer, audio watermarking information detection apparatus, speech synthesizing method, audio watermarking information detection method, and computer program product

Patent number: 10109286

Abstract: According to an embodiment, a speech synthesizer includes a source generator, a phase modulator, and a vocal tract filter unit. The source generator generates a source signal by using a fundamental frequency sequence and a pulse signal. The phase modulator modulates, with respect to the source signal generated by the source generator, a phase of the pulse signal at each pitch mark based on audio watermarking information. The vocal tract filter unit generates a speech signal by using a spectrum parameter sequence with respect to the source signal in which the phase of the pulse signal is modulated by the phase modulator.

Type: Grant

Filed: September 14, 2017

Date of Patent: October 23, 2018

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Kentaro Tachibana, Takehiko Kagoshima, Masatsune Tamura, Masahiro Morita
SPEECH SYNTHESIS DEVICE, SPEECH SYNTHESIS METHOD, SPEECH SYNTHESIS MODEL TRAINING DEVICE, SPEECH SYNTHESIS MODEL TRAINING METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20180174570

Abstract: A speech synthesis device of an embodiment includes a memory unit, a creating unit, a deciding unit, a generating unit and a waveform generating unit. The memory unit stores, as statistical model information of a statistical model, an output distribution of acoustic feature parameters including pitch feature parameters and a duration distribution. The creating unit creates a statistical model sequence from context information and the statistical model information. The deciding unit decides a pitch-cycle waveform count of each state using a duration based on the duration distribution of each state of each statistical model in the statistical model sequence, and pitch information based on the output distribution of the pitch feature parameters. The generating unit generates an output distribution sequence based on the pitch-cycle waveform count, and acoustic feature parameters based on the output distribution sequence.

Type: Application

Filed: February 14, 2018

Publication date: June 21, 2018

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune TAMURA, Masahiro MORITA
SPEECH PROCESSING DEVICE, SPEECH PROCESSING METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20180174571

Abstract: A speech processing device of an embodiment includes a spectrum parameter calculation unit, a phase spectrum calculation unit, a group delay spectrum calculation unit, a band group delay parameter calculation unit, and a band group delay compensation parameter calculation unit. The spectrum parameter calculation unit calculates a spectrum parameter. The phase spectrum calculation unit calculates a first phase spectrum. The group delay spectrum calculation unit calculates a group delay spectrum from the first phase spectrum based on a frequency component of the first phase spectrum. The band group delay parameter calculation unit calculates a band group delay parameter in a predetermined frequency band from a group delay spectrum. The band group delay compensation parameter calculation unit calculates a band group delay compensation parameter to compensate a difference between a second phase spectrum reconstructed from the band group delay parameter and the first phase spectrum.

Type: Application

Filed: February 16, 2018

Publication date: June 21, 2018

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune TAMURA, Masahiro MORITA
Speech synthesizer, audio watermarking information detection apparatus, speech synthesizing method, audio watermarking information detection method, and computer program product

Patent number: 9870779

Abstract: According to an embodiment, a speech synthesizer includes a source generator, a phase modulator, and a vocal tract filter unit. The source generator generates a source signal by using a fundamental frequency sequence and a pulse signal. The phase modulator modulates, with respect to the source signal generated by the source generator, a phase of the pulse signal at each pitch mark based on audio watermarking information. The vocal tract filter unit generates a speech signal by using a spectrum parameter sequence with respect to the source signal in which the phase of the pulse signal is modulated by the phase modulator.

Type: Grant

Filed: July 16, 2015

Date of Patent: January 16, 2018

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Kentaro Tachibana, Takehiko Kagoshima, Masatsune Tamura, Masahiro Morita
SPEECH SYNTHESIZER, AUDIO WATERMARKING INFORMATION DETECTION APPARATUS, SPEECH SYNTHESIZING METHOD, AUDIO WATERMARKING INFORMATION DETECTION METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20180005637

Abstract: According to an embodiment, a speech synthesizer includes a source generator, a phase modulator, and a vocal tract filter unit. The source generator generates a source signal by using a fundamental frequency sequence and a pulse signal. The phase modulator modulates, with respect to the source signal generated by the source generator, a phase of the pulse signal at each pitch mark based on audio watermarking information. The vocal tract filter unit generates a speech signal by using a spectrum parameter sequence with respect to the source signal in which the phase of the pulse signal is modulated by the phase modulator.

Type: Application

Filed: September 14, 2017

Publication date: January 4, 2018

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Kentaro TACHIBANA, Takehiko KAGOSHIMA, Masatsune TAMURA, Masahiro MORITA
Text-to-speech device, text-to-speech method, and computer program product

Patent number: 9830904

Abstract: According to an embodiment, a text-to-speech device includes a context acquirer, an acoustic model parameter acquirer, a conversion parameter acquirer, a converter, and a waveform generator. The context acquirer is configured to acquire a context sequence affecting fluctuations in voice. The acoustic model parameter acquirer is configured to acquire an acoustic model parameter sequence that corresponds to the context sequence and represents an acoustic model in a standard speaking style of a target speaker. The conversion parameter acquirer is configured to acquire a conversion parameter sequence corresponding to the context sequence to convert an acoustic model parameter in the standard speaking style into one in a different speaking style. The converter is configured to convert the acoustic model parameter sequence using the conversion parameter sequence. The waveform generator is configured to generate a voice signal based on the acoustic model parameter sequence acquired after conversion.

Type: Grant

Filed: June 17, 2016

Date of Patent: November 28, 2017

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Yu Nasu, Masatsune Tamura, Ryo Morinaka, Masahiro Morita
VOICE PROCESSING DEVICE, VOICE PROCESSING METHOD, AND COMPUTER PROGRAM PRODUCT

Publication number: 20170162187

Abstract: According to an embodiment, a voice processing device includes an interface system, a determining processor, and a predicting processor. The interface system configured to receive neutral voice data representing audio in a neutral voice of a user. The determining processor configured to determine a predictive parameter based at least in part on the neutral voice data. The predicting processor configured to predict a voice conversion model for converting the neutral voice of the speaker to a target voice using at least the predictive parameter.

Type: Application

Filed: February 15, 2017

Publication date: June 8, 2017

Inventors: Yamato OHTANI, Yu NASU, Masatsune TAMURA, Masahiro MORITA
SPEECH SYNTHESIZER, AND SPEECH SYNTHESIS METHOD AND COMPUTER PROGRAM PRODUCT

Publication number: 20170162186

Abstract: A speech synthesizer includes a statistical-model sequence generator, a multiple-acoustic feature parameter sequence generator, and a waveform generator. The statistical-model sequence generator generates, based on context information corresponding to an input text, a statistical model sequence that comprises a first sequence of a statistical model comprising a plurality of states. The multiple-acoustic feature parameter sequence generator, for each speech section corresponding to each state of the statistical model sequence, selects a first plurality of acoustic feature parameters from a first set of acoustic feature parameters extracted from a first speech waveform stored in a speech database and generates a multiple-acoustic feature parameter sequence that comprises a sequence of the first plurality of acoustic feature parameters.

Type: Application

Filed: February 16, 2017

Publication date: June 8, 2017

Inventors: Masatsune TAMURA, Masahiro MORITA

1 2 3 next