Patents by Inventor Masami Akamine

Masami Akamine has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Dialogue system, dialogue method, and storage medium

Patent number: 11417319

Abstract: According to one embodiment, a dialogue system includes a setting apparatus and a processing apparatus. The setting apparatus sets in advance a plurality of words that are in impossible combination relationships to each other. The processing apparatus acquires speech of a user, and when a speech recognition result of an object included in the speech includes a word combination included in the plurality of words that are in impossible combination relationships to each other, output a notification to the user that processing of the object cannot be carried out.

Type: Grant

Filed: February 20, 2018

Date of Patent: August 16, 2022

Assignee: Kabushiki Kaisha Toshiba

Inventors: Takami Yoshida, Kenji Iwata, Yuka Kobayashi, Masami Akamine
Interactive system, apparatus, and method

Patent number: 11270683

Abstract: According to one embodiment, an interactive system includes following units. The knowledge reference unit refers to a question-answering knowledge based on a result of analyzing an input sentence to acquire a candidate for an answer to the input sentence. The unknown keyword detection unit detects, from the input sentence, an unknown keyword. The related keyword estimation unit acquires, in response to the detection of the unknown keyword, one or more candidates for a related keyword having a meaning close to the unknown keyword from predetermined keywords. The response generation unit generates a response to the input sentence based on the one or more candidates for the related keyword when the unknown keyword is detected.

Type: Grant

Filed: August 30, 2019

Date of Patent: March 8, 2022

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Kenji Iwata, Hiroshi Fujimura, Yuka Kobayashi, Takami Yoshida, Masami Akamine
Dialogue system and dialogue method

Patent number: 10847151

Abstract: According to an embodiment, a dialogue system includes a satisfaction estimator, a dialogue state estimator, and a behavior determiner. The satisfaction estimator estimates a satisfaction of a user based on a speech input from the user. The dialogue state estimator estimates a dialogue state with the user based on the speech input from the user and the estimated satisfaction of the user. The behavior determiner determines a behavior towards the user based on the estimated dialogue state.

Type: Grant

Filed: February 20, 2018

Date of Patent: November 24, 2020

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Masami Akamine, Takami Yoshida
INTERACTIVE SYSTEM, APPARATUS, AND METHOD

Publication number: 20200143792

Abstract: According to one embodiment, an interactive system includes following units. The knowledge reference unit refers to a question-answering knowledge based on a result of analyzing an input sentence to acquire a candidate for an answer to the input sentence. The unknown keyword detection unit detects, from the input sentence, an unknown keyword. The related keyword estimation unit acquires, in response to the detection of the unknown keyword, one or more candidates for a related keyword having a meaning close to the unknown keyword from predetermined keywords. The response generation unit generates a response to the input sentence based on the one or more candidates for the related keyword when the unknown keyword is detected.

Type: Application

Filed: August 30, 2019

Publication date: May 7, 2020

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Kenji IWATA, Hiroshi Fujimura, Yuka Kobayashi, Takami Yoshida, Masami Akamine
DIALOGUE SYSTEM, DIALOGUE METHOD, AND STORAGE MEDIUM

Publication number: 20190088252

Abstract: According to one embodiment, a dialogue system includes a setting apparatus and a processing apparatus. The setting apparatus sets in advance a plurality of words that are in impossible combination relationships to each other. The processing apparatus acquires speech of a user, and when a speech recognition result of an object included in the speech includes a word combination included in the plurality of words that are in impossible combination relationships to each other, output a notification to the user that processing of the object cannot be carried out.

Type: Application

Filed: February 20, 2018

Publication date: March 21, 2019

Inventors: Takami Yoshida, Kenji Iwata, Yuka Kobayashi, Masami Akamine
Text to speech method and system using voice characteristic dependent weighting

Patent number: 9454963

Abstract: A text-to-speech method for simulating a plurality of different voice characteristics includes dividing inputted text into a sequence of acoustic units; selecting voice characteristics for the inputted text; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model having a plurality of model parameters provided in clusters each having at least one sub-cluster and describing probability distributions which relate an acoustic unit to a speech vector; and outputting the sequence of speech vectors as audio with the selected voice characteristics. A parameter of a predetermined type of each probability distribution is expressed as a weighted sum of parameters of the same type using voice characteristic dependent weighting. In converting the sequence of acoustic units to a sequence of speech vectors, the voice characteristic dependent weights for the selected voice characteristics are retrieved for each cluster such that there is one weight per sub-cluster.

Type: Grant

Filed: March 13, 2013

Date of Patent: September 27, 2016

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Javier Latorre-Martinez, Vincent Ping Leung Wan, Kean Kheong Chin, Mark John Francis Gales, Katherine Mary Knill, Masami Akamine, Byung Ha Chung
Text to speech system

Patent number: 9269347

Abstract: A text-to-speech method configured to output speech having a selected speaker voice and a selected speaker attribute, including: inputting text; dividing the inputted text into a sequence of acoustic units; selecting a speaker for the inputted text; selecting a speaker attribute for the inputted text; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model; and outputting the sequence of speech vectors as audio with the selected speaker voice and a selected speaker attribute. The acoustic model includes a first set of parameters relating to speaker voice and a second set of parameters relating to speaker attributes, which parameters do not overlap. The selecting a speaker voice includes selecting parameters from the first set of parameters and the selecting the speaker attribute includes selecting the parameters from the second set of parameters.

Type: Grant

Filed: March 15, 2013

Date of Patent: February 23, 2016

Assignee: Kabushiki Kaisha Toshiba

Inventors: Javier Latorre-Martinez, Vincent Ping Leung Wan, Kean Kheong Chin, Mark John Francis Gales, Katherine Mary Knill, Masami Akamine
TEXT TO SPEECH METHOD AND SYSTEM

Publication number: 20130262109

Abstract: A text-to-speech method for simulating a plurality of different voice characteristics includes dividing inputted text into a sequence of acoustic units; selecting voice characteristics for the inputted text; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model having a plurality of model parameters provided in clusters each having at least one sub-cluster and describing probability distributions which relate an acoustic unit to a speech vector; and outputting the sequence of speech vectors as audio with the selected voice characteristics. A parameter of a predetermined type of each probability distribution is expressed as a weighted sum of parameters of the same type using voice characteristic dependent weighting. In converting the sequence of acoustic units to a sequence of speech vectors, the voice characteristic dependent weights for the selected voice characteristics are retrieved for each cluster such that there is one weight per sub-cluster.

Type: Application

Filed: March 13, 2013

Publication date: October 3, 2013

Inventors: Javier Latorre-Martinez, Vincent Ping Leung Wan, Kean Kheong Chin, Mark John Francis Gales, Katherine Mary Knill, Masami Akamine, Byung Ha Chung
TEXT TO SPEECH SYSTEM

Publication number: 20130262119

Abstract: A text-to-speech method configured to output speech having a selected speaker voice and a selected speaker attribute, including: inputting text; dividing the inputted text into a sequence of acoustic units; selecting a speaker for the inputted text; selecting a speaker attribute for the inputted text; converting the sequence of acoustic units to a sequence of speech vectors using an acoustic model; and outputting the sequence of speech vectors as audio with the selected speaker voice and a selected speaker attribute. The acoustic model includes a first set of parameters relating to speaker voice and a second set of parameters relating to speaker attributes, which parameters do not overlap. The selecting a speaker voice includes selecting parameters from the first set of parameters and the selecting the speaker attribute includes selecting the parameters from the second set of parameters.

Type: Application

Filed: March 15, 2013

Publication date: October 3, 2013

Applicant: Kabushiki Kaisha Toshiba

Inventors: Javier LATORRE-MARTINEZ, Vincent Ping Leung Wan, Kean Kheong Chin, Mark John Francis Gales, Katherine Mary Knill, Masami Akamine
Speech synthesizer, speech synthesizing method and program product

Patent number: 8494856

Abstract: According to one embodiment, a speech synthesizer includes an analyzer, a first estimator, a selector, a generator, a second estimator, and a synthesizer. The analyzer analyzes text and extracts a linguistic feature. The first estimator selects a first prosody model adapted to the linguistic feature and estimates prosody information that maximizes a first likelihood representing probability of the selected first prosody model. The selector selects speech units that minimize a cost function determined in accordance with the prosody information. The generator generates a second prosody model that is a model of the prosody information of the speech units. The second estimator estimates prosody information that maximizes a third likelihood calculated on the basis of the first likelihood and a second likelihood representing probability of the second prosody model. The synthesizer generates synthetic speech by concatenating the speech units on the basis of the prosody information estimated by the second estimator.

Type: Grant

Filed: October 12, 2011

Date of Patent: July 23, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Javier Latorre, Masami Akamine
Speech processing apparatus, method, and computer program product for synthesizing speech

Patent number: 8407053

Abstract: A speech processing apparatus, including a segmenting unit to divide a fundamental frequency signal of a speech signal corresponding to an input text into pitch segments, based on an alignment between samples of at least one given linguistic level included in the input text and the speech signal. Character strings of the input text are divided into the samples based on each linguistic level. A parameterizing unit generates a parametric representation of the pitch segments using a predetermined invertible operator and generates a group of first parameters in correspondence with each linguistic level. A descriptor generating unit generates, for each linguistic level, a descriptor that includes a set of features describing each sample in the input text and a model learning unit classifies the first parameters of each linguistic level of all speech signals in a memory into clusters based on the descriptor corresponding to the linguistic level.

Type: Grant

Filed: March 17, 2009

Date of Patent: March 26, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Javier Latorre, Masami Akamine
Apparatus, method, and computer program product for judging speech/non-speech

Patent number: 8380500

Abstract: A spectrum calculating unit calculates, for each of the frames, a spectrum by performing a frequency analysis on an acoustic signal. An estimating unit estimates a noise spectrum. An energy calculating unit calculates an energy characteristic amount. An entropy calculating unit calculates a normalized spectral entropy value. A generating unit generates a characteristic vector based on the energy characteristic amounts and the normalized spectral entropy values that have been calculated for a plurality of frames. A likelihood calculating unit calculates a speech likelihood value of a target frame that corresponds to the characteristic vector. In a case where the speech likelihood value is larger than a threshold value, a judging unit judges that the target frame is a speech frame.

Type: Grant

Filed: September 22, 2008

Date of Patent: February 19, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Koichi Yamamoto, Masami Akamine
Feature-vector compensating apparatus, feature-vector compensating method, and computer program product

Patent number: 8370139

Abstract: A noise-environment storing unit stores therein a compensation vector for compensating a feature vector of a speech. A feature-vector extracting unit extracts the feature vector of the speech in each of a plurality of frames. A noise-environment-series estimating unit estimates a noise-environment series based on a feature-vector series and a degree of similarity. A calculating unit obtains a compensation vector corresponding to each noise environment in estimated noise-environment series based on the compensation vector present in the noise-environment storing unit. A compensating unit compensates the extracted feature vector of the speech based on obtained compensation vector.

Type: Grant

Filed: March 19, 2007

Date of Patent: February 5, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masami Akamine, Takashi Masuko, Daniel Barreda, Remco Teunen
SPEECH SYNTHESIZER, SPEECH SYNTHESIZING METHOD AND PROGRAM PRODUCT

Publication number: 20120089402

Abstract: According to one embodiment, a speech synthesizer includes an analyzer, a first estimator, a selector, a generator, a second estimator, and a synthesizer. The analyzer analyzes text and extracts a linguistic feature. The first estimator selects a first prosody model adapted to the linguistic feature and estimates prosody information that maximizes a first likelihood representing probability of the selected first prosody model. The selector selects speech units that minimize a cost function determined in accordance with the prosody information. The generator generates a second prosody model that is a model of the prosody information of the speech units. The second estimator estimates prosody information that maximizes a third likelihood calculated on the basis of the first likelihood and a second likelihood representing probability of the second prosody model. The synthesizer generates synthetic speech by concatenating the speech units on the basis of the prosody information estimated by the second estimator.

Type: Application

Filed: October 12, 2011

Publication date: April 12, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Javier Latorre, Masami Akamine
SPEECH MODEL GENERATING APPARATUS, SPEECH SYNTHESIS APPARATUS, SPEECH MODEL GENERATING PROGRAM PRODUCT, SPEECH SYNTHESIS PROGRAM PRODUCT, SPEECH MODEL GENERATING METHOD, AND SPEECH SYNTHESIS METHOD

Publication number: 20120065961

Abstract: According to one embodiment, a speech model generating apparatus includes a spectrum analyzer, a chunker, a parameterizer, a clustering unit, and a model training unit. The spectrum analyzer acquires a speech signal corresponding to text information and calculates a set of spectral coefficients. The chunker acquires boundary information indicating a beginning and an end of linguistic units and chunks the speech signal into linguistic units. The parameterizer calculates a set of spectral trajectory parameters for a trajectory of the spectral trajectory parameters of the linguistic unit on the basis of the spectral coefficients. The clustering unit clusters the spectral trajectory parameters calculated for each of the linguistic units into clusters on the basis of linguistic information. The model training unit obtains a trained spectral trajectory model indicating a characteristic of a cluster based on the spectral trajectory parameters belonging to the same cluster.

Type: Application

Filed: September 21, 2011

Publication date: March 15, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Javier Latorre, Masami Akamine
Apparatus for creating speaker model, and computer program product

Patent number: 8078462

Abstract: A transformation-parameter calculating unit calculates a first model parameter indicating a parameter of a speaker model for causing a first likelihood for a clean feature to maximum, and calculates a transformation parameter for causing the first likelihood to maximum. The transformation parameter transforms, for each of the speakers, a distribution of the clean feature corresponding to the identification information of the speaker to a distribution represented by the speaker model of the first model parameter. A model-parameter calculating unit transforms a noisy feature corresponding to identification information for each of speakers by using the transformation parameter, and calculates a second model parameter indicating a parameter of the speaker model for causing a second likelihood for the transformed noisy feature to maximum.

Type: Grant

Filed: October 2, 2008

Date of Patent: December 13, 2011

Assignee: Kabushiki Kaisha Toshiba

Inventors: Yusuke Shinohara, Masami Akamine
Prosody-pattern generating apparatus, speech synthesizing apparatus, and computer program product and method thereof

Patent number: 8046225

Abstract: Normalization parameters are generated at a normalization-parameter generating unit by calculating the mean values and the standard deviations of an initial prosody pattern and a prosody pattern of a training sentence of a speech corpus. Then, the variance range or variance width of the initial prosody pattern is normalized at the prosody-pattern normalizing unit in accordance with the normalization parameters. As a result, a prosody pattern similar to speech of human beings and improved in naturalness can be generated with a small amount of calculation.

Type: Grant

Filed: February 8, 2008

Date of Patent: October 25, 2011

Assignee: Kabushiki Kaisha Toshiba

Inventors: Takashi Masuko, Masami Akamine
SPEAKER ADAPTATION APPARATUS AND PROGRAM THEREOF

Publication number: 20100169094

Abstract: A speaker adaptation apparatus includes an acquiring unit configured to acquire an acoustic model including HMMs and decision trees for estimating what type of the phoneme or the word is included in a feature value used for speech recognition, the HMMs having a plurality of states on a phoneme-to-phoneme basis or a word-to-word basis, and the decision trees being configured to reply to questions relating to the feature value and output likelihoods in the respective states of the HMMs, and a speaker adaptation unit configured to adapt the decision trees to a speaker, the decision trees being adapted using speaker adaptation data vocalized by the speaker of an input speech.

Type: Application

Filed: September 17, 2009

Publication date: July 1, 2010

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masami Akamine, Jitendra Ajmera, Partha Lal
APPARATUS AND METHOD FOR RECOGNIZING A SPEECH

Publication number: 20100076759

Abstract: A noisy vector is extracted from a noisy speech, which is a clean speech on which a noise is superimposed. A noise parameter of the noise is estimated from the noisy vector. A prior distribution parameter of a clean vector of the clean speech is already stored. A joint Gaussian distribution parameter between the clean vector and the noisy vector is calculated by unscented transformation, from the noise parameter and the prior distribution parameter. A posterior distribution parameter of the clean vector is calculated by the joint Gaussian distribution parameter, from the noisy vector. By comparing the posterior distribution parameter with a standard pattern of each word previously stored, a word sequence of the noisy speech is output.

Type: Application

Filed: September 8, 2009

Publication date: March 25, 2010

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Yusuke Shinohara, Masami Akamine
APPARATUS, METHOD, AND COMPUTER PROGRAM PRODUCT FOR JUDGING SPEECH/NON-SPEECH

Publication number: 20090254341

Abstract: A spectrum calculating unit calculates, for each of the frames, a spectrum by performing a frequency analysis on an acoustic signal. An estimating unit estimates a noise spectrum. An energy calculating unit calculates an energy characteristic amount. An entropy calculating unit calculates a normalized spectral entropy value. A generating unit generates a characteristic vector based on the energy characteristic amounts and the normalized spectral entropy values that have been calculated for a plurality of frames. A likelihood calculating unit calculates a speech likelihood value of a target frame that corresponds to the characteristic vector. In a case where the speech likelihood value is larger than a threshold value, a judging unit judges that the target frame is a speech frame.

Type: Application

Filed: September 22, 2008

Publication date: October 8, 2009

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Koichi Yamamoto, Masami Akamine

1 2 3 4 next