Patents by Inventor Ryuki Tachibana

Ryuki Tachibana has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PRONUNCIATION ACCURACY IN SPEECH RECOGNITION

Publication number: 20160210964

Abstract: A reading accuracy-improving system includes: a reading conversion unit for retrieving a plurality of candidate word strings from speech recognition results to determine the reading of each candidate word string; a reading score calculating unit for determining the speech recognition score for each of one or more candidate word strings with the same reading to determine a reading score; and a candidate word string selection unit for selecting a candidate to output from the plurality of candidate word strings on the basis of the reading score and speech recognition score corresponding to each candidate word string.

Type: Application

Filed: March 28, 2016

Publication date: July 21, 2016

Inventors: Gakuto Kurata, Masafumi Nishimura, Ryuki Tachibana
Pronunciation accuracy in speech recognition

Patent number: 9384730

Abstract: A reading accuracy-improving system includes: a reading conversion unit for retrieving a plurality of candidate word strings from speech recognition results to determine the reading of each candidate word string; a reading score calculating unit for determining the speech recognition score for each of one or more candidate word strings with the same reading to determine a reading score; and a candidate word string selection unit for selecting a candidate to output from the plurality of candidate word strings on the basis of the reading score and speech recognition score corresponding to each candidate word string.

Type: Grant

Filed: April 14, 2014

Date of Patent: July 5, 2016

Assignee: International Business Machines Corporation

Inventors: Gakuto Kurata, Masafumi Nishimura, Ryuki Tachibana
Speech Recognition Model Construction Method, Speech Recognition Method, Computer System, Speech Recognition Apparatus, Program, and Recording Medium

Publication number: 20160086599

Abstract: A construction method for a speech recognition model, in which a computer system includes; a step of acquiring alignment between speech of each of a plurality of speakers and a transcript of the speaker; a step of joining transcripts of the respective ones of the plurality of speakers along a time axis, creating a transcript of speech of mixed speakers obtained from synthesized speech of the speakers, and replacing predetermined transcribed portions of the plurality of speakers overlapping on the time axis with a unit which represents a simultaneous speech segment; and a step of constructing at least one of an acoustic model and a language model which make up a speech recognition model, based on the transcript of the speech of the mixed speakers.

Type: Application

Filed: September 23, 2015

Publication date: March 24, 2016

Inventors: Gakuto Kurata, Toru Nagano, Masayuki Suzuki, Ryuki Tachibana
Speech synthesis system, speech synthesis program product, and speech synthesis method

Patent number: 9275631

Abstract: Waveform concatenation speech synthesis with high sound quality. Prosody with both high accuracy and high sound quality is achieved by performing a two-path search including a speech segment search and a prosody modification value search. An accurate accent is secured by evaluating the consistency of the prosody by using a statistical model of prosody variations (the slope of fundamental frequency) for both of two paths of the speech segment selection and the modification value search. In the prosody modification value search, a prosody modification value sequence that minimizes a modified prosody cost is searched for. This allows a search for a modification value sequence that can increase the likelihood of absolute values or variations of the prosody to the statistical model as high as possible with minimum modification values.

Type: Grant

Filed: December 31, 2012

Date of Patent: March 1, 2016

Assignee: Nuance Communications, Inc.

Inventors: Ryuki Tachibana, Masafumi Nishimura
Information processing method for determining weight of each feature in subjective hierarchical clustering

Patent number: 8972407

Abstract: An information processing apparatus determines a weight of each physical feature for hierarchical clustering by acquiring training data of multiple pieces of content in triplets with label information indicating a pair specified by a user as having a highest degree of similarity among three contents of the triplet and executing hierarchical clustering using a feature vector of each piece of content of the training data and the weight of each feature to determine the hierarchical structure of the training data. The information processing apparatus updates the weight of each feature so that the degree of agreement between a pair combined first as being the same clusters among three contents of the triplet in a determined hierarchical structure and a pair indicated by label information corresponding to the triplet increases.

Type: Grant

Filed: September 6, 2012

Date of Patent: March 3, 2015

Assignee: International Business Machines Corporation

Inventors: Toru Nagano, Masafumi Nishimura, Takashima Ryoichi, Ryuki Tachibana
Information processing apparatus, method and program for determining weight of each feature in subjective hierarchical clustering

Patent number: 8918396

Abstract: An information processing apparatus determines a weight of each physical feature for hierarchical clustering by acquiring training data of multiple pieces of content in triplets with label information indicating a pair specified by a user as having a highest degree of similarity among three contents of the triplet and executing hierarchical clustering using a feature vector of each piece of content of the training data and the weight of each feature to determine the hierarchical structure of the training data. The information processing apparatus updates the weight of each feature so that the degree of agreement between a pair combined first as being the same clusters among three contents of the triplet in a determined hierarchical structure and a pair indicated by label information corresponding to the triplet increases.

Type: Grant

Filed: June 28, 2012

Date of Patent: December 23, 2014

Assignee: International Business Machines Corporation

Inventors: Toru Nagano, Masafumi Nishimura, Takashima Ryoichi, Ryuki Tachibana
PRONUNCIATION ACCURACY IN SPEECH RECOGNITION

Publication number: 20140358533

Abstract: A reading accuracy-improving system includes: a reading conversion unit for retrieving a plurality of candidate word strings from speech recognition results to determine the reading of each candidate word string; a reading score calculating unit for determining the speech recognition score for each of one or more candidate word strings with the same reading to determine a reading score; and a candidate word string selection unit for selecting a candidate to output from the plurality of candidate word strings on the basis of the reading score and speech recognition score corresponding to each candidate word string.

Type: Application

Filed: April 14, 2014

Publication date: December 4, 2014

Applicant: International Business Machines Corporation

Inventors: Gakuto Kurata, Masafumi Nishimura, Ryuki Tachibana
Speaker-adaptive synthesized voice

Patent number: 8744853

Abstract: An objective is to provide a technique for accurately reproducing features of a fundamental frequency of a target-speaker's voice on the basis of only a small amount of learning data. A learning apparatus learns shift amounts from a reference source F0 pattern to a target F0 pattern of a target-speaker's voice. The learning apparatus associates a source F0 pattern of a learning text to a target F0 pattern of the same learning text by associating their peaks and troughs. For each of points on the target F0 pattern, the learning apparatus obtains shift amounts in a time-axis direction and in a frequency-axis direction from a corresponding point on the source F0 pattern in reference to a result of the association, and learns a decision tree using, as an input feature vector, linguistic information obtained by parsing the learning text, and using, as an output feature vector, the calculated shift amounts.

Type: Grant

Filed: March 16, 2010

Date of Patent: June 3, 2014

Assignee: International Business Machines Corporation

Inventors: Masafumi Nishimura, Ryuki Tachibana
SPEECH SYNTHESIS SYSTEM, SPEECH SYNTHESIS PROGRAM PRODUCT, AND SPEECH SYNTHESIS METHOD

Publication number: 20130268275

Abstract: Waveform concatenation speech synthesis with high sound quality. Prosody with both high accuracy and high sound quality is achieved by performing a two-path search including a speech segment search and a prosody modification value search. An accurate accent is secured by evaluating the consistency of the prosody by using a statistical model of prosody variations (the slope of fundamental frequency) for both of two paths of the speech segment selection and the modification value search. In the prosody modification value search, a prosody modification value sequence that minimizes a modified prosody cost is searched for. This allows a search for a modification value sequence that can increase the likelihood of absolute values or variations of the prosody to the statistical model as high as possible with minimum modification values.

Type: Application

Filed: December 31, 2012

Publication date: October 10, 2013

Inventors: Ryuki Tachibana, Masafumi Nishimura
Speech synthesis system, speech synthesis program product, and speech synthesis method

Patent number: 8370149

Abstract: Waveform concatenation speech synthesis with high sound quality. Prosody with both high accuracy and high sound quality is achieved by performing a two-path search including a speech segment search and a prosody modification value search. An accurate accent is secured by evaluating the consistency of the prosody by using a statistical model of prosody variations (the slope of fundamental frequency) for both of two paths of the speech segment selection and the modification value search. In the prosody modification value search, a prosody modification value sequence that minimizes a modified prosody cost is searched for. This allows a search for a modification value sequence that can increase the likelihood of absolute values or variations of the prosody to the statistical model as high as possible with minimum modification values.

Type: Grant

Filed: August 15, 2008

Date of Patent: February 5, 2013

Assignee: Nuance Communications, Inc.

Inventors: Ryuki Tachibana, Masafumi Nishimura
INFORMATION PROCESSING APPARATUS, METHOD AND PROGRAM FOR DETERMINING WEIGHT OF EACH FEATURE IN SUBJECTIVE HIERARCHICAL CLUSTERING

Publication number: 20130006991

Abstract: An information processing apparatus determines a weight of each physical feature for hierarchical clustering by acquiring training data of multiple pieces of content in triplets with label information indicating a pair specified by a user as having a highest degree of similarity among three contents of the triplet and executing hierarchical clustering using a feature vector of each piece of content of the training data and the weight of each feature to determine the hierarchical structure of the training data. The information processing apparatus updates the weight of each feature so that the degree of agreement between a pair combined first as being the same clusters among three contents of the triplet in a determined hierarchical structure and a pair indicated by label information corresponding to the triplet increases.

Type: Application

Filed: June 28, 2012

Publication date: January 3, 2013

Inventors: Toru Nagano, Masafumi Nishimura, Takashima Ryoichi, Ryuki Tachibana
INFORMATION PROCESSING METHOD FOR DETERMINING WEIGHT OF EACH FEATURE IN SUBJECTIVE HIERARCHICAL CLUSTERING

Publication number: 20120330957

Abstract: An information processing apparatus determines a weight of each physical feature for hierarchical clustering by acquiring training data of multiple pieces of content in triplets with label information indicating a pair specified by a user as having a highest degree of similarity among three contents of the triplet and executing hierarchical clustering using a feature vector of each piece of content of the training data and the weight of each feature to determine the hierarchical structure of the training data. The information processing apparatus updates the weight of each feature so that the degree of agreement between a pair combined first as being the same clusters among three contents of the triplet in a determined hierarchical structure and a pair indicated by label information corresponding to the triplet increases.

Type: Application

Filed: September 6, 2012

Publication date: December 27, 2012

Applicant: International Business Machines Corporation

Inventors: Toru Nagano, Masafumi Nishimura, Takashima Ryoichi, Ryuki Tachibana
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, INFORMATION PROCESSING SYSTEM, AND PROGRAM

Publication number: 20120316880

Abstract: An information processing apparatus, information processing method, and computer readable non-transitory storage medium for analyzing words reflecting information that is not explicitly recognized verbally. An information processing method includes the steps of: extracting speech data and sound data used for recognizing phonemes included in the speech data as words; identifying a section surrounded by pauses within a speech spectrum of the speech data; performing sound analysis on the identified section to identify a word in the section; generating prosodic feature values for the words; acquiring frequencies of occurrence of the word within the speech data; calculating a degree of fluctuation within the speech data for the prosodic feature values of high frequency words where the high frequency words are any words whose frequency of occurrence meets a threshold; and determining a key phrase based on the degree of fluctuation.

Type: Application

Filed: August 22, 2012

Publication date: December 13, 2012

Applicant: International Business Machines Corporation

Inventors: Tohru Nagano, Masafumi Nishimura, Ryuki Tachibana
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, INFORMATION PROCESSING SYSTEM, AND PROGRAM

Publication number: 20120197644

Abstract: An information processing apparatus, information processing method, and computer readable non-transitory storage medium for analyzing words reflecting information that is not explicitly recognized verbally. An information processing method includes the steps of: extracting speech data and sound data used for recognizing phonemes included in the speech data as words; identifying a section surrounded by pauses within a speech spectrum of the speech data; performing sound analysis on the identified section to identify a word in the section; generating prosodic feature values for the words; acquiring frequencies of occurrence of the word within the speech data; calculating a degree of fluctuation within the speech data for the prosodic feature values of high frequency words where the high frequency words are any words whose frequency of occurrence meets a threshold; and determining a key phrase based on the degree of fluctuation.

Type: Application

Filed: January 30, 2012

Publication date: August 2, 2012

Applicant: International Business Machines Corporation

Inventors: Tohru Nagano, Masafumi Nishimura, Ryuki Tachibana
SPEAKER-ADAPTIVE SYNTHESIZED VOICE

Publication number: 20120059654

Abstract: An objective is to provide a technique for accurately reproducing features of a fundamental frequency of a target-speaker's voice on the basis of only a small amount of learning data. A learning apparatus learns shift amounts from a reference source F0 pattern to a target F0 pattern of a target-speaker's voice. The learning apparatus associates a source F0 pattern of a learning text to a target F0 pattern of the same learning text by associating their peaks and troughs. For each of points on the target F0 pattern, the learning apparatus obtains shift amounts in a time-axis direction and in a frequency-axis direction from a corresponding point on the source F0 pattern in reference to a result of the association, and learns a decision tree using, as an input feature vector, linguistic information obtained by parsing the learning text, and using, as an output feature vector, the calculated shift amounts.

Type: Application

Filed: March 16, 2010

Publication date: March 8, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Masafumi Nishimura, Ryuki Tachibana
Audio content digital watermark detection

Patent number: 8055505

Abstract: Digital watermark detection apparatus including detection units which calculate detected values of watermark signals by use of keys for PCM data of channels of audio content, a plurality of units which add the detected values corresponding to each of the channels and each of the keys for each possible combination of the respective channels and the respective keys, and a unit which selects and outputs one adding result from the respective adding results by the plurality of detected value adding units. Moreover, it includes units which accumulate the detected values in accumulation cycles different from one another to restore messages embedded as digital watermarks from the accumulated detected values, and perform boundary detection of the audio contents to detect the audio contents in which the digital watermarks are embedded, and a detection result output unit which synthesizes and outputs respective processing results by the message restoration units.

Type: Grant

Filed: June 17, 2008

Date of Patent: November 8, 2011

Assignee: International Business Machines Corporation

Inventors: Ryuki Tachibana, Norishige Morimoto
Generating objectively evaluated sufficiently natural synthetic speech from text by using selective paraphrases

Patent number: 8015011

Abstract: A synthetic speech system includes a phoneme segment storage section for storing multiple phoneme segment data pieces; a synthesis section for generating voice data from text by reading phoneme segment data pieces representing the pronunciation of an inputted text from the phoneme segment storage section and connecting the phoneme segment data pieces to each other; a computing section for computing a score indicating the unnaturalness of the voice data representing the synthetic speech of the text; a paraphrase storage section for storing multiple paraphrases of the multiple first phrases; a replacement section for searching the text and replacing with appropriate paraphrases; and a judgment section for outputting generated voice data on condition that the computed score is smaller than a reference value and for inputting the text after the replacement to the synthesis section to cause the synthesis section to further generate voice data for the text.

Type: Grant

Filed: January 30, 2008

Date of Patent: September 6, 2011

Assignee: Nuance Communications, Inc.

Inventors: Tohru Nagano, Masafumi Nishimura, Ryuki Tachibana
System and method for supporting text-to-speech

Patent number: 7921014

Abstract: A system for generating high-quality synthesized text-to-speech includes a learning data generating unit, a frequency data generating unit, and a setting unit. The learning data generating unit recognizes inputted speech, and then generates first learning data in which wordings of phrases are associated with readings thereof. The frequency data generating unit generates, based on the first learning data, frequency data indicating appearance frequencies of both wordings and readings of phrases. The setting unit sets the thus generated frequency data for a language processing unit in order to approximate outputted speech of text-to-speech to the inputted speech. Furthermore, the language processing unit generates, from a wording of text, a reading corresponding to the wording, on the basis of the appearance frequencies.

Type: Grant

Filed: July 9, 2007

Date of Patent: April 5, 2011

Assignee: Nuance Communications, Inc.

Inventors: Gakuto Kurata, Toru Nagano, Masafumi Nishimura, Ryuki Tachibana
Watermark signal generating apparatus

Patent number: 7797542

Abstract: An apparatus 10 for generating watermark signals to be embedded as a digital watermark in real-time contents includes: input means 12 for inputting the real-time contents; an input buffer 14 for storing the real-time contents; generation means for generating watermark signals corresponding to predicted intensities of the real-time contents from divided real-time contents; and an output buffer 18 for storing the generated watermark signals to be outputted. The generation means is configured by including prediction means 16 for predicting intensities of the watermark signals; control means 20 for controlling embedding by use of a message to be embedded as the digital watermark in the divided real-time contents; and means 22 for generating the watermark signals to be outputted.

Type: Grant

Filed: July 28, 2009

Date of Patent: September 14, 2010

Assignee: International Business Machines Corporation

Inventors: Ryuki Tachibana, Ryo Subihara
STOCHASTIC PHONEME AND ACCENT GENERATION USING ACCENT CLASS

Publication number: 20100125459

Abstract: Exemplary embodiments provide for determining a sequence of words in a TTS system. An input text is analyzed using two models, a word n-gram model and an accent class n-gram model. A list of all possible words for each word in the input is generated for each model. Each word in each list for each model is given a score based on the probability that the word is the correct word in the sequence, based on the particular model. The two lists are combined and the two scores are combined for each word. A set of sequences of words are generated. Each sequence of words comprises a unique combination of an attribute and associated word for each word in the input. The combined score of each of word in the sequence of words is combined. A sequence of words having the highest score is selected and presented to a user.

Type: Application

Filed: July 1, 2009

Publication date: May 20, 2010

Applicant: Nuance Communications, Inc.

Inventors: Nobuyasu Itoh, Tohru Nagano, Masafumi Nishimura, Ryuki Tachibana

prev 1 2 3 4 5 next