Patents by Inventor Ryuki Tachibana

Ryuki Tachibana has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20160210964
    Abstract: A reading accuracy-improving system includes: a reading conversion unit for retrieving a plurality of candidate word strings from speech recognition results to determine the reading of each candidate word string; a reading score calculating unit for determining the speech recognition score for each of one or more candidate word strings with the same reading to determine a reading score; and a candidate word string selection unit for selecting a candidate to output from the plurality of candidate word strings on the basis of the reading score and speech recognition score corresponding to each candidate word string.
    Type: Application
    Filed: March 28, 2016
    Publication date: July 21, 2016
    Inventors: Gakuto Kurata, Masafumi Nishimura, Ryuki Tachibana
  • Patent number: 9384730
    Abstract: A reading accuracy-improving system includes: a reading conversion unit for retrieving a plurality of candidate word strings from speech recognition results to determine the reading of each candidate word string; a reading score calculating unit for determining the speech recognition score for each of one or more candidate word strings with the same reading to determine a reading score; and a candidate word string selection unit for selecting a candidate to output from the plurality of candidate word strings on the basis of the reading score and speech recognition score corresponding to each candidate word string.
    Type: Grant
    Filed: April 14, 2014
    Date of Patent: July 5, 2016
    Assignee: International Business Machines Corporation
    Inventors: Gakuto Kurata, Masafumi Nishimura, Ryuki Tachibana
  • Publication number: 20160086599
    Abstract: A construction method for a speech recognition model, in which a computer system includes; a step of acquiring alignment between speech of each of a plurality of speakers and a transcript of the speaker; a step of joining transcripts of the respective ones of the plurality of speakers along a time axis, creating a transcript of speech of mixed speakers obtained from synthesized speech of the speakers, and replacing predetermined transcribed portions of the plurality of speakers overlapping on the time axis with a unit which represents a simultaneous speech segment; and a step of constructing at least one of an acoustic model and a language model which make up a speech recognition model, based on the transcript of the speech of the mixed speakers.
    Type: Application
    Filed: September 23, 2015
    Publication date: March 24, 2016
    Inventors: Gakuto Kurata, Toru Nagano, Masayuki Suzuki, Ryuki Tachibana
  • Patent number: 9275631
    Abstract: Waveform concatenation speech synthesis with high sound quality. Prosody with both high accuracy and high sound quality is achieved by performing a two-path search including a speech segment search and a prosody modification value search. An accurate accent is secured by evaluating the consistency of the prosody by using a statistical model of prosody variations (the slope of fundamental frequency) for both of two paths of the speech segment selection and the modification value search. In the prosody modification value search, a prosody modification value sequence that minimizes a modified prosody cost is searched for. This allows a search for a modification value sequence that can increase the likelihood of absolute values or variations of the prosody to the statistical model as high as possible with minimum modification values.
    Type: Grant
    Filed: December 31, 2012
    Date of Patent: March 1, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Ryuki Tachibana, Masafumi Nishimura
  • Patent number: 8972407
    Abstract: An information processing apparatus determines a weight of each physical feature for hierarchical clustering by acquiring training data of multiple pieces of content in triplets with label information indicating a pair specified by a user as having a highest degree of similarity among three contents of the triplet and executing hierarchical clustering using a feature vector of each piece of content of the training data and the weight of each feature to determine the hierarchical structure of the training data. The information processing apparatus updates the weight of each feature so that the degree of agreement between a pair combined first as being the same clusters among three contents of the triplet in a determined hierarchical structure and a pair indicated by label information corresponding to the triplet increases.
    Type: Grant
    Filed: September 6, 2012
    Date of Patent: March 3, 2015
    Assignee: International Business Machines Corporation
    Inventors: Toru Nagano, Masafumi Nishimura, Takashima Ryoichi, Ryuki Tachibana
  • Patent number: 8918396
    Abstract: An information processing apparatus determines a weight of each physical feature for hierarchical clustering by acquiring training data of multiple pieces of content in triplets with label information indicating a pair specified by a user as having a highest degree of similarity among three contents of the triplet and executing hierarchical clustering using a feature vector of each piece of content of the training data and the weight of each feature to determine the hierarchical structure of the training data. The information processing apparatus updates the weight of each feature so that the degree of agreement between a pair combined first as being the same clusters among three contents of the triplet in a determined hierarchical structure and a pair indicated by label information corresponding to the triplet increases.
    Type: Grant
    Filed: June 28, 2012
    Date of Patent: December 23, 2014
    Assignee: International Business Machines Corporation
    Inventors: Toru Nagano, Masafumi Nishimura, Takashima Ryoichi, Ryuki Tachibana
  • Publication number: 20140358533
    Abstract: A reading accuracy-improving system includes: a reading conversion unit for retrieving a plurality of candidate word strings from speech recognition results to determine the reading of each candidate word string; a reading score calculating unit for determining the speech recognition score for each of one or more candidate word strings with the same reading to determine a reading score; and a candidate word string selection unit for selecting a candidate to output from the plurality of candidate word strings on the basis of the reading score and speech recognition score corresponding to each candidate word string.
    Type: Application
    Filed: April 14, 2014
    Publication date: December 4, 2014
    Applicant: International Business Machines Corporation
    Inventors: Gakuto Kurata, Masafumi Nishimura, Ryuki Tachibana
  • Patent number: 8744853
    Abstract: An objective is to provide a technique for accurately reproducing features of a fundamental frequency of a target-speaker's voice on the basis of only a small amount of learning data. A learning apparatus learns shift amounts from a reference source F0 pattern to a target F0 pattern of a target-speaker's voice. The learning apparatus associates a source F0 pattern of a learning text to a target F0 pattern of the same learning text by associating their peaks and troughs. For each of points on the target F0 pattern, the learning apparatus obtains shift amounts in a time-axis direction and in a frequency-axis direction from a corresponding point on the source F0 pattern in reference to a result of the association, and learns a decision tree using, as an input feature vector, linguistic information obtained by parsing the learning text, and using, as an output feature vector, the calculated shift amounts.
    Type: Grant
    Filed: March 16, 2010
    Date of Patent: June 3, 2014
    Assignee: International Business Machines Corporation
    Inventors: Masafumi Nishimura, Ryuki Tachibana
  • Publication number: 20130268275
    Abstract: Waveform concatenation speech synthesis with high sound quality. Prosody with both high accuracy and high sound quality is achieved by performing a two-path search including a speech segment search and a prosody modification value search. An accurate accent is secured by evaluating the consistency of the prosody by using a statistical model of prosody variations (the slope of fundamental frequency) for both of two paths of the speech segment selection and the modification value search. In the prosody modification value search, a prosody modification value sequence that minimizes a modified prosody cost is searched for. This allows a search for a modification value sequence that can increase the likelihood of absolute values or variations of the prosody to the statistical model as high as possible with minimum modification values.
    Type: Application
    Filed: December 31, 2012
    Publication date: October 10, 2013
    Inventors: Ryuki Tachibana, Masafumi Nishimura
  • Patent number: 8370149
    Abstract: Waveform concatenation speech synthesis with high sound quality. Prosody with both high accuracy and high sound quality is achieved by performing a two-path search including a speech segment search and a prosody modification value search. An accurate accent is secured by evaluating the consistency of the prosody by using a statistical model of prosody variations (the slope of fundamental frequency) for both of two paths of the speech segment selection and the modification value search. In the prosody modification value search, a prosody modification value sequence that minimizes a modified prosody cost is searched for. This allows a search for a modification value sequence that can increase the likelihood of absolute values or variations of the prosody to the statistical model as high as possible with minimum modification values.
    Type: Grant
    Filed: August 15, 2008
    Date of Patent: February 5, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Ryuki Tachibana, Masafumi Nishimura
  • Publication number: 20130006991
    Abstract: An information processing apparatus determines a weight of each physical feature for hierarchical clustering by acquiring training data of multiple pieces of content in triplets with label information indicating a pair specified by a user as having a highest degree of similarity among three contents of the triplet and executing hierarchical clustering using a feature vector of each piece of content of the training data and the weight of each feature to determine the hierarchical structure of the training data. The information processing apparatus updates the weight of each feature so that the degree of agreement between a pair combined first as being the same clusters among three contents of the triplet in a determined hierarchical structure and a pair indicated by label information corresponding to the triplet increases.
    Type: Application
    Filed: June 28, 2012
    Publication date: January 3, 2013
    Inventors: Toru Nagano, Masafumi Nishimura, Takashima Ryoichi, Ryuki Tachibana
  • Publication number: 20120330957
    Abstract: An information processing apparatus determines a weight of each physical feature for hierarchical clustering by acquiring training data of multiple pieces of content in triplets with label information indicating a pair specified by a user as having a highest degree of similarity among three contents of the triplet and executing hierarchical clustering using a feature vector of each piece of content of the training data and the weight of each feature to determine the hierarchical structure of the training data. The information processing apparatus updates the weight of each feature so that the degree of agreement between a pair combined first as being the same clusters among three contents of the triplet in a determined hierarchical structure and a pair indicated by label information corresponding to the triplet increases.
    Type: Application
    Filed: September 6, 2012
    Publication date: December 27, 2012
    Applicant: International Business Machines Corporation
    Inventors: Toru Nagano, Masafumi Nishimura, Takashima Ryoichi, Ryuki Tachibana
  • Publication number: 20120316880
    Abstract: An information processing apparatus, information processing method, and computer readable non-transitory storage medium for analyzing words reflecting information that is not explicitly recognized verbally. An information processing method includes the steps of: extracting speech data and sound data used for recognizing phonemes included in the speech data as words; identifying a section surrounded by pauses within a speech spectrum of the speech data; performing sound analysis on the identified section to identify a word in the section; generating prosodic feature values for the words; acquiring frequencies of occurrence of the word within the speech data; calculating a degree of fluctuation within the speech data for the prosodic feature values of high frequency words where the high frequency words are any words whose frequency of occurrence meets a threshold; and determining a key phrase based on the degree of fluctuation.
    Type: Application
    Filed: August 22, 2012
    Publication date: December 13, 2012
    Applicant: International Business Machines Corporation
    Inventors: Tohru Nagano, Masafumi Nishimura, Ryuki Tachibana
  • Publication number: 20120197644
    Abstract: An information processing apparatus, information processing method, and computer readable non-transitory storage medium for analyzing words reflecting information that is not explicitly recognized verbally. An information processing method includes the steps of: extracting speech data and sound data used for recognizing phonemes included in the speech data as words; identifying a section surrounded by pauses within a speech spectrum of the speech data; performing sound analysis on the identified section to identify a word in the section; generating prosodic feature values for the words; acquiring frequencies of occurrence of the word within the speech data; calculating a degree of fluctuation within the speech data for the prosodic feature values of high frequency words where the high frequency words are any words whose frequency of occurrence meets a threshold; and determining a key phrase based on the degree of fluctuation.
    Type: Application
    Filed: January 30, 2012
    Publication date: August 2, 2012
    Applicant: International Business Machines Corporation
    Inventors: Tohru Nagano, Masafumi Nishimura, Ryuki Tachibana
  • Publication number: 20120059654
    Abstract: An objective is to provide a technique for accurately reproducing features of a fundamental frequency of a target-speaker's voice on the basis of only a small amount of learning data. A learning apparatus learns shift amounts from a reference source F0 pattern to a target F0 pattern of a target-speaker's voice. The learning apparatus associates a source F0 pattern of a learning text to a target F0 pattern of the same learning text by associating their peaks and troughs. For each of points on the target F0 pattern, the learning apparatus obtains shift amounts in a time-axis direction and in a frequency-axis direction from a corresponding point on the source F0 pattern in reference to a result of the association, and learns a decision tree using, as an input feature vector, linguistic information obtained by parsing the learning text, and using, as an output feature vector, the calculated shift amounts.
    Type: Application
    Filed: March 16, 2010
    Publication date: March 8, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Masafumi Nishimura, Ryuki Tachibana
  • Patent number: 8055505
    Abstract: Digital watermark detection apparatus including detection units which calculate detected values of watermark signals by use of keys for PCM data of channels of audio content, a plurality of units which add the detected values corresponding to each of the channels and each of the keys for each possible combination of the respective channels and the respective keys, and a unit which selects and outputs one adding result from the respective adding results by the plurality of detected value adding units. Moreover, it includes units which accumulate the detected values in accumulation cycles different from one another to restore messages embedded as digital watermarks from the accumulated detected values, and perform boundary detection of the audio contents to detect the audio contents in which the digital watermarks are embedded, and a detection result output unit which synthesizes and outputs respective processing results by the message restoration units.
    Type: Grant
    Filed: June 17, 2008
    Date of Patent: November 8, 2011
    Assignee: International Business Machines Corporation
    Inventors: Ryuki Tachibana, Norishige Morimoto
  • Patent number: 8015011
    Abstract: A synthetic speech system includes a phoneme segment storage section for storing multiple phoneme segment data pieces; a synthesis section for generating voice data from text by reading phoneme segment data pieces representing the pronunciation of an inputted text from the phoneme segment storage section and connecting the phoneme segment data pieces to each other; a computing section for computing a score indicating the unnaturalness of the voice data representing the synthetic speech of the text; a paraphrase storage section for storing multiple paraphrases of the multiple first phrases; a replacement section for searching the text and replacing with appropriate paraphrases; and a judgment section for outputting generated voice data on condition that the computed score is smaller than a reference value and for inputting the text after the replacement to the synthesis section to cause the synthesis section to further generate voice data for the text.
    Type: Grant
    Filed: January 30, 2008
    Date of Patent: September 6, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Tohru Nagano, Masafumi Nishimura, Ryuki Tachibana
  • Patent number: 7921014
    Abstract: A system for generating high-quality synthesized text-to-speech includes a learning data generating unit, a frequency data generating unit, and a setting unit. The learning data generating unit recognizes inputted speech, and then generates first learning data in which wordings of phrases are associated with readings thereof. The frequency data generating unit generates, based on the first learning data, frequency data indicating appearance frequencies of both wordings and readings of phrases. The setting unit sets the thus generated frequency data for a language processing unit in order to approximate outputted speech of text-to-speech to the inputted speech. Furthermore, the language processing unit generates, from a wording of text, a reading corresponding to the wording, on the basis of the appearance frequencies.
    Type: Grant
    Filed: July 9, 2007
    Date of Patent: April 5, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Gakuto Kurata, Toru Nagano, Masafumi Nishimura, Ryuki Tachibana
  • Patent number: 7797542
    Abstract: An apparatus 10 for generating watermark signals to be embedded as a digital watermark in real-time contents includes: input means 12 for inputting the real-time contents; an input buffer 14 for storing the real-time contents; generation means for generating watermark signals corresponding to predicted intensities of the real-time contents from divided real-time contents; and an output buffer 18 for storing the generated watermark signals to be outputted. The generation means is configured by including prediction means 16 for predicting intensities of the watermark signals; control means 20 for controlling embedding by use of a message to be embedded as the digital watermark in the divided real-time contents; and means 22 for generating the watermark signals to be outputted.
    Type: Grant
    Filed: July 28, 2009
    Date of Patent: September 14, 2010
    Assignee: International Business Machines Corporation
    Inventors: Ryuki Tachibana, Ryo Subihara
  • Publication number: 20100125459
    Abstract: Exemplary embodiments provide for determining a sequence of words in a TTS system. An input text is analyzed using two models, a word n-gram model and an accent class n-gram model. A list of all possible words for each word in the input is generated for each model. Each word in each list for each model is given a score based on the probability that the word is the correct word in the sequence, based on the particular model. The two lists are combined and the two scores are combined for each word. A set of sequences of words are generated. Each sequence of words comprises a unique combination of an attribute and associated word for each word in the input. The combined score of each of word in the sequence of words is combined. A sequence of words having the highest score is selected and presented to a user.
    Type: Application
    Filed: July 1, 2009
    Publication date: May 20, 2010
    Applicant: Nuance Communications, Inc.
    Inventors: Nobuyasu Itoh, Tohru Nagano, Masafumi Nishimura, Ryuki Tachibana