Patents by Inventor Takafumi Koshinaka

Takafumi Koshinaka has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150348571
    Abstract: A data processing device, method and non-transitory computer-readable storage medium are disclosed. A data processing device may include a memory storing instructions, and at least one processor configured to process the instructions to divide a first speech data into first segments based on a data structure of the first speech data, classify the first segments into first clusters through clustering, generate a first segment speech model for each of the first clusters, and calculate a similarity between the first segment speech models and a second speech data.
    Type: Application
    Filed: May 27, 2015
    Publication date: December 3, 2015
    Inventors: Takafumi KOSHINAKA, Takayuki SUZUKI
  • Publication number: 20150278194
    Abstract: An information processing device according to the present invention includes: a global context extraction unit which identifies a word, a character, or a word string included in data as a specific word, and extracts a set of words included in at least a predetermined range extending from the specific word as a global context; a context classification unit which classifies the global context based on a predetermined viewpoint, and outputs a result of classification; and a language model generation unit which generates a language model for calculating a generation probability of the specific word by using the result of the classification.
    Type: Application
    Filed: November 7, 2013
    Publication date: October 1, 2015
    Applicant: NEC Corporation
    Inventors: Makoto Terao, Takafumi Koshinaka
  • Patent number: 9053751
    Abstract: A sound segment sorting unit (103) sorts the sound segments of a video. An image segment sorting unit (104) sorts the image segments of the video. A multiple sorting result generation unit (105) generates a plurality of sound segment sorting results and/or a plurality of image segment sorting results. A sorting result pair generation unit (106) generates a plurality of sorting result pairs of the sorting results as the candidates of the optimum segment sorting result of the video. A sorting result output unit (108) compares the sorting result comparative scores of the sorting result pairs calculated by a sorting result comparative score calculation unit (107) and thus outputs a sound segment sorting result and an image segment sorting result having good correspondence. This allows to accurately sort, for each object, a plurality of sound segments and a plurality of image segments contained in the video without adjusting parameters in advance.
    Type: Grant
    Filed: November 5, 2010
    Date of Patent: June 9, 2015
    Assignee: NEC CORPORATION
    Inventors: Makoto Terao, Takafumi Koshinaka
  • Patent number: 8954327
    Abstract: A voice data analyzing device comprises speaker model deriving means which derives speaker models as models each specifying character of voice of each speaker from voice data including a plurality of utterances to each of which a speaker label as information for identifying a speaker has been assigned and speaker co-occurrence model deriving means which derives a speaker co-occurrence model as a model representing the strength of co-occurrence relationship among the speakers from session data obtained by segmenting the voice data in units of sequences of conversation by use of the speaker models derived by the speaker model deriving means.
    Type: Grant
    Filed: June 3, 2010
    Date of Patent: February 10, 2015
    Assignee: NEC Corporation
    Inventor: Takafumi Koshinaka
  • Patent number: 8892435
    Abstract: Provided is to a text data processing apparatus, method and program to add a symbol at an appropriate position. The apparatus according to this embodiment is a text data processing apparatus that executes edit of a symbol in input text, the apparatus including symbol edit determination means 52 that determines whether symbol edit is necessary or not based on a frequency of symbol insertion in a block consisting of a plurality of divided text; and symbol edit position calculation means 53 that calculates likelihood of the symbol edit based on likelihood of symbol insertion for a word and a distance between the symbols and calculates a symbol edit position in the block in accordance with the likelihood of symbol edit or a word in the block when the symbol edit determination means determines that the symbol edit is necessary.
    Type: Grant
    Filed: February 13, 2009
    Date of Patent: November 18, 2014
    Assignee: Nec Corporation
    Inventors: Tasuku Kitade, Takafumi Koshinaka
  • Patent number: 8788266
    Abstract: The present invention uses a language model creation device 200 that creates a new language model using a standard language model created from standard language text. The language model creation device 200 includes a transformation rule storage section 201 that stores transformation rules used for transforming dialect-containing word strings into standard language word strings, and a dialect language model creation section 203 that creates dialect-containing n-grams by applying the transformation rules to word n-grams in the standard language model and, furthermore, creates the new language model (dialect language model) by adding the created dialect-containing n-grams to the word n-grams.
    Type: Grant
    Filed: March 16, 2010
    Date of Patent: July 22, 2014
    Assignee: NEC Corporation
    Inventors: Tasuku Kitade, Takafumi Koshinaka, Yoshifumi Onishi
  • Patent number: 8751227
    Abstract: Parameters of a first variation model, a second variation model and an environment-independent acoustic model are estimated in such a way that an integrated degree of fitness obtained by integrating a degree of fitness of the first variation model to the sample speech data, a degree of fitness of the second variation model to the sample speech data, and a degree of fitness of the environment-independent acoustic model to the sample speech data becomes the maximum. Therefore, when constructing an acoustic model by using sample speech data affected by a plurality of acoustic environments; the effect on a speech which is caused by each of the acoustic environments can be extracted with high accuracy.
    Type: Grant
    Filed: February 10, 2009
    Date of Patent: June 10, 2014
    Assignee: NEC Corporation
    Inventor: Takafumi Koshinaka
  • Patent number: 8719021
    Abstract: A speech recognition dictionary compilation assisting system can create and update speech recognition dictionary and language models efficiently so as to reduce speech recognition errors by utilizing text data available at a low cost. The system includes speech recognition dictionary storage section 105, language model storage section 106 and acoustic model storage section 107. A virtual speech recognition processing section 102 processes analyzed text data generated by the text analyzing section 101 by making reference to the recognition dictionary, language models and acoustic models so as to generate virtual text data resulted from speech recognition, and compares the virtual text data resulted from speech recognition with the analyzed text data. The update processing section 103 updates the recognition dictionary and language models so as to reduce different point(s) between both sets of text data.
    Type: Grant
    Filed: February 2, 2007
    Date of Patent: May 6, 2014
    Assignee: NEC Corporation
    Inventor: Takafumi Koshinaka
  • Patent number: 8630853
    Abstract: A speech classification apparatus includes a speech classification probability calculation unit that calculates a probability (probability of classification into each cluster) that a latest one of the speech signals (speech data) belongs to each cluster based on a generative model which is a probability model, and a parameter updating unit that successively estimates parameters that define the generative model based on the probability of classification of the speech data into each cluster calculated by the speech classification probability calculation unit.
    Type: Grant
    Filed: March 13, 2008
    Date of Patent: January 14, 2014
    Assignee: NEC Corporation
    Inventor: Takafumi Koshinaka
  • Patent number: 8606574
    Abstract: The present invention provides a speech recognition processing system in which speech recognition processing is executed parallelly by plural speech recognizing units. Before text data as the speech recognition result is output from each of the speech recognizing units, information indicating each speaker is parallelly displayed on a display in emission order of each speech. When the text data is output from each of the speech recognizing units, the text data is associated with the information indicating each speaker and the text data is displayed.
    Type: Grant
    Filed: March 25, 2010
    Date of Patent: December 10, 2013
    Assignee: NEC Corporation
    Inventors: Takafumi Koshinaka, Masahiko Hamanaka
  • Publication number: 20130317822
    Abstract: A model adaptation device includes a recognition unit which creates a recognition result of recognizing data that complies with a target domain which is an assumed condition of recognition target data, based on at least two models and a candidate of a weighting factor indicating a weight of each model on a recognition process. A weighting factor determination unit determines the weighting factor so as to assign a smaller weight to a model having higher reliability. A model update unit updates at least one model out of the models, using the recognition result as the truth label.
    Type: Application
    Filed: January 31, 2012
    Publication date: November 28, 2013
    Inventor: Takafumi Koshinaka
  • Patent number: 8595004
    Abstract: A problem to be solved is to robustly detect a pronunciation variation example and acquire a pronunciation variation rule having a high generalization property, with less effort. The problem can be solved by a pronunciation variation rule extraction apparatus including a speech data storage unit, a base form pronunciation storage unit, a sub word language model generation unit, a speech recognition unit, and a difference extraction unit. The speech data storage unit stores speech data. The base form pronunciation storage unit stores base form pronunciation data representing base form pronunciation of the speech data. The sub word language model generation unit generates a sub word language model from the base form pronunciation data. The speech recognition unit recognizes the speech data by using the sub word language model.
    Type: Grant
    Filed: November 27, 2008
    Date of Patent: November 26, 2013
    Assignee: NEC Corporation
    Inventor: Takafumi Koshinaka
  • Patent number: 8577679
    Abstract: Enables symbol insertion evaluation in consideration of a difference in speaking style features between speakers. For a word sequence transcribing voice information, the symbol insertion likelihood calculation means 113 obtains a symbol insertion likelihood for each of a plurality of symbol insertion models supplied for different speaking style features. The speaking style feature similarity calculation means 112 obtains a similarity between the speaking style feature of the word sequence and the plurality of speaking style feature models. The symbol insertion evaluation means 114 weights the symbol insertion likelihood obtained for the word sequence by each of the plurality of symbol insertion models according to the similarity between the speaking style feature of the word sequence and the plurality of speaking style feature models and the relevance between the symbol insertion model and the speaking style feature model, and performs symbol insertion evaluation to the word sequence.
    Type: Grant
    Filed: January 19, 2009
    Date of Patent: November 5, 2013
    Assignee: NEC Corporation
    Inventors: Tasuku Kitade, Takafumi Koshinaka
  • Publication number: 20130231929
    Abstract: The present invention can increase the types of noises that can be dealt with enough to enable speech recognition with a speech recognition rate of high accuracy.
    Type: Application
    Filed: November 10, 2011
    Publication date: September 5, 2013
    Applicant: NEC CORPORATION
    Inventors: Shuji Komeji, Takayuki Arakawa, Takafumi Koshinaka
  • Patent number: 8422787
    Abstract: There is provided an apparatus including a model based topic segmentation section that when segments a text using a topic model representing semantic coherence, a parameter estimation section that estimates a control parameter used in segmenting the text based on detection of a change point of word distribution in the text, using the result of segmentation by the model based topic segmentation unit as training data, and a change point detection topic segmentation section that segments the text, based on detection of the change point of word distribution in the text, using the parameter estimated by the parameter estimation section.
    Type: Grant
    Filed: December 25, 2008
    Date of Patent: April 16, 2013
    Assignee: NEC Corporation
    Inventors: Makoto Terao, Takafumi Koshinaka
  • Publication number: 20120239400
    Abstract: A speaker or a set of speakers can be recognized with high accuracy even when multiple speakers and a relationship between speakers change over time. A device comprises a speaker model derivation means for deriving a speaker model for defining a voice property per speaker from speech data made of multiple utterances to which speaker labels as information for identifying a speaker are given, a speaker co-occurrence model derivation means for, by use of the speaker model derived by the speaker model derivation means, deriving a speaker co-occurrence model indicating a strength of a co-occurrence relationship between the speakers from session data which is divided speech data in units of a series of conversation, and a model structure update means for, with reference to a session of newly-added speech data, detecting predefined events, and when the predefined event is detected, updating a structure of at least one of the speaker model and the speaker co-occurrence model.
    Type: Application
    Filed: October 21, 2010
    Publication date: September 20, 2012
    Applicant: NRC Corporation
    Inventor: Takafumi Koshinaka
  • Publication number: 20120233168
    Abstract: A sound segment sorting unit (103) sorts the sound segments of a video. An image segment sorting unit (104) sorts the image segments of the video. A multiple sorting result generation unit (105) generates a plurality of sound segment sorting results and/or a plurality of image segment sorting results. A sorting result pair generation unit (106) generates a plurality of sorting result pairs of the sorting results as the candidates of the optimum segment sorting result of the video. A sorting result output unit (108) compares the sorting result comparative scores of the sorting result pairs calculated by a sorting result comparative score calculation unit (107) and thus outputs a sound segment sorting result and an image segment sorting result having good correspondence. This allows to accurately sort, for each object, a plurality of sound segments and a plurality of image segments contained in the video without adjusting parameters in advance.
    Type: Application
    Filed: November 5, 2010
    Publication date: September 13, 2012
    Applicant: NEC CORPORATION
    Inventors: Makoto Terao, Takafumi Koshinaka
  • Publication number: 20120116763
    Abstract: A voice data analyzing device comprises speaker model deriving means which derives speaker models as models each specifying character of voice of each speaker from voice data including a plurality of utterances to each of which a speaker label as information for identifying a speaker has been assigned and speaker co-occurrence model deriving means which derives a speaker co-occurrence model as a model representing the strength of co-occurrence relationship among the speakers from session data obtained by segmenting the voice data in units of sequences of conversation by use of the speaker models derived by the speaker model deriving means.
    Type: Application
    Filed: June 3, 2010
    Publication date: May 10, 2012
    Applicant: NEC CORPORATION
    Inventor: Takafumi Koshinaka
  • Patent number: 8140530
    Abstract: [PROBLEMS] To accurately calculate similarity between media data and a query even if the media data or its meta data has an error. [MEANS FOR SOLVING THE PROBLEMS] A similarity calculation device includes: a single score calculation device used when calculating similarity between first media data and a query, which calculates a single score that shows similarity between second media data different from the first media data and the query; an inter-media similarity calculation device which calculates inter-media similarity that shows the similarity between the second media data and the first media data; and a query similarity calculation device which obtains similarity between the first media data and the query by using the inter-media similarity of the second media data and the single score.
    Type: Grant
    Filed: August 2, 2007
    Date of Patent: March 20, 2012
    Assignee: NEC Corporation
    Inventors: Makoto Terao, Takafumi Koshinaka, Shinichi Ando, Yoshifumi Onishi
  • Publication number: 20120035915
    Abstract: The present invention uses a language model creation device 200 that creates a new language model using a standard language model created from standard language text. The language model creation device 200 includes a transformation rule storage section 201 that stores transformation rules used for transforming dialect-containing word strings into standard language word strings, and a dialect language model creation section 203 that creates dialect-containing n-grams by applying the transformation rules to word n-grams in the standard language model and, furthermore, creates the new language model (dialect language model) by adding the created dialect-containing n-grams to the word n-grams.
    Type: Application
    Filed: March 16, 2010
    Publication date: February 9, 2012
    Inventors: Tasuku Kitade, Takafumi Koshinaka, Yoshifumi Onishi