Patents by Inventor Takafumi Koshinaka

Takafumi Koshinaka has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20110301952
    Abstract: The present invention provides a speech recognition processing system in which speech recognition processing is executed parallelly by plural speech recognizing units. Before text data as the speech recognition result is output from each of the speech recognizing units, information indicating each speaker is parallelly displayed on a display in emission order of each speech. When the text data is output from each of the speech recognizing units, the text data is associated with the information indicating each speaker and the text data is displayed.
    Type: Application
    Filed: March 25, 2010
    Publication date: December 8, 2011
    Applicant: NEC CORPORATION
    Inventors: Takafumi Koshinaka, Masahiko Hamanaka
  • Publication number: 20110202487
    Abstract: A statistical model learning device is provided to efficiently select data effective in improving the quality of statistical models. A data classification means 601 refers to structural information 611 generally possessed by a data which is a learning object, and extracts a plurality of subsets 613 from the training data 612. A statistical model learning means 602 utilizes the plurality of subsets 613 to create statistical models 614 respectively. A data recognition means 603 utilizes the respective statistical models 614 to recognize other data 615 different from the training data 612 and acquires each recognition result 616. An information amount calculation means 604 calculates information amounts of the other data 615 from a degree of discrepancy among the statistical models of the recognition results. A data selection means 605 selects the data with a large information amount and adds the same to the training data 612.
    Type: Application
    Filed: July 22, 2009
    Publication date: August 18, 2011
    Applicant: NEC CORPORATION
    Inventor: Takafumi Koshinaka
  • Publication number: 20110046952
    Abstract: Parameters of a first variation model, a second variation model and an environment-independent acoustic model are estimated in such a way that an integrated degree of fitness obtained by integrating a degree of fitness of the first variation model to the sample speech data, a degree of fitness of the second variation model to the sample speech data, and a degree of fitness of the environment-independent acoustic model to the sample speech data becomes the maximum. Therefore, when constructing an acoustic model by using sample speech data affected by a plurality of acoustic environments; the effect on a speech which is caused by each of the acoustic environments can be extracted with high accuracy.
    Type: Application
    Filed: February 10, 2009
    Publication date: February 24, 2011
    Inventor: Takafumi Koshinaka
  • Publication number: 20110010175
    Abstract: Provided is to a text data processing apparatus, method and program to add a symbol at an appropriate position. The apparatus according to this embodiment is a text data processing apparatus that executes edit of a symbol in input text, the apparatus including symbol edit determination means 52 that determines whether symbol edit is necessary or not based on a frequency of symbol insertion in a block consisting of a plurality of divided text; and symbol edit position calculation means 53 that calculates likelihood of the symbol edit based on likelihood of symbol insertion for a word and a distance between the symbols and calculates a symbol edit position in the block in accordance with the likelihood of symbol edit or a word in the block when the symbol edit determination means determines that the symbol edit is necessary.
    Type: Application
    Filed: February 13, 2009
    Publication date: January 13, 2011
    Inventors: Tasuku Kitade, Takafumi Koshinaka
  • Publication number: 20100292989
    Abstract: Enables symbol insertion evaluation in consideration of a difference in speaking style features between speakers. For a word sequence transcribing voice information, the symbol insertion likelihood calculation means 113 obtains a symbol insertion likelihood for each of a plurality of symbol insertion models supplied for different speaking style features. The speaking style feature similarity calculation means 112 obtains a similarity between the speaking style feature of the word sequence and the plurality of speaking style feature models. The symbol insertion evaluation means 114 weights the symbol insertion likelihood obtained for the word sequence by each of the plurality of symbol insertion models according to the similarity between the speaking style feature of the word sequence and the plurality of speaking style feature models and the relevance between the symbol insertion model and the speaking style feature model, and performs symbol insertion evaluation to the word sequence.
    Type: Application
    Filed: January 19, 2009
    Publication date: November 18, 2010
    Inventors: Tasuku Kitade, Takafumi Koshinaka
  • Publication number: 20100278428
    Abstract: There is provided an apparatus including a model based topic segmentation section that segments a text using a topic model representing semantic coherence, a parameter estimation section that estimates a control parameter used in segmenting the text based on detection of a change point of word distribution in the text, using the result of segmentation by the model based topic segmentation unit as training data, and a change point detection topic segmentation section that segments the text, based on detection of the change point of word distribution in the text, using the parameter estimated by the parameter estimation section (FIG. 1).
    Type: Application
    Filed: December 25, 2008
    Publication date: November 4, 2010
    Inventors: Makoto Terao, Takafumi Koshinaka
  • Publication number: 20100268535
    Abstract: A problem to be solved is to robustly detect a pronunciation variation example and acquire a pronunciation variation rule having a high generalization property, with less effort. The problem can be solved by a pronunciation variation rule extraction apparatus including a speech data storage unit, a base form pronunciation storage unit, a sub word language model generation unit, a speech recognition unit, and a difference extraction unit. The speech data storage unit stores speech data. The base form pronunciation storage unit stores base form pronunciation data representing base form pronunciation of the speech data. The sub word language model generation unit generates a sub word language model from the base form pronunciation data. The speech recognition unit recognizes the speech data by using the sub word language model.
    Type: Application
    Filed: November 27, 2008
    Publication date: October 21, 2010
    Inventor: Takafumi Koshinaka
  • Publication number: 20100138223
    Abstract: An object of the present invention is to allow classification of sequentially input speech signals with good accuracy based on similarity of speakers and environments by using a realistic memory use amount, a realistic processing speed, and an on-line operation. A speech classification probability calculation means 103 calculates a probability (probability of classification into each cluster) that a latest one of the speech signals (speech data) belongs to each cluster based on a generative model which is a probability model. A parameter updating means 107 successively estimates parameters that define the generative model based on the probability of classification of the speech data into each cluster calculated by the speech classification probability calculation means 103 (in FIG. 1).
    Type: Application
    Filed: March 13, 2008
    Publication date: June 3, 2010
    Inventor: Takafumi Koshinaka
  • Publication number: 20090319513
    Abstract: [Problems] To accurately calculate similarity between media data and a query even if the media data or its meta data has an error. [Means for Solving the Problems] A similarity calculation device includes: a single score calculation device used when calculating similarity between first media data and a query, which calculates a single score that shows similarity between second media data different from the first media data and the query; an inter-media similarity calculation device which calculates inter-media similarity that shows the similarity between the second media data and the first media data; and a query similarity calculation device which obtains similarity between the first media data and the query by using the inter-media similarity of the second media data and the single score.
    Type: Application
    Filed: August 2, 2007
    Publication date: December 24, 2009
    Applicant: NEC Corporation
    Inventors: Makoto Terao, Takafumi Koshinaka, Shinichi Ando, Yoshifumi Onishi
  • Publication number: 20090271195
    Abstract: A speech recognition apparatus capable of attaining high recognition accuracy within practical processing time using a computing machine having standard performance by appropriately adapting a language model to a speech about a certain topic, irrespectively of a degree of detail and diversity of the topic and irrespectively of a confidence score of an initial speech recognition result is provided.
    Type: Application
    Filed: July 6, 2007
    Publication date: October 29, 2009
    Applicant: NEC Corporation
    Inventors: Tasuku Kitade, Takafumi Koshinaka
  • Publication number: 20090024392
    Abstract: A speech recognition dictionary making supporting system for efficiently making/updating a speech recognition dictionary/language model with reduced speech recognition errors by using text data available at low cost. The speech recognition dictionary making supporting system comprises a recognition dictionary storage section (105), a language model storage section (106), and a sound model storage section (107). A virtual speech recognizing section (102) creates virtual speech recognition result text data in regard to an analyzed text data created by a text analyzing section (101) with reference to a recognition dictionary, language model, and sound model, and compares the virtual speech recognition result text data with the original analyzed text data. An updating section (103) updates the recognition dictionary and language model so that the different portions in both the text data may be lessened.
    Type: Application
    Filed: February 2, 2007
    Publication date: January 22, 2009
    Applicant: NEC CORPORATION
    Inventor: Takafumi Koshinaka
  • Publication number: 20070162272
    Abstract: A temporary model generating unit (103) generates a probability model which is estimated to generate a text document as a processing target and in which information indicating which word of the text document to which topic is made to correspond to a latent variable, and each word is made to correspond to an observable variable. A model parameter estimating unit (105) estimates model parameters defining a probability model on the basis of the text document as the processing target. When a plurality of probability models are generated, a model selecting unit (107) selects an optimal probability model on the basis of the estimation result for each probability model. A text segmentation result output unit (108) segments the text document as the processing target for each topic on the basis of the estimation result on the optimal probability model.
    Type: Application
    Filed: January 17, 2005
    Publication date: July 12, 2007
    Applicant: NEC CORPORATION
    Inventor: Takafumi Koshinaka
  • Patent number: 6671417
    Abstract: A reference line detecting step 4 assumes two reference lines dividing a character row image into three, i.e., upper, intermediate and lower areas to be quadratic curves independent of one another, and obtains parameters determining the two reference lines such as to best separate the length distributions of white runs in the three areas from one another. A reference line correcting step 5 corrects the character row image from the parameters and the same image such as to obtain two horizontal reference lines and predetermined values of area (i.e., height) ratios of the three areas, and feeds the corrected image back to a preprocessing step 2. A character row reading step 3 executes character segmentation, feature extraction and character recognition with the character row image from the preprocessing means 2. The means 3 executes selection of the character recognition results, and outputs a most likely read-out result.
    Type: Grant
    Filed: May 19, 2000
    Date of Patent: December 30, 2003
    Assignee: NEC Corporation
    Inventor: Takafumi Koshinaka
  • Patent number: 6115506
    Abstract: The invention provides a character recognition apparatus wherein wrong correction in slant correction processing of a character string is minimized to minimize erroneous recognition. A character slant estimation section receives an image, calculates slant angle candidates and evaluation values of them, and calculates a slant angle estimated value based on the evaluation values. An estimated value evaluation section receives the evaluation values, calculates an information amount of the evaluation values or the like, and outputs it as a validity of the slant angle estimated value. A slant correction section receives and normalizes the validity to a value from 0 to 1 and determines the normalized value as an execution coefficient for slant correction.
    Type: Grant
    Filed: May 4, 1998
    Date of Patent: September 5, 2000
    Assignee: NEC Corporation
    Inventor: Takafumi Koshinaka