Patents by Inventor Yoshifumi Onishi

Yoshifumi Onishi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8140530
    Abstract: [PROBLEMS] To accurately calculate similarity between media data and a query even if the media data or its meta data has an error. [MEANS FOR SOLVING THE PROBLEMS] A similarity calculation device includes: a single score calculation device used when calculating similarity between first media data and a query, which calculates a single score that shows similarity between second media data different from the first media data and the query; an inter-media similarity calculation device which calculates inter-media similarity that shows the similarity between the second media data and the first media data; and a query similarity calculation device which obtains similarity between the first media data and the query by using the inter-media similarity of the second media data and the single score.
    Type: Grant
    Filed: August 2, 2007
    Date of Patent: March 20, 2012
    Assignee: NEC Corporation
    Inventors: Makoto Terao, Takafumi Koshinaka, Shinichi Ando, Yoshifumi Onishi
  • Publication number: 20120046940
    Abstract: A method for processing multichannel acoustic signals, whereby input signals of a plurality of channels including the voices of a plurality of speaking persons are processed. The method is characterized by comprising: calculating the first feature quantity of the input signals of the multichannels for each channel; calculating similarity of the first feature quantity of each channel between the channels; selecting channels having high similarity; separating signals using the input signals of the selected channels; inputting the input signals of the channels having low similarity and the signals after the signal separation; and detecting a voice section of each speaking person or each channel.
    Type: Application
    Filed: February 8, 2010
    Publication date: February 23, 2012
    Applicant: NEC CORPORATION
    Inventors: Masanori Tsujikawa, Tadashi Emori, Yoshifumi Onishi, Ryosuke Isotani
  • Publication number: 20120035915
    Abstract: The present invention uses a language model creation device 200 that creates a new language model using a standard language model created from standard language text. The language model creation device 200 includes a transformation rule storage section 201 that stores transformation rules used for transforming dialect-containing word strings into standard language word strings, and a dialect language model creation section 203 that creates dialect-containing n-grams by applying the transformation rules to word n-grams in the standard language model and, furthermore, creates the new language model (dialect language model) by adding the created dialect-containing n-grams to the word n-grams.
    Type: Application
    Filed: March 16, 2010
    Publication date: February 9, 2012
    Inventors: Tasuku Kitade, Takafumi Koshinaka, Yoshifumi Onishi
  • Publication number: 20120029915
    Abstract: A method for processing multichannel acoustic signals which processes input signals of a plurality of channels including the voices of a plurality of speaking persons. The method is characterized by detecting the voice section of each speaking person or each channel, detecting overlapped sections wherein the detected voice sections are common between channels, determining a channel to be subjected to crosstalk removal and the section thereof by use of at least voice sections not including the detected overlapped sections, and removing crosstalk in the sections of the channel to be subjected to the crosstalk removal.
    Type: Application
    Filed: February 8, 2010
    Publication date: February 2, 2012
    Applicant: NEC CORPORATION
    Inventors: Masanori Tsujikawa, Ryosuke Isotani, Tadashi Emori, Yoshifumi Onishi
  • Publication number: 20120029916
    Abstract: A method for processing multichannel acoustic signals which is characterized by calculating the feature quantity of each channel from the input signals of a plurality of channels, calculating similarity between the channels in the feature quantity of each channel, selecting channels having high similarity, and separating signals using the input signals of the selected channels.
    Type: Application
    Filed: February 8, 2010
    Publication date: February 2, 2012
    Applicant: NEC CORPORATION
    Inventors: Masanori Tsujikawa, Tadashi Emori, Yoshifumi Onishi
  • Publication number: 20110224985
    Abstract: A model adaptation device includes a text database that stores a plurality of sentences containing predetermined phonemes; a sentence list that includes a plurality of sentences that describe the contents of the input voice; an input unit to which the input voice is input; a model adaptation unit that performs the model adaptation using the input voice and the sentence list and outputs adapting characteristic information, which is for making the model approximate to the input voice; a statistic database that stores the adapting characteristic information; a distance calculation unit that outputs a value of an acoustic distance between the adapting characteristic information and the model for each phoneme; a phoneme detection unit that outputs a distance value, among the distance values, which is greater than a threshold value as a detection result; and a label generation unit that extracts from the text database a sentence containing a phoneme associated with the detection result and outputs the sentence.
    Type: Application
    Filed: October 23, 2009
    Publication date: September 15, 2011
    Inventors: Ken Hanazawa, Yoshifumi Onishi
  • Publication number: 20100324897
    Abstract: Acoustic models and language models are learned according to a speaking length which indicates a length of a speaking section in speech data, and speech recognition process is implemented by using the learned acoustic models and language models. A speech recognition apparatus includes means (103) for detecting a speaking section in speech data (101) and for generating a section information which indicates the detected speaking section, means (104) for recognizing a data part corresponding to a section information in the speech data as well as text data (102) written from the speech data and for classifying the data part based on a speaking length thereof, and means (106) for learning acoustic models and language models (107) by using the classified data part (105).
    Type: Application
    Filed: December 7, 2007
    Publication date: December 23, 2010
    Inventors: Tadashi Emori, Yoshifumi Onishi
  • Publication number: 20100318358
    Abstract: A speech recognition apparatus (110) selects an optimum recognition result from recognition results output from a set of speech recognizers (s1-sM) based on a majority decision. This decision is implemented with taking into account weight values, as to the set of the speech recognizers, learned by a learning apparatus (100). The learning apparatus includes a unit (103) selecting speech recognizers corresponding to characteristics of speech for learning (101), a unit (104) finding recognition results of the speech for learning by using the selected speech recognizers, a unit (105) unifying the recognition results and generating a word string network, and a unit (106) finding weight values concerning a set of the speech recognizers by implementing learning processing.
    Type: Application
    Filed: January 18, 2008
    Publication date: December 16, 2010
    Inventors: Yoshifumi Onishi, Tadashi Emori
  • Publication number: 20100114572
    Abstract: To enable selection of a speaker, the acoustic feature value of which is similar to that of an utterance speaker, with accuracy and stability, while adapting to changes even when the acoustic feature value of the speaker changes every moment. A speaker score calculating means (22) calculates a long-time speaker score (log likelihood of each of a plurality of speaker models stored in a speaker model storage section (31) with respect to the acoustic feature value) based on an arbitrary number of utterances, for example, and calculates a short-time speaker score based on a short-time utterance, for example. A long-time speaker selecting means 23 selects speakers corresponding to a predetermined number of speaker models having a high long-time speaker score.
    Type: Application
    Filed: February 29, 2008
    Publication date: May 6, 2010
    Inventors: Masahiro Tani, Tadashi Emori, Yoshifumi Onishi
  • Publication number: 20100094629
    Abstract: A weighting factor learning system includes an audio recognition section that recognizes learning audio data and outputting the recognition result; a weighting factor updating section that updates a weighting factor applied to a score obtained from an acoustic model and a language model so that the difference between a correct-answer score calculated with the use of a correct-answer text of the learning audio data and a score of the recognition result becomes large; a convergence determination section that determines, with the use of the score after updating, whether to return to the weighting factor updating section to update the weighting factor again; and a weighting factor convergence determination section that determines, with the use of the score after updating, whether to return to the audio recognition section to perform the process again and update the weighting factor using the weighting factor updating section.
    Type: Application
    Filed: February 19, 2008
    Publication date: April 15, 2010
    Inventors: Tadashi Emori, Yoshifumi Onishi
  • Publication number: 20100023329
    Abstract: Speech recognition of even a speaker who uses a speech recognition system is enabled by using an extended recognition dictionary suited to the speaker without requiring any previous learning using an utterance label corresponding to the speech of the speaker.
    Type: Application
    Filed: January 15, 2008
    Publication date: January 28, 2010
    Applicant: NEC CORPORATION
    Inventor: Yoshifumi Onishi
  • Publication number: 20090319513
    Abstract: [Problems] To accurately calculate similarity between media data and a query even if the media data or its meta data has an error. [Means for Solving the Problems] A similarity calculation device includes: a single score calculation device used when calculating similarity between first media data and a query, which calculates a single score that shows similarity between second media data different from the first media data and the query; an inter-media similarity calculation device which calculates inter-media similarity that shows the similarity between the second media data and the first media data; and a query similarity calculation device which obtains similarity between the first media data and the query by using the inter-media similarity of the second media data and the single score.
    Type: Application
    Filed: August 2, 2007
    Publication date: December 24, 2009
    Applicant: NEC Corporation
    Inventors: Makoto Terao, Takafumi Koshinaka, Shinichi Ando, Yoshifumi Onishi
  • Publication number: 20090012791
    Abstract: A method and apparatus for carrying out adaptation using input speech data information even at a low reference pattern recognition performance. A reference pattern adaptation device 2 includes a speech recognition section 18, an adaptation data calculating section 19 and a reference pattern adaptation section 20. The speech recognition section 18 calculates a recognition result teacher label from the input speech data and the reference pattern. The adaptation data calculating section 19 calculates adaptation data composed of a teacher label and speech data. The adaptation data is composed of the input speech data and the recognition result teacher label corrected for adaptation by the recognition error knowledge which is the statistical information of the tendency towards recognition errors of the reference pattern. The reference pattern adaptation section 20 adapts the reference pattern using the adaptation data to generate an adaptation pattern.
    Type: Application
    Filed: February 16, 2007
    Publication date: January 8, 2009
    Applicant: NEC Corporation
    Inventor: Yoshifumi Onishi
  • Publication number: 20070055530
    Abstract: It is to provide a speaker verifying apparatus and the like capable of updating the identifier of a registering speaker at a low cost, considering that voices change over time. An update data generating apparatus comprises functions of: inputting registering speaker's voice feature value data to the speaker identifier of the registering speaker to obtain hypothesis scores, and generating a registering speaker score vector string constituted with a plurality of vectors having the hypothesis scores as the elements; inputting background speaker's voice feature value data to the speaker identifier of the registering speaker to obtain hypothesis scores, and generating a background speaker score vector string constituted with a plurality of vectors having the hypothesis scores as the elements; and storing the registering speaker score vector string and the background speaker score vector string to a storage device.
    Type: Application
    Filed: August 17, 2006
    Publication date: March 8, 2007
    Inventor: Yoshifumi Onishi