Patents by Inventor Katsuki Minamino

Katsuki Minamino has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20060177802
    Abstract: In a conventional voice dialogue system, there is a case where it is difficult to perform a natural dialogue with the user. Therefore, we designed to perform speech recognition on the user's utterance, to control a dialogue with the user according to a scenario previously given, based on the speech recognition result to generate an answering sentence corresponding to the contents of the user's utterance as the occasion demands, and to perform voice synthesis processing to one sentence in the reproduced scenario or the generated answering sentence.
    Type: Application
    Filed: March 16, 2004
    Publication date: August 10, 2006
    Inventors: Atsuo Hiroe, Hideki Shimomura, Helmut Lucke, Katsuki Minamino, Haru Kato
  • Patent number: 7088853
    Abstract: A plural number of letters or characters, inferred from the results of letter/character recognition of an image photographed by a CCD camera (20), a plural number of kana readings inferred from the letters or characters and the way of pronunciation corresponding to the kana readings are generated in an pronunciation information generating unit (150) and the plural readings obtained are matched to the pronunciation from the user acquired by a microphone (23) to specify one kana reading and the way of pronunciation (reading) from among the plural generated candidates.
    Type: Grant
    Filed: December 31, 2002
    Date of Patent: August 8, 2006
    Assignee: Sony Corporation
    Inventors: Atsuo Hiroe, Katsuki Minamino, Kenta Kawamoto, Kohtaro Sabe, Takeshi Ohashi
  • Patent number: 7013277
    Abstract: A preliminary word-selecting section selects one or more words following words which have been obtained in a word string serving as a candidate for a result of speech recognition; and a matching section calculates acoustic or linguistic scores for the selected words, and forms a word string serving as a candidate for a result of speech recognition according to the scores. A control section generates word-connection relationships between words in the word string serving as a candidate for a result of speech recognition, sends them to a word-connection-information storage section, and stores them in it. A re-evaluation section corrects the word-connection relationships stored in the word-connection-information storage section 16, and the control section determines a word string serving as the result of speech recognition according to the corrected word-connection relationships.
    Type: Grant
    Filed: February 26, 2001
    Date of Patent: March 14, 2006
    Assignee: Sony Corporation
    Inventors: Katsuki Minamino, Yasuharu Asano, Hiroaki Ogawa, Helmut Lucke
  • Patent number: 6961701
    Abstract: An extended-word selecting section calculates a score for a phoneme string formed of one more phonemes, corresponding to a user's speech, and searches a large-vocabulary-dictionary for a word having one or more phonemes equal to or similar to those of a phoneme string having a score equal to or higher than a predetermined value. A matching section calculates scores for the word searched for by the extended-word selecting section in addition to a word preliminary word-selecting section. A control section determines a word string as the result of recognition of the speech uttered by the user.
    Type: Grant
    Filed: March 3, 2001
    Date of Patent: November 1, 2005
    Assignee: Sony Corporation
    Inventors: Hiroaki Ogawa, Katsuki Minamino, Yasuharu Asano, Helmut Lucke
  • Publication number: 20050075877
    Abstract: A speech recognizing device for efficient processing while keeping a high speech recognizing performance. A matching unit (14) computes the score of a word preliminarily selected by a word preliminary selection unit (13) and determines candidates of the speech recognition result on the basis of the score. A control unit (11) creates a word connection relation between the words of a word sequence, which is a candidate of the speech recognition result and stores them in a word connection information storage unit (16). A revaluation unit (15) corrects the word connection relation serially, and the control unit ( 11) defines the speech recognition result on the basis of the word connection relation corrected. A word connection relation managing unit (21) limits the time corresponding to the boundary of a word expressed by the word connection relation, and a word connection relation managing unit (22) limits the starting time of the word preliminarily selected by the word preliminary selection unit (13).
    Type: Application
    Filed: November 7, 2001
    Publication date: April 7, 2005
    Inventors: Katsuki Minamino, Yasuharu Asano, Hiroaki Ogawa, Helmut Lucke
  • Patent number: 6862497
    Abstract: There is proposed a method that may be universally used for controlling a man-machine interface unit. A learning sample is used in order at least to derive and/or initialize a target action (t) to be carried out and to lead the user from an optional current status (ec) to an optional desired target status (et) as the final status (ef). This learning sample (l) is formed by a data triple made up by an initial status (ei) before an optional action (a) carried out by the user, a final status (ef) after the action taken place (a).
    Type: Grant
    Filed: June 3, 2002
    Date of Patent: March 1, 2005
    Assignees: Sony Corporation, Sony International (Europe) GmbH
    Inventors: Thomas Kemp, Ralf Kompe, Raquel Tato, Masahiro Fujita, Katsuki Minamino, Kenta Kawamoto, Rika Horinaka
  • Publication number: 20040167779
    Abstract: In order to prevent degradation of speech recognition accuracy due to an unknown word, a dictionary database has stored therein a word dictionary in which are stored, in addition to words for the objects of speech recognition, suffixes, which are sound elements and a sound element sequence, which form the unknown word, for classifying the unknown word by the part of speech thereof. Based on such a word dictionary, a matching section connects the acoustic models of an sound model database, and calculates the score using the series of features output by a feature extraction section on the basis of the connected acoustic model. Then, the matching section selects a series of the words, which represents the speech recognition result, on the basis of the score.
    Type: Application
    Filed: February 24, 2004
    Publication date: August 26, 2004
    Applicant: SONY CORPORATION
    Inventors: Helmut Lucke, Katsuki Minamino, Yasuharu Asano, Hiroaki Ogawa
  • Publication number: 20040078198
    Abstract: A system and method for an automatic set-up of speech recognition engines may include a speech recognizer configured to perform speech recognition procedures to identify input speech data according to one or more operating parameters. A merit manager may be utilized to automatically calculate merit values corresponding to the foregoing recognition procedures. These merit values may incorporate recognition accuracy information, recognition speed information, and a user-specified weighting factor that shifts the relative effect of the recognition accuracy information and the recognition speed information on the merit values. The merit manager may then automatically perform a merit value optimization procedure to select operating parameters that correspond to an optimal one of the merit values.
    Type: Application
    Filed: March 31, 2003
    Publication date: April 22, 2004
    Inventors: Gustavo Hernandez-Abrego, Xavier Menendez-Pidal, Thomas Kemp, Katsuki Minamino, Helmut Lucke
  • Patent number: 6718232
    Abstract: A robot apparatus causes the emotion in a feeling part (130) to be changed based on the information acquired by a perception part (120) to manifest the behavior of information acquisition as autonomous behavior. The robot apparatus includes a behavior control part for causing the robot apparatus to manifest a language acquisition behavior and a meaning acquisition part. The robot apparatus also includes a control part for performing the behavior control of pointing its object of learning. The robot apparatus causes changes in internal states, which are ascribable to the object, to be stored in a memory part in association with the object.
    Type: Grant
    Filed: September 24, 2002
    Date of Patent: April 6, 2004
    Assignee: Sony Corporation
    Inventors: Masahiro Fujita, Tsuyoshi Takagi, Rika Horinaka, Jun Yokono, Gabriel Costa, Hideki Shimomura, Katsuki Minamino
  • Publication number: 20040039483
    Abstract: There is proposed a method that may be universally used for controlling a man-machine interface unit. A learning sample is used in order at least to derive and/or initialize a target action (t) to be carried out and to lead the user from an optional current status (ec) to an optional desired target status (et) as the final status (ef). This learning sample (l) is formed by a data triple made up by an initial status (ei) before an optional action (a) carried out by the user, a final status (ef) after the action taken place, and the action taken place (a).
    Type: Application
    Filed: June 16, 2003
    Publication date: February 26, 2004
    Inventors: Thomas Kemp, Ralf Kompe, Raquel Tato, Masahiro Fujita, Katsuki Minamino, Kenta Kawamoto, Rika Horinaka
  • Publication number: 20030152261
    Abstract: A plural number of letters or characters, inferred from the results of letter/character recognition of an image photographed by a CCD camera (20), a plural number of kana readings inferred from the letters or characters and the way of pronunciation corresponding to the kana readings are generated in an pronunciation information generating unit (150) and the plural readings obtained are matched to the pronunciation from the user acquired by a microphone (23) to specify one kana reading and the way of pronunciation (reading) from among the plural generated candidates.
    Type: Application
    Filed: December 31, 2002
    Publication date: August 14, 2003
    Inventors: Atsuo Hiroe, Katsuki Minamino, Kenta Kawamoto, Kohtaro Sabe, Takeshi Ohashi
  • Publication number: 20030060930
    Abstract: A robot apparatus causes the emotion in a feeling part (130) to be changed based on the information acquired by a perception part (120) to manifest the behavior of information acquisition as autonomous behavior. The robot apparatus includes a behavior control part for causing the robot apparatus to manifest a language acquisition behavior and a meaning acquisition part. The robot apparatus also includes a control part for performing the behavior control of pointing its object of learning. The robot apparatus causes changes in internal states, which are ascribable to the object, to be stored in a memory part in association with the object.
    Type: Application
    Filed: September 24, 2002
    Publication date: March 27, 2003
    Inventors: Masahiro Fujita, Tsuyoshi Takagi, Rika Horinaka, Jun Yokono, Gabriel Costa, Hideki Shimomura, Katsuki Minamino
  • Publication number: 20020173958
    Abstract: A speech recognition apparatus in which the accuracy in speech recognition is improved as the resource is prevented from increasing. Such a word which is probable as the result of the speech recognition is selected on the basis of an acoustic score and a linguistic score, while word selection is also performed on the basis of a measure different from the acoustic score, such as the number of phonemes being small, a part of speech being a pre-set one, inclusion in the past results of speech recognition or the linguistic score being not less than a pre-set value. The words so selected are subjected to matching processing.
    Type: Application
    Filed: May 10, 2002
    Publication date: November 21, 2002
    Inventors: Yasuharu Asano, Katsuki Minamino, Hiroaki Ogawa, Helmut Lucke
  • Publication number: 20010053974
    Abstract: In order to prevent degradation of speech recognition accuracy due to an unknown word, a dictionary database has stored therein a word dictionary in which are stored, in addition to words for the objects of speech recognition, suffixes, which are sound elements and a sound element sequence, which form the unknown word, for classifying the unknown word by the part of speech thereof. Based on such a word dictionary, a matching section connects the acoustic models of an sound model database, and calculates the score using the series of features output by a feature extraction section on the basis of the connected acoustic model. Then, the matching section selects a series of the words, which represents the speech recognition result, on the basis of the score.
    Type: Application
    Filed: March 12, 2001
    Publication date: December 20, 2001
    Inventors: Helmut Lucke, Katsuki Minamino, Yasuharu Asano, Hiroaki Ogawa
  • Publication number: 20010037200
    Abstract: An extended-word selecting section calculates a score for a phoneme string formed of one or more phonemes, corresponding to the user's voice, and searches a large-vocabulary dictionary for a word having one or more phonemes equal to or similar to those of a phoneme string having a score equal to or higher than a predetermined value. A matching section calculates scores for the word searched for by the extended-word selecting section in addition to a word preliminarily selected by a preliminary word-selecting section. A control section determines a word string serving as the result of recognition of the voice.
    Type: Application
    Filed: March 3, 2001
    Publication date: November 1, 2001
    Inventors: Hiroaki Ogawa, Katsuki Minamino, Yasuhara Asano, Helmut Lucke
  • Publication number: 20010021909
    Abstract: A conversation processing apparatus and method determines whether to change the topic. If the determination is affirmative, the degree of association between a present topic being discussed and a candidate topic stored in a memory is computed with reference to a degree of association table. Based on the computation result, a topic with the highest degree of association is selected as a subsequent topic. The topic is changed from the present topic to the subsequent topic. The degree of association table used to select the subsequent topic is updated.
    Type: Application
    Filed: December 27, 2000
    Publication date: September 13, 2001
    Inventors: Hideki Shimomura, Takashi Toyoda, Katsuki Minamino, Osamu Hanagata, Hiroki Saijo, Toshiya Ogura
  • Publication number: 20010020226
    Abstract: A preliminary word-selecting section selects one or more words following words which have been obtained in a word string serving as a candidate for a result of voice recognition; and a matching section calculates acoustic or linguistic scores for the selected words, and forms a word string serving as a candidate for a result of voice recognition according to the scores. A control section generates word-connection relationships between words in the word string serving as a candidate for a result of voice recognition, sends them to a word-connection-information storage section, and stores them in it. A re-evaluation section corrects the word-connection relationships stored in the word-connection-information storage section 16, and the control section determines a word string serving as the result of voice recognition according to the corrected word-connection relationships.
    Type: Application
    Filed: February 26, 2001
    Publication date: September 6, 2001
    Inventors: Katsuki Minamino, Yasuharu Asano, Hiroaki Ogawa, Helmut Lucke
  • Patent number: 6253174
    Abstract: A speech recognition system for use in an automobile navigation system performs speech processing for recognizing speech or spoken words that correspond to a name of a place and a word designating a desired operation of the navigation system. When a new audio signal is input during speech recognition processing of a previously input audio signal, by holding a talk switch down for a fixed time period, the processing of the previously input audio signal is canceled, and the new audio signal immediately undergoes speech recognition processing without requiring any continuation of the processing of the previously input audio signal. The speech recognition system also determines whether an input audio signal has been reinputted within a predetermined amount of time from when the audio signal was previously inputted.
    Type: Grant
    Filed: July 1, 1998
    Date of Patent: June 26, 2001
    Assignee: Sony Corporation
    Inventors: Kazuo Ishii, Eiji Yamamoto, Miyuki Tanaka, Hiroshi Kakuda, Yasuharu Asano, Hiroaki Ogawa, Masanori Omote, Katsuki Minamino
  • Patent number: 6161093
    Abstract: A book database stores at least phonetic signal information including phoneme information and rhythm information as document data, a central system transmits phonetic signal information stored on the book database to a terminal and the terminal receives the phonetic signal information is then carried out at the terminal and the document is then recited via synthesized sounds.
    Type: Grant
    Filed: October 1, 1998
    Date of Patent: December 12, 2000
    Assignee: Sony Corporation
    Inventors: Masao Watari, Makoto Akabane, Tetsuya Kagami, Kazuo Ishii, Yusuke Iwahashi, Yasuhiko Kato, Hiroaki Ogawa, Masanori Omote, Kazuo Watanabe, Katsuki Minamino, Yasuharu Asano
  • Patent number: 6067521
    Abstract: A speech recognition apparatus and method for use in a car navigation system performs speech processing for recognizing speech or spoken words referring to a specified region. An input audio signal or vocalized speech undergoes speech processing to determine and recognize the region specified in the speech. Data corresponding to the specified region is converted to coordinate position data for the region, and a map of the vicinity of the converted coordinate position data is displayed. When a new audio signal is input during speech recognition processing of a previously-input audio signal, the processing of the previously-input audio signal is interrupted and the new audio signal undergoes speech recognition processing. Accordingly, a high-efficiency operation of the car navigation system may be performed without interfering with the driving of the car.
    Type: Grant
    Filed: October 10, 1996
    Date of Patent: May 23, 2000
    Assignee: Sony Corporation
    Inventors: Kazuo Ishii, Eiji Yamamoto, Miyuki Tanaka, Hiroshi Kakuda, Yasuharu Asano, Hiroaki Ogawa, Masanori Omote, Katsuki Minamino