Assessment Or Evaluation Of Speech Recognition Systems (epo) Patents (Class 704/E15.002)
  • Patent number: 11947629
    Abstract: A computer system includes processor hardware configured to execute instructions that include joining at least a portion of multiple call transcription data entries with at least a portion of multiple agent call log data entries according to timestamps associated with the entries to generate a set of joined call data entries, and validating the joined call data entry by determining whether a transcribed entity name matches with entity identifier information associated with the agent call log data entry. The instructions include preprocessing the joined call data entry according to word confidence score data entries associated with the call transcription data entry to generate preprocessed text, performing natural language processing vectorization on the preprocessed text to generate an input vector, and supplying the input vector to an unsupervised machine learning model to assign an output topic classification of the model to the joined call data entry associated with the input vector.
    Type: Grant
    Filed: September 1, 2021
    Date of Patent: April 2, 2024
    Assignee: Evernorth Strategic Development, Inc.
    Inventors: Akash Dwivedi, Christopher R. Markson, Pritesh J. Shah
  • Patent number: 11763834
    Abstract: Features are extracted from an observed speech signal including at least speech of multiple speakers including a target speaker. A mask is calculated for extracting speech of the target speaker based on the features of the observed speech signal and a speech signal of the target speaker serving as adaptation data of the target speaker. The signal of the speech of the target speaker is calculated from the observed speech signal based on the mask. Speech of the target speaker can be extracted from observed speech that includes speech of multiple speakers.
    Type: Grant
    Filed: July 18, 2018
    Date of Patent: September 19, 2023
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Takuya Higuchi, Tomohiro Nakatani
  • Publication number: 20130197916
    Abstract: According to one embodiment, a terminal device including a main body, includes: a sound input module configured to receive a voice, convert the voice into a digital signal, and output the digital signal; a state detecting module having an acceleration sensor, configured to detect one or both of a movement and a state of the main body and output a detection result; an executing module, which is capable to execute plural speech recognition response processes, configured to execute one of the speech recognition response processes to the digital signal according to the detection result detected by the state detecting module.
    Type: Application
    Filed: November 7, 2012
    Publication date: August 1, 2013
    Inventor: Motonobu Sugiura
  • Publication number: 20130117020
    Abstract: Disclosed are a personalized advertisement device based on speech recognition SMS services and a personalized advertisement exposure method based on speech recognition SMS services. The present invention provides a personalized advertisement device based on speech recognition SMS services and a personalized advertisement exposure method based on speech recognition SMS services capable of maximizing an effect of advertisement by grasping user's intention, an emotion state, and positional information from speech data uttered by a user during a process of providing speech recognition SMS services, configuring advertisements based thereon, and exposing the configured advertisements to a user.
    Type: Application
    Filed: September 5, 2012
    Publication date: May 9, 2013
    Applicant: Electronics and telecommunications Research Institute
    Inventors: Hoon CHUNG, Jeon Gue Park, Hyung Bae Jeon, Ki Young Park, Yun Keun Lee, Sang Kyu Park
  • Publication number: 20120179460
    Abstract: A method for testing an automated interactive media system. The method can include establishing a communication session with the automated interactive media system. In response to receiving control and/or media information from the automated interactive media system, pre-recorded control and/or media information can be propagated to the automated interactive media system. The pre-recorded control and/or media information can be recorded in real time.
    Type: Application
    Filed: March 17, 2012
    Publication date: July 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: WILLIAM V. DA PALMA, BRIEN H. MUSCHETT
  • Publication number: 20120078622
    Abstract: According to one embodiment, a spoken dialogue apparatus includes a detection unit configured to detect speech of a user; a recognition unit configured to recognize the speech; an output unit configured to output a response voice corresponding to the result of speech recognition; an estimate unit configured to estimate probability variation of a barge-in utterance, the probability variation of the barge-in utterance being the time variation of the probability of arising the barge-in utterance interrupted by the user during outputting the response voice; and a control unit configured to determine whether to adopt the barge-in utterance based on the probability variation of the barge-in utterance.
    Type: Application
    Filed: March 18, 2011
    Publication date: March 29, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Kenji Iwata, Takehide Yano
  • Publication number: 20110191105
    Abstract: Computer-implemented systems and methods are provided for identifying language that would be considered obscene or otherwise offensive to a user or proprietor of a system. A plurality of offensive words are received, where each offensive word is associated with a severity score identifying the offensiveness of that word. A string of words is received. A distance between a candidate word and each offensive word in the plurality of offensive words is calculated, and a plurality of offensiveness scores for the candidate word are calculated, each offensiveness score based on the calculated distance between the candidate word and the offensive word and the severity score of the offensive word. A determination is made as to whether the candidate word is an offender word, where the candidate word is deemed to be an offender word when the highest offensiveness score in the plurality of offensiveness scores exceeds an offensiveness threshold value.
    Type: Application
    Filed: January 29, 2010
    Publication date: August 4, 2011
    Inventor: Joseph L. Spears
  • Publication number: 20110099012
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for estimating reliability of alternate speech recognition hypotheses. A system configured to practice the method receives an N-best list of speech recognition hypotheses and features describing the N-best list, determines a first probability of correctness for each hypothesis in the N-best list based on the received features, determines a second probability that the N-best list does not contain a correct hypothesis, and uses the first probability and the second probability in a spoken dialog. The features can describe properties of at least one of a lattice, a word confusion network, and a garbage model. In one aspect, the N-best lists are not reordered according to reranking scores. The determination of the first probability of correctness can include a first stage of training a probabilistic model and a second stage of distributing mass over items in a tail of the N-best list.
    Type: Application
    Filed: October 23, 2009
    Publication date: April 28, 2011
    Applicant: AT&T Intellectual Property I, L.P.
    Inventors: Jason WILLIAMS, Suhrid BALAKRISHNAN
  • Publication number: 20100292988
    Abstract: A speech recognition system samples speech signals having the same meaning, and obtains frequency spectrum images of the speech signals. Training objects are obtained by modifying the frequency spectrum images to be the same width. The speech recognition system obtains specific data of the speech signals by analyzing the training objects. The specific data is linked with the meaning of the speech signals. The specific data may include probability values representing probabilities that the training objects appear at different points in an image area of the training objects. A speech command may be sampled, and a frequency spectrum image of the speech command is modified to be the same width as the training objects. The speech recognition system can determine a meaning of a speech command by determining a matching degree of the modified frequency spectrum image of the speech command and the specific data of the speech signals.
    Type: Application
    Filed: August 10, 2009
    Publication date: November 18, 2010
    Applicant: HON HAI PRECISION INDUSTRY CO., LTD.
    Inventors: HOU-HSIEN LEE, CHANG-JUNG LEE, CHIH-PING LO
  • Publication number: 20100094626
    Abstract: It is an object of the present invention to provide a method and apparatus for locating a keyword of a speech and a speech recognition system. The method includes the steps of: by extracting feature parameters from frames constituting the recognition target speech, forming a feature parameter vector sequence that represents the recognition target speech; by normalizing of the feature parameter vector sequence with use of a codebook containing a plurality of codebook vectors, obtaining a feature trace of the recognition target speech in a vector space; and specifying the position of a keyword by matching prestored keyword template traces with the feature trace. According to the present invention, a keyword template trace and a feature space trace of a target speech are drawn in accordance with an identical codebook. This causes resampling to be unnecessary in performing linear movement matching of speech wave frames having similar phonological feature structures.
    Type: Application
    Filed: September 27, 2007
    Publication date: April 15, 2010
    Inventors: Fengqin Li, Yadong Wu, Qinqtao Yang, Chen Chen
  • Publication number: 20100030400
    Abstract: A system and method which implement automatic speech recognition (ASR) and text-to-speech (TTS) programs to permit pilots, co-pilots, and other persons to more quickly and easily perform control and monitoring tasks on aircraft. The system may be used to automatically change the frequency of an aircraft radio when a pilot or co-pilot is instructed to do so by ATC.
    Type: Application
    Filed: July 13, 2006
    Publication date: February 4, 2010
    Applicant: GARMIN INTERNATIONAL, INC.
    Inventors: Joseph L. Komer, Joseph E. Gepner, Charles Gregory Sherwood
  • Publication number: 20100004931
    Abstract: An apparatus is provided for speech utterance verification. The apparatus is configured to compare a first prosody component from a recorded speech with a second prosody component for a reference speech. The apparatus determines a prosodic verification evaluation for the recorded speech utterance in dependence of the comparison.
    Type: Application
    Filed: September 15, 2006
    Publication date: January 7, 2010
    Inventors: Bin Ma, Haizhou Li, Minghui Dong
  • Publication number: 20090070111
    Abstract: A method, system, and computer program product for spoken language grammar evaluation are provided. The method includes playing a recorded question to a candidate, recording a spoken answer from the candidate, and converting the spoken answer into text. The method further includes comparing the text to a grammar database, calculating a spoken language grammar evaluation score based on the comparison, and outputting the spoken language grammar evaluation score.
    Type: Application
    Filed: March 26, 2008
    Publication date: March 12, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Rajni Bajaj, Sreeram V. Balakrishnan, Mridula Bhandari, Lyndon J. D'Silva, Sandeep Jindal, Pooja Kumar, Nitendra Rajput, Ashish Verma
  • Publication number: 20080249773
    Abstract: A method and system for automatically generating a scoring model for scoring a speech sample are disclosed. One or more training speech samples are received in response to a prompt. One or more speech features are determined for each of the training speech samples. A scoring model is then generated based on the speech features. At least one of the training speech samples may be a high entropy speech sample. An evaluation speech sample is received and a score is assigned to the evaluation speech sample using the scoring model. The evaluation speech sample may be a high entropy speech sample.
    Type: Application
    Filed: June 16, 2008
    Publication date: October 9, 2008
    Inventors: Isaac Bejar, Klaus Zechner
  • Publication number: 20080215320
    Abstract: Disclosed is directed an apparatus and method to reduce recognition errors through context relations among multiple dialogue turns. The apparatus includes a rule set storage unit having a rule set containing one or more rules, an evolutionary rule generation module connected to the rule storage unit, and a rule trigger unit connected to the rule storage unit. The rule set uses dialogue turn as a unit for the information described by each rule. The method analyzes a dialogue history through an evolutionary massive parallelism approach to get a rule set describing the context relation among dialogue turns. Based on the rule set and recognition result of an ASR system, it reevaluates the recognition result, and measures the confidence measure of the reevaluated recognition result. After each successful dialogue turn, the rule set is dynamically adapted.
    Type: Application
    Filed: August 1, 2007
    Publication date: September 4, 2008
    Inventors: Hsu-Chih Wu, Ching-Hsien Lee
  • Publication number: 20080120104
    Abstract: A method of transmitting end-of-speech marks in a distributed speech recognition system operating in a discontinuous transmission mode, in which system speech segments (30, 40) are transmitted, followed by periods (34) of silence, each speech segment (30, 40) terminating with an end-of-speech mark (31, 41). The end-of-speech mark (31) is retransmitted continually (31a, 31b, 31c, 31d) throughout the duration of the period of silence (34) following said speech segment (30).
    Type: Application
    Filed: December 28, 2005
    Publication date: May 22, 2008
    Inventor: Alexandre Ferrieux
  • Publication number: 20080010065
    Abstract: A method and apparatus for speaker recognition is provided. One embodiment of a method for determining whether a given speech signal is produced by an alleged speaker, where a plurality of statistical models (including at least one support vector machine) have been produced for the alleged speaker based on a previous speech signal received from the alleged speaker, includes receiving the given speech signal, the speech signal representing an utterance made by a speaker claiming to be the alleged speaker, scoring the given speech signal using at least two modeling systems, where at least one of the modeling systems is a support vector machine, combining scores produced by the modeling systems, with equal weights, to produce a final score, and determining, in accordance with the final score, whether the speaker is likely the alleged speaker.
    Type: Application
    Filed: June 5, 2007
    Publication date: January 10, 2008
    Inventors: Harry BRATT, Luciana Ferrer, Martin Graciarena, Sachin Kajarekar, Elizabeth Shriberg, Mustafa Sonmez, Andreas Stolcke, Gokhan Tur, Anand Venkataraman