Assessment Or Evaluation Of Speech Recognition Systems (epo) Patents (Class 704/E15.002)
-
Patent number: 12112097Abstract: A framework for generating and presenting verbal command suggestions to facilitate discoverability of commands capable of being understood and support users exploring available commands. A target associated with a direct-manipulation input is received from a user via a multimodal user interface. A set of operations relevant to the target is selected and verbal command suggestions relevant to the selected set of operations and the determined target are generated. At least a portion of the generated verbal command suggestions is provided for presentation in association with the multimodal user interface in one of three interface variants: one that presents command suggestions as a list, one that presents command suggestions using contextual overlay windows, and one that presents command suggestions embedded within the interface. Each of the proposed interface variants facilitates user awareness of verbal commands that are capable of being executed and teaches users how available verbal commands can be invoked.Type: GrantFiled: January 9, 2023Date of Patent: October 8, 2024Assignee: Adobe Inc.Inventors: Lubomira Dontcheva, Arjun Srinivasan, Seth John Walker, Eytan Adar
-
Patent number: 11947629Abstract: A computer system includes processor hardware configured to execute instructions that include joining at least a portion of multiple call transcription data entries with at least a portion of multiple agent call log data entries according to timestamps associated with the entries to generate a set of joined call data entries, and validating the joined call data entry by determining whether a transcribed entity name matches with entity identifier information associated with the agent call log data entry. The instructions include preprocessing the joined call data entry according to word confidence score data entries associated with the call transcription data entry to generate preprocessed text, performing natural language processing vectorization on the preprocessed text to generate an input vector, and supplying the input vector to an unsupervised machine learning model to assign an output topic classification of the model to the joined call data entry associated with the input vector.Type: GrantFiled: September 1, 2021Date of Patent: April 2, 2024Assignee: Evernorth Strategic Development, Inc.Inventors: Akash Dwivedi, Christopher R. Markson, Pritesh J. Shah
-
Patent number: 11763834Abstract: Features are extracted from an observed speech signal including at least speech of multiple speakers including a target speaker. A mask is calculated for extracting speech of the target speaker based on the features of the observed speech signal and a speech signal of the target speaker serving as adaptation data of the target speaker. The signal of the speech of the target speaker is calculated from the observed speech signal based on the mask. Speech of the target speaker can be extracted from observed speech that includes speech of multiple speakers.Type: GrantFiled: July 18, 2018Date of Patent: September 19, 2023Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATIONInventors: Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Takuya Higuchi, Tomohiro Nakatani
-
Publication number: 20130197916Abstract: According to one embodiment, a terminal device including a main body, includes: a sound input module configured to receive a voice, convert the voice into a digital signal, and output the digital signal; a state detecting module having an acceleration sensor, configured to detect one or both of a movement and a state of the main body and output a detection result; an executing module, which is capable to execute plural speech recognition response processes, configured to execute one of the speech recognition response processes to the digital signal according to the detection result detected by the state detecting module.Type: ApplicationFiled: November 7, 2012Publication date: August 1, 2013Inventor: Motonobu Sugiura
-
Publication number: 20130117020Abstract: Disclosed are a personalized advertisement device based on speech recognition SMS services and a personalized advertisement exposure method based on speech recognition SMS services. The present invention provides a personalized advertisement device based on speech recognition SMS services and a personalized advertisement exposure method based on speech recognition SMS services capable of maximizing an effect of advertisement by grasping user's intention, an emotion state, and positional information from speech data uttered by a user during a process of providing speech recognition SMS services, configuring advertisements based thereon, and exposing the configured advertisements to a user.Type: ApplicationFiled: September 5, 2012Publication date: May 9, 2013Applicant: Electronics and telecommunications Research InstituteInventors: Hoon CHUNG, Jeon Gue Park, Hyung Bae Jeon, Ki Young Park, Yun Keun Lee, Sang Kyu Park
-
Publication number: 20120179460Abstract: A method for testing an automated interactive media system. The method can include establishing a communication session with the automated interactive media system. In response to receiving control and/or media information from the automated interactive media system, pre-recorded control and/or media information can be propagated to the automated interactive media system. The pre-recorded control and/or media information can be recorded in real time.Type: ApplicationFiled: March 17, 2012Publication date: July 12, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: WILLIAM V. DA PALMA, BRIEN H. MUSCHETT
-
Publication number: 20120078622Abstract: According to one embodiment, a spoken dialogue apparatus includes a detection unit configured to detect speech of a user; a recognition unit configured to recognize the speech; an output unit configured to output a response voice corresponding to the result of speech recognition; an estimate unit configured to estimate probability variation of a barge-in utterance, the probability variation of the barge-in utterance being the time variation of the probability of arising the barge-in utterance interrupted by the user during outputting the response voice; and a control unit configured to determine whether to adopt the barge-in utterance based on the probability variation of the barge-in utterance.Type: ApplicationFiled: March 18, 2011Publication date: March 29, 2012Applicant: KABUSHIKI KAISHA TOSHIBAInventors: Kenji Iwata, Takehide Yano
-
Publication number: 20110191105Abstract: Computer-implemented systems and methods are provided for identifying language that would be considered obscene or otherwise offensive to a user or proprietor of a system. A plurality of offensive words are received, where each offensive word is associated with a severity score identifying the offensiveness of that word. A string of words is received. A distance between a candidate word and each offensive word in the plurality of offensive words is calculated, and a plurality of offensiveness scores for the candidate word are calculated, each offensiveness score based on the calculated distance between the candidate word and the offensive word and the severity score of the offensive word. A determination is made as to whether the candidate word is an offender word, where the candidate word is deemed to be an offender word when the highest offensiveness score in the plurality of offensiveness scores exceeds an offensiveness threshold value.Type: ApplicationFiled: January 29, 2010Publication date: August 4, 2011Inventor: Joseph L. Spears
-
Publication number: 20110099012Abstract: Disclosed herein are systems, methods, and computer-readable storage media for estimating reliability of alternate speech recognition hypotheses. A system configured to practice the method receives an N-best list of speech recognition hypotheses and features describing the N-best list, determines a first probability of correctness for each hypothesis in the N-best list based on the received features, determines a second probability that the N-best list does not contain a correct hypothesis, and uses the first probability and the second probability in a spoken dialog. The features can describe properties of at least one of a lattice, a word confusion network, and a garbage model. In one aspect, the N-best lists are not reordered according to reranking scores. The determination of the first probability of correctness can include a first stage of training a probabilistic model and a second stage of distributing mass over items in a tail of the N-best list.Type: ApplicationFiled: October 23, 2009Publication date: April 28, 2011Applicant: AT&T Intellectual Property I, L.P.Inventors: Jason WILLIAMS, Suhrid BALAKRISHNAN
-
Publication number: 20100292988Abstract: A speech recognition system samples speech signals having the same meaning, and obtains frequency spectrum images of the speech signals. Training objects are obtained by modifying the frequency spectrum images to be the same width. The speech recognition system obtains specific data of the speech signals by analyzing the training objects. The specific data is linked with the meaning of the speech signals. The specific data may include probability values representing probabilities that the training objects appear at different points in an image area of the training objects. A speech command may be sampled, and a frequency spectrum image of the speech command is modified to be the same width as the training objects. The speech recognition system can determine a meaning of a speech command by determining a matching degree of the modified frequency spectrum image of the speech command and the specific data of the speech signals.Type: ApplicationFiled: August 10, 2009Publication date: November 18, 2010Applicant: HON HAI PRECISION INDUSTRY CO., LTD.Inventors: HOU-HSIEN LEE, CHANG-JUNG LEE, CHIH-PING LO
-
Publication number: 20100094626Abstract: It is an object of the present invention to provide a method and apparatus for locating a keyword of a speech and a speech recognition system. The method includes the steps of: by extracting feature parameters from frames constituting the recognition target speech, forming a feature parameter vector sequence that represents the recognition target speech; by normalizing of the feature parameter vector sequence with use of a codebook containing a plurality of codebook vectors, obtaining a feature trace of the recognition target speech in a vector space; and specifying the position of a keyword by matching prestored keyword template traces with the feature trace. According to the present invention, a keyword template trace and a feature space trace of a target speech are drawn in accordance with an identical codebook. This causes resampling to be unnecessary in performing linear movement matching of speech wave frames having similar phonological feature structures.Type: ApplicationFiled: September 27, 2007Publication date: April 15, 2010Inventors: Fengqin Li, Yadong Wu, Qinqtao Yang, Chen Chen
-
Publication number: 20100030400Abstract: A system and method which implement automatic speech recognition (ASR) and text-to-speech (TTS) programs to permit pilots, co-pilots, and other persons to more quickly and easily perform control and monitoring tasks on aircraft. The system may be used to automatically change the frequency of an aircraft radio when a pilot or co-pilot is instructed to do so by ATC.Type: ApplicationFiled: July 13, 2006Publication date: February 4, 2010Applicant: GARMIN INTERNATIONAL, INC.Inventors: Joseph L. Komer, Joseph E. Gepner, Charles Gregory Sherwood
-
Publication number: 20100004931Abstract: An apparatus is provided for speech utterance verification. The apparatus is configured to compare a first prosody component from a recorded speech with a second prosody component for a reference speech. The apparatus determines a prosodic verification evaluation for the recorded speech utterance in dependence of the comparison.Type: ApplicationFiled: September 15, 2006Publication date: January 7, 2010Inventors: Bin Ma, Haizhou Li, Minghui Dong
-
Publication number: 20090070111Abstract: A method, system, and computer program product for spoken language grammar evaluation are provided. The method includes playing a recorded question to a candidate, recording a spoken answer from the candidate, and converting the spoken answer into text. The method further includes comparing the text to a grammar database, calculating a spoken language grammar evaluation score based on the comparison, and outputting the spoken language grammar evaluation score.Type: ApplicationFiled: March 26, 2008Publication date: March 12, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Rajni Bajaj, Sreeram V. Balakrishnan, Mridula Bhandari, Lyndon J. D'Silva, Sandeep Jindal, Pooja Kumar, Nitendra Rajput, Ashish Verma
-
Publication number: 20080249773Abstract: A method and system for automatically generating a scoring model for scoring a speech sample are disclosed. One or more training speech samples are received in response to a prompt. One or more speech features are determined for each of the training speech samples. A scoring model is then generated based on the speech features. At least one of the training speech samples may be a high entropy speech sample. An evaluation speech sample is received and a score is assigned to the evaluation speech sample using the scoring model. The evaluation speech sample may be a high entropy speech sample.Type: ApplicationFiled: June 16, 2008Publication date: October 9, 2008Inventors: Isaac Bejar, Klaus Zechner
-
Publication number: 20080215320Abstract: Disclosed is directed an apparatus and method to reduce recognition errors through context relations among multiple dialogue turns. The apparatus includes a rule set storage unit having a rule set containing one or more rules, an evolutionary rule generation module connected to the rule storage unit, and a rule trigger unit connected to the rule storage unit. The rule set uses dialogue turn as a unit for the information described by each rule. The method analyzes a dialogue history through an evolutionary massive parallelism approach to get a rule set describing the context relation among dialogue turns. Based on the rule set and recognition result of an ASR system, it reevaluates the recognition result, and measures the confidence measure of the reevaluated recognition result. After each successful dialogue turn, the rule set is dynamically adapted.Type: ApplicationFiled: August 1, 2007Publication date: September 4, 2008Inventors: Hsu-Chih Wu, Ching-Hsien Lee
-
Publication number: 20080120104Abstract: A method of transmitting end-of-speech marks in a distributed speech recognition system operating in a discontinuous transmission mode, in which system speech segments (30, 40) are transmitted, followed by periods (34) of silence, each speech segment (30, 40) terminating with an end-of-speech mark (31, 41). The end-of-speech mark (31) is retransmitted continually (31a, 31b, 31c, 31d) throughout the duration of the period of silence (34) following said speech segment (30).Type: ApplicationFiled: December 28, 2005Publication date: May 22, 2008Inventor: Alexandre Ferrieux
-
Publication number: 20080010065Abstract: A method and apparatus for speaker recognition is provided. One embodiment of a method for determining whether a given speech signal is produced by an alleged speaker, where a plurality of statistical models (including at least one support vector machine) have been produced for the alleged speaker based on a previous speech signal received from the alleged speaker, includes receiving the given speech signal, the speech signal representing an utterance made by a speaker claiming to be the alleged speaker, scoring the given speech signal using at least two modeling systems, where at least one of the modeling systems is a support vector machine, combining scores produced by the modeling systems, with equal weights, to produce a final score, and determining, in accordance with the final score, whether the speaker is likely the alleged speaker.Type: ApplicationFiled: June 5, 2007Publication date: January 10, 2008Inventors: Harry BRATT, Luciana Ferrer, Martin Graciarena, Sachin Kajarekar, Elizabeth Shriberg, Mustafa Sonmez, Andreas Stolcke, Gokhan Tur, Anand Venkataraman