Assessment Or Evaluation Of Speech Recognition Systems (epo) Patents (Class 704/E15.002)

Facilitating discovery of verbal commands using multimodal interfaces

Patent number: 12112097

Abstract: A framework for generating and presenting verbal command suggestions to facilitate discoverability of commands capable of being understood and support users exploring available commands. A target associated with a direct-manipulation input is received from a user via a multimodal user interface. A set of operations relevant to the target is selected and verbal command suggestions relevant to the selected set of operations and the determined target are generated. At least a portion of the generated verbal command suggestions is provided for presentation in association with the multimodal user interface in one of three interface variants: one that presents command suggestions as a list, one that presents command suggestions using contextual overlay windows, and one that presents command suggestions embedded within the interface. Each of the proposed interface variants facilitates user awareness of verbal commands that are capable of being executed and teaches users how available verbal commands can be invoked.

Type: Grant

Filed: January 9, 2023

Date of Patent: October 8, 2024

Assignee: Adobe Inc.

Inventors: Lubomira Dontcheva, Arjun Srinivasan, Seth John Walker, Eytan Adar
Machine learning models for automated processing of transcription database entries

Patent number: 11947629

Abstract: A computer system includes processor hardware configured to execute instructions that include joining at least a portion of multiple call transcription data entries with at least a portion of multiple agent call log data entries according to timestamps associated with the entries to generate a set of joined call data entries, and validating the joined call data entry by determining whether a transcribed entity name matches with entity identifier information associated with the agent call log data entry. The instructions include preprocessing the joined call data entry according to word confidence score data entries associated with the call transcription data entry to generate preprocessed text, performing natural language processing vectorization on the preprocessed text to generate an input vector, and supplying the input vector to an unsupervised machine learning model to assign an output topic classification of the model to the joined call data entry associated with the input vector.

Type: Grant

Filed: September 1, 2021

Date of Patent: April 2, 2024

Assignee: Evernorth Strategic Development, Inc.

Inventors: Akash Dwivedi, Christopher R. Markson, Pritesh J. Shah
Mask calculation device, cluster weight learning device, mask calculation neural network learning device, mask calculation method, cluster weight learning method, and mask calculation neural network learning method

Patent number: 11763834

Abstract: Features are extracted from an observed speech signal including at least speech of multiple speakers including a target speaker. A mask is calculated for extracting speech of the target speaker based on the features of the observed speech signal and a speech signal of the target speaker serving as adaptation data of the target speaker. The signal of the speech of the target speaker is calculated from the observed speech signal based on the mask. Speech of the target speaker can be extracted from observed speech that includes speech of multiple speakers.

Type: Grant

Filed: July 18, 2018

Date of Patent: September 19, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Marc Delcroix, Keisuke Kinoshita, Atsunori Ogawa, Takuya Higuchi, Tomohiro Nakatani
TERMINAL DEVICE, SPEECH RECOGNITION PROCESSING METHOD OF TERMINAL DEVICE, AND RELATED PROGRAM

Publication number: 20130197916

Abstract: According to one embodiment, a terminal device including a main body, includes: a sound input module configured to receive a voice, convert the voice into a digital signal, and output the digital signal; a state detecting module having an acceleration sensor, configured to detect one or both of a movement and a state of the main body and output a detection result; an executing module, which is capable to execute plural speech recognition response processes, configured to execute one of the speech recognition response processes to the digital signal according to the detection result detected by the state detecting module.

Type: Application

Filed: November 7, 2012

Publication date: August 1, 2013

Inventor: Motonobu Sugiura
PERSONALIZED ADVERTISEMENT DEVICE BASED ON SPEECH RECOGNITION SMS SERVICE, AND PERSONALIZED ADVERTISEMENT EXPOSURE METHOD BASED ON SPEECH RECOGNITION SMS SERVICE

Publication number: 20130117020

Abstract: Disclosed are a personalized advertisement device based on speech recognition SMS services and a personalized advertisement exposure method based on speech recognition SMS services. The present invention provides a personalized advertisement device based on speech recognition SMS services and a personalized advertisement exposure method based on speech recognition SMS services capable of maximizing an effect of advertisement by grasping user's intention, an emotion state, and positional information from speech data uttered by a user during a process of providing speech recognition SMS services, configuring advertisements based thereon, and exposing the configured advertisements to a user.

Type: Application

Filed: September 5, 2012

Publication date: May 9, 2013

Applicant: Electronics and telecommunications Research Institute

Inventors: Hoon CHUNG, Jeon Gue Park, Hyung Bae Jeon, Ki Young Park, Yun Keun Lee, Sang Kyu Park
CREATION AND USE OF TEST CASES FOR AUTOMATED TESTING OF MEDIA-BASED APPLICATIONS

Publication number: 20120179460

Abstract: A method for testing an automated interactive media system. The method can include establishing a communication session with the automated interactive media system. In response to receiving control and/or media information from the automated interactive media system, pre-recorded control and/or media information can be propagated to the automated interactive media system. The pre-recorded control and/or media information can be recorded in real time.

Type: Application

Filed: March 17, 2012

Publication date: July 12, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: WILLIAM V. DA PALMA, BRIEN H. MUSCHETT
SPOKEN DIALOGUE APPARATUS, SPOKEN DIALOGUE METHOD AND COMPUTER PROGRAM PRODUCT FOR SPOKEN DIALOGUE

Publication number: 20120078622

Abstract: According to one embodiment, a spoken dialogue apparatus includes a detection unit configured to detect speech of a user; a recognition unit configured to recognize the speech; an output unit configured to output a response voice corresponding to the result of speech recognition; an estimate unit configured to estimate probability variation of a barge-in utterance, the probability variation of the barge-in utterance being the time variation of the probability of arising the barge-in utterance interrupted by the user during outputting the response voice; and a control unit configured to determine whether to adopt the barge-in utterance based on the probability variation of the barge-in utterance.

Type: Application

Filed: March 18, 2011

Publication date: March 29, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Kenji Iwata, Takehide Yano
Systems and Methods for Word Offensiveness Detection and Processing Using Weighted Dictionaries and Normalization

Publication number: 20110191105

Abstract: Computer-implemented systems and methods are provided for identifying language that would be considered obscene or otherwise offensive to a user or proprietor of a system. A plurality of offensive words are received, where each offensive word is associated with a severity score identifying the offensiveness of that word. A string of words is received. A distance between a candidate word and each offensive word in the plurality of offensive words is calculated, and a plurality of offensiveness scores for the candidate word are calculated, each offensiveness score based on the calculated distance between the candidate word and the offensive word and the severity score of the offensive word. A determination is made as to whether the candidate word is an offender word, where the candidate word is deemed to be an offender word when the highest offensiveness score in the plurality of offensiveness scores exceeds an offensiveness threshold value.

Type: Application

Filed: January 29, 2010

Publication date: August 4, 2011

Inventor: Joseph L. Spears
SYSTEM AND METHOD FOR ESTIMATING THE RELIABILITY OF ALTERNATE SPEECH RECOGNITION HYPOTHESES IN REAL TIME

Publication number: 20110099012

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for estimating reliability of alternate speech recognition hypotheses. A system configured to practice the method receives an N-best list of speech recognition hypotheses and features describing the N-best list, determines a first probability of correctness for each hypothesis in the N-best list based on the received features, determines a second probability that the N-best list does not contain a correct hypothesis, and uses the first probability and the second probability in a spoken dialog. The features can describe properties of at least one of a lattice, a word confusion network, and a garbage model. In one aspect, the N-best lists are not reordered according to reranking scores. The determination of the first probability of correctness can include a first stage of training a probabilistic model and a second stage of distributing mass over items in a tail of the N-best list.

Type: Application

Filed: October 23, 2009

Publication date: April 28, 2011

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Jason WILLIAMS, Suhrid BALAKRISHNAN
SYSTEM AND METHOD FOR SPEECH RECOGNITION

Publication number: 20100292988

Abstract: A speech recognition system samples speech signals having the same meaning, and obtains frequency spectrum images of the speech signals. Training objects are obtained by modifying the frequency spectrum images to be the same width. The speech recognition system obtains specific data of the speech signals by analyzing the training objects. The specific data is linked with the meaning of the speech signals. The specific data may include probability values representing probabilities that the training objects appear at different points in an image area of the training objects. A speech command may be sampled, and a frequency spectrum image of the speech command is modified to be the same width as the training objects. The speech recognition system can determine a meaning of a speech command by determining a matching degree of the modified frequency spectrum image of the speech command and the specific data of the speech signals.

Type: Application

Filed: August 10, 2009

Publication date: November 18, 2010

Applicant: HON HAI PRECISION INDUSTRY CO., LTD.

Inventors: HOU-HSIEN LEE, CHANG-JUNG LEE, CHIH-PING LO
METHOD AND APPARATUS FOR LOCATING SPEECH KEYWORD AND SPEECH RECOGNITION SYSTEM

Publication number: 20100094626

Abstract: It is an object of the present invention to provide a method and apparatus for locating a keyword of a speech and a speech recognition system. The method includes the steps of: by extracting feature parameters from frames constituting the recognition target speech, forming a feature parameter vector sequence that represents the recognition target speech; by normalizing of the feature parameter vector sequence with use of a codebook containing a plurality of codebook vectors, obtaining a feature trace of the recognition target speech in a vector space; and specifying the position of a keyword by matching prestored keyword template traces with the feature trace. According to the present invention, a keyword template trace and a feature space trace of a target speech are drawn in accordance with an identical codebook. This causes resampling to be unnecessary in performing linear movement matching of speech wave frames having similar phonological feature structures.

Type: Application

Filed: September 27, 2007

Publication date: April 15, 2010

Inventors: Fengqin Li, Yadong Wu, Qinqtao Yang, Chen Chen
AUTOMATIC SPEECH RECOGNITION SYSTEM AND METHOD FOR AIRCRAFT

Publication number: 20100030400

Abstract: A system and method which implement automatic speech recognition (ASR) and text-to-speech (TTS) programs to permit pilots, co-pilots, and other persons to more quickly and easily perform control and monitoring tasks on aircraft. The system may be used to automatically change the frequency of an aircraft radio when a pilot or co-pilot is instructed to do so by ATC.

Type: Application

Filed: July 13, 2006

Publication date: February 4, 2010

Applicant: GARMIN INTERNATIONAL, INC.

Inventors: Joseph L. Komer, Joseph E. Gepner, Charles Gregory Sherwood
Apparatus and method for speech utterance verification

Publication number: 20100004931

Abstract: An apparatus is provided for speech utterance verification. The apparatus is configured to compare a first prosody component from a recorded speech with a second prosody component for a reference speech. The apparatus determines a prosodic verification evaluation for the recorded speech utterance in dependence of the comparison.

Type: Application

Filed: September 15, 2006

Publication date: January 7, 2010

Inventors: Bin Ma, Haizhou Li, Minghui Dong
METHODS, SYSTEMS, AND COMPUTER PROGRAM PRODUCTS FOR SPOKEN LANGUAGE GRAMMAR EVALUATION

Publication number: 20090070111

Abstract: A method, system, and computer program product for spoken language grammar evaluation are provided. The method includes playing a recorded question to a candidate, recording a spoken answer from the candidate, and converting the spoken answer into text. The method further includes comparing the text to a grammar database, calculating a spoken language grammar evaluation score based on the comparison, and outputting the spoken language grammar evaluation score.

Type: Application

Filed: March 26, 2008

Publication date: March 12, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Rajni Bajaj, Sreeram V. Balakrishnan, Mridula Bhandari, Lyndon J. D'Silva, Sandeep Jindal, Pooja Kumar, Nitendra Rajput, Ashish Verma
METHOD AND SYSTEM FOR THE AUTOMATIC GENERATION OF SPEECH FEATURES FOR SCORING HIGH ENTROPY SPEECH

Publication number: 20080249773

Abstract: A method and system for automatically generating a scoring model for scoring a speech sample are disclosed. One or more training speech samples are received in response to a prompt. One or more speech features are determined for each of the training speech samples. A scoring model is then generated based on the speech features. At least one of the training speech samples may be a high entropy speech sample. An evaluation speech sample is received and a score is assigned to the evaluation speech sample using the scoring model. The evaluation speech sample may be a high entropy speech sample.

Type: Application

Filed: June 16, 2008

Publication date: October 9, 2008

Inventors: Isaac Bejar, Klaus Zechner
Apparatus And Method To Reduce Recognition Errors Through Context Relations Among Dialogue Turns

Publication number: 20080215320

Abstract: Disclosed is directed an apparatus and method to reduce recognition errors through context relations among multiple dialogue turns. The apparatus includes a rule set storage unit having a rule set containing one or more rules, an evolutionary rule generation module connected to the rule storage unit, and a rule trigger unit connected to the rule storage unit. The rule set uses dialogue turn as a unit for the information described by each rule. The method analyzes a dialogue history through an evolutionary massive parallelism approach to get a rule set describing the context relation among dialogue turns. Based on the rule set and recognition result of an ASR system, it reevaluates the recognition result, and measures the confidence measure of the reevaluated recognition result. After each successful dialogue turn, the rule set is dynamically adapted.

Type: Application

Filed: August 1, 2007

Publication date: September 4, 2008

Inventors: Hsu-Chih Wu, Ching-Hsien Lee
Method of Transmitting End-of-Speech Marks in a Speech Recognition System

Publication number: 20080120104

Abstract: A method of transmitting end-of-speech marks in a distributed speech recognition system operating in a discontinuous transmission mode, in which system speech segments (30, 40) are transmitted, followed by periods (34) of silence, each speech segment (30, 40) terminating with an end-of-speech mark (31, 41). The end-of-speech mark (31) is retransmitted continually (31a, 31b, 31c, 31d) throughout the duration of the period of silence (34) following said speech segment (30).

Type: Application

Filed: December 28, 2005

Publication date: May 22, 2008

Inventor: Alexandre Ferrieux
METHOD AND APPARATUS FOR SPEAKER RECOGNITION

Publication number: 20080010065

Abstract: A method and apparatus for speaker recognition is provided. One embodiment of a method for determining whether a given speech signal is produced by an alleged speaker, where a plurality of statistical models (including at least one support vector machine) have been produced for the alleged speaker based on a previous speech signal received from the alleged speaker, includes receiving the given speech signal, the speech signal representing an utterance made by a speaker claiming to be the alleged speaker, scoring the given speech signal using at least two modeling systems, where at least one of the modeling systems is a support vector machine, combining scores produced by the modeling systems, with equal weights, to produce a final score, and determining, in accordance with the final score, whether the speaker is likely the alleged speaker.

Type: Application

Filed: June 5, 2007

Publication date: January 10, 2008

Inventors: Harry BRATT, Luciana Ferrer, Martin Graciarena, Sachin Kajarekar, Elizabeth Shriberg, Mustafa Sonmez, Andreas Stolcke, Gokhan Tur, Anand Venkataraman