Constructional Details Of Speech Recognition Systems (epo) Patents (Class 704/E15.046)

E Subclasses

Distributed recognition, e.g., in client-server systems for mobile phones or network applications, etc. (epo) (Class 704/E15.047)

Memory allocation or algorithm optimization to reduce hardware requirements (epo) (Class 704/E15.048)

Multiple recognizers used in sequence or in parallel; corresponding voting or score combination systems (epo) (Class 704/E15.049)

Recognizers for parallel processing (epo) (Class 704/E15.05)

Speech recognition method and appratus using weighted scores

Patent number: 11935516

Abstract: A speech recognition method and apparatus are disclosed. The speech recognition method includes determining a first score of candidate texts based on an input speech, determining a weight for an output of a language model based on the input speech, applying the weight to a second score of the candidate texts output from the language model to obtain a weighted second score, selecting a target candidate text from among the candidate texts based on the first score and the weighted second score corresponding to the target candidate text, and determining the target candidate text to correspond to a portion of the input speech.

Type: Grant

Filed: July 20, 2021

Date of Patent: March 19, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventor: Jihyun Lee
Automatic geological formations tops picking using dynamic time warping (DTW)

Patent number: 11914099

Abstract: Systems and methods include a method for predicting geological formation tops. First well log data associated with a key master well is received. Formation data identifying tops of formations confirmed in the key master well is received. Merged key master well and formation data is generated in a dynamic time warping (DTW)-readable format by merging the first well log data with the formation data. Second well log data associated with a training well located in geographic proximity to the key master well is received. The second well log data is formatted into the DTW-readable format. A DTW function is executed to generate indices associated with the formation tops. The DTW function uses the merged key master well and formation data and the formatted second well log data as DTW function inputs. Predicted geological formation tops for the training well are predicted using the generated indexes.

Type: Grant

Filed: November 1, 2019

Date of Patent: February 27, 2024

Assignee: Saudi Arabian Oil Company

Inventors: Matter J. Alshammery, Nazih F. Najjar
On-board agent system, on-board agent system control method, and storage medium

Patent number: 11508370

Abstract: An on-board agent system includes: a plurality of agent functional units, each of the plurality of agent functional units being configured to provide a service including outputting a response using voice to an output unit according to an utterance of an occupant of a vehicle; and a common operator configured to be shared by the plurality of agent functional units and provided in the vehicle, wherein, when an operation is executed on the common operator with an operation pattern set to correspond to each of the plurality of agent functional units, an agent functional unit corresponding to the operation pattern of the executed operation is activated.

Type: Grant

Filed: March 3, 2020

Date of Patent: November 22, 2022

Assignee: HONDA MOTOR CO., LTD.

Inventors: Sawako Furuya, Yoshifumi Wagatsuma, Hiroki Nakayama, Kengo Naiki, Yusuke Oi
Method and System for Real-Time Keyword Spotting for Speech Analytics

Publication number: 20140025379

Abstract: A system and method are presented for real-time speech analytics in the speech analytics field. Real time audio is fed along with a keyword model, into a recognition engine. The recognition engine computes the probability of the audio stream data matching keywords in the keyword model. The probability is compared to a threshold where the system determines if the probability is indicative of whether or not the keyword has been spotted. Empirical metrics are computed and any false alarms are identified and rejected. The keyword may be reported as found when it is deemed not to be a false alarm and passes the threshold for detection.

Type: Application

Filed: July 20, 2012

Publication date: January 23, 2014

Applicant: INTERACTIVE INTELLIGENCE, INC.

Inventors: Aravind Ganapathiraju, Ananth Nagaraja Iyer
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM

Publication number: 20130124209

Abstract: An information processing apparatus includes: a plurality of information input units; an event detection unit that generates event information including estimated position information and estimated identification information of users present in the real space based on analysis of the information from the information input unit; and an information integration processing unit that inputs the event information, and generates target information including a position of each user and user identification information based on the input event information, and signal information representing a probability value of the event generation source, wherein the information integration processing unit includes an utterance source probability calculation unit, and wherein the utterance source probability calculation unit performs a process of calculating an utterance source score as an index value representing an utterance source probability of each target by multiplying weights based on utterance situations by a plurality of d

Type: Application

Filed: November 6, 2012

Publication date: May 16, 2013

Applicant: Sony Corporation

Inventor: Sony Corporation
ELECTRONIC DEVICE, AND CONTROL METHOD AND STORAGE MEDIUM STORING CONTROL PROGRAM

Publication number: 20120315957

Abstract: There is provided an electronic device that can execute a function using characters inputted during a telephone call, and a control method and a control program thereof. An application control unit inputs a character as an input character using an input control unit in a state where a call with a predetermined communication counterpart is continuing using a communication unit. When a predetermined function is selected after the input character is inputted with the input control unit, the application control unit executes a predetermined function in a state where the input character is inputted.

Type: Application

Filed: June 8, 2012

Publication date: December 13, 2012

Applicant: KYOCERA Corporation

Inventor: Hiroshi KAMIKUBO
Computer-Implemented Systems and Methods for Evaluating Prosodic Features of Speech

Publication number: 20120245942

Abstract: Systems and methods are provided for scoring speech. A speech sample is received, where the speech sample is associated with a script. The speech sample is aligned with the script. An event recognition metric of the speech sample is extracted, and locations of prosodic events are detected in the speech sample based on the event recognition metric. The locations of the detected prosodic events are compared with locations of model prosodic events, where the locations of model prosodic events identify expected locations of prosodic events of a fluent, native speaker speaking the script. A prosodic event metric is calculated based on the comparison, and the speech sample is scored using a scoring model based upon the prosodic event metric.

Type: Application

Filed: March 20, 2012

Publication date: September 27, 2012

Inventors: Klaus Zechner, Xiaoming Xi
DEVICE FOR RECONSTRUCTING SPEECH BY ULTRASONICALLY PROBING THE VOCAL APPARATUS

Publication number: 20120232894

Abstract: The invention provides a portable device for recognizing and/or reconstructing speech by ultrasound probing of the vocal apparatus, the device including at least one ultrasound transducer (20) for generating an ultrasound wave and for receiving a wave reflected by the user's vocal apparatus, and analysis means for analyzing a signal generated by the ultrasound transducer, wherein the device includes locating means (21, 23) for determining the position of the ultrasound transducer relative to the skull of the user.

Type: Application

Filed: September 15, 2010

Publication date: September 13, 2012

Applicants: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE, UNIVERSITE PIERRE ET MARIE CURIE (PARIS 6)

Inventors: Thomas Hueber, Bruce Denby, Gérard Dreyfus, Rémi Dubois, Perrie Roussel
METHOD AND APPARATUS FOR PREDICTING WORD ACCURACY IN AUTOMATIC SPEECH RECOGNITION SYSTEMS

Publication number: 20120221337

Abstract: The invention comprises a method and apparatus for predicting word accuracy. Specifically, the method comprises obtaining an utterance in speech data where the utterance comprises an actual word string, processing the utterance for generating an interpretation of the actual word string, processing the utterance to identify at least one utterance frame, and predicting a word accuracy associated with the interpretation according to at least one stationary signal-to-noise ratio and at least one non-stationary signal to noise ratio, wherein the at least one stationary signal-to-noise ratio and the at least one non-stationary signal to noise ratio are determined according to a frame energy associated with each of the at least one utterance frame.

Type: Application

Filed: May 7, 2012

Publication date: August 30, 2012

Inventors: Mazin Gilbert, Hong Kook Kim
POSITION-DEPENDENT PHONETIC MODELS FOR RELIABLE PRONUNCIATION IDENTIFICATION

Publication number: 20120191456

Abstract: A representation of a speech signal is received and is decoded to identify a sequence of position-dependent phonetic tokens wherein each token comprises a phone and a position indicator that indicates the position of the phone within a syllable.

Type: Application

Filed: February 1, 2012

Publication date: July 26, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Peng Liu, Yu Shi, Frank Kao-ping Soong
SPEECH RECOGNITION APPARATUS AND METHOD

Publication number: 20120095766

Abstract: A speech recognition apparatus is provided. The speech recognition apparatus includes a primary speech recognition unit configured to perform speech recognition on input speech and thus to generate word lattice information, a word string generation unit configured to generate one or more word strings based on the word lattice information, a language model score calculation unit configured to calculate bidirectional language model scores of the generated word strings selectively using forward and backward language models for each of words in each of the generated word strings, and a sentence output unit configured to output one or more of the generated word strings with high scores as results of the speech recognition of the input speech based on the calculated bidirectional language model scores.

Type: Application

Filed: May 24, 2011

Publication date: April 19, 2012

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Ick-Sang Han, Chi-Youn Park, Jeong-Su Kim, Jeong-Mi Cho
SPEECH PROCESSING SYSTEM AND METHOD

Publication number: 20120041764

Abstract: A speech processing method, comprising: receiving a speech input which comprises a sequence of feature vectors; determining the likelihood of a sequence of words arising from the sequence of feature vectors using an acoustic model and a language model, comprising: providing an acoustic model for performing speech recognition on an input signal which comprises a sequence of feature vectors, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to a feature vector, wherein said speech input is a mismatched speech input which is received from a speaker in an environment which is not matched to the speaker or environment under which the acoustic model was trained; and adapting the acoustic model to the mismatched speech input, the speech processing method further comprising determining the likelihood of a sequence of features occurring in a given language using a language model; and combining the likelihoods determined by the acoustic

Type: Application

Filed: August 10, 2011

Publication date: February 16, 2012

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Haitian XU, Kean Kheong Chin, Mark John Francis Gales
Systems and Methods for Assessment of Non-Native Speech Using Vowel Space Characteristics

Publication number: 20120016672

Abstract: Computer-implemented systems and methods are provided for assessing non-native speech proficiency. A non-native speech sample is processed to identify a plurality of vowel sound boundaries in the non-native speech sample. Portions of the non-native speech sample are analyzed within the vowel sound boundaries to extract vowel characteristics. The vowel characteristics are used to identify a plurality of vowel space metrics for the non-native speech sample, and the vowel space metrics are used to determine a non-native speech proficiency score for the non-native speech sample.

Type: Application

Filed: July 14, 2011

Publication date: January 19, 2012

Inventors: Lei Chen, Keelan Evanini, Xie Sun
USING PITCH DURING SPEECH RECOGNITION POST-PROCESSING TO IMPROVE RECOGNITION ACCURACY

Publication number: 20110144987

Abstract: A method of automated speech recognition in a vehicle. The method includes receiving audio in the vehicle, pre-processing the received audio to generate acoustic feature vectors, decoding the generated acoustic feature vectors to produce at least one speech hypothesis, and post-processing the at least one speech hypothesis using pitch to improve speech recognition accuracy. The speech hypothesis can be accepted as recognized speech during post-processing if pitch is present in the received audio. Alternatively, a pitch count for the received audio can be determined, N-best speech hypotheses can be post-processed by comparing the pitch count to syllable counts associated with the speech hypotheses, and the speech hypothesis having a syllable count equal to the pitch count can be accepted as recognized speech.

Type: Application

Filed: December 10, 2009

Publication date: June 16, 2011

Applicant: GENERAL MOTORS LLC

Inventors: Xufang Zhao, Uma Arun
TERMINAL AND METHOD FOR EFFICIENT USE AND IDENTIFICATION OF PERIPHERALS HAVING AUDIO LINES

Publication number: 20100235168

Abstract: A communication system comprises a terminal configured for being able to communicate with a computer and to operate according to at least one operational parameter. A peripheral device for use with the terminal has a characterizing parameter associated therewith. The terminal is operable for reading the characterizing parameter from the peripheral device when the device is coupled to the terminal. The terminal is further operable for configuring itself to operate according to an operational parameter associated with the characterizing parameter of the peripheral device.

Type: Application

Filed: May 21, 2010

Publication date: September 16, 2010

Inventors: Mark David Murawski, Ryan Anthony Zoschg, James Randall Logan, Roger Graham Byford, Lawrence R. Sweeney, Douglas Mark Zatezalo
Markup language-based selection and utilization of recognizers for utterance processing

Publication number: 20100185447

Abstract: Embodiments are provided for selecting and utilizing multiple recognizers to process an utterance based on a markup language document. The markup language document and an utterance are received in a computing device. One or more recognizers are selected from among the multiple recognizers for returning a results set for the utterance based on markup language in the markup language document. The results set is received from the one or more selected recognizers in a format determined by a processing method specified in the markup language document. An event is then executed on the computing device in response to receiving the results set.

Type: Application

Filed: January 22, 2009

Publication date: July 22, 2010

Applicant: Microsoft Corporation

Inventors: Andrew K. Krumel, Pierre-Alexandre F. Masse, Joseph A. Ruff
SIGNAL PROCESSING APPARATUS AND METHOD OF RECOGNIZING A VOICE COMMAND THEREOF

Publication number: 20100179812

Abstract: Provided are an apparatus and method for recognizing voice commands, the apparatus including: a voice command recognition unit which recognizes an input voice command; a voice command recognition learning unit which learns a recognition-targeted voice command; and a controller which controls the voice command recognition unit to recognize the recognition-targeted voice command from an input voice command, controls the voice command recognition learning unit to learn the input voice command if the voice command recognition is unsuccessful, and performs a particular operation corresponding to the recognized voice command if the voice command recognition is successful.

Type: Application

Filed: September 2, 2009

Publication date: July 15, 2010

Applicant: Samsung Electronics Co., Ltd.

Inventors: Jong-hyuk Jang, Seung-kwon Park, Jong-ho Lea
VOICE RECOGNITION SEARCH APPARATUS AND VOICE RECOGNITION SEARCH METHOD

Publication number: 20100076763

Abstract: A voice recognition search apparatus includes: a dictionary create unit creating a first voice recognition dictionary from a search subject data; a voice acquisition unit acquiring first and second voices; a voice recognition unit creating first and second text data by recognizing the first and second voices using the first and second voice recognition dictionaries; a first search unit searching the search subject data by the first text data; and a second search unit searching a search result of the first search unit by the second text data.

Type: Application

Filed: September 15, 2009

Publication date: March 25, 2010

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Kazushige Ouchi, Miwako Doi
Method for Creating a Speech Model

Publication number: 20100070278

Abstract: A transformation can be derived which would represent that processing required to convert a male speech model to a female speech model. That transformation is subjected to a predetermined modification, and the modified transformation is applied to a female speech model to produce a synthetic children's speech model. The male and female models can be expressed in terms of a vector representing key values defining each speech model and the derived transformation can be in the form of a matrix that would transform the vector of the male model to the vector of the female model. The modification to the derived matrix comprises applying an exponential p which has a value greater than zero and less than 1.

Type: Application

Filed: September 12, 2008

Publication date: March 18, 2010

Inventors: Andreas Hagen, Bryan Peltom, Kadri Hacioglu
Voice Recognition Apparatus

Publication number: 20090222265

Abstract: A voice recognition apparatus 10 includes a voice recognition means 12 for performing voice recognition, and a control means for controlling receipt of a voice input to the voice recognition means, and for performing recognition according to a result of the voice recognition acquired by the voice recognition means. In this voice recognition apparatus, the control means controls the receipt of a voice according to a timeout time which defines the end of the receipt of a voice. The voice recognition apparatus further includes an environmental condition detecting means 18 for detecting an environmental condition, and a timeout time control means 16 for changing the timeout time according to the environmental condition detected by the environmental condition detection means.

Type: Application

Filed: September 13, 2006

Publication date: September 3, 2009

Inventors: Ryo Iwamiya, Reiko Okada
Method of adapting a neural network of an automatic speech recognition device

Publication number: 20090216528

Abstract: A method of adapting a neural network of an automatic speech recognition device, includes the steps of: providing a neural network including an input stage, an intermediate stage and an output stage, the output stage outputting phoneme probabilities; providing a linear stage in the neural network; and training the linear stage by means of an adaptation set; wherein the step of providing the linear stage includes the step of providing the linear stage after the intermediate stage.

Type: Application

Filed: June 1, 2005

Publication date: August 27, 2009

Inventors: Roberto Gemello, Franco Mana
APPARATUS, METHOD, AND PROGRAM FOR CLUSTERING PHONEMIC MODELS

Publication number: 20090177472

Abstract: A node initializing unit generates a root node including inputted phonemic models. A candidate generating unit generates candidates of a pair of child sets by partitioning a set of phonemic models included in a node having no child node into two. A candidate deleting unit deletes candidates each including only phonemic models attached with determination information indicating that at least one of the child sets has a small amount of speech data for training. A similarity calculating unit calculates a sum of similarities among the phonemic models included in the child sets. A candidate selecting unit selects one of the candidates having a largest sum. A node generating unit generates two nodes including the two child sets included in the selected candidate, respectively. A clustering unit clusters the phonemic models in units of phonemic model sets each included in a node.

Type: Application

Filed: September 22, 2008

Publication date: July 9, 2009

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventor: Masaru Sakai
HUMAN SPEECH RECOGNITION APPARATUS AND METHOD

Publication number: 20090157403

Abstract: A speech recognition apparatus generates a feature vector series corresponding to a speech signal, and recognizes a phoneme series corresponding to the feature vector series using sounds corresponding to phonemes and a phoneme language model. In addition, the speech recognition apparatus recognizes vocabulary that corresponds to the recognized phoneme series. At this time, the phoneme language model represents connection relationships between the phonemes, and is modeled according to time-variant characteristics of the phonemes.

Type: Application

Filed: December 12, 2008

Publication date: June 18, 2009

Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITIUTE

Inventors: Hoon CHUNG, Yunkeun Lee
USE OF LEXICAL TRANSLATIONS FOR FACILITATING SEARCHES

Publication number: 20090132233

Abstract: A translation graph is created using a plurality of reference sources that include translations between a plurality of different languages. Each entry in a source is used to create a wordsense entry, and each new word in a source is used to create a wordnode entry. A pair of wordnode and wordsense entries corresponds to a translation. In addition, a probability is determined for each wordsense entry and is decreased for each translation entry that includes more than a predefined number of translations into the same language. Bilingual translation entries are removed if subsumed by a multilingual translation entry. Triangulation is employed to identify pairs of common wordsense translations between a first, second, and third language. Translations not found in reference sources can also be inferred from the data comprising the translation graph. The translation graph can then be used for searches of a data collection in different languages.

Type: Application

Filed: November 21, 2007

Publication date: May 21, 2009

Applicant: University of Washington

Inventors: Oren Etzioni, Kobi Reiter, Marcus Sammer, Michael Schmitz, Stephen Soderland
SYSTEM AND METHOD FOR IMPROVING INTERACTION WITH A USER THROUGH A DYNAMICALLY ALTERABLE SPOKEN DIALOG SYSTEM

Publication number: 20090112590

Abstract: Disclosed are systems and methods for dynamically interacting with a user through a spoken dialogue system. A method includes the steps of (1) receiving a user utterance, (2) analyzing the user utterance for a threshold determination of dialect, (3) generating a response that reflects an incremental implementation of the dialect, (4) further varying the perceived implementation of the dialect in subsequent responses by a process of: (a) receiving a subsequent user utterance, (b) determining a modified level of confidence in the dialect based at least in part from the subsequent utterance, (c) generating a subsequent response that implements an incremental variation according to the modified level of confidence.

Type: Application

Filed: October 30, 2007

Publication date: April 30, 2009

Applicant: AT&T Corp.

Inventors: Gregory Pulz, Harry E. Blanchard, Steven H. Lewis, Lan Zhang
METHOD AND APPARATUS FOR RECOGNIZING A SPEAKER IN LAWFUL INTERCEPTION SYSTEMS

Publication number: 20090043573

Abstract: A method and apparatus for identifying a speaker within a captured audio signal from a collection of known speakers. The method and apparatus receive or generate voice representations for each known speakers and tag the representations according to meta data related to the known speaker or to the voice. The representations are grouped into one or more groups according to the indices. When a voice to be recognized is introduced, characteristics are determined according to which the groups are prioritized, so that the representations participating only in part of the groups are matched against the o voice to be identified, thus reducing identification time and improving the statistical significance.

Type: Application

Filed: August 9, 2007

Publication date: February 12, 2009

Applicant: NICE SYSTEMS LTD.

Inventors: Adam WEINBERG, Irit OPHER, Eyal BENAROYA, Renan GUTMAN
CONCEPT MONITORING IN SPOKEN-WORD AUDIO

Publication number: 20080319750

Abstract: Monitoring a spoken-word audio stream for a relevant concept is disclosed. A speech recognition engine may recognize a plurality of words from the audio stream. Function words that do not indicate content may be removed from the plurality of words. A concept may be determined from at least one word recognized from the audio stream. The concept may be determined via a morphological normalization of the plurality of words. The concept may be associated with a time related to when the at least one word was spoken. A relevance metric may be computed for the concept. Computing the relevance metric may include assessing the temporal frequency of the concept within the audio stream. The relevance metric for the concept may be based on respective confidence scores of the at least one word. The concept, time, and relevance metric may be displayed in a graphical display.

Type: Application

Filed: June 20, 2007

Publication date: December 25, 2008

Applicant: Microsoft Corporation

Inventors: Stephen Frederick Potter, Tal Saraf, David Gareth Ollason, Steve Sung-Nam Chang
Speech Recognition Device Using Statistical Language Model

Publication number: 20080300881

Abstract: [Object] To provide recognition of natural speech for a speech application in a grammar method with little effort and cost.

Type: Application

Filed: June 20, 2008

Publication date: December 4, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Hiroaki Kashima, Yoshinori Tahara, Daisuke Tomoda
Communications Using Different Modalities

Publication number: 20080249778

Abstract: Communications between users of different modalities are enabled by a single integrated platform that allows both the input of voice (from a telephone, for example) to be realized as text (such as an interactive text message) and allows the input of text (from the interactive text messaging application, for example) to be realized as voice (on the telephone). Real-time communication may be enabled between any permutation of any number of text devices (desktop, PDA, mobile telephone) and voice devices (mobile telephone, regular telephone, etc.). A call to a text device user may be initiated by a voice device user or vice versa.

Type: Application

Filed: April 3, 2007

Publication date: October 9, 2008

Applicant: Microsoft Corporation

Inventors: William F. Barton, Francisco M. Galanes, Lawrence M. Ockene, Anand Ramakrishna, Tal Saraf
Methods and Apparatus for Generating Dialog State Conditioned Language Models

Publication number: 20080215329

Abstract: Techniques are provided for generating improved language modeling. Such improved modeling is achieved by conditioning a language model on a state of a dialog for which the language model is employed. For example, the techniques of the invention may improve modeling of language for use in a speech recognizer of an automatic natural language based dialog system. Improved usability of the dialog system arises from better recognition of a user's utterances by a speech recognizer, associated with the dialog system, using the dialog state-conditioned language models. By way of example, the state of the dialog may be quantified as: (i) the internal state of the natural language understanding part of the dialog system; or (ii) words in the prompt that the dialog system played to the user.

Type: Application

Filed: March 28, 2008

Publication date: September 4, 2008

Applicant: International Business Machines Corporation

Inventors: Satyanarayana Dharanipragada, Michael Daniel Monkowski, Harry W. Printz, Karthik Visweswariah
Data Modelling of Class Independent Recognition Models

Publication number: 20080126094

Abstract: A method, system, and computer program for generating a recognition model set. A technique is described to take advantage of the logarithm likelihood of real data for cross entropy to measure the mismatch between a training data and a training data derived model, and compare such type of mismatches between class dependent models and class independent model for evidence of model replacement. By using change of cross entropies in the decision of adding class independent Gaussian Mixture Models (GMMs), the good performance of class dependent models is largely retained, while decreasing the size and complexity of the model.

Type: Application

Filed: July 10, 2007

Publication date: May 29, 2008

Inventors: Eric W. Janke, Bin Jia