Voice Recognition Patents (Class 704/246)

Preliminary matching (Class 704/247)

Endpoint detection (Class 704/248)

Subportions (Class 704/249)

Specialized models (Class 704/250)

Phone assisted ‘photographic memory’

Patent number: 8775454

Abstract: A system and method for collecting data may include a data collection device to obtain the data from a user, an apparatus for obtaining metadata for each word of the data from the user, an apparatus for obtaining a searchable transcript of the data and a device to store the searchable transcript. The metadata may be date data, time data, name data or location data and the data collection device may include a speech recognition engine to translate speech into searchable words. The speech recognition engine may provide a confidence level corresponding to the translation of the speech into searchable words.

Type: Grant

Filed: July 29, 2008

Date of Patent: July 8, 2014

Inventor: James L. Geer
Speech-based speaker recognition systems and methods

Patent number: 8775179

Abstract: The illustrative embodiments described herein provide systems and methods for authenticating a speaker. In one embodiment, a method includes receiving reference speech input including a reference passphrase to form a reference recording, and receiving test speech input including a test passphrase to form a test recording. The method includes determining whether the test passphrase matches the reference passphrase, and determining whether one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase. The method authenticates the speaker of the test speech input in response to determining that the reference passphrase matches the test passphrase and that one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase.

Type: Grant

Filed: May 6, 2010

Date of Patent: July 8, 2014

Assignee: Senam Consulting, Inc.

Inventor: Serge Olegovich Seyfetdinov
Mobile speech-to-speech interpretation system

Patent number: 8775181

Abstract: Interpretation from a first language to a second language via one or more communication devices is performed through a communication network (e.g. phone network or the internet) using a server for performing recognition and interpretation tasks, comprising the steps of: receiving an input speech utterance in a first language on a first mobile communication device; conditioning said input speech utterance; first transmitting said conditioned input speech utterance to a server; recognizing said first transmitted speech utterance to generate one or more recognition results; interpreting said recognition results to generate one or more interpretation results in an interlingua; mapping the interlingua to a second language in a first selected format; second transmitting said interpretation results in the first selected format to a second mobile communication device; and presenting said interpretation results in a second selected format on said second communication device.

Type: Grant

Filed: July 2, 2013

Date of Patent: July 8, 2014

Assignee: Fluential, LLC

Inventors: Farzad Ehsani, Demitrios Master, Elaine Drom Zuber
Control center for a voice controlled wireless communication device system

Patent number: 8775189

Abstract: A wireless communication device is disclosed that accepts recorded audio data from an end-user. The audio data can be in the form of a command requesting user action. Likewise, the audio data can be converted into a text file. The audio data is reduced to a digital file in a format that is supported by the device hardware, such as a .wav, .mp3, .vnf file, or the like. The digital file is sent via secured or unsecured wireless communication to one or more server computers for further processing. In accordance with an important aspect of the invention, the system evaluates the confidence level of the of the speech recognition process. If the confidence level is high, the system automatically builds the application command or creates the text file for transmission to the communication device.

Type: Grant

Filed: August 9, 2006

Date of Patent: July 8, 2014

Assignee: Nuance Communications, Inc.

Inventors: Stephen S. Burns, Mickey W. Kowitz
Coding/decoding method, system and apparatus

Patent number: 8775166

Abstract: An encoding method includes: extracting core layer characteristic parameters and enhancement layer characteristic parameters of a background noise signal, encoding the core layer characteristic parameters and enhancement layer characteristic parameters to obtain a core layer codestream and an enhancement layer codestream. The disclosure also provides an encoding device, a decoding device and method, an encapsulating method, a reconstructing method, an encoding-decoding system and an encoding-decoding method. By describing the background noise signal with the enhancement layer characteristic parameters, the background noise signal can be processed by using more accurate encoding and decoding method, so as to improve the quality of encoding and decoding the background noise signal.

Type: Grant

Filed: August 14, 2009

Date of Patent: July 8, 2014

Assignee: Huawei Technologies Co., Ltd.

Inventors: Hualin Wan, Libin Zhang
Script compliance and quality assurance based on speech recognition and duration of interaction

Patent number: 8775180

Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script. In yet still further aspects of the invention, the duration of a given interaction can be analyzed, either apart from or in combination with the script compliance analysis above, to seek to identify instances of agent non-compliance, of fraud, or of quality-analysis issues.

Type: Grant

Filed: November 26, 2012

Date of Patent: July 8, 2014

Assignee: West Corporation

Inventors: Mark J. Pettay, Fonda J. Narke
Updating a voice template

Patent number: 8775178

Abstract: Updating a voice template for recognizing a speaker on the basis of a voice uttered by the speaker is disclosed. Stored voice templates indicate distinctive characteristics of utterances from speakers. Distinctive characteristics are extracted for a specific speaker based on a voice message utterance received from that speaker. The distinctive characteristics are compared to the characteristics indicated by the stored voice templates to selected a template that matches within a predetermined threshold. The selected template is updated on the basis of the extracted characteristics.

Type: Grant

Filed: October 27, 2009

Date of Patent: July 8, 2014

Assignee: International Business Machines Corporation

Inventors: Yukari Miki, Masami Noguchi
System and method for processing calls in a call center

Patent number: 8774392

Abstract: A system and method for processing calls in a call center are described. A call session from a caller via a session manager and including incoming text messages of a verbal speech stream is assigned. The incoming text messages are progressively visually presented throughout the call session to a live agent on an agent console operatively coupled to the session manager. The incoming text messages are progressively processed through a customer support scenario interactively monitored and controlled by the live agent via the agent console. The incoming text messages are processed through automated script execution in concert with the live agent. Outgoing text messages are converted into a synthesized speech stream. The synthesized speech stream is sent via the agent console to the caller.

Type: Grant

Filed: June 17, 2013

Date of Patent: July 8, 2014

Assignee: Intellisist, Inc.

Inventors: Gilad Odinak, Alastair Sutherland, William A. Tolhurst
APPARATUS, SYSTEM AND METHOD FOR CALCULATING PASSPHRASE VARIABILITY

Publication number: 20140188468

Abstract: An apparatus, system and method for calculating passphrase variability are disclosed. The passphrase variability value can then be used for generating phonetically rich passwords in text-dependent speaker recognition systems, or for estimating the variability of the input passphrase in text-independent system during the enrolling process in a speech recognition security system.

Type: Application

Filed: December 28, 2012

Publication date: July 3, 2014

Inventors: Dmitry Dyrmovskiy, Mikhail Khitrov
USER PROFILING FOR VOICE INPUT PROCESSING

Publication number: 20140188471

Abstract: This is directed to processing voice inputs received by an electronic device. In particular, this is directed to receiving a voice input and identifying the user providing the voice input. The voice input can be processed using a subset of words from a library used to identify the words or phrases of the voice input. The particular subset can be selected such that voice inputs provided by the user are more likely to include words from the subset. The subset of the library can be selected using any suitable approach, including for example based on the user's interests and words that relate to those interests. For example, the subset can include one or more words related to media items selected by the user for storage on the electronic device, names of the user's contacts, applications or processes used by the user, or any other words relating to the user's interactions with the device.

Type: Application

Filed: March 4, 2014

Publication date: July 3, 2014

Applicant: Apple Inc.

Inventor: Allen P. HAUGHAY
Method and apparatus for voice-enabling an application

Patent number: 8768711

Abstract: A method of voice-enabling an application for command and control and content navigation can include the application dynamically generating a markup language fragment specifying a command and control and content navigation grammar for the application, instantiating an interpreter from a voice library, and providing the markup language fragment to the interpreter. The method also can include the interpreter processing a speech input using the command and control and content navigation grammar specified by the markup language fragment and providing an event to the application indicating an instruction representative of the speech input.

Type: Grant

Filed: June 17, 2004

Date of Patent: July 1, 2014

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Brien H. Muschett
APPARATUS FOR SPEECH RECOGNITION USING MULTIPLE ACOUSTIC MODEL AND METHOD THEREOF

Publication number: 20140180689

Abstract: Disclosed are an apparatus for recognizing voice using multiple acoustic models according to the present invention and a method thereof. An apparatus for recognizing voice using multiple acoustic models includes a voice data database (DB) configured to store voice data collected in various noise environments; a model generating means configured to perform classification for each speaker and environment based on the collected voice data, and to generate an acoustic model of a binary tree structure as the classification result; and a voice recognizing means configured to extract feature data of voice data when the voice data is received from a user, to select multiple models from the generated acoustic model based on the extracted feature data, to parallel recognize the voice data based on the selected multiple models, and to output a word string corresponding to the voice data as the recognition result.

Type: Application

Filed: March 18, 2013

Publication date: June 26, 2014

Applicant: Electronics and Telecommunications Research Institute

Inventor: Electronics and Telecommunications Research Institute
Method of editing a noise-database and computer device

Patent number: 8762138

Abstract: The present invention relates to a method as well as to a computing device (20) for editing a noise-database (13) containing noise information, said noise information being derived from noise signals within an audio stream (19). In order to enhance possibilities to create and utilize context information which emerge from tracking noise signals from an audio stream, for example a telephone call, the above method is characterized by the following steps: A) in a localizing step (14), determining geographical data of the location the noise signals origin from; B) in an analyzing step (15), analyzing the noise signals with reference to the noise content; C) in a linking step, linking the analyzed noise signals to said geographical data to create noise information; D) in a storing step, storing said noise information within said noise-database (13).

Type: Grant

Filed: August 30, 2010

Date of Patent: June 24, 2014

Assignee: Vodafone Holding GmbH

Inventors: Stefan Holtel, Jad Noueihed
Speech recognition repair using contextual information

Patent number: 8762156

Abstract: A speech control system that can recognize a spoken command and associated words (such as “call mom at home”) and can cause a selected application (such as a telephone dialer) to execute the command to cause a data processing system, such as a smartphone, to perform an operation based on the command (such as look up mom's phone number at home and dial it to establish a telephone call). The speech control system can use a set of interpreters to repair recognized text from a speech recognition system, and results from the set can be merged into a final repaired transcription which is provided to the selected application.

Type: Grant

Filed: September 28, 2011

Date of Patent: June 24, 2014

Assignee: Apple Inc.

Inventor: Lik Harry Chen
Consonant-segment detection apparatus and consonant-segment detection method

Patent number: 8762147

Abstract: A signal portion is extracted from an input signal for each frame having a specific duration to generate a per-frame input signal. The per-frame input signal in a time domain is converted into a per-frame input signal in a frequency domain, thereby generating a spectral pattern. Subband average energy is derived in each of subbands adjacent one another in the spectral pattern. The subband average energy is compared in at least one subband pair of a first subband and a second subband that is a higher frequency band than the first subband, the first and second subbands being consecutive subbands in the spectral pattern. It is determined that the per-frame input signal includes a consonant segment if the subband average energy of the second subband is higher than the subband average energy of the first subband.

Type: Grant

Filed: February 1, 2012

Date of Patent: June 24, 2014

Assignee: JVC KENWOOD Corporation

Inventors: Akiko Akechi, Takaaki Yamabe
Method for verifying the identity of a speaker and related computer readable medium and computer

Patent number: 8762149

Abstract: The present invention refers to a method for verifying the identity of a speaker based on the speakers voice comprising the steps of: a) receiving a voice utterance; b) using biometric voice data to verify (10) that the speakers voice corresponds to the speaker the identity of which is to be verified based on the received voice utterance; and c) verifying (12, 13) that the received voice utterance is not falsified, preferably after having verified the speakers voice; d) accepting (16) the speakers identity to be verified in case that both verification steps give a positive result and not accepting (15) the speakers identity to be verified if any of the verification steps give a negative result. The invention further refers to a corresponding computer readable medium and a computer.

Type: Grant

Filed: December 10, 2008

Date of Patent: June 24, 2014

Inventors: Marta Sánchez Asenjo, Alfredo Gutiérrez Navarro, Alberto Martín de los Santos de las Heras, Marta García Gomar
System and method using feedback speech analysis for improving speaking ability

Patent number: 8756057

Abstract: A speech analysis system and method for analyzing speech. The system includes: a voice recognition system for converting inputted speech to text; an analytics system for generating feedback information by analyzing the inputted speech and text; and a feedback system for outputting the feedback information.

Type: Grant

Filed: November 2, 2005

Date of Patent: June 17, 2014

Assignee: Nuance Communications, Inc.

Inventors: Steven Michael Miller, Anne R. Sand
Method Of Voice Recognition And Electronic Apparatus

Publication number: 20140163984

Abstract: A method of voice recognition and an electronic apparatus are described with the method of voice recognition being applied in an electronic apparatus. The method includes taking i=1 and detecting corresponding i-th voice sub-information at a moment Ti when the electronic apparatus detects that a user starts to talk at a moment T0, wherein the i-th voice sub-information is corresponding voice information from the moment T0 to the moment Ti, the i-th voice sub-information is partial voice information of voice information with integral semantic corresponding to a moment Tj after the moment T0 to the moment Ti, and i is an integer greater than or equal to 1; and analyzing the i-th voice sub-information to obtain M results of analysis, M being an integer greater than or equal to 1.

Type: Application

Filed: December 10, 2013

Publication date: June 12, 2014

Applicants: Lenovo (Beijing) Co., Ltd., Beijing Lenovo Software Ltd.

Inventors: Haisheng Dai, Qianying Wang, Hao Wang
Multi-Stage Speaker Adaptation

Publication number: 20140163985

Abstract: A first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors that correspond to a first unit of input speech. The first set of feature vectors may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors, which correspond to a second unit of input speech, may be modified based on the first gender-specific speaker adaptation technique. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors, which correspond to a third unit of speech, may be modified based on the first speaker-dependent speaker adaptation technique.

Type: Application

Filed: February 17, 2014

Publication date: June 12, 2014

Applicant: Google Inc.

Inventors: Petar Aleksic, Xin Lei
Method and system for enabling a device function of a vehicle

Patent number: 8751241

Abstract: The current invention provides a method and system for enabling a device function of a vehicle. A speech input stream is received at a telematics unit. A speech input context is determined for the received speech input stream. The received speech input stream is processed based on the determination and the device function of the vehicle is enabled responsive to the processed speech input stream. A vehicle device in control of the enabled device function of the vehicle is directed based on the processed speech input stream. A computer usable medium with suitable computer program code is employed for enabling a device function of a vehicle.

Type: Grant

Filed: April 10, 2008

Date of Patent: June 10, 2014

Assignee: General Motors LLC

Inventors: Christopher L. Oesterling, William E. Mazzara, Jr., Jeffrey M. Stefan
Method for voice recognition

Patent number: 8751145

Abstract: A voice recognition method that is used for finding a street uses a database including information about a plurality of streets. The streets are characterized by respective street names and street types. A user provides a voice input for the street that the user tries to find. The voice input includes a street name and a street type. The street type is recognized by processing the voice input. Streets having the recognized street type are then selected from the database and a street name of at least one of the streets selected from the database is recognized by processing the voice input.

Type: Grant

Filed: November 30, 2005

Date of Patent: June 10, 2014

Assignees: Volkswagen of America, Inc., Audi AG

Inventors: Ramon Eduardo Prieto, Carsten Bergmann, William B. Lathrop, M. Kashif Imam, Gerd Gruchalski, Markus Möhrle
Apparatus and method for forming search engine queries based on spoken utterances

Patent number: 8751240

Abstract: A combination and a method are provided. Automatic speech recognition is performed on a received utterance. A meaning of the utterance is determined based, at least in part, on the recognized speech. At least one query is formed based, at least in part, on the determined meaning of the utterance. The at least one query is sent to at least one searching mechanism to search for an address of at least one web page that satisfies the at least one query.

Type: Grant

Filed: May 13, 2005

Date of Patent: June 10, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Steven Hart Lewis, Kenneth H. Rosen
System and method for automatic call segmentation at call center

Patent number: 8750489

Abstract: A system and method for automatic call segmentation including steps and means for automatically detecting boundaries between utterances in the call transcripts; automatically classifying utterances into target call sections; automatically partitioning the call transcript into call segments; and outputting a segmented call transcript. A training method and apparatus for training the system to perform automatic call segmentation includes steps and means for providing at least one training transcript with annotated call sections; normalizing the at least one training transcript; and performing statistical analysis on the at least one training transcript.

Type: Grant

Filed: October 23, 2008

Date of Patent: June 10, 2014

Assignee: International Business Machines Corporation

Inventor: Youngja Park
Digital signatures for communications using text-independent speaker verification

Patent number: 8751233

Abstract: A speaker-verification digital signature system is disclosed that provides greater confidence in communications having digital signatures because a signing party may be prompted to speak a text-phrase that may be different for each digital signature, thus making it difficult for anyone other than the legitimate signing party to provide a valid signature.

Type: Grant

Filed: July 31, 2012

Date of Patent: June 10, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Pradeep K. Bansal, Lee Begeja, Carroll W. Creswell, Jeffrey Farah, Benjamin J. Stern, Jay Wilpon
CONVERSATION SYSTEM AND A METHOD FOR RECOGNIZING SPEECH

Publication number: 20140156276

Abstract: A dialogue system which correctly identifies an utterance directed to a dialogue system by using various pieces of information including information other than a voice recognition result without requiring a special signal is provided. A dialogue system includes an utterance detection/voice recognition unit that detects an utterance and recognizes a voice and an utterance feature extraction unit that extracts features of an utterance. The utterance feature extraction unit determines whether or not a target utterance is directed to the dialogue system based on features including a length of the target utterance, time relation between the target utterance and a previous utterance, and a system state.

Type: Application

Filed: May 23, 2013

Publication date: June 5, 2014

Applicant: Honda Motor Co., Ltd.

Inventors: Mikio NAKANO, Kauznori KOMATANI, Akira HIRANO
System and method for generating challenge items for CAPTCHAs

Patent number: 8744850

Abstract: Challenge items for an audible based electronic challenge system are generated using a variety of techniques to identify optimal candidates. The challenge items are intended for use in a computing system that discriminates between humans and text to speech (TTS) system.

Type: Grant

Filed: January 14, 2013

Date of Patent: June 3, 2014

Assignee: John Nicholas and Kristin Gross

Inventor: John Nicholas Gross
Sparse maximum a posteriori (MAP) adaptation

Patent number: 8738376

Abstract: Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.

Type: Grant

Filed: October 28, 2011

Date of Patent: May 27, 2014

Assignee: Nuance Communications, Inc.

Inventors: Vaibhava Goel, Peder A. Olsen, Steven J. Rennie, Jing Huang
Spoken term detection apparatus, method, program, and storage medium

Patent number: 8731926

Abstract: A spoken term detection apparatus includes: processing performed by a processor includes a feature extraction process extracting an acoustic feature from speech data accumulated in an accumulation part and storing an extracted acoustic feature in an acoustic feature storage, a first calculation process calculating a standard score from a similarity between an acoustic feature stored in the acoustic feature storage and an acoustic model stored in the acoustic model storage part, a second calculation process for comparing an acoustic model corresponding to an input keyword with the acoustic feature stored in the acoustic feature storage part to calculate a score of the keyword, and a retrieval process retrieving speech data including the keyword from speech data accumulated in the accumulation part based on the score of the keyword calculated by the second calculation process and the standard score stored in the standard score storage part.

Type: Grant

Filed: March 3, 2011

Date of Patent: May 20, 2014

Assignee: Fujitsu Limited

Inventors: Nobuyuki Washio, Shouji Harada
Solution that integrates voice enrollment with other types of recognition operations performed by a speech recognition engine using a layered grammar stack

Patent number: 8731925

Abstract: The present invention can include a speech enrollment system including an ordered stack of grammars and a recognition engine. The ordered stack of grammars can include an application grammars layer, a confusable grammar layer, a personal grammar layer, a phrase enrolled grammar layer, and an enrollment grammar layer. The recognition engine can return recognition results for speech input by processing the input using the ordered stack of grammars. The processing can occur from the topmost layer in the stack to the bottommost layer in the stack. Each layer in the stack can includes exit criteria based upon a defined condition. When the exit criteria is satisfied, a result can be returned based upon that layer and lower layers of the ordered stack can be ignored.

Type: Grant

Filed: December 22, 2006

Date of Patent: May 20, 2014

Assignee: Nuance Communications, Inc.

Inventors: William V. Da Palma, Brien H. Muschett
Issuing alerts on detection of contents of interest introduced during a conference

Patent number: 8731935

Abstract: A method, system, and computer program product for issuing an alert in response to detecting a content of interest in a conference. A listening logic comprising multiple conference engines monitors speakers, topics, and words spoken during a conference. A speech-to-text engine monitors the conference and records a transcription. A word emphasis engine monitors the transcription for key words. A voice identification engine monitors the live conversation and the recorded transcript, in real time, for a particular individual to begin speaking. An outline engine may create an outline of transcription. The listening device may issue an alert upon detecting a content of interest in the conference. The listening device may additionally display an outline or a selected portion of the transcript regarding a particular content of interest to inform a user of the listening device of a portion of content of the conference that may have been missed.

Type: Grant

Filed: September 10, 2009

Date of Patent: May 20, 2014

Assignee: Nuance Communications, Inc.

Inventors: Timothy R. Chavez, Jacob Daniel Eisinger, Jennifer Elizabeth King, William R. Reichert
Agent architecture for determining meanings of natural language utterances

Patent number: 8731929

Abstract: Systems and methods for receiving natural language queries and/or commands and execute the queries and/or commands. The systems and methods overcomes the deficiencies of prior art speech query and response systems through the application of a complete speech-based information query, retrieval, presentation and command environment. This environment makes significant use of context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for one or more users making queries or commands in multiple domains. Through this integrated approach, a complete speech-based natural language query and response environment can be created. The systems and methods creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context and presenting the expected results for a particular question or command.

Type: Grant

Filed: February 4, 2009

Date of Patent: May 20, 2014

Assignee: VoiceBox Technologies Corporation

Inventors: Robert A. Kennewick, David Locke, Michael R. Kennewick, Sr., Michael R. Kennewick, Jr., Richard Kennewick, Tom Freeman
Method of controlling a system and signal processing system

Patent number: 8731940

Abstract: A method of controlling a system which includes the steps of obtaining at least one signal representative of information communicated by a user via an input device in an environment of the user, wherein a signal from a first source is available in a perceptible form in the environment; estimating at least a point in time when a transition between information flowing from the first source and information flowing from the user is expected to occur; and timing the performance of a function by the system in relation to the estimated time.

Type: Grant

Filed: September 11, 2009

Date of Patent: May 20, 2014

Assignee: Koninklijke Philips N.V.

Inventor: Aki Sakari Harma
DEVICE AND SYSTEM HAVING SMART DIRECTIONAL CONFERENCING

Publication number: 20140136203

Abstract: Some implementations provide a method for identifying a speaker. The method determines position and orientation of a second device based on data from a first device that is for capturing the position and orientation of the second device. The second device includes several microphones for capturing sound. The second device has movable position and movable orientation. The method assigns an object as a representation of a known user. The object has a moveable position. The method receives a position of the object. The position of the object corresponds to a position of the known user. The method processes the captured sound to identify a sound originating from the direction of the object. The direction of the object is relative to the position and the orientation of the second device. The method identifies the sound originating from the direction of the object as belonging to the known user.

Type: Application

Filed: December 21, 2012

Publication date: May 15, 2014

Applicant: QUALCOMM Incorporated

Inventors: Kexi Liu, Pei Xiang
METHODS AND APPARATUS FOR IDENTIFYING FRAUDULENT CALLERS

Publication number: 20140136194

Abstract: The methods, apparatus, and systems described herein are designed to identify fraudulent callers. A voice print of a call is created and compared to known voice prints to determine if it matches one or more of the known voice prints. The methods include a pre-processing step to separate speech from non-speech, selecting a number of elements that affect the voice print the most, and/or computing an adjustment factor based on the scores of each received voice print against known voice prints.

Type: Application

Filed: November 9, 2012

Publication date: May 15, 2014

Applicant: Mattersight Corporation

Inventors: Roger Warford, Douglas Brown, Christopher Danson, David Gustafson
Method and apparatus for element identification in a signal

Patent number: 8725508

Abstract: A computer-implemented method and apparatus for searching for an element sequence, the method comprising: receiving a signal; determining an initial segment of the signal; inputting the initial segment into an element extraction engine to obtain a first element sequence; determining one or more second segments, each of the second segments at least partially overlapping with the initial segment; inputting the second segments into the element extraction engine to obtain at least one second element sequence; and searching for an element subsequence common to at least a predetermined number of sequences of the first element sequence and the second element sequences.

Type: Grant

Filed: March 27, 2012

Date of Patent: May 13, 2014

Assignee: Novospeech

Inventor: Yossef Ben-Ezra
METHOD AND APPARATUS FOR VOICE RECOGNITION

Publication number: 20140129223

Abstract: A method and apparatus for voice recognition are disclosed. The apparatus includes: a voice receiver which receives a user's voice signal; a first voice recognition engine which receives the voice signal and recognizes voice based on the voice signal; a communicator which receives and transmits the voice signal to an external second voice recognition engine; and a controller which transmits the voice signal from the voice receiver to the first voice recognition engine, and in response to the first voice recognition engine being capable of recognizing voice from the voice signal, the controller outputs the voice recognition results of the first voice recognition engine, and in response to the first voice recognition engine being incapable of recognizing voice from the voice signal, the controller controls transmission of the voice signal to the second voice recognition engine through the communicator.

Type: Application

Filed: October 3, 2013

Publication date: May 8, 2014

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor: Eun-sang BAK
Systems and methods for dynamic re-configurable speech recognition

Patent number: 8719017

Abstract: Speech recognition models are dynamically re-configurable based on user information, background information such as background noise and transducer information such as transducer response characteristics to provide users with alternate input modes to keyboard text entry. The techniques of dynamic re-configurable speech recognition provide for deployment of speech recognition on small devices such as mobile phones and personal digital assistants as well environments such as office, home or vehicle while maintaining the accuracy of the speech recognition.

Type: Grant

Filed: May 15, 2008

Date of Patent: May 6, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Richard C Rose, Bojana Gajic
Speaker identification

Patent number: 8719019

Abstract: Speaker identification techniques are described. In one or more implementations, sample data is received at a computing device of one or more user utterances captured using a microphone. The sample data is processed by the computing device to identify a speaker of the one or more user utterances. The processing involving use of a feature set that includes features obtained using a filterbank having filters that space linearly at higher frequencies and logarithmically at lower frequencies, respectively, features that model the speaker's vocal tract transfer function, and features that indicate a vibration rate of vocal folds of the speaker of the sample data.

Type: Grant

Filed: April 25, 2011

Date of Patent: May 6, 2014

Assignee: Microsoft Corporation

Inventors: Hoang T. Do, Ivan J. Tashev, Alejandro Acero, Jason S. Flaks, Robert N. Heitkamp, Molly R. Suver
Generation of voice profiles

Patent number: 8719020

Abstract: Embodiments of the present invention provide systems, methods, and computer-readable media for generating a voice characteristic profile based on detected sound components. In embodiments, a call is initiated between a first caller and a second caller. Information communicated during the call is monitored to determine that sound components have been spoken by the first caller. The sound components are determined to be associated with a language dialect. Further, the sound components are stored in association with the first caller. In particular, the sound components are stored in association with the first caller in a voice characteristic profile of the first caller.

Type: Grant

Filed: January 7, 2013

Date of Patent: May 6, 2014

Assignee: Sprint Communications Company L.P.

Inventors: Mark D. Peden, Simon Youngs, Gary D. Koller, Piyush Jethwa
Robustness to environmental changes of a context dependent speech recognizer

Patent number: 8719023

Abstract: An apparatus to improve robustness to environmental changes of a context dependent speech recognizer for an application, that includes a training database to store sounds for speech recognition training, a dictionary to store words supported by the speech recognizer, and a speech recognizer training module to train a set of one or more multiple state Hidden Markov Models (HMMs) with use of the training database and the dictionary. The speech recognizer training module performs a non-uniform state clustering process on each of the states of each HMM, which includes using a different non-uniform cluster threshold for at least some of the states of each HMM to more heavily cluster and correspondingly reduce a number of observation distributions for those of the states of each HMM that are less empirically affected by one or more contextual dependencies.

Type: Grant

Filed: May 21, 2010

Date of Patent: May 6, 2014

Assignee: Sony Computer Entertainment Inc.

Inventors: Xavier Menendez-Pidal, Ruxin Chen
Speech analytics system and system and method for determining structured speech

Patent number: 8719016

Abstract: A method for converting speech to text in a speech analytics system is provided. The method includes receiving audio data containing speech made up of sounds from an audio source, processing the sounds with a phonetic module resulting in symbols corresponding to the sounds, and processing the symbols with a language module and occurrence table resulting in text. The method also includes determining a probability of correct translation for each word in the text, comparing the probability of correct translation for each word in the text to the occurrence table, and adjusting the occurrence table based on the probability of correct translation for each word in the text.

Type: Grant

Filed: April 7, 2010

Date of Patent: May 6, 2014

Assignee: Verint Americas Inc.

Inventors: Omer Ziv, Ran Achituv, Ido Shapira
Biometric speaker identification

Patent number: 8719018

Abstract: A biometric speaker-identification apparatus is disclosed that generates ordered speaker-identity candidates for a probe based on prototypes. Probe match scores are clustered, and templates that correspond to clusters having top M probe match scores are compared with the prototypes to obtain template-prototype match scores. The probe is also compared with the prototypes, and those templates corresponding to template-prototype match scores that are nearest to probe-prototype match scores are selected as speaker-identity candidates. The speaker-identity candidates are ordered based on their similarity to the probe.

Type: Grant

Filed: October 25, 2010

Date of Patent: May 6, 2014

Assignee: Lockheed Martin Corporation

Inventor: Jonathan J. Dinerstein
VOICE RECOGNITION APPARATUS AND VOICE RECOGNITION METHOD THEREOF

Publication number: 20140122075

Abstract: A voice recognition apparatus is provided. The voice recognition apparatus comprises: a voice receiver which receives a user's voice signal; a first voice recognition engine which receives the voice signal and performs a voice recognition process; a communication unit which receives the voice signal and transmits the voice signal to an external second voice recognition engine; and a controller which transmits the voice signal received through the voice receiver to at least one of the first voice recognition engine and the communication unit.

Type: Application

Filed: August 1, 2013

Publication date: May 1, 2014

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Eun-sang BAK, Myung-jae KIM, Yu LIU, Geo-geun PARK
Voice Command System for Stitchers

Publication number: 20140122076

Abstract: A voice command system for a stitcher includes a tablet device in operative communication with the stitcher; the tablet device further comprising a display screen; a memory; a microprocessor; a communication module; and a microphone; and a speech recognition algorithm operatively communicating with said tablet device. An associated method includes the steps of digitizing a user's spoken command; transmitting the digitized spoken command to the speech recognition algorithm; producing a list of words possibly comprising the spoken command; parsing the list of possible words to identify the spoken command; and initiating execution of the spoken command.

Type: Application

Filed: October 25, 2013

Publication date: May 1, 2014

Applicant: GAMMILL, INC.

Inventors: Theodore Stokes, Joseph W. Bauman
Active Speaker Indicator for Conference Participants

Publication number: 20140118472

Abstract: In one embodiment, a method includes receiving requests to join a conference from a plurality of user devices proximate a first endpoint. The requests include a username. The method also includes receiving an audio signal for the conference from the first endpoint. The first endpoint is operable to capture audio proximate the first endpoint. The method also includes transmitting the audio signal to a second endpoint, remote from the first endpoint. The method also includes identifying, by a processor, an active speaker proximate the first endpoint based on information received from the plurality of user devices.

Type: Application

Filed: October 31, 2012

Publication date: May 1, 2014

Inventors: Yanghua Liu, Weidong Chen, Biren Gandhi, Raghurama Bhat, Joseph Fouad Khouri, John Joseph Houston, Brian Thomas Toombs
METHOD AND SYSTEM OF USER-BASED JAMMING OF MEDIA CONTENT BY AGE CATEGORY

Publication number: 20140122074

Abstract: In one exemplary embodiment, a computer-implemented method includes the step of determining an age group of a first user. Media content available to the first user is identified. It is determined whether the user has permission to listen to the media content. The media content is jammed with a sound wave at a frequency that can be heard by the user when the user does not have permission to listen to the media content. Optionally, a voice age-recognition algorithm to determine the age group of the first user. An age-group of a second user can be determined. The first user and the second user may be proximate to a media player providing the ambient sound stream.

Type: Application

Filed: October 29, 2012

Publication date: May 1, 2014

Inventors: Amit V. Karmarkar, Richard Ross Peters
Pausing a VoiceXML dialog of a multimodal application

Patent number: 8713542

Abstract: Pausing a VoiceXML dialog of a multimodal application, including generating by the multimodal application a pause event; responsive to the pause event, temporarily pausing the dialogue by the VoiceXML interpreter; generating by the multimodal application a resume event; and responsive to the resume event, resuming the dialog. Embodiments are implemented with the multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the VoiceXML interpreter is interpreting the VoiceXML dialog to be paused.

Type: Grant

Filed: February 27, 2007

Date of Patent: April 29, 2014

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., David Jaramillo, Gerald M. McCobb
Controllable prosody re-estimation system and method and computer program product thereof

Patent number: 8706493

Abstract: In one embodiment of a controllable prosody re-estimation system, a TTS/STS engine consists of a prosody prediction/estimation module, a prosody re-estimation module and a speech synthesis module. The prosody prediction/estimation module generates predicted or estimated prosody information. And then the prosody re-estimation module re-estimates the predicted or estimated prosody information and produces new prosody information, according to a set of controllable parameters provided by a controllable prosody parameter interface. The new prosody information is provided to the speech synthesis module to produce a synthesized speech.

Type: Grant

Filed: July 11, 2011

Date of Patent: April 22, 2014

Assignee: Industrial Technology Research Institute

Inventors: Cheng-Yuan Lin, Chien-Hung Huang, Chih-Chung Kuo
Method and device for mnemonic contact image association

Patent number: 8706485

Abstract: The present invention pertains to method and a communication device (100) for associating a contact record pertaining to a remote speaker (220) with a mnemonic image (191) based on attributes of the speaker (220). The method comprises receiving voice data of the speaker (220); in a communication session with a source device (200). A source determination representing the speaker (220) is registered, and then the received voice data is analyzed so that voice data characteristics can be extracted. Based on these voice data characteristics a mnemonic image (191) can be selected, and associated to a contact record in which the source determination is stored. The mnemonic image (191) may be selected among images previously stored in the device, or derived through editing of such images.

Type: Grant

Filed: May 17, 2011

Date of Patent: April 22, 2014

Assignees: Sony Corporation, Sony Mobile Communications AB

Inventor: Joakim Martensson
Methods and apparatus for formant-based voice synthesis

Patent number: 8706488

Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.

Type: Grant

Filed: February 27, 2013

Date of Patent: April 22, 2014

Assignee: Nuance Communications, Inc.

Inventors: Michael D. Edgington, Laurence Gillick, Jordan R. Cohen

prev … 10 11 12 13 14 15 16 17 18 … next