Voice Recognition Patents (Class 704/246)

Preliminary matching (Class 704/247)

Endpoint detection (Class 704/248)

Subportions (Class 704/249)

Specialized models (Class 704/250)

Synchronized voice signature

Patent number: 8510116

Abstract: Methods and systems are disclosed for capturing consumer voice signatures. The methods and systems synchronize voice signatures with display of the terms and conditions of a transaction to the consumer in real time during a phone conversation. In one implementation, mobile devices with text display capability may be used to display the terms/conditions of the transaction while the consumer is talking to a customer service representative or an interactive voice response system. The terms/conditions may be displayed as a scrollable document on the mobile device to which the consumer may then agree during the phone conversation. The consumer may then “voice sign” by reading the displayed terms/conditions, or some portion thereof, during the phone conversation to manifest his/her knowing consent. Such an arrangement helps promotes the use of voice signatures by consumers in a manner that complies with the requirements of the federal E-Sign Act.

Type: Grant

Filed: October 26, 2007

Date of Patent: August 13, 2013

Assignee: United Services Automobile Association (USAA)

Inventors: Curt Wayne Moy, Sakina Hassonjee, Amy Irene Forsythe, Linda Giessel King, Sarah Brooke Severson
Routing communication based on urgency priority level

Patent number: 8509832

Abstract: Methods, systems, devices, and computer program products route communication based on an urgency priority associated with a sender of the communication. The method involves receiving incoming communication, identifying the sender, determining an urgency priority designation associated with communication, and routing the incoming communication according to routing instructions associated with the urgency priority designation. Prior to receiving the incoming communication, the method may further involve receiving a selection to configure routing of communication based on one or more urgency priority designations, rendering urgency priority options and routing options that provide routing instructions, receiving routing instructions associated with each urgency priority designation, and receiving and recording the urgency priority designation associated with the sender.

Type: Grant

Filed: October 1, 2010

Date of Patent: August 13, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Brian Daigle
Automatic creation of complex conversational natural language call routing system for call centers

Patent number: 8509396

Abstract: A call routing system is created by receiving a set of initial target classes and a corresponding set of topic descriptions. Non-overlapping semantic tokens in the set of topic descriptions are identified. A set of clear target classes from the non-overlapping semantic tokens and the initial target classes is identified. Overlapping semantic tokens from the set of topic descriptions are identified. A set of vague classes is identified from the overlapping semantic tokens and the initial target classes. A set of disambiguation dialogues and a set of grammar prompts is generated according to the overlapping and non-overlapping semantic tokens. The call routing system is then created based on the set of clear target classes, the set of vague target classes, and the set of disambiguation dialogues.

Type: Grant

Filed: September 24, 2009

Date of Patent: August 13, 2013

Assignee: International Business Machines Corporation

Inventors: Ea-Ee Jan, Hong-Kwang Jeff Kuo, David M. Lubensky
Identification of people using multiple types of input

Patent number: 8510110

Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.

Type: Grant

Filed: July 11, 2012

Date of Patent: August 13, 2013

Assignee: Microsoft Corporation

Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
System and method for voice recognition

Patent number: 8510103

Abstract: Systems and methods are operable to associate each of a plurality of stored audio patterns with at least one of a plurality of digital tokens, identify a user based on user identification input, access a plurality of stored audio patterns associated with a user based on the user identification input, receive from a user at least one audio input from a custom language made up of custom language elements wherein the elements include at least one monosyllabic representation of a number, letter or word, select one of the plurality of stored audio patterns associated with the identified user, in the case that the audio input received from the identified user corresponds with one of the plurality of stored audio patterns, determine the digital token associated with the selected one of the plurality of stored audio patterns, and generate the output signal for use in a device based on the determined digital token.

Type: Grant

Filed: October 15, 2010

Date of Patent: August 13, 2013

Inventor: Paul Angott
ESTABLISHING A MULTIMODAL PERSONALITY FOR A MULTIMODAL APPLICATION IN DEPENDENCE UPON ATTRIBUTES OF USER INTERACTION

Publication number: 20130204620

Abstract: Establishing a multimodal personality for a multimodal application, including evaluating, by the multimodal application, attributes of a user's interaction with the multimodal application; selecting, by the multimodal application, a vocal demeanor in dependence upon the values of the attributes of the user's interaction with the multimodal application; and incorporating, by the multimodal application, the vocal demeanor into the multimodal application.

Type: Application

Filed: January 23, 2013

Publication date: August 8, 2013

Applicant: Nuance Communications, Inc.

Inventor: Nuance Communications, Inc.
Voice Detection For Automated Communication System

Publication number: 20130204607

Abstract: A system implements voice detection using a receiver, a voice analyzer, and a voice identifier. The receiver receives a transmission from a transmission channel associated with a channel identification. The transmission includes a voice input. The voice analyzer analyzes the voice input and generates a plurality of voice metrics according to a plurality of analysis parameters. The voice identifier compares the voice metrics to one or more stored sets of voice metrics. Each set of voice metrics corresponds to a voice identification associated with the channel identification. The voice identifier also identifies a match between the voice metrics from the voice analyzer and at least one of the stored sets of voice metrics.

Type: Application

Filed: March 15, 2013

Publication date: August 8, 2013

Applicant: Forrest S. Baker III Trust

Inventor: Forrest S. Baker III Trust
Methods and Systems for Dictation and Transcription

Publication number: 20130204618

Abstract: Automated delivery and filing of transcribed material prepared from dictated audio files into a central record-keeping system are presented. A user dictates information from any location, uploads that audio file to a transcriptionist to be transcribed, and the transcribed material is automatically delivered into a central record keeping system, filed with the appropriate client or matter file, and the data stored in the designated appropriate fields within those client or matter files. Also described is the recordation of meetings from multiple sources using mobile devices and the detection of the active or most prominent speaker at given intervals in the meeting. Further, text boxes on websites are completed using an audio recording application and offsite transcription.

Type: Application

Filed: January 22, 2013

Publication date: August 8, 2013

Applicant: SpeakWrite, LLC

Inventor: SpeakWrite, LLC
System and method for detecting synthetic speaker verification

Patent number: 8504365

Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.

Type: Grant

Filed: April 11, 2008

Date of Patent: August 6, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventor: Horst Schroeter
Multi-cursor transcription editing

Patent number: 8504369

Abstract: A device, for use by a transcriptionist in a transcription editing system for editing transcriptions dictated by speakers, includes, in combination, a monitor configured to display visual text of transcribed dictations, an audio mechanism configured to cause playback of portions of an audio file associated with a dictation, and a cursor-control module coupled to the audio mechanism and to the monitor and configured to cause the monitor to display multiple cursors in the text.

Type: Grant

Filed: June 2, 2004

Date of Patent: August 6, 2013

Assignee: Nuance Communications, Inc.

Inventors: Benjamin Chigier, Edward A. Brody, Daniel Edward Chernin, Roger S. Zimmerman
Joint factor analysis scoring for speech processing systems

Patent number: 8504366

Abstract: Method, system, and computer program product are provided for Joint Factor Analysis (JFA) scoring in speech processing systems. The method includes: carrying out an enrollment session offline to enroll a speaker model in a speech processing system using JFA, including: extracting speaker factors from the enrollment session; estimating first components of channel factors from the enrollment session. The method further includes: carrying out a test session including: calculating second components of channel factors strongly dependent on the test session; and generating a score based on speaker factors, channel factors, and test session Gaussian mixture model sufficient statistics to provide a log-likelihood ratio for a test session.

Type: Grant

Filed: November 16, 2011

Date of Patent: August 6, 2013

Assignee: Nuance Communications, Inc.

Inventors: Aronowitz Hagai, Barkan Oren
Differential dynamic content delivery with text display in dependence upon simultaneous speech

Patent number: 8504364

Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.

Type: Grant

Filed: September 14, 2012

Date of Patent: August 6, 2013

Assignee: Nuance Communications, Inc.

Inventors: William K. Bodin, Michael John Burkhart, Daniel G. Eisenhauer, Thomas James Watson, Daniel Mark Schumacher
ZONE BASED PRESENCE DETERMINATION VIA VOICEPRINT LOCATION AWARENESS

Publication number: 20130195285

Abstract: A speech from a speaker proximate to one or more microphones within an environment can be received. The microphones can be a directional microphone or an omni-directional microphone. The speech can be processed to produce an utterance to determine the identity of the speaker. The identity of the speaker can be associated with a voiceprint. The identity can be associated with a user's credentials of a computing system. The credentials can uniquely identify the user within the computing system. The utterance can be analyzed to establish a zone in which the speaker is present. The zone can be a bounded region within the environment. The zone can be mapped within the environment to determine a location of the speaker. The location can be a relative or an absolute location.

Type: Application

Filed: January 30, 2012

Publication date: August 1, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: STEPHANIE DE LA FUENTE, GREGORY S. JONES, JOHN S. PANNELL
SPECIFIC CALL DETECTING DEVICE AND SPECIFIC CALL DETECTING METHOD

Publication number: 20130197912

Abstract: A specific call detecting device includes: an utterance period detecting unit which detects at least a first utterance period in which the first speaker speaks in a call between a first speaker and a second speaker; an utterance ratio calculating unit which calculates utterance ratio of the first speaker in the call; a voice recognition execution determining unit which determines whether at least one of the first voice of the first speaker and second voice of the second speaker becomes a target of voice recognition or not on the basis of the utterance ratio of the first speaker; a voice recognizing unit which detects a keyword related to a specific call from the voice determined as a target of voice recognition among the first and second voices; and a determining unit which determines whether the call is the specific call or not on the basis of the detected keyword.

Type: Application

Filed: December 7, 2012

Publication date: August 1, 2013

Inventor: FUJITSU LIMITED
VOICE ANALYZER, VOICE ANALYSIS SYSTEM, AND NON-TRANSITORY COMPUTER READABLE MEDIUM STORING A PROGRAM

Publication number: 20130191127

Abstract: A voice analyzer includes a plate-shaped body, a plurality of first voice acquisition units that are placed on both surfaces of the plate-shaped body and that acquire a voice of a speaker, a sound pressure comparison unit that compares sound pressure of a voice acquired by the first voice acquisition unit placed on one surface of the plate-shaped body with sound pressure of a voice acquired by the first voice acquisition unit placed on the other surface and determines a larger sound pressure, and a voice signal selection unit that selects information regarding a voice signal which is associated with the larger sound pressure and is determined by the sound pressure comparison unit.

Type: Application

Filed: July 20, 2012

Publication date: July 25, 2013

Applicant: FUJI XEROX CO., LTD.

Inventors: Kiyoshi IIDA, Haruo HARADA, Hirohito YONEYAMA, Kei SHIMOTANI, Akira FUJII, Yohei NISHINO
Word-level correction of speech input

Patent number: 8494852

Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the transcription system. The method further includes presenting one or more transcribed words from the word lattice. The method further includes receiving a user selection of at least one of the presented transcribed words. The method further includes presenting one or more alternate words from the word lattice for the selected transcribed word. The method further includes receiving a user selection of at least one of the alternate words. The method further includes replacing the selected transcribed word in the presented transcribed words with the selected alternate word.

Type: Grant

Filed: October 27, 2010

Date of Patent: July 23, 2013

Assignee: Google Inc.

Inventors: Michael J. LeBeau, William J. Byrne, John Nicholas Jitkoff, Brandon M. Ballinger, Trausti Kristjansson
CAPTCHA using challenges optimized for distinguishing between humans and machines

Patent number: 8494854

Abstract: An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance using optimized challenge items selected for their discrimination capability to determine if it was articulated by an unauthorized human or a text to speech (TTS) system.

Type: Grant

Filed: June 15, 2009

Date of Patent: July 23, 2013

Assignee: John Nicholas and Kristin Gross

Inventor: John Nicholas Gross
Communication System and Method Between an On-Vehicle Voice Recognition System and an Off-Vehicle Voice Recognition System

Publication number: 20130185072

Abstract: A vehicle based system and method for receiving voice inputs and determining whether to perform a voice recognition analysis using in-vehicle resources or resources external to the vehicle.

Type: Application

Filed: June 24, 2011

Publication date: July 18, 2013

Applicant: Honda Motor Co., Ltd.

Inventors: Ritchie Winson Huang, Pedram Vaghefinazari, Stuart Yamamoto
Method and device for providing speech-to-text encoding and telephony service

Patent number: 8489397

Abstract: A machine-readable medium and a network device are provided for speech-to-text translation. Speech packets are received at a broadband telephony interface and stored in a buffer. The speech packets are processed and textual representations thereof are displayed as words on a display device. Speech processing is activated and deactivated in response to a command from a subscriber.

Type: Grant

Filed: September 11, 2012

Date of Patent: July 16, 2013

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Charles David Caldwell, John Bruce Harlow, Robert J. Sayko, Norman Shaye
System and method for verifying origin of input through spoken language analysis

Patent number: 8489399

Abstract: An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance to determine if it was articulated by an unauthorized human or a text to speech (TTS) system.

Type: Grant

Filed: June 15, 2009

Date of Patent: July 16, 2013

Assignee: John Nicholas and Kristin Gross Trust

Inventor: John Nicholas Gross
METHODS AND APPARATUS FOR FORMANT-BASED VOICE SYNTHESIS

Publication number: 20130179167

Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.

Type: Application

Filed: February 27, 2013

Publication date: July 11, 2013

Applicant: Nuance Communications, Inc.

Inventor: Nuance Communications, Inc.
IMAGE DISPLAY APPARATUS AND METHOD OF CONTROLLING THE SAME

Publication number: 20130179168

Abstract: Provided are an image display apparatus and a method of controlling the same. The image display apparatus enabling voice recognition includes: a first voice inputter which receives a user-side audio signal; an audio outputter which outputs an audio signal processed by the image display apparatus; a first voice recognizer which recognizes the user-side audio signal received through the first voice inputter; and a controller which decreases a volume of the audio signal output through the audio outputter to a predetermined level if a voice recognition start command is received.

Type: Application

Filed: January 9, 2013

Publication date: July 11, 2013

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor: SAMSUNG ELECTRONICS CO., LTD.
Adaptive auto-encoders

Patent number: 8484022

Abstract: A method and system for adaptive auto-encoders is disclosed. An input audio training signal may be transformed into a sequence of feature vectors, each bearing quantitative measures of acoustic properties of the input audio training signal. An auto-encoder may process the feature vectors to generate an encoded form of the quantitative measures, and a recovered form of the quantitative measures based on an inverse operation by the auto-encoder on the encoded form of the quantitative measures. A duplicate copy of the sequence of feature vectors may be normalized to form a normalized signal in which supra-phonetic acoustic properties are reduced in comparison with phonetic acoustic properties of the input audio training signal. The auto-encoder may then be trained to compensate for supra-phonetic features by reducing the magnitude of an error signal corresponding to a difference between the normalized signal and the recovered form of the quantitative measures.

Type: Grant

Filed: July 27, 2012

Date of Patent: July 9, 2013

Assignee: Google Inc.

Inventor: Vincent Vanhoucke
Modification of voice waveforms to change social signaling

Patent number: 8484035

Abstract: A method of altering a social signaling characteristic of a speech signal. A statistically large number of speech samples created by different speakers in different tones of voice are evaluated to determine one or more relationships that exist between a selected social signaling characteristic and one or more measurable parameters of the speech samples. An input audio voice signal is then processed in accordance with these relationships to modify one or more of controllable parameters of input audio voice signal to produce a modified output audio voice signal in which said selected social signaling characteristic is modified. In a specific illustrative embodiment, a two-level hidden Markov model is used to identify voiced and unvoiced speech segments and selected controllable characteristics of these speech segments are modified to alter the desired social signaling characteristic.

Type: Grant

Filed: September 6, 2007

Date of Patent: July 9, 2013

Assignee: Massachusetts Institute of Technology

Inventor: Alex Paul Pentland
Apparatus, system, and method for voice chat transcription

Patent number: 8478598

Abstract: An apparatus, system, and method to transcribe a voice chat session initiated from a text chat session. The system includes a chat server, a voice server, and a transcription engine. The chat server is configured to facilitate a text chat session between multiple instant messaging clients. The voice server is coupled to the chat server and configured to facilitate a transition from the text chat session to a voice chat session between the multiple instant messaging clients. The transcription engine is coupled to the voice server and configured to generate a voice transcription of the voice chat session. The voice transcription may be aggregated into a text chat history.

Type: Grant

Filed: August 17, 2007

Date of Patent: July 2, 2013

Assignee: International Business Machines Corporation

Inventors: Erik J. Burckart, Steve R. Campbell, Andrew Ivory, Aaron K. Shook
Word-level correction of speech input

Patent number: 8478590

Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the transcription system. The method further includes presenting one or more transcribed words from the word lattice. The method further includes receiving a user selection of at least one of the presented transcribed words. The method further includes presenting one or more alternate words from the word lattice for the selected transcribed word. The method further includes receiving a user selection of at least one of the alternate words. The method further includes replacing the selected transcribed word in the presented transcribed words with the selected alternate word.

Type: Grant

Filed: September 30, 2011

Date of Patent: July 2, 2013

Assignee: Google Inc.

Inventors: Michael J. LeBeau, William J. Byrne, John Nicholas Jitkoff, Brandon M. Ballinger, Trausti Kristjansson
Method and system for assessing pronunciation difficulties of non-native speakers

Patent number: 8478597

Abstract: The present disclosure presents a useful metric for assessing the relative difficulty which non-native speakers face in pronouncing a given utterance and a method and systems for using such a metric in the evaluation and assessment of the utterances of non-native speakers. In an embodiment, the metric may be based on both known sources of difficulty for language learners and a corpus-based measure of cross-language sound differences. The method may be applied to speakers who primarily speak a first language speaking utterances in any non-native second language.

Type: Grant

Filed: January 10, 2006

Date of Patent: July 2, 2013

Assignee: Educational Testing Service

Inventors: Derrick Higgins, Klaus Zechner, Yoko Futagi, Rene Lawless
Mobile speech-to-speech interpretation system

Patent number: 8478578

Abstract: Interpretation from a first language to a second language via one or more communication devices is performed through a communication network (e.g. phone network or the internet) using a server for performing recognition and interpretation tasks, comprising the steps of: receiving an input speech utterance in a first language on a first mobile communication device; conditioning said input speech utterance; first transmitting said conditioned input speech utterance to a server; recognizing said first transmitted speech utterance to generate one or more recognition results; interpreting said recognition results to generate one or more interpretation results in an interlingua; mapping the interlingua to a second language in a first selected format; second transmitting said interpretation results in the first selected format to a second mobile communication device; and presenting said interpretation results in a second selected format on said second communication device.

Type: Grant

Filed: January 9, 2009

Date of Patent: July 2, 2013

Assignee: Fluential, LLC

Inventors: Farzad Ehsani, Demitrios Master, Elaine Drom Zuber
VOICE ANALYZER

Publication number: 20130166299

Abstract: A voice analyzer includes an apparatus body, a strap that is connected to the apparatus body and is used to hang the apparatus body from a neck of a user, a first voice acquisition unit provided in the strap or the apparatus body, a second voice acquisition unit provided at a position where a distance of a sound wave propagation path from a mouth of the user is smaller than a distance of a sound wave propagation path from the mouth of the user to the first voice acquisition unit, and an identification unit that identifies a sound, in which first sound pressure acquired by the first voice acquisition unit is larger by a predetermined value or more than second sound pressure acquired by the second voice acquisition unit, on the basis of a result of comparison between the first sound pressure and the second sound pressure.

Type: Application

Filed: May 18, 2012

Publication date: June 27, 2013

Applicant: FUJI XEROX CO., LTD.

Inventors: Kei SHIMOTANI, Yohei NISHINO, Hirohito YONEYAMA, Kiyoshi IIDA, Akira FUJII, Haruo HARADA
ELECTRONIC DEVICE, DISPLAYING METHOD, AND PROGRAM COMPUTER-READABLE STORAGE MEDIUM

Publication number: 20130166300

Abstract: An electronic device includes a voice recognition analyzing module, a manipulation identification module, and a manipulating module. The voice recognition analyzing module is configured to recognize and analyze a voice of a user. The manipulation identification module is configured to, using the analyzed voice, identify an object on a screen and identify a requested manipulation associated with the object. The manipulating module is configured to perform the requested manipulation.

Type: Application

Filed: September 12, 2012

Publication date: June 27, 2013

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Sachie Yokoyama, Hideki Tsutsui
SYSTEM AND METHOD FOR RECOGNIZING A USER VOICE COMMAND IN NOISY ENVIRONMENT

Publication number: 20130166279

Abstract: An automatic speech recognition system for recognizing a user voice command in noisy environment, including: matching means for matching elements retrieved from speech units forming said command with templates in a template library; characterized by processing means including a MultiLayer Perceptron for computing posterior templates (P(Otemplate(q))) stored as said templates in said template library; means for retrieving posterior vectors (P(Otest(q))) from said speech units, said posterior vectors being used as said elements. The present invention relates also to a method for recognizing a user voice command in noisy environments.

Type: Application

Filed: February 21, 2013

Publication date: June 27, 2013

Applicant: VEOVOX SA

Inventor: VEOVOX SA
VOICE ANALYZER

Publication number: 20130166298

Abstract: A voice analyzer includes an apparatus body, a strap that is connected to the apparatus body to make the apparatus body hung from a neck of a wearer, a first voice acquisition unit that acquires a voice of a speaker and is disposed in either a left or right strap when viewed from the wearer, a second voice acquisition unit that acquires the voice of the speaker and is disposed in the opposite strap in which the first voice acquisition unit is disposed, and an arrangement recognition unit that recognizes arrangements of the first and second voice acquisition units, when viewed from the wearer, by comparing a voice signal of the voice acquired by the first voice acquisition unit with sound pressure of a heart sound of the wearer acquired by the second voice acquisition unit.

Type: Application

Filed: April 20, 2012

Publication date: June 27, 2013

Applicant: FUJI XEROX CO., LTD.

Inventors: Haruo HARADA, Hirohito YONEYAMA, Kei SHIMOTANI, Akira FUJII, Yohei NISHINO, Kiyoshi IIDA
Skipping radio/television program segments

Patent number: 8473294

Abstract: Techniques for notifying at least one entity of an occurrence of an event in an audio signal are provided. At least one preference is obtained from the at least one entity. An occurrence of an event in the audio signal is determined. The event is related to at least one of at least one speaker and at least one topic. The at least one entity is notified of the occurrence of the event in the audio signal, in accordance with the at least one preference.

Type: Grant

Filed: March 30, 2012

Date of Patent: June 25, 2013

Assignee: International Business Machines Corporation

Inventors: Hagai Aronowitz, Itzhack Goldberg, Ron Hoory
Acoustic model adaptation using geographic information

Patent number: 8468012

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

Type: Grant

Filed: May 26, 2010

Date of Patent: June 18, 2013

Assignee: Google Inc.

Inventors: Matthew I. Lloyd, Trausti Kristjansson
System and method for processing calls in a call center

Patent number: 8467519

Abstract: A system and method for processing calls in a call center are described. A call session from a caller via a session manager and including incoming text messages of a verbal speech stream is assigned. The incoming text messages are progressively visually presented throughout the call session to a live agent on an agent console operatively coupled to the session manager. The incoming text messages are progressively processed through a customer support scenario interactively monitored and controlled by the live agent via the agent console. The incoming text messages are processed through automated script execution in concert with the live agent. Outgoing text messages are converted into a synthesized speech stream. The synthesized speech stream is sent via the agent console to the caller.

Type: Grant

Filed: June 23, 2008

Date of Patent: June 18, 2013

Assignee: Intellisist, Inc.

Inventors: Gilad Odinak, Alastair Sutherland, William A. Tolhurst
Systems and methods for transactions on the telecom web

Patent number: 8463705

Abstract: Embodiments of the invention broadly contemplate systems, methods, apparatuses and program products that leverage the mobile web, especially the spoken (telecom) web, to handle transactions. According to embodiments of the invention, in essence, a mobile device such as a phone is used as a terminal, remote authentication is employed, and a challenge response using a per transaction audio based code is used as a confirmation. Embodiments of the invention also provide further protection against repudiation, and greater trust in the transaction, by employing witnesses.

Type: Grant

Filed: February 28, 2010

Date of Patent: June 11, 2013

Assignee: International Business Machines Corporation

Inventors: Anupam Joshi, Srinivas G. Narayana, Aaditeshwar Seth
Interactive speech recognition model

Patent number: 8463608

Abstract: A method and apparatus for updating a speech model on a multi-user speech recognition system with a personal speech model for a single user. A speech recognition system, for instance in a car, can include a generic speech model for comparison with the user speech input. A way of identifying a personal speech model, for instance in a mobile phone, is connected to the system. A mechanism is included for receiving personal speech model components, for instance a BLUETOOTH connection. The generic speech model is updated using the received personal speech model components. Speech recognition can then be performed on user speech using the updated generic speech model.

Type: Grant

Filed: March 12, 2012

Date of Patent: June 11, 2013

Assignee: Nuance Communications, Inc.

Inventors: Barry Neil Dow, Eric William Janke, Daniel Lee Yuk Cheung, Benjamin Terrick Staniford
Systems and methods for own voice recognition with adaptations for noise robustness

Patent number: 8462969

Abstract: Own voice recognition (OVR) for hearing aids, detects time instances where the person wearing the device is speaking. Classification of the own voice is performed dependent on a fixed or adaptive detection threshold. Automatic tuning in a real-time system depends on general noise statistics in the input signals. The noise is removed from the received signal and is characterized by signal-to-noise ratio and noise color. An optimal detection threshold for own voice recognition is determined based on the noise characteristics. A noise detection model is created by smoothed Voronoi tessellation. Own voice detection is performed by a processor.

Type: Grant

Filed: April 19, 2011

Date of Patent: June 11, 2013

Assignee: Siemens Audiologische Technik GmbH

Inventors: Heiko Claussen, Michael T. Loiacono, Henning Puder, Justinian Rosca
METHOD, SYSTEM AND PROGRAM FOR VERIFYING THE AUTHENTICITY OF A WEBSITE USING A RELIABLE TELECOMMUNICATION CHANNEL AND PRE-LOGIN MESSAGE

Publication number: 20130144620

Abstract: Various embodiments of the present invention for validating the authenticity of a website are provided. An example of a method according to the present invention comprises providing a website having an artifact, receiving a communication from a user, at a service provider, for validating the website associated with a service provider, inquiring from the user a description of the artifact comparing the artifact on the website with the description of the artifact from the user and generating a indication to the user based upon the comparing. The communication is over a first communication channel and the website is accessed over a second communication channel. The first communication channel is different than the second. The artifact can be displayed after a user session is identified.

Type: Application

Filed: December 6, 2011

Publication date: June 6, 2013

Applicant: TELCORDIA TECHNOLOGIES, INC.

Inventors: Richard J. Lipton, Shoshana K. Loeb, Thimios Panagos
ENHANCED VOICE CONFERENCING

Publication number: 20130144619

Abstract: Techniques for ability enhancement are described. Some embodiments provide an ability enhancement facilitator system (“AEFS”) configured to enhance voice conferencing among multiple speakers. In one embodiment, the AEFS receives data that represents utterances of multiple speakers who are engaging in a voice conference with one another. The AEFS then determines speaker-related information, such as by identifying a current speaker, locating an information item (e.g., an email message, document) associated with the speaker, or the like. The AEFS then informs a user of the speaker-related information, such as by presenting the speaker-related information on a display of a conferencing device associated with the user.

Type: Application

Filed: January 23, 2012

Publication date: June 6, 2013

Inventors: Richard T. Lord, Robert W. Lord, Nathan P. Myhrvold, Clarence T. Tegreene, Roderick A. Hyde, Lowell L. Wood, JR., Muriel Y. Ishikawa, Victoria Y.H. Wood, Charles Whitmer, Paramvir Bahl, Doughlas C. Burger, Ranveer Chandra, William H. Gates, III, Paul Holman, Jordin T. Kare, Craig J. Mundie, Tim Paek, Desney S. Tan, Lin Zhong, Matthew G. Dyor
Systems and Methods for Assessment of Non-Native Spontaneous Speech

Publication number: 20130144621

Abstract: Computer-implemented systems and methods are provided for assessing non-native spontaneous speech pronunciation. Speech recognition on digitized speech is performed using a non-native acoustic model trained with non-native speech to generate word hypotheses for the digitized speech. Time alignment is performed between the digitized speech and the word hypotheses using a reference acoustic model trained with native-quality speech. Statistics are calculated regarding individual words and phonemes in the word hypotheses based on the alignment. A plurality of features for use in assessing pronunciation of the speech are calculated based on the statistics, an assessment score is calculated based on one or more of the calculated features, and the assessment score is stored in a computer-readable memory.

Type: Application

Filed: January 31, 2013

Publication date: June 6, 2013

Applicant: EDUCATIONAL TESTING SERVICE

Inventor: Educational Testing Service
Mechanism for providing user guidance and latency concealment for automatic speech recognition systems

Patent number: 8457963

Abstract: Audio input to a user device is captured in a buffer and played back to the user while being sent to and recognized by an automatic speech recognition (ASR) system. Overlapping the playback with the speech recognition processing masks a portion of the true latency of the ASR system thus improving the user's perception of the ASR system's responsiveness. Further, upon hearing the playback, the user is intuitively guided to self-correct for any defects in the captured audio.

Type: Grant

Filed: March 30, 2010

Date of Patent: June 4, 2013

Assignee: Promptu Systems Corporation

Inventor: Laurent Charriere
Detecting and communicating biometrics of recorded voice during transcription process

Patent number: 8457964

Abstract: A method and system for determining and communicating biometrics of a recorded speaker in a voice transcription process. An interactive voice response system receives a request from a user for a transcription of a voice file. A profile associated with the requesting user is obtained, wherein the profile comprises biometric parameters and preferences defined by the user. The requested voice file is analyzed for biometric elements according to the parameters specified in the user's profile. Responsive to detecting biometric elements in the voice file that conform to the parameters specified in the user's profile, a transcription output of the voice file is modified according to the preferences specified in the user's profile for the detected biometric elements to form a modified transcription output file. The modified transcription output file may then be provided to the requesting user.

Type: Grant

Filed: September 5, 2012

Date of Patent: June 4, 2013

Assignee: International Business Machines Corporation

Inventor: Peeyush Jaiswal
User authentication by combining speaker verification and reverse turing test

Patent number: 8457974

Abstract: Methods and system for authenticating a user are disclosed. The present invention includes accessing a collection of personal information related to the user. The present invention also includes performing an authentication operation that is based on the collection of personal information. The authentication operation incorporates at least one dynamic component and prompts the user to give an audible utterance. The audible utterance is compared to a stored voiceprint.

Type: Grant

Filed: July 26, 2012

Date of Patent: June 4, 2013

Assignee: Microsoft Corporation

Inventor: Kuansan Wang
Recognition architecture for generating Asian characters

Patent number: 8457946

Abstract: Architecture for correcting incorrect recognition results in an Asian language speech recognition system. A spelling mode can be launched in response to receiving speech input, the spelling mode for correcting incorrect spelling of the recognition results or generating new words. Correction can be obtained using speech and/or manual selection and entry. The architecture facilitates correction in a single pass, rather than multiples times as in conventional systems. Words corrected using the spelling mode are corrected as a unit and treated as a word. The spelling mode applies to languages of at least the Asian continent, such as Simplified Chinese, Traditional Chinese, and/or other Asian languages such as Japanese.

Type: Grant

Filed: April 26, 2007

Date of Patent: June 4, 2013

Assignee: Microsoft Corporation

Inventors: Shiun-Zu Kuo, Kevin E. Feige, Yifan Gong, Taro Miwa, Arun Chitrapu
Menu hierarchy skipping dialog for directed dialog speech recognition

Patent number: 8457973

Abstract: A method and a processing device for managing an interactive speech recognition system is provided. Whether a voice input relates to expected input, at least partially, of any one of a group of menus different from a current menu is determined. If the voice input relates to the expected input, at least partially, of any one of the group of menus different from the current menu, skipping to the one of the group of menus is performed. The group of menus different from the current menu include menus at multiple hierarchical levels.

Type: Grant

Filed: March 4, 2006

Date of Patent: June 4, 2013

Assignee: AT&T Intellectual Propert II, L.P.

Inventor: Harry Edward Blanchard
Projection apparatus with speech indication and control method thereof

Patent number: 8452593

Abstract: A projection apparatus with speech indication and a control method thereof are provided. The projection apparatus comprises a storage unit, a transmission interface, a process unit, and an output unit. The storage unit is configured to store a plurality of speech data. The transmission interface is configured to connect to an external apparatus for accessing the storage unit. The process unit is configured to select at least one of the speech data according to the present state of the projection apparatus. The output unit is configured to output the selected speech datum to broadcast the speech indication.

Type: Grant

Filed: August 31, 2007

Date of Patent: May 28, 2013

Assignee: Delta Electronics, Inc.

Inventors: Yi-Hsiang Huang, Yuan Ming Hsu, Jimmy Su
Method and system to authenticate a user and/or generate cryptographic data

Patent number: 8447614

Abstract: System and process for audio authentication of an individual or speaker including a processor for decomposing an audio signal received at the sensor into vectors representative of the speaker to be authenticated for transforming the super-vector V of the speaker resulting from the concatenation of the vectors associated with the said speaker into binary data 1001100 . . . 0 taking as an input the mean super-vector M resulting from the mean super-vector, and comparing the super-vector V of the speaker with the mean super-vector M, the said binary data thus obtained being transmitted to a module for extracting the speaker authentication taking as an input the public keys Kpub(l) in order to authenticate the speaker and/or to generate a cryptographic key associated with the speaker.

Type: Grant

Filed: December 22, 2009

Date of Patent: May 21, 2013

Assignee: Thales

Inventors: François Capman, Sandra Marcello, Jean Martinelli
System for transferring personalize matter from one computer to another

Patent number: RE44248

Abstract: This invention combines methodologies that enhance voice recognition dictation. It describes features for moving speaker voice files eliminating redundant training of speech recognition dictation applications. It defines how to create synthetic voice models reducing speaker dependency. It combines accuracy and performance into a single measure called RAP Rate. Moreover, the invention describes enhancing voice recognition applications and systems by measure/adjusting hardware and software features for optimal voice recognition dictation incorporating methodical processes based on RAP Rate. Using these approaches and tools the invention includes a method for constructing a handheld transcriber that immediately translates audio speech into text with real-time display. The invention describes a method for applying RAP Rate and synthetic voice models to applications like voice mail to text. With the ability to move and translate voice models the invention describes new services that could be provided for a fee.

Type: Grant

Filed: March 30, 2012

Date of Patent: May 28, 2013

Inventor: Darrell A. Poirier
System and method of voice recognition near a wireline node of a network supporting cable television and/or video delivery

Patent number: RE44326

Abstract: A method and system of speech recognition presented by a back channel from multiple user sites within a network supporting cable television and/or video delivery is disclosed.

Type: Grant

Filed: November 3, 2011

Date of Patent: June 25, 2013

Assignee: Promptu Systems Corporation

Inventors: Theodore Calderone, Paul M. Cook, Mark J. Foster

prev … 14 15 16 17 18 19 20 21 22 … next