Voice Recognition Patents (Class 704/246)
  • Patent number: 8510116
    Abstract: Methods and systems are disclosed for capturing consumer voice signatures. The methods and systems synchronize voice signatures with display of the terms and conditions of a transaction to the consumer in real time during a phone conversation. In one implementation, mobile devices with text display capability may be used to display the terms/conditions of the transaction while the consumer is talking to a customer service representative or an interactive voice response system. The terms/conditions may be displayed as a scrollable document on the mobile device to which the consumer may then agree during the phone conversation. The consumer may then “voice sign” by reading the displayed terms/conditions, or some portion thereof, during the phone conversation to manifest his/her knowing consent. Such an arrangement helps promotes the use of voice signatures by consumers in a manner that complies with the requirements of the federal E-Sign Act.
    Type: Grant
    Filed: October 26, 2007
    Date of Patent: August 13, 2013
    Assignee: United Services Automobile Association (USAA)
    Inventors: Curt Wayne Moy, Sakina Hassonjee, Amy Irene Forsythe, Linda Giessel King, Sarah Brooke Severson
  • Patent number: 8509832
    Abstract: Methods, systems, devices, and computer program products route communication based on an urgency priority associated with a sender of the communication. The method involves receiving incoming communication, identifying the sender, determining an urgency priority designation associated with communication, and routing the incoming communication according to routing instructions associated with the urgency priority designation. Prior to receiving the incoming communication, the method may further involve receiving a selection to configure routing of communication based on one or more urgency priority designations, rendering urgency priority options and routing options that provide routing instructions, receiving routing instructions associated with each urgency priority designation, and receiving and recording the urgency priority designation associated with the sender.
    Type: Grant
    Filed: October 1, 2010
    Date of Patent: August 13, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Brian Daigle
  • Patent number: 8509396
    Abstract: A call routing system is created by receiving a set of initial target classes and a corresponding set of topic descriptions. Non-overlapping semantic tokens in the set of topic descriptions are identified. A set of clear target classes from the non-overlapping semantic tokens and the initial target classes is identified. Overlapping semantic tokens from the set of topic descriptions are identified. A set of vague classes is identified from the overlapping semantic tokens and the initial target classes. A set of disambiguation dialogues and a set of grammar prompts is generated according to the overlapping and non-overlapping semantic tokens. The call routing system is then created based on the set of clear target classes, the set of vague target classes, and the set of disambiguation dialogues.
    Type: Grant
    Filed: September 24, 2009
    Date of Patent: August 13, 2013
    Assignee: International Business Machines Corporation
    Inventors: Ea-Ee Jan, Hong-Kwang Jeff Kuo, David M. Lubensky
  • Patent number: 8510110
    Abstract: Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.
    Type: Grant
    Filed: July 11, 2012
    Date of Patent: August 13, 2013
    Assignee: Microsoft Corporation
    Inventors: Cha Zhang, Paul A. Viola, Pei Yin, Ross G. Cutler, Xinding Sun, Yong Rui
  • Patent number: 8510103
    Abstract: Systems and methods are operable to associate each of a plurality of stored audio patterns with at least one of a plurality of digital tokens, identify a user based on user identification input, access a plurality of stored audio patterns associated with a user based on the user identification input, receive from a user at least one audio input from a custom language made up of custom language elements wherein the elements include at least one monosyllabic representation of a number, letter or word, select one of the plurality of stored audio patterns associated with the identified user, in the case that the audio input received from the identified user corresponds with one of the plurality of stored audio patterns, determine the digital token associated with the selected one of the plurality of stored audio patterns, and generate the output signal for use in a device based on the determined digital token.
    Type: Grant
    Filed: October 15, 2010
    Date of Patent: August 13, 2013
    Inventor: Paul Angott
  • Publication number: 20130204620
    Abstract: Establishing a multimodal personality for a multimodal application, including evaluating, by the multimodal application, attributes of a user's interaction with the multimodal application; selecting, by the multimodal application, a vocal demeanor in dependence upon the values of the attributes of the user's interaction with the multimodal application; and incorporating, by the multimodal application, the vocal demeanor into the multimodal application.
    Type: Application
    Filed: January 23, 2013
    Publication date: August 8, 2013
    Applicant: Nuance Communications, Inc.
    Inventor: Nuance Communications, Inc.
  • Publication number: 20130204607
    Abstract: A system implements voice detection using a receiver, a voice analyzer, and a voice identifier. The receiver receives a transmission from a transmission channel associated with a channel identification. The transmission includes a voice input. The voice analyzer analyzes the voice input and generates a plurality of voice metrics according to a plurality of analysis parameters. The voice identifier compares the voice metrics to one or more stored sets of voice metrics. Each set of voice metrics corresponds to a voice identification associated with the channel identification. The voice identifier also identifies a match between the voice metrics from the voice analyzer and at least one of the stored sets of voice metrics.
    Type: Application
    Filed: March 15, 2013
    Publication date: August 8, 2013
    Applicant: Forrest S. Baker III Trust
    Inventor: Forrest S. Baker III Trust
  • Publication number: 20130204618
    Abstract: Automated delivery and filing of transcribed material prepared from dictated audio files into a central record-keeping system are presented. A user dictates information from any location, uploads that audio file to a transcriptionist to be transcribed, and the transcribed material is automatically delivered into a central record keeping system, filed with the appropriate client or matter file, and the data stored in the designated appropriate fields within those client or matter files. Also described is the recordation of meetings from multiple sources using mobile devices and the detection of the active or most prominent speaker at given intervals in the meeting. Further, text boxes on websites are completed using an audio recording application and offsite transcription.
    Type: Application
    Filed: January 22, 2013
    Publication date: August 8, 2013
    Applicant: SpeakWrite, LLC
    Inventor: SpeakWrite, LLC
  • Patent number: 8504365
    Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.
    Type: Grant
    Filed: April 11, 2008
    Date of Patent: August 6, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventor: Horst Schroeter
  • Patent number: 8504369
    Abstract: A device, for use by a transcriptionist in a transcription editing system for editing transcriptions dictated by speakers, includes, in combination, a monitor configured to display visual text of transcribed dictations, an audio mechanism configured to cause playback of portions of an audio file associated with a dictation, and a cursor-control module coupled to the audio mechanism and to the monitor and configured to cause the monitor to display multiple cursors in the text.
    Type: Grant
    Filed: June 2, 2004
    Date of Patent: August 6, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Benjamin Chigier, Edward A. Brody, Daniel Edward Chernin, Roger S. Zimmerman
  • Patent number: 8504366
    Abstract: Method, system, and computer program product are provided for Joint Factor Analysis (JFA) scoring in speech processing systems. The method includes: carrying out an enrollment session offline to enroll a speaker model in a speech processing system using JFA, including: extracting speaker factors from the enrollment session; estimating first components of channel factors from the enrollment session. The method further includes: carrying out a test session including: calculating second components of channel factors strongly dependent on the test session; and generating a score based on speaker factors, channel factors, and test session Gaussian mixture model sufficient statistics to provide a log-likelihood ratio for a test session.
    Type: Grant
    Filed: November 16, 2011
    Date of Patent: August 6, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Aronowitz Hagai, Barkan Oren
  • Patent number: 8504364
    Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: August 6, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: William K. Bodin, Michael John Burkhart, Daniel G. Eisenhauer, Thomas James Watson, Daniel Mark Schumacher
  • Publication number: 20130195285
    Abstract: A speech from a speaker proximate to one or more microphones within an environment can be received. The microphones can be a directional microphone or an omni-directional microphone. The speech can be processed to produce an utterance to determine the identity of the speaker. The identity of the speaker can be associated with a voiceprint. The identity can be associated with a user's credentials of a computing system. The credentials can uniquely identify the user within the computing system. The utterance can be analyzed to establish a zone in which the speaker is present. The zone can be a bounded region within the environment. The zone can be mapped within the environment to determine a location of the speaker. The location can be a relative or an absolute location.
    Type: Application
    Filed: January 30, 2012
    Publication date: August 1, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: STEPHANIE DE LA FUENTE, GREGORY S. JONES, JOHN S. PANNELL
  • Publication number: 20130197912
    Abstract: A specific call detecting device includes: an utterance period detecting unit which detects at least a first utterance period in which the first speaker speaks in a call between a first speaker and a second speaker; an utterance ratio calculating unit which calculates utterance ratio of the first speaker in the call; a voice recognition execution determining unit which determines whether at least one of the first voice of the first speaker and second voice of the second speaker becomes a target of voice recognition or not on the basis of the utterance ratio of the first speaker; a voice recognizing unit which detects a keyword related to a specific call from the voice determined as a target of voice recognition among the first and second voices; and a determining unit which determines whether the call is the specific call or not on the basis of the detected keyword.
    Type: Application
    Filed: December 7, 2012
    Publication date: August 1, 2013
    Inventor: FUJITSU LIMITED
  • Publication number: 20130191127
    Abstract: A voice analyzer includes a plate-shaped body, a plurality of first voice acquisition units that are placed on both surfaces of the plate-shaped body and that acquire a voice of a speaker, a sound pressure comparison unit that compares sound pressure of a voice acquired by the first voice acquisition unit placed on one surface of the plate-shaped body with sound pressure of a voice acquired by the first voice acquisition unit placed on the other surface and determines a larger sound pressure, and a voice signal selection unit that selects information regarding a voice signal which is associated with the larger sound pressure and is determined by the sound pressure comparison unit.
    Type: Application
    Filed: July 20, 2012
    Publication date: July 25, 2013
    Applicant: FUJI XEROX CO., LTD.
    Inventors: Kiyoshi IIDA, Haruo HARADA, Hirohito YONEYAMA, Kei SHIMOTANI, Akira FUJII, Yohei NISHINO
  • Patent number: 8494852
    Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the transcription system. The method further includes presenting one or more transcribed words from the word lattice. The method further includes receiving a user selection of at least one of the presented transcribed words. The method further includes presenting one or more alternate words from the word lattice for the selected transcribed word. The method further includes receiving a user selection of at least one of the alternate words. The method further includes replacing the selected transcribed word in the presented transcribed words with the selected alternate word.
    Type: Grant
    Filed: October 27, 2010
    Date of Patent: July 23, 2013
    Assignee: Google Inc.
    Inventors: Michael J. LeBeau, William J. Byrne, John Nicholas Jitkoff, Brandon M. Ballinger, Trausti Kristjansson
  • Patent number: 8494854
    Abstract: An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance using optimized challenge items selected for their discrimination capability to determine if it was articulated by an unauthorized human or a text to speech (TTS) system.
    Type: Grant
    Filed: June 15, 2009
    Date of Patent: July 23, 2013
    Assignee: John Nicholas and Kristin Gross
    Inventor: John Nicholas Gross
  • Publication number: 20130185072
    Abstract: A vehicle based system and method for receiving voice inputs and determining whether to perform a voice recognition analysis using in-vehicle resources or resources external to the vehicle.
    Type: Application
    Filed: June 24, 2011
    Publication date: July 18, 2013
    Applicant: Honda Motor Co., Ltd.
    Inventors: Ritchie Winson Huang, Pedram Vaghefinazari, Stuart Yamamoto
  • Patent number: 8489397
    Abstract: A machine-readable medium and a network device are provided for speech-to-text translation. Speech packets are received at a broadband telephony interface and stored in a buffer. The speech packets are processed and textual representations thereof are displayed as words on a display device. Speech processing is activated and deactivated in response to a command from a subscriber.
    Type: Grant
    Filed: September 11, 2012
    Date of Patent: July 16, 2013
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Charles David Caldwell, John Bruce Harlow, Robert J. Sayko, Norman Shaye
  • Patent number: 8489399
    Abstract: An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance to determine if it was articulated by an unauthorized human or a text to speech (TTS) system.
    Type: Grant
    Filed: June 15, 2009
    Date of Patent: July 16, 2013
    Assignee: John Nicholas and Kristin Gross Trust
    Inventor: John Nicholas Gross
  • Publication number: 20130179167
    Abstract: In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.
    Type: Application
    Filed: February 27, 2013
    Publication date: July 11, 2013
    Applicant: Nuance Communications, Inc.
    Inventor: Nuance Communications, Inc.
  • Publication number: 20130179168
    Abstract: Provided are an image display apparatus and a method of controlling the same. The image display apparatus enabling voice recognition includes: a first voice inputter which receives a user-side audio signal; an audio outputter which outputs an audio signal processed by the image display apparatus; a first voice recognizer which recognizes the user-side audio signal received through the first voice inputter; and a controller which decreases a volume of the audio signal output through the audio outputter to a predetermined level if a voice recognition start command is received.
    Type: Application
    Filed: January 9, 2013
    Publication date: July 11, 2013
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventor: SAMSUNG ELECTRONICS CO., LTD.
  • Patent number: 8484022
    Abstract: A method and system for adaptive auto-encoders is disclosed. An input audio training signal may be transformed into a sequence of feature vectors, each bearing quantitative measures of acoustic properties of the input audio training signal. An auto-encoder may process the feature vectors to generate an encoded form of the quantitative measures, and a recovered form of the quantitative measures based on an inverse operation by the auto-encoder on the encoded form of the quantitative measures. A duplicate copy of the sequence of feature vectors may be normalized to form a normalized signal in which supra-phonetic acoustic properties are reduced in comparison with phonetic acoustic properties of the input audio training signal. The auto-encoder may then be trained to compensate for supra-phonetic features by reducing the magnitude of an error signal corresponding to a difference between the normalized signal and the recovered form of the quantitative measures.
    Type: Grant
    Filed: July 27, 2012
    Date of Patent: July 9, 2013
    Assignee: Google Inc.
    Inventor: Vincent Vanhoucke
  • Patent number: 8484035
    Abstract: A method of altering a social signaling characteristic of a speech signal. A statistically large number of speech samples created by different speakers in different tones of voice are evaluated to determine one or more relationships that exist between a selected social signaling characteristic and one or more measurable parameters of the speech samples. An input audio voice signal is then processed in accordance with these relationships to modify one or more of controllable parameters of input audio voice signal to produce a modified output audio voice signal in which said selected social signaling characteristic is modified. In a specific illustrative embodiment, a two-level hidden Markov model is used to identify voiced and unvoiced speech segments and selected controllable characteristics of these speech segments are modified to alter the desired social signaling characteristic.
    Type: Grant
    Filed: September 6, 2007
    Date of Patent: July 9, 2013
    Assignee: Massachusetts Institute of Technology
    Inventor: Alex Paul Pentland
  • Patent number: 8478598
    Abstract: An apparatus, system, and method to transcribe a voice chat session initiated from a text chat session. The system includes a chat server, a voice server, and a transcription engine. The chat server is configured to facilitate a text chat session between multiple instant messaging clients. The voice server is coupled to the chat server and configured to facilitate a transition from the text chat session to a voice chat session between the multiple instant messaging clients. The transcription engine is coupled to the voice server and configured to generate a voice transcription of the voice chat session. The voice transcription may be aggregated into a text chat history.
    Type: Grant
    Filed: August 17, 2007
    Date of Patent: July 2, 2013
    Assignee: International Business Machines Corporation
    Inventors: Erik J. Burckart, Steve R. Campbell, Andrew Ivory, Aaron K. Shook
  • Patent number: 8478590
    Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the transcription system. The method further includes presenting one or more transcribed words from the word lattice. The method further includes receiving a user selection of at least one of the presented transcribed words. The method further includes presenting one or more alternate words from the word lattice for the selected transcribed word. The method further includes receiving a user selection of at least one of the alternate words. The method further includes replacing the selected transcribed word in the presented transcribed words with the selected alternate word.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: July 2, 2013
    Assignee: Google Inc.
    Inventors: Michael J. LeBeau, William J. Byrne, John Nicholas Jitkoff, Brandon M. Ballinger, Trausti Kristjansson
  • Patent number: 8478597
    Abstract: The present disclosure presents a useful metric for assessing the relative difficulty which non-native speakers face in pronouncing a given utterance and a method and systems for using such a metric in the evaluation and assessment of the utterances of non-native speakers. In an embodiment, the metric may be based on both known sources of difficulty for language learners and a corpus-based measure of cross-language sound differences. The method may be applied to speakers who primarily speak a first language speaking utterances in any non-native second language.
    Type: Grant
    Filed: January 10, 2006
    Date of Patent: July 2, 2013
    Assignee: Educational Testing Service
    Inventors: Derrick Higgins, Klaus Zechner, Yoko Futagi, Rene Lawless
  • Patent number: 8478578
    Abstract: Interpretation from a first language to a second language via one or more communication devices is performed through a communication network (e.g. phone network or the internet) using a server for performing recognition and interpretation tasks, comprising the steps of: receiving an input speech utterance in a first language on a first mobile communication device; conditioning said input speech utterance; first transmitting said conditioned input speech utterance to a server; recognizing said first transmitted speech utterance to generate one or more recognition results; interpreting said recognition results to generate one or more interpretation results in an interlingua; mapping the interlingua to a second language in a first selected format; second transmitting said interpretation results in the first selected format to a second mobile communication device; and presenting said interpretation results in a second selected format on said second communication device.
    Type: Grant
    Filed: January 9, 2009
    Date of Patent: July 2, 2013
    Assignee: Fluential, LLC
    Inventors: Farzad Ehsani, Demitrios Master, Elaine Drom Zuber
  • Publication number: 20130166299
    Abstract: A voice analyzer includes an apparatus body, a strap that is connected to the apparatus body and is used to hang the apparatus body from a neck of a user, a first voice acquisition unit provided in the strap or the apparatus body, a second voice acquisition unit provided at a position where a distance of a sound wave propagation path from a mouth of the user is smaller than a distance of a sound wave propagation path from the mouth of the user to the first voice acquisition unit, and an identification unit that identifies a sound, in which first sound pressure acquired by the first voice acquisition unit is larger by a predetermined value or more than second sound pressure acquired by the second voice acquisition unit, on the basis of a result of comparison between the first sound pressure and the second sound pressure.
    Type: Application
    Filed: May 18, 2012
    Publication date: June 27, 2013
    Applicant: FUJI XEROX CO., LTD.
    Inventors: Kei SHIMOTANI, Yohei NISHINO, Hirohito YONEYAMA, Kiyoshi IIDA, Akira FUJII, Haruo HARADA
  • Publication number: 20130166300
    Abstract: An electronic device includes a voice recognition analyzing module, a manipulation identification module, and a manipulating module. The voice recognition analyzing module is configured to recognize and analyze a voice of a user. The manipulation identification module is configured to, using the analyzed voice, identify an object on a screen and identify a requested manipulation associated with the object. The manipulating module is configured to perform the requested manipulation.
    Type: Application
    Filed: September 12, 2012
    Publication date: June 27, 2013
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Sachie Yokoyama, Hideki Tsutsui
  • Publication number: 20130166279
    Abstract: An automatic speech recognition system for recognizing a user voice command in noisy environment, including: matching means for matching elements retrieved from speech units forming said command with templates in a template library; characterized by processing means including a MultiLayer Perceptron for computing posterior templates (P(Otemplate(q))) stored as said templates in said template library; means for retrieving posterior vectors (P(Otest(q))) from said speech units, said posterior vectors being used as said elements. The present invention relates also to a method for recognizing a user voice command in noisy environments.
    Type: Application
    Filed: February 21, 2013
    Publication date: June 27, 2013
    Applicant: VEOVOX SA
    Inventor: VEOVOX SA
  • Publication number: 20130166298
    Abstract: A voice analyzer includes an apparatus body, a strap that is connected to the apparatus body to make the apparatus body hung from a neck of a wearer, a first voice acquisition unit that acquires a voice of a speaker and is disposed in either a left or right strap when viewed from the wearer, a second voice acquisition unit that acquires the voice of the speaker and is disposed in the opposite strap in which the first voice acquisition unit is disposed, and an arrangement recognition unit that recognizes arrangements of the first and second voice acquisition units, when viewed from the wearer, by comparing a voice signal of the voice acquired by the first voice acquisition unit with sound pressure of a heart sound of the wearer acquired by the second voice acquisition unit.
    Type: Application
    Filed: April 20, 2012
    Publication date: June 27, 2013
    Applicant: FUJI XEROX CO., LTD.
    Inventors: Haruo HARADA, Hirohito YONEYAMA, Kei SHIMOTANI, Akira FUJII, Yohei NISHINO, Kiyoshi IIDA
  • Patent number: 8473294
    Abstract: Techniques for notifying at least one entity of an occurrence of an event in an audio signal are provided. At least one preference is obtained from the at least one entity. An occurrence of an event in the audio signal is determined. The event is related to at least one of at least one speaker and at least one topic. The at least one entity is notified of the occurrence of the event in the audio signal, in accordance with the at least one preference.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: June 25, 2013
    Assignee: International Business Machines Corporation
    Inventors: Hagai Aronowitz, Itzhack Goldberg, Ron Hoory
  • Patent number: 8468012
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.
    Type: Grant
    Filed: May 26, 2010
    Date of Patent: June 18, 2013
    Assignee: Google Inc.
    Inventors: Matthew I. Lloyd, Trausti Kristjansson
  • Patent number: 8467519
    Abstract: A system and method for processing calls in a call center are described. A call session from a caller via a session manager and including incoming text messages of a verbal speech stream is assigned. The incoming text messages are progressively visually presented throughout the call session to a live agent on an agent console operatively coupled to the session manager. The incoming text messages are progressively processed through a customer support scenario interactively monitored and controlled by the live agent via the agent console. The incoming text messages are processed through automated script execution in concert with the live agent. Outgoing text messages are converted into a synthesized speech stream. The synthesized speech stream is sent via the agent console to the caller.
    Type: Grant
    Filed: June 23, 2008
    Date of Patent: June 18, 2013
    Assignee: Intellisist, Inc.
    Inventors: Gilad Odinak, Alastair Sutherland, William A. Tolhurst
  • Patent number: 8463705
    Abstract: Embodiments of the invention broadly contemplate systems, methods, apparatuses and program products that leverage the mobile web, especially the spoken (telecom) web, to handle transactions. According to embodiments of the invention, in essence, a mobile device such as a phone is used as a terminal, remote authentication is employed, and a challenge response using a per transaction audio based code is used as a confirmation. Embodiments of the invention also provide further protection against repudiation, and greater trust in the transaction, by employing witnesses.
    Type: Grant
    Filed: February 28, 2010
    Date of Patent: June 11, 2013
    Assignee: International Business Machines Corporation
    Inventors: Anupam Joshi, Srinivas G. Narayana, Aaditeshwar Seth
  • Patent number: 8463608
    Abstract: A method and apparatus for updating a speech model on a multi-user speech recognition system with a personal speech model for a single user. A speech recognition system, for instance in a car, can include a generic speech model for comparison with the user speech input. A way of identifying a personal speech model, for instance in a mobile phone, is connected to the system. A mechanism is included for receiving personal speech model components, for instance a BLUETOOTH connection. The generic speech model is updated using the received personal speech model components. Speech recognition can then be performed on user speech using the updated generic speech model.
    Type: Grant
    Filed: March 12, 2012
    Date of Patent: June 11, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Barry Neil Dow, Eric William Janke, Daniel Lee Yuk Cheung, Benjamin Terrick Staniford
  • Patent number: 8462969
    Abstract: Own voice recognition (OVR) for hearing aids, detects time instances where the person wearing the device is speaking. Classification of the own voice is performed dependent on a fixed or adaptive detection threshold. Automatic tuning in a real-time system depends on general noise statistics in the input signals. The noise is removed from the received signal and is characterized by signal-to-noise ratio and noise color. An optimal detection threshold for own voice recognition is determined based on the noise characteristics. A noise detection model is created by smoothed Voronoi tessellation. Own voice detection is performed by a processor.
    Type: Grant
    Filed: April 19, 2011
    Date of Patent: June 11, 2013
    Assignee: Siemens Audiologische Technik GmbH
    Inventors: Heiko Claussen, Michael T. Loiacono, Henning Puder, Justinian Rosca
  • Publication number: 20130144620
    Abstract: Various embodiments of the present invention for validating the authenticity of a website are provided. An example of a method according to the present invention comprises providing a website having an artifact, receiving a communication from a user, at a service provider, for validating the website associated with a service provider, inquiring from the user a description of the artifact comparing the artifact on the website with the description of the artifact from the user and generating a indication to the user based upon the comparing. The communication is over a first communication channel and the website is accessed over a second communication channel. The first communication channel is different than the second. The artifact can be displayed after a user session is identified.
    Type: Application
    Filed: December 6, 2011
    Publication date: June 6, 2013
    Applicant: TELCORDIA TECHNOLOGIES, INC.
    Inventors: Richard J. Lipton, Shoshana K. Loeb, Thimios Panagos
  • Publication number: 20130144619
    Abstract: Techniques for ability enhancement are described. Some embodiments provide an ability enhancement facilitator system (“AEFS”) configured to enhance voice conferencing among multiple speakers. In one embodiment, the AEFS receives data that represents utterances of multiple speakers who are engaging in a voice conference with one another. The AEFS then determines speaker-related information, such as by identifying a current speaker, locating an information item (e.g., an email message, document) associated with the speaker, or the like. The AEFS then informs a user of the speaker-related information, such as by presenting the speaker-related information on a display of a conferencing device associated with the user.
    Type: Application
    Filed: January 23, 2012
    Publication date: June 6, 2013
    Inventors: Richard T. Lord, Robert W. Lord, Nathan P. Myhrvold, Clarence T. Tegreene, Roderick A. Hyde, Lowell L. Wood, JR., Muriel Y. Ishikawa, Victoria Y.H. Wood, Charles Whitmer, Paramvir Bahl, Doughlas C. Burger, Ranveer Chandra, William H. Gates, III, Paul Holman, Jordin T. Kare, Craig J. Mundie, Tim Paek, Desney S. Tan, Lin Zhong, Matthew G. Dyor
  • Publication number: 20130144621
    Abstract: Computer-implemented systems and methods are provided for assessing non-native spontaneous speech pronunciation. Speech recognition on digitized speech is performed using a non-native acoustic model trained with non-native speech to generate word hypotheses for the digitized speech. Time alignment is performed between the digitized speech and the word hypotheses using a reference acoustic model trained with native-quality speech. Statistics are calculated regarding individual words and phonemes in the word hypotheses based on the alignment. A plurality of features for use in assessing pronunciation of the speech are calculated based on the statistics, an assessment score is calculated based on one or more of the calculated features, and the assessment score is stored in a computer-readable memory.
    Type: Application
    Filed: January 31, 2013
    Publication date: June 6, 2013
    Applicant: EDUCATIONAL TESTING SERVICE
    Inventor: Educational Testing Service
  • Patent number: 8457963
    Abstract: Audio input to a user device is captured in a buffer and played back to the user while being sent to and recognized by an automatic speech recognition (ASR) system. Overlapping the playback with the speech recognition processing masks a portion of the true latency of the ASR system thus improving the user's perception of the ASR system's responsiveness. Further, upon hearing the playback, the user is intuitively guided to self-correct for any defects in the captured audio.
    Type: Grant
    Filed: March 30, 2010
    Date of Patent: June 4, 2013
    Assignee: Promptu Systems Corporation
    Inventor: Laurent Charriere
  • Patent number: 8457964
    Abstract: A method and system for determining and communicating biometrics of a recorded speaker in a voice transcription process. An interactive voice response system receives a request from a user for a transcription of a voice file. A profile associated with the requesting user is obtained, wherein the profile comprises biometric parameters and preferences defined by the user. The requested voice file is analyzed for biometric elements according to the parameters specified in the user's profile. Responsive to detecting biometric elements in the voice file that conform to the parameters specified in the user's profile, a transcription output of the voice file is modified according to the preferences specified in the user's profile for the detected biometric elements to form a modified transcription output file. The modified transcription output file may then be provided to the requesting user.
    Type: Grant
    Filed: September 5, 2012
    Date of Patent: June 4, 2013
    Assignee: International Business Machines Corporation
    Inventor: Peeyush Jaiswal
  • Patent number: 8457974
    Abstract: Methods and system for authenticating a user are disclosed. The present invention includes accessing a collection of personal information related to the user. The present invention also includes performing an authentication operation that is based on the collection of personal information. The authentication operation incorporates at least one dynamic component and prompts the user to give an audible utterance. The audible utterance is compared to a stored voiceprint.
    Type: Grant
    Filed: July 26, 2012
    Date of Patent: June 4, 2013
    Assignee: Microsoft Corporation
    Inventor: Kuansan Wang
  • Patent number: 8457946
    Abstract: Architecture for correcting incorrect recognition results in an Asian language speech recognition system. A spelling mode can be launched in response to receiving speech input, the spelling mode for correcting incorrect spelling of the recognition results or generating new words. Correction can be obtained using speech and/or manual selection and entry. The architecture facilitates correction in a single pass, rather than multiples times as in conventional systems. Words corrected using the spelling mode are corrected as a unit and treated as a word. The spelling mode applies to languages of at least the Asian continent, such as Simplified Chinese, Traditional Chinese, and/or other Asian languages such as Japanese.
    Type: Grant
    Filed: April 26, 2007
    Date of Patent: June 4, 2013
    Assignee: Microsoft Corporation
    Inventors: Shiun-Zu Kuo, Kevin E. Feige, Yifan Gong, Taro Miwa, Arun Chitrapu
  • Patent number: 8457973
    Abstract: A method and a processing device for managing an interactive speech recognition system is provided. Whether a voice input relates to expected input, at least partially, of any one of a group of menus different from a current menu is determined. If the voice input relates to the expected input, at least partially, of any one of the group of menus different from the current menu, skipping to the one of the group of menus is performed. The group of menus different from the current menu include menus at multiple hierarchical levels.
    Type: Grant
    Filed: March 4, 2006
    Date of Patent: June 4, 2013
    Assignee: AT&T Intellectual Propert II, L.P.
    Inventor: Harry Edward Blanchard
  • Patent number: 8452593
    Abstract: A projection apparatus with speech indication and a control method thereof are provided. The projection apparatus comprises a storage unit, a transmission interface, a process unit, and an output unit. The storage unit is configured to store a plurality of speech data. The transmission interface is configured to connect to an external apparatus for accessing the storage unit. The process unit is configured to select at least one of the speech data according to the present state of the projection apparatus. The output unit is configured to output the selected speech datum to broadcast the speech indication.
    Type: Grant
    Filed: August 31, 2007
    Date of Patent: May 28, 2013
    Assignee: Delta Electronics, Inc.
    Inventors: Yi-Hsiang Huang, Yuan Ming Hsu, Jimmy Su
  • Patent number: 8447614
    Abstract: System and process for audio authentication of an individual or speaker including a processor for decomposing an audio signal received at the sensor into vectors representative of the speaker to be authenticated for transforming the super-vector V of the speaker resulting from the concatenation of the vectors associated with the said speaker into binary data 1001100 . . . 0 taking as an input the mean super-vector M resulting from the mean super-vector, and comparing the super-vector V of the speaker with the mean super-vector M, the said binary data thus obtained being transmitted to a module for extracting the speaker authentication taking as an input the public keys Kpub(l) in order to authenticate the speaker and/or to generate a cryptographic key associated with the speaker.
    Type: Grant
    Filed: December 22, 2009
    Date of Patent: May 21, 2013
    Assignee: Thales
    Inventors: François Capman, Sandra Marcello, Jean Martinelli
  • Patent number: RE44248
    Abstract: This invention combines methodologies that enhance voice recognition dictation. It describes features for moving speaker voice files eliminating redundant training of speech recognition dictation applications. It defines how to create synthetic voice models reducing speaker dependency. It combines accuracy and performance into a single measure called RAP Rate. Moreover, the invention describes enhancing voice recognition applications and systems by measure/adjusting hardware and software features for optimal voice recognition dictation incorporating methodical processes based on RAP Rate. Using these approaches and tools the invention includes a method for constructing a handheld transcriber that immediately translates audio speech into text with real-time display. The invention describes a method for applying RAP Rate and synthetic voice models to applications like voice mail to text. With the ability to move and translate voice models the invention describes new services that could be provided for a fee.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: May 28, 2013
    Inventor: Darrell A. Poirier
  • Patent number: RE44326
    Abstract: A method and system of speech recognition presented by a back channel from multiple user sites within a network supporting cable television and/or video delivery is disclosed.
    Type: Grant
    Filed: November 3, 2011
    Date of Patent: June 25, 2013
    Assignee: Promptu Systems Corporation
    Inventors: Theodore Calderone, Paul M. Cook, Mark J. Foster