Preliminary Matching Patents (Class 704/247)
-
Patent number: 8976943Abstract: Provided is a method and a telephone-based system with voice-verification capabilities that enable a user to safely and securely conduct transactions with his or her online financial transaction program account over the phone in a convenient and user-friendly fashion, without having to depend on an internet connection.Type: GrantFiled: September 25, 2012Date of Patent: March 10, 2015Assignee: Ebay Inc.Inventor: Will Tonini
-
Patent number: 8977549Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.Type: GrantFiled: September 26, 2013Date of Patent: March 10, 2015Assignee: Nuance Communications, Inc.Inventors: Sabine V. Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
-
Patent number: 8977547Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.Type: GrantFiled: October 8, 2009Date of Patent: March 10, 2015Assignee: Mitsubishi Electric CorporationInventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
-
Publication number: 20150046162Abstract: Device, system, and method of liveness detection using voice biometrics. For example, a method comprises: generating a first matching score based on a comparison between: (a) a voice-print from a first text-dependent audio sample received at an enrollment stage, and (b) a second text-dependent audio sample received at an authentication stage; generating a second matching score based on a text-independent audio sample; and generating a liveness score by taking into account at least the first matching score and the second matching score.Type: ApplicationFiled: October 27, 2014Publication date: February 12, 2015Applicant: Nuance Communications, Inc.Inventors: ALMOG ALEY-RAZ, Nir Moshe Krause, Michael Itzhak Salmon, Ran Yehoshua Gazit
-
Patent number: 8948350Abstract: Disclosed is a secure telephone call management system for authenticating users of a telephone system in an institutional facility. Authentication of the users is accomplished by using a personal identification number, preferably in conjunction with speaker independent voice recognition and speaker dependent voice identification. When a user first enters the system, the user speaks his or her name which is used as a sample voice print. During each subsequent use of the system, the user is required to speak his or her name. Voice identification software is used to verify that the provided speech matches the sample voice print. The secure system includes accounting software to limit access based on funds in a user's account or other related limitations. Management software implements widespread or local changes to the system and can modify or set any number of user account parameters.Type: GrantFiled: July 11, 2008Date of Patent: February 3, 2015Assignee: Global Tel*Link CorporationInventor: Stephen L. Hodge
-
Patent number: 8938382Abstract: An item of information (212) is transmitted to a distal computer (220), translated to a different sense modality and/or language (222), and in substantially real time, and the translation (222) is transmitted back to the location (211) from which the item was sent. The device sending the item is preferably a wireless device, and more preferably a cellular or other telephone (210). The device receiving the translation is also preferably a wireless device, and more preferably a cellular or other telephone, and may advantageously be the same device as the sending device. The item of information (212) preferably comprises a sentence of human of speech having at least ten words, and the translation is a written expression of the sentence. All of the steps of transmitting the item of information, executing the program code, and transmitting the translated information preferably occurs in less than 60 seconds of elapsed time.Type: GrantFiled: March 21, 2012Date of Patent: January 20, 2015Assignee: Ulloa Research Limited Liability CompanyInventor: Robert D. Fish
-
Patent number: 8938388Abstract: Maintaining and supplying a plurality of speech models is provided. A plurality of speech models and metadata for each speech model are stored. A query for a speech model is received from a source. The query includes one or more conditions. The speech model with metadata most closely matching the supplied one or more conditions is determined. The determined speech model is provided to the source. A refined speech model is received from the source, and the refined speech model is stored.Type: GrantFiled: July 9, 2012Date of Patent: January 20, 2015Assignee: International Business Machines CorporationInventors: Bin Jia, Ying Liu, E. Feng Lu, Jia Wu, Zhen Zhang
-
Patent number: 8930191Abstract: Methods, systems, and computer readable storage medium related to operating an intelligent digital assistant are disclosed. A user request is received, the user request including at least a speech input received from a user. In response to the user request, (1) an echo of the speech input based on a textual interpretation of the speech input, and (2) a paraphrase of the user request based at least in part on a respective semantic interpretation of the speech input are presented to the user.Type: GrantFiled: March 4, 2013Date of Patent: January 6, 2015Assignee: Apple Inc.Inventors: Thomas Robert Gruber, Harry Joseph Saddler, Adam John Cheyer, Dag Kittlaus, Christopher Dean Brigham, Richard Donald Giuli, Didier Rene Guzzoni, Marcello Bastea-Forte
-
Patent number: 8930187Abstract: An apparatus for utilizing textual data and acoustic data corresponding to speech data to detect sentiment may include a processor and memory storing executable computer code causing the apparatus to at least perform operations including evaluating textual data and acoustic data corresponding to voice data associated with captured speech content. The computer program code may further cause the apparatus to analyze the textual data and the acoustic data to detect whether the textual data or the acoustic data includes one or more words indicating at least one sentiment of a user that spoke the speech content. The computer program code may further cause the apparatus to assign at least one predefined sentiment to at least one of the words in response to detecting that the word(s) indicates the sentiment of the user. Corresponding methods and computer program products are also provided.Type: GrantFiled: January 3, 2012Date of Patent: January 6, 2015Assignee: Nokia CorporationInventors: Imre Attila Kiss, Joseph Polifroni, Francois Mairesse, Mark Adler
-
Patent number: 8924214Abstract: A method for detecting and recognizing speech is provided that remotely detects body motions from a speaker during vocalization with one or more radar sensors. Specifically, the radar sensors include a transmit aperture that transmits one or more waveforms towards the speaker, and each of the waveforms has a distinct wavelength. A receiver aperture is configured to receive the scattered radio frequency energy from the speaker. Doppler signals correlated with the speaker vocalization are extracted with a receiver. Digital signal processors are configured to develop feature vectors utilizing the vocalization Doppler signals, and words associated with the feature vectors are recognized with a word classifier.Type: GrantFiled: June 7, 2011Date of Patent: December 30, 2014Assignee: The United States of America, as represented by the Secretary of the NavyInventors: Jefferson M Willey, Todd Stephenson, Hugh Faust, James P. Hansen, George J Linde, Carol Chang, Justin Nevitt, James A Ballas, Thomas Herne Crystal, Vincent Michael Stanford, Jean W. De Graaf
-
Patent number: 8924197Abstract: Disclosed are systems, methods, and computer readable media for converting a natural language query into a logical query. The method embodiment comprises receiving a natural language query and converting the natural language query using an extensible engine to generate a logical query, the extensible engine being linked to the toolkit and knowledge base. In one embodiment, a natural language query can be processed in a domain independent method to generate a logical query.Type: GrantFiled: October 30, 2007Date of Patent: December 30, 2014Assignee: Semantifi, Inc.Inventors: Sreenivasa Rao Pragada, Viswanath Dasari, Abhijit A Patil
-
Patent number: 8918320Abstract: An apparatus for generating a review based in part on detected sentiment may include a processor and memory storing executable computer code causing the apparatus to at least perform operations including determining a location(s) of the apparatus and a time(s) that the location(s) was determined responsive to capturing voice data of speech content associated with spoken reviews of entities. The computer program code may further cause the apparatus to analyze textual and acoustic data corresponding to the voice data to detect whether the textual or acoustic data includes words indicating a sentiment(s) of a user speaking the speech content. The computer program code may further cause the apparatus to generate a review of an entity corresponding to a spoken review(s) based on assigning a predefined sentiment to a word(s) responsive to detecting that the word indicates the sentiment of the user. Corresponding methods and computer program products are also provided.Type: GrantFiled: January 3, 2012Date of Patent: December 23, 2014Assignee: Nokia CorporationInventors: Mark Adler, Imre Attila Kiss, Francois Mairesse, Joseph Polifroni
-
Patent number: 8918406Abstract: A method of processing content files may include receiving the content file, employing processing circuitry to determine an identity score of a source of a portion of at least a portion the content file, to determine a word score based for the content file and to determine a metadata score for the content file, determining a composite priority score based on the identity score, the word score and the metadata score, and associating the composite priority score with the content file for electronic provision of the content file together with the composite priority score to a human analyst.Type: GrantFiled: December 14, 2012Date of Patent: December 23, 2014Assignee: Second Wind Consulting LLCInventor: Donna Rober
-
Patent number: 8918316Abstract: The content of a media program is recognized by analyzing its audio content to extract therefrom prescribed features, which are compared to a database of features associated with identified content. The identity of the content within the database that has features that most closely match the features of the media program being played is supplied as the identity of the program being played. The features are extracted from a frequency domain version of the media program by a) filtering the coefficients to reduce their number, e.g., using triangular filters; b) grouping a number of consecutive outputs of triangular filters into segments; and c) selecting those segments that meet prescribed criteria, such as those segments that have the largest minimum segment energy with prescribed constraints that prevent the segments from being too close to each other. The triangular filters may be log-spaced and their output may be normalized.Type: GrantFiled: July 29, 2003Date of Patent: December 23, 2014Assignee: Alcatel LucentInventors: Jan I Ben, Christopher J Burges, Madjid Sam Mousavi, Craig R. Nohl
-
Patent number: 8914290Abstract: Method and apparatus that dynamically adjusts operational parameters of a text-to-speech engine in a speech-based system. A voice engine or other application of a device provides a mechanism to alter the adjustable operational parameters of the text-to-speech engine. In response to one or more environmental conditions, the adjustable operational parameters of the text-to-speech engine are modified to increase the intelligibility of synthesized speech.Type: GrantFiled: May 18, 2012Date of Patent: December 16, 2014Assignee: Vocollect, Inc.Inventors: James Hendrickson, Debra Drylie Scott, Duane Littleton, John Pecorari, Arkadiusz Slusarczyk
-
Patent number: 8903727Abstract: A machine, system and method for user-guided teaching and modifications of voice commands and actions to be executed by a conversational learning system. The machine includes a system bus for communicating data and control signals received from the conversational learning system to a computer system, a vehicle data and control bus for connecting devices and sensors in the machine, a bridge module for connecting the vehicle data and control bus to the system bus, machine subsystems coupled to the vehicle data and control bus having a respective user interface for receiving a voice command or input signal from a user, a memory coupled to the system bus for storing action command sequences learned for a new voice command and a processing unit coupled to the system bus for automatically executing the action command sequences learned when the new voice command is spoken.Type: GrantFiled: March 6, 2013Date of Patent: December 2, 2014Assignee: Nuance Communications, Inc.Inventors: Liam David Comerford, Mahesh Viswanathan
-
Patent number: 8892446Abstract: Methods, systems, and computer readable storage medium related to operating an intelligent digital assistant are disclosed. A user request is received, the user request including at least a speech input received from the user. The user request is processed to obtain a representation of user intent, where the representation of user intent associates the user request with a task flow operationalizing a requested task, and the task flow is operable to invoke a plurality of services each supporting functions according to a respective plurality of service parameters. Based on the representation of user intent, one or more relevant task parameters are identified from a plurality of task parameters of the task flow. A subset of the plurality of services are selectively invoked during execution of the task flow, where the selectively invoked subset of the plurality of services support functions according to the identified one or more relevant task parameters.Type: GrantFiled: December 21, 2012Date of Patent: November 18, 2014Assignee: Apple Inc.Inventors: Adam John Cheyer, Didier Rene Guzzoni, Thomas Robert Gruber, Christopher Dean Brigham
-
Patent number: 8874442Abstract: Device, system, and method of liveness detection using voice biometrics. For example, a method comprises: generating a first matching score based on a comparison between: (a) a voice-print from a first text-dependent audio sample received at an enrollment stage, and (b) a second text-dependent audio sample received at an authentication stage; generating a second matching score based on a text-independent audio sample; and generating a liveness score by taking into account at least the first matching score and the second matching score.Type: GrantFiled: April 17, 2013Date of Patent: October 28, 2014Assignee: Nuance Communications, Inc.Inventors: Almog Aley-Raz, Nir Moshe Krause, Michael Itzhak Salmon, Ran Yehoshua Gazit
-
Patent number: 8874440Abstract: A speech detection apparatus and method are provided. The speech detection apparatus and method determine whether a frame is speech or not using feature information extracted from an input signal. The speech detection apparatus may estimate a situation related to an input frame and determine which feature information is required for speech detection for the input frame in the estimated situation. The speech detection apparatus may detect a speech signal using dynamic feature information that may be more suitable to the situation of a particular frame, instead of using the same feature information for each and every frame.Type: GrantFiled: April 16, 2010Date of Patent: October 28, 2014Assignee: Samsung Electronics Co., Ltd.Inventors: Chi-youn Park, Nam-hoon Kim, Jeong-mi Cho
-
Patent number: 8831942Abstract: A method is provided for identifying a gender of a speaker. The method steps include obtaining speech data of the speaker, extracting vowel-like speech frames from the speech data, analyzing the vowel-like speech frames to generate a feature vector having pitch values corresponding to the vowel-like frames, analyzing the pitch values to generate a most frequent pitch value, determining, in response to the most frequent pitch value being between a first pre-determined threshold and a second pre-determined threshold, an output of a male Gaussian Mixture Model (GMM) and an output of a female GMM using the pitch values as inputs to the male GMM and the female GMM, and identifying the gender of the speaker by comparing the output of the male GMM and the output of the female GMM based on a pre-determined criterion.Type: GrantFiled: March 19, 2010Date of Patent: September 9, 2014Assignee: Narus, Inc.Inventor: Antonio Nucci
-
Patent number: 8825482Abstract: Consumer electronic devices have been developed with enormous information processing capabilities, high quality audio and video outputs, large amounts of memory, and may also include wired and/or wireless networking capabilities. Additionally, relatively unsophisticated and inexpensive sensors, such as microphones, video camera, GPS or other position sensors, when coupled with devices having these enhanced capabilities, can be used to detect subtle features about users and their environments. A variety of audio, video, simulation and user interface paradigms have been developed to utilize the enhanced capabilities of these devices. These paradigms can be used separately or together in any combination. One paradigm automatically creating user identities using speaker identification. Another paradigm includes a control button with 3-axis pressure sensitivity for use with game controllers and other input devices.Type: GrantFiled: September 15, 2006Date of Patent: September 2, 2014Assignee: Sony Computer Entertainment Inc.Inventors: Gustavo Hernandez-Abrego, Xavier Menendez-Pidal, Steven Osman, Ruxin Chen, Rishi Deshpande, Care Michaud-Wideman, Richard Marks, Eric Larsen, Xiaodong Mao
-
Patent number: 8818816Abstract: A voice recognition device includes a voice input unit 11 for inputting a voice of an uttered button name to convert the voice into an electric signal, a voice recognition processing unit 12 for performing a voice recognition process according to a sound signal sent thereto, as the electric signal, from the voice input unit, a button candidate detecting unit 13 for detecting, as a button candidate, a button having a button name which partially matches a voice recognition result acquired by the voice recognition processing unit, a display control unit 15 for, when a plurality of candidate buttons are detected by the button candidate detecting unit, producing a screen showing a state in which at least one of the plurality of button candidates is selected, and a display unit 16 for displaying the screen produced by the display control unit.Type: GrantFiled: April 23, 2009Date of Patent: August 26, 2014Assignee: Mitsubishi Electric CorporationInventors: Yuzuru Inoue, Takayoshi Chikuri, Yuki Furumoto
-
Patent number: 8818807Abstract: This invention describes methods for implementing human speech recognition. The methods described here are of using sub-events that are sounds between spaces (typically a fully spoken word) that is then compared with a library of sub-events. All sub-events are packaged with it's own speech recognition function as individual units. This invention illustrates how this model can be used as a Large Vocabulary Speech Recognition System.Type: GrantFiled: May 24, 2010Date of Patent: August 26, 2014Inventor: Darrell Poirier
-
Patent number: 8812318Abstract: One-to-many comparisons of callers' voice prints with known voice prints to identify any matches between them. When a customer communicates with a particular entity, such as a customer service center, the system makes a recording of the real-time call including both the customer's and agent's voices. The system segments the recording to extract at least a portion of the customer's voice to create a customer voice print, and it formats the segmented voice print for network transmission to a server. The server compares the customer's voice print with multiple known voice prints to determine any matches, meaning that the customer's voice print and one of the known voice prints are likely from the same person. The identification of any matches can be used for a variety of purposes, such as determining whether to authorize a transaction requested by the customer.Type: GrantFiled: February 6, 2012Date of Patent: August 19, 2014Assignee: III Holdings 1, LLCInventors: Vicki Broman, Vernon Marshall, Seshasayee Bellamkonda, Marcel Leyva, Cynthia Hanson
-
Patent number: 8812326Abstract: A computer-driven device assists a user in self-regulating speech control of the device. The device processes an input signal representing human speech to compute acoustic signal quality indicators indicating conditions likely to be problematic to speech recognition, and advises the user of those conditions.Type: GrantFiled: August 6, 2013Date of Patent: August 19, 2014Assignee: Promptu Systems CorporationInventors: Naren Chittar, Vikas Gulati, Matthew Pratt, Harry Printz
-
Patent number: 8804973Abstract: In an example signal clustering apparatus, a feature of a signal is divided into segments. A first feature vector of each segment is calculated, the first feature vector having has a plurality of elements corresponding to each reference model. A value of an element attenuates when a feature of the segment shifts from a center of a distribution of the reference model corresponding to the element. A similarity between two reference models is calculated. A second feature vector of each segment is calculated, the second feature vector having a plurality of elements corresponding to each reference model. A value of an element is a weighted sum and segments of second feature vectors of which the plurality of elements are similar values are clustered to one class.Type: GrantFiled: March 19, 2012Date of Patent: August 12, 2014Assignee: Kabushiki Kaisha ToshibaInventors: Makoto Hirohata, Kazunori Imoto, Hisashi Aoki
-
Patent number: 8805685Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.Type: GrantFiled: August 5, 2013Date of Patent: August 12, 2014Assignee: AT&T Intellectual Property I, L.P.Inventor: Horst J. Schroeter
-
Patent number: 8793130Abstract: A method of generating a confidence measure generator is provided for use in a voice search system, the voice search system including voice search components comprising a speech recognition system, a dialog manager and a search system. The method includes selecting voice search features, from a plurality of the voice search components, to be considered by the confidence measure generator in generating a voice search confidence measure. The method includes training a model, using a computer processor, to generate the voice search confidence measure based on selected voice search features.Type: GrantFiled: March 23, 2012Date of Patent: July 29, 2014Assignee: Microsoft CorporationInventors: Ye-Yi Wang, Yun-Cheng Ju, Dong Yu
-
Patent number: 8775181Abstract: Interpretation from a first language to a second language via one or more communication devices is performed through a communication network (e.g. phone network or the internet) using a server for performing recognition and interpretation tasks, comprising the steps of: receiving an input speech utterance in a first language on a first mobile communication device; conditioning said input speech utterance; first transmitting said conditioned input speech utterance to a server; recognizing said first transmitted speech utterance to generate one or more recognition results; interpreting said recognition results to generate one or more interpretation results in an interlingua; mapping the interlingua to a second language in a first selected format; second transmitting said interpretation results in the first selected format to a second mobile communication device; and presenting said interpretation results in a second selected format on said second communication device.Type: GrantFiled: July 2, 2013Date of Patent: July 8, 2014Assignee: Fluential, LLCInventors: Farzad Ehsani, Demitrios Master, Elaine Drom Zuber
-
Patent number: 8775180Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script. In yet still further aspects of the invention, the duration of a given interaction can be analyzed, either apart from or in combination with the script compliance analysis above, to seek to identify instances of agent non-compliance, of fraud, or of quality-analysis issues.Type: GrantFiled: November 26, 2012Date of Patent: July 8, 2014Assignee: West CorporationInventors: Mark J. Pettay, Fonda J. Narke
-
Publication number: 20140188472Abstract: A computer-implemented method, system and/or program product update voice prints over time. A receiving computer receives an initial voice print. A determining period of time is calculated for that initial voice print. This determining period of time is a length of time during which an expected degree of change in subsequent voice prints, in comparison to the initial voice print and according to a speaker's subsequent age, is predicted to occur. A new voice print is received after the determining period of time has passed, and the new voice print is compared with the initial voice print. In response to a change to the new voice print falling within the expected degree of change in comparison to the initial voice print, a voice print store is updated with the new voice print.Type: ApplicationFiled: March 7, 2014Publication date: July 3, 2014Applicant: Nuance Communications, Inc.Inventors: Sheri G. Daye, Peeyush Jaiswal, Fang Wang
-
Patent number: 8762149Abstract: The present invention refers to a method for verifying the identity of a speaker based on the speakers voice comprising the steps of: a) receiving a voice utterance; b) using biometric voice data to verify (10) that the speakers voice corresponds to the speaker the identity of which is to be verified based on the received voice utterance; and c) verifying (12, 13) that the received voice utterance is not falsified, preferably after having verified the speakers voice; d) accepting (16) the speakers identity to be verified in case that both verification steps give a positive result and not accepting (15) the speakers identity to be verified if any of the verification steps give a negative result. The invention further refers to a corresponding computer readable medium and a computer.Type: GrantFiled: December 10, 2008Date of Patent: June 24, 2014Inventors: Marta Sánchez Asenjo, Alfredo Gutiérrez Navarro, Alberto MartÃn de los Santos de las Heras, Marta GarcÃa Gomar
-
Patent number: 8751241Abstract: The current invention provides a method and system for enabling a device function of a vehicle. A speech input stream is received at a telematics unit. A speech input context is determined for the received speech input stream. The received speech input stream is processed based on the determination and the device function of the vehicle is enabled responsive to the processed speech input stream. A vehicle device in control of the enabled device function of the vehicle is directed based on the processed speech input stream. A computer usable medium with suitable computer program code is employed for enabling a device function of a vehicle.Type: GrantFiled: April 10, 2008Date of Patent: June 10, 2014Assignee: General Motors LLCInventors: Christopher L. Oesterling, William E. Mazzara, Jr., Jeffrey M. Stefan
-
Patent number: 8751233Abstract: A speaker-verification digital signature system is disclosed that provides greater confidence in communications having digital signatures because a signing party may be prompted to speak a text-phrase that may be different for each digital signature, thus making it difficult for anyone other than the legitimate signing party to provide a valid signature.Type: GrantFiled: July 31, 2012Date of Patent: June 10, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Pradeep K. Bansal, Lee Begeja, Carroll W. Creswell, Jeffrey Farah, Benjamin J. Stern, Jay Wilpon
-
Patent number: 8731928Abstract: A phonetic vocabulary for a speech recognition system is adapted to a particular speaker's pronunciation. A speaker can be attributed specific pronunciation styles, which can be identified from specific pronunciation examples. Consequently, a phonetic vocabulary can be reduced in size, which can improve recognition accuracy and recognition speed.Type: GrantFiled: March 15, 2013Date of Patent: May 20, 2014Assignee: Nuance Communications, Inc.Inventors: Nitendra Rajput, Ashish Verma
-
Patent number: 8719023Abstract: An apparatus to improve robustness to environmental changes of a context dependent speech recognizer for an application, that includes a training database to store sounds for speech recognition training, a dictionary to store words supported by the speech recognizer, and a speech recognizer training module to train a set of one or more multiple state Hidden Markov Models (HMMs) with use of the training database and the dictionary. The speech recognizer training module performs a non-uniform state clustering process on each of the states of each HMM, which includes using a different non-uniform cluster threshold for at least some of the states of each HMM to more heavily cluster and correspondingly reduce a number of observation distributions for those of the states of each HMM that are less empirically affected by one or more contextual dependencies.Type: GrantFiled: May 21, 2010Date of Patent: May 6, 2014Assignee: Sony Computer Entertainment Inc.Inventors: Xavier Menendez-Pidal, Ruxin Chen
-
Patent number: 8719016Abstract: A method for converting speech to text in a speech analytics system is provided. The method includes receiving audio data containing speech made up of sounds from an audio source, processing the sounds with a phonetic module resulting in symbols corresponding to the sounds, and processing the symbols with a language module and occurrence table resulting in text. The method also includes determining a probability of correct translation for each word in the text, comparing the probability of correct translation for each word in the text to the occurrence table, and adjusting the occurrence table based on the probability of correct translation for each word in the text.Type: GrantFiled: April 7, 2010Date of Patent: May 6, 2014Assignee: Verint Americas Inc.Inventors: Omer Ziv, Ran Achituv, Ido Shapira
-
Patent number: 8706503Abstract: Methods, systems, and computer readable storage medium related to operating an intelligent digital assistant are disclosed. A text string is obtained from a speech input received from a user. Information is derived from a communication event that occurred at the electronic device prior to receipt of the speech input. The text string is interpreted to derive a plurality of candidate interpretations of user intent. One of the candidate user intents is selected based on the information relating to the communication event.Type: GrantFiled: December 21, 2012Date of Patent: April 22, 2014Assignee: Apple Inc.Inventors: Adam John Cheyer, Didier Rene Guzzoni, Thomas Robert Gruber, Christopher Dean Brigham
-
Patent number: 8706491Abstract: One feature of the present invention uses the parsing capabilities of a structured language model in the information extraction process. During training, the structured language model is first initialized with syntactically annotated training data. The model is then trained by generating parses on semantically annotated training data enforcing annotated constituent boundaries. The syntactic labels in the parse trees generated by the parser are then replaced with joint syntactic and semantic labels. The model is then trained by generating parses on the semantically annotated training data enforcing the semantic tags or labels found in the training data. The trained model can then be used to extract information from test data using the parses generated by the model.Type: GrantFiled: August 24, 2010Date of Patent: April 22, 2014Assignee: Microsoft CorporationInventors: Ciprian Chelba, Milind Mahajan
-
Patent number: 8676579Abstract: A method of authenticating a user of a mobile device having a first microphone and a second microphone, the method comprising receiving voice input from the user at the first and second microphones, determining a position of the user relative to the mobile device based on the voice input received by the first and second microphones, and authenticating the user based on the position of the user.Type: GrantFiled: April 30, 2012Date of Patent: March 18, 2014Assignee: BlackBerry LimitedInventor: James Allen Hymel
-
Patent number: 8666743Abstract: The invention provides a speech recognition method for selecting a combination of list elements via a speech input, wherein a first list element of the combination is part of a first set of list elements and a second list element of the combination is part of a second set of list elements, the method comprising the steps of receiving the speech input, comparing each list element of the first set with the speech input to obtain a first candidate list of best matching list elements, processing the second set using the first candidate list to obtain a subset of the second set, comparing each list element of the subset of the second set with the speech input to obtain a second candidate list of best matching list elements, and selecting a combination of list elements using the first and the second candidate list.Type: GrantFiled: June 2, 2010Date of Patent: March 4, 2014Assignee: Nuance Communications, Inc.Inventors: Markus Schwarz, Matthias Schulz, Marc Biedert, Christian Hillebrecht, Franz Gerl, Udo Haiber
-
Patent number: 8660845Abstract: Systems and methods for audio editing are provided. In one implementation, a computer-implemented method is provided. The method includes receiving digital audio data including a plurality of distinct vocal components. Each distinct vocal component is automatically identified using one or more attributes that uniquely identify each distinct vocal component. The audio data is separated into two or more individual tracks where each individual track comprises audio data corresponding to one distinct vocal component. The separated individual tracks are then made available for further processing.Type: GrantFiled: October 16, 2007Date of Patent: February 25, 2014Assignee: Adobe Systems IncorporatedInventors: Nariman Sodeifi, David E. Johnston
-
Patent number: 8661515Abstract: An audible authentication of a wireless device for enrollment onto a secure wireless network includes an unauthorized wireless device that audibly emits a uniquely identifying secret code (e.g., a personal identification number (PIN)). In some implementations, the audible code is heard by the user and manually entered via a network-enrollment user interface. In other implementations, a network-authorizing device automatically picks up the audible code and verifies the code. If verified, the wireless device is enrolled onto the wireless network.Type: GrantFiled: May 10, 2010Date of Patent: February 25, 2014Assignee: Intel CorporationInventors: Marc Meylemans, Gary A. Martz, Jr.
-
Patent number: 8660251Abstract: A method, system and computer program product for alerting a participant when a topic of interest is being discussed and/or a speaker of interest is speaking during a conference call. A participant to a conference call identifies the topics and/or speakers of interest which is stored for future use along with the participant's contact information. When a participant's identified topic of interest is being discussed and/or a participant's identified speaker of interest is speaking during a conference call, the participant will be alerted to that fact, such as via the means specified in the participant's contact information.Type: GrantFiled: July 12, 2012Date of Patent: February 25, 2014Assignee: International Business Machines CorporationInventors: Steven M. Miller, Lisa A. Seacat
-
Patent number: 8639507Abstract: The present invention enables the recognition process at high speed even when a lot of garbage is included in the grammar. The first voice recognition processing unit generates a recognition hypothesis graph which indicates a structure of hypothesis that is derived according to a first grammar together with a score associated with respective connections of a recognition unit by executing a voice recognition process based on the first grammar to a voice feature amount of input voice, and the second voice recognition processing unit outputs the recognition result from a total score of a hypothesis which is derived according to a second grammar after executing a voice recognition process according to the second grammar that is specified to accept a section other than keywords in input voice as the garbage section to a voice feature amount of input voice, and the second voice recognition processing unit acquires the structure and the score of the garbage section from the recognition hypothesis graph.Type: GrantFiled: December 22, 2008Date of Patent: January 28, 2014Assignee: NEC CorporationInventors: Fumihiro Adachi, Ryosuke Isotani, Ken Hanazawa
-
Patent number: 8634783Abstract: A communication device includes memory, an input interface, a processing module, and a transmitter. The processing module receives a digital signal from the input interface, wherein the digital signal includes a desired digital signal component and an undesired digital signal component. The processing module identifies one of a plurality of codebooks based on the undesired digital signal component. The processing module then identifies a codebook entry from the one of the plurality of codebooks based on the desired digital signal component to produce a selected codebook entry. The processing module then generates a coded signal based on the selected codebook entry, wherein the coded signal includes a substantially unattenuated representation of the desired digital signal component and an attenuated representation of the undesired digital signal component. The transmitter converts the coded signal into an outbound signal in accordance with a signaling protocol and transmits it.Type: GrantFiled: January 31, 2013Date of Patent: January 21, 2014Assignee: Broadcom CorporationInventor: Nambirajan Seshadri
-
Patent number: 8620654Abstract: A system in one embodiment includes a server associated with a unified messaging system (UMS). The server records speech of a user as an audio data file, translates the audio data file into a text data file, and maps each word within the text data file to a corresponding segment of audio data in the audio data file. A graphical user interface (GUI) of a message editor running on an endpoint associated with the user displays the text data file on the endpoint and allows the user to identify a portion of the text data file for replacement. The server being further operable to record new speech of the user as new audio data and to replace one or more segments of the audio data file corresponding to the portion of the text with the new audio data.Type: GrantFiled: July 20, 2007Date of Patent: December 31, 2013Assignee: Cisco Technology, Inc.Inventors: Joseph F. Khouri, Laurent Philonenko, Mukul Jain, Shmuel Shaffer
-
Patent number: 8620655Abstract: A speech processing method, comprising: receiving a speech input which comprises a sequence of feature vectors; determining the likelihood of a sequence of words arising from the sequence of feature vectors using an acoustic model and a language model, comprising: providing an acoustic model for performing speech recognition on an input signal which comprises a sequence of feature vectors, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to a feature vector, wherein said speech input is a mismatched speech input which is received from a speaker in an environment which is not matched to the speaker or environment under which the acoustic model was trained; and adapting the acoustic model to the mismatched speech input, the speech processing method further comprising determining the likelihood of a sequence of features occurring in a given language using a language model; and combining the likelihoods determined by the acousticType: GrantFiled: August 10, 2011Date of Patent: December 31, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Haitian Xu, Kean Kheong Chin, Mark John Francis Gales
-
Patent number: 8606560Abstract: An interpretation system that includes an optical or audio acquisition device for acquiring a sentence written or spoke in a source language and an audio restoration device for generating, from an input signal acquired by the acquisition device, a source sentence that is a transcription of the sentence in the source language. The interpretation system further includes a translation device for generating, from the source sentence, a target sentence that is a translation of the source sentence in a target language, and a speech synthesis device for generating, from the target sentence, an output audio signal reproduced by the audio restoration device. The interpretation system includes a smoothing device for calling the recognition, translation and speech synthesis devices in order to produce in real time an interpretation in the target language of the sentence in the source language.Type: GrantFiled: November 18, 2008Date of Patent: December 10, 2013Inventor: Jean Grenier
-
Patent number: 8606568Abstract: Methods, computer program products, and systems are described for receiving, by a speech recognition engine, audio data that encodes an utterance and determining, by the speech recognition engine, that a transcription of the utterance includes one or more keywords associated with a command, and a pronoun. In addition, the methods, computer program products, and systems described herein pertain to transmitting a disambiguation request to an application, wherein the disambiguation request identifies the pronoun, receiving, by the speech recognition engine, a response to the disambiguation request, wherein the response references an item of content identified by the application, and generating, by the speech recognition engine, the command using the keywords and the response.Type: GrantFiled: October 23, 2012Date of Patent: December 10, 2013Assignee: Google Inc.Inventors: Simon Tickner, Richard Z. Cohen