Preliminary Matching Patents (Class 704/247)
-
Patent number: 8600751Abstract: Digital method for authentication of a person by comparing a current voice profile with a previously stored initial voice profile, wherein to determine the relevant voice profile the person speaks at least one speech sample into the system, this speech sample is conveyed to a voice-profile calculation unit and thereby, on the basis of a prespecified voice-profile algorithm, the voice profile is calculated, such that the overall size of the speech sample and/or parameters of its evaluation to determine the relevant voice profile are established dynamically and automatically as the sample is spoken, in response to the result of an evaluation of a first partial speech sample.Type: GrantFiled: February 19, 2008Date of Patent: December 3, 2013Assignee: Voice.Trust AGInventors: Raja Kuppuswamy, Christian Pilz
-
Patent number: 8595007Abstract: Positive identification of local inhabitants plays an important role in modern military, police and security operations. Since terrorists use all means to masquerade as local inhabitants, the identification of terrorist or hostile suspects becomes an increasingly complicated task. The instant software solution will assist military, police and security forces in the identification of suspects using Voice Print Recognition (VPR) technology. Our VPR software will compare and recognize, or match, specific voice samples with stored, digital voice models (voice prints) for the purpose of establishing or verifying identity. VPR software will support an operator's decision and situational awareness through the verification of a person's identity (for instance: remote access control), but more importantly will assist in the identification of suspect individuals (identifying suspects among a large group of captured individuals).Type: GrantFiled: June 5, 2007Date of Patent: November 26, 2013Assignee: NITV Federal Services, LLCInventor: James A. Kane
-
Patent number: 8571869Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.Type: GrantFiled: May 15, 2008Date of Patent: October 29, 2013Assignee: Nuance Communications, Inc.Inventors: Sabine Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
-
Patent number: 8571865Abstract: Systems, methods performed by data processing apparatus and computer storage media encoded with computer programs for receiving information relating to (i) a communication device that has received an utterance and (ii) a voice associated with the received utterance, comparing the received voice information with voice signatures in a comparison group, the comparison group including one or more individuals identified from one or more connections arising from the received information relating to the communication device, attempting to identify the voice associated with the utterance as matching one of the individuals in the comparison group, and based on a result of the attempt to identify, selectively providing the communication device with access to one or more resources associated with the matched individual.Type: GrantFiled: August 10, 2012Date of Patent: October 29, 2013Assignee: Google Inc.Inventor: Philip Hewinson
-
Patent number: 8571858Abstract: For classifying different segments of a signal which has segments of at least a first type and second type, e.g. audio and speech segments, the signal is short-term classified on the basis of the at least one short-term feature extracted from the signal and a short-term classification result is delivered. The signal is also long-term classified on the basis of the at least one short-term feature and at least one long-term feature extracted from the signal and a long-term classification result is delivered. The short-term classification result and the long-term classification result are combined to provide an output signal indicating whether a segment of the signal is of the first type or of the second type.Type: GrantFiled: January 11, 2011Date of Patent: October 29, 2013Assignee: Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.Inventors: Guillaume Fuchs, Stefan Bayer, Jens Hirschfeld, Juergen Herre, Jeremie Lecomte, Frederik Nagel, Nikolaus Rettelbach, Stefan Wabnik, Yoshikazu Yokotani
-
Patent number: 8566097Abstract: A lexical acquisition apparatus includes: a phoneme recognition section 2 for preparing a phoneme sequence candidate from an inputted speech; a word matching section 3 for preparing a plurality of word sequences based on the phoneme sequence candidate; a discrimination section 4 for selecting, from among a plurality of word sequences, a word sequence having a high likelihood in a recognition result; an acquisition section 5 for acquiring a new word based on the word sequence selected by the discrimination section 4; a teaching word list 4A used to teach a name; and a probability model 4B of the teaching word and an unknown word, wherein the discrimination section 4 calculates, for each word sequence, a first evaluation value showing how much words in the word sequence correspond to teaching words in the list 4A and a second evaluation value showing a probability at which the words in the word sequence are adjacent to one another and selects a word sequence for which a sum of the first evaluation value and theType: GrantFiled: June 1, 2010Date of Patent: October 22, 2013Assignees: Honda Motor Co., Ltd., Advanced Telecommunications Research Institute InternationalInventors: Mikio Nakano, Takashi Nose, Ryo Taguchi, Kotaro Funakoshi, Naoto Iwahashi
-
Patent number: 8560327Abstract: A method for synchronizing sound data and text data, said text data being obtained by manual transcription of said sound data during playback of the latter. The proposed method comprises the steps of repeatedly querying said sound data and said text data to obtain a current time position corresponding to a currently played sound datum and a currently transcribed text datum, respectively, correcting said current time position by applying a time correction value in accordance with a transcription delay, and generating at least one association datum indicative of a synchronization association between said corrected time position and said currently transcribed text datum. Thus, the proposed method achieves cost-effective synchronization of sound and text in connection with the manual transcription of sound data.Type: GrantFiled: August 18, 2006Date of Patent: October 15, 2013Assignee: Nuance Communications, Inc.Inventors: Andreas Neubacher, Miklos Papai
-
Patent number: 8554563Abstract: A method and system for speaker diarization are provided. Pre-trained acoustic models of individual speaker and/or groups of speakers are obtained. Speech data with multiple speakers is received and divided into frames. For a frame, an acoustic feature vector is determined extended to include log-likelihood ratios of the pre-trained models in relation to a background population model. The extended acoustic feature vector is used in segmentation and clustering algorithms.Type: GrantFiled: September 11, 2012Date of Patent: October 8, 2013Assignee: Nuance Communications, Inc.Inventor: Hagai Aronowitz
-
Patent number: 8554566Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.Type: GrantFiled: November 29, 2012Date of Patent: October 8, 2013Assignee: Morphism LLCInventor: James H. Stephens, Jr.
-
Patent number: 8554541Abstract: A virtual pet system includes: a virtual pet client, adapted to receive a sentence in natural language and send the sentence to a Q&A server; the Q&A server, adapted to receive the sentence, process the sentence through natural language comprehension, generate an answer in natural language based on a result of natural language comprehension and reasoning knowledge, and send the answer in natural language to the virtual pet client. A method for virtual pet chatting includes: receiving a sentence in natural language, perform natural language comprehension for the sentence, and generating an answer in natural language based on a result of natural language comprehension and reasoning knowledge.Type: GrantFiled: September 18, 2008Date of Patent: October 8, 2013Assignee: Tencent Technology (Shenzhen) Company Ltd.Inventors: Haisong Yang, Zhiyuan Liu, Yunfeng Liu, Rongling Yu
-
Patent number: 8554562Abstract: A method and system for speaker diarization are provided. Pre-trained acoustic models of individual speaker and/or groups of speakers are obtained. Speech data with multiple speakers is received and divided into frames. For a frame, an acoustic feature vector is determined extended to include log-likelihood ratios of the pre-trained models in relation to a background population model. The extended acoustic feature vector is used in segmentation and clustering algorithms.Type: GrantFiled: November 15, 2009Date of Patent: October 8, 2013Assignee: Nuance Communications, Inc.Inventor: Hagai Aronowitz
-
Patent number: 8532800Abstract: Simple, computational efficient, and robust audio features are applied in a uniform program indexing method for picking up video segments relating to highlight plays in a recorded program worthy of being reviewed. By focusing on certain frequencies in an audio sequence of the program, a computational complexity of the uniform program indexing method is significantly decreased. With the aid of MFCC coefficients and a DFBE coefficient generated from the MFCC coefficients, audio patterns may be utilized for differentiating exciting events in the program from other unnecessary information. Scores corresponding to various audio segments are regarded as standards for picking up video segments in the program worthy of being chosen in a recorded highlight collection. Some low-level-feature parameters, some video segments having highlight-related visual characteristics, and a re-ranking procedure are utilized for enhancing precision of the scores for providing video segments worthy of being reviewed.Type: GrantFiled: May 24, 2007Date of Patent: September 10, 2013Assignee: MAVs Lab. Inc.Inventors: Bei Wang, Chia-Hung Yeh, Hsuan-Huei Shih, Chung-Chieh Kuo
-
Patent number: 8527271Abstract: A method for the voice recognition of a spoken expression to be recognized, comprising a plurality of expression parts that are to be recognized. Partial voice recognition takes place on a first selected expression part, and depending on a selection of hits for the first expression part detected by the partial voice recognition, voice recognition on the first and further expression parts is executed.Type: GrantFiled: June 18, 2008Date of Patent: September 3, 2013Assignee: Nuance Communications, Inc.Inventors: Michael Wandinger, Jesus Fernando Guitarte Perez, Bernhard Littel
-
Publication number: 20130226582Abstract: Device, system, and method of liveness detection using voice biometrics. For example, a method comprises: generating a first matching score based on a comparison between: (a) a voice-print from a first text-dependent audio sample received at an enrollment stage, and (b) a second text-dependent audio sample received at an authentication stage; generating a second matching score based on a text-independent audio sample; and generating a liveness score by taking into account at least the first matching score and the second matching score.Type: ApplicationFiled: April 17, 2013Publication date: August 29, 2013Applicant: Nuance Communications, Inc.Inventors: Almog Aley-Raz, Nir Moshe Krause, Michael Itzhak Salmon, Ran Yehoshua Gazit
-
Patent number: 8521537Abstract: A computer-driven device assists a user in self-regulating speech control of the device. The device processes an input signal representing human speech to compute acoustic signal quality indicators indicating conditions likely to be problematic to speech recognition, and advises the user of those conditions.Type: GrantFiled: April 3, 2007Date of Patent: August 27, 2013Assignee: Promptu Systems CorporationInventors: Naren Chittar, Vikas Gulati, Matthew Pratt, Harry Printz
-
Patent number: 8521526Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing spoken query terms. In one aspect, a method includes performing speech recognition on an audio signal to select two or more textual, candidate transcriptions that match a spoken query term, and to establish a speech recognition confidence value for each candidate transcription, obtaining a search history for a user who spoke the spoken query term, where the search history references one or more past search queries that have been submitted by the user, generating one or more n-grams from each candidate transcription, where each n-gram is a subsequence of n phonemes, syllables, letters, characters, words or terms from a respective candidate transcription, and determining, for each n-gram, a frequency with which the n-gram occurs in the past search queries, and a weighting value that is based on the respective frequency.Type: GrantFiled: July 28, 2010Date of Patent: August 27, 2013Assignee: Google Inc.Inventors: Matthew I. Lloyd, Johan Schalkwyk, Pankaj Risbood
-
Patent number: 8515755Abstract: A system enables a transcriptionist to replace a first written form (such as an abbreviation) of a concept with a second written form (such as an expanded form) of the same concept. For example, the system may display to the transcriptionist a draft document produced from speech by an automatic speech recognizer. If the transcriptionist recognizes a first written form of a concept that should be replaced with a second written form of the same concept, the transcriptionist may provide the system with a replacement command. In response, the system may identify the second written form of the concept and replace the first written form with the second written form in the draft document.Type: GrantFiled: November 1, 2010Date of Patent: August 20, 2013Assignee: MModal IP LLCInventor: Kjell Schubert
-
Patent number: 8504369Abstract: A device, for use by a transcriptionist in a transcription editing system for editing transcriptions dictated by speakers, includes, in combination, a monitor configured to display visual text of transcribed dictations, an audio mechanism configured to cause playback of portions of an audio file associated with a dictation, and a cursor-control module coupled to the audio mechanism and to the monitor and configured to cause the monitor to display multiple cursors in the text.Type: GrantFiled: June 2, 2004Date of Patent: August 6, 2013Assignee: Nuance Communications, Inc.Inventors: Benjamin Chigier, Edward A. Brody, Daniel Edward Chernin, Roger S. Zimmerman
-
Patent number: 8504365Abstract: Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.Type: GrantFiled: April 11, 2008Date of Patent: August 6, 2013Assignee: AT&T Intellectual Property I, L.P.Inventor: Horst Schroeter
-
Patent number: 8489397Abstract: A machine-readable medium and a network device are provided for speech-to-text translation. Speech packets are received at a broadband telephony interface and stored in a buffer. The speech packets are processed and textual representations thereof are displayed as words on a display device. Speech processing is activated and deactivated in response to a command from a subscriber.Type: GrantFiled: September 11, 2012Date of Patent: July 16, 2013Assignee: AT&T Intellectual Property II, L.P.Inventors: Charles David Caldwell, John Bruce Harlow, Robert J. Sayko, Norman Shaye
-
Patent number: 8484035Abstract: A method of altering a social signaling characteristic of a speech signal. A statistically large number of speech samples created by different speakers in different tones of voice are evaluated to determine one or more relationships that exist between a selected social signaling characteristic and one or more measurable parameters of the speech samples. An input audio voice signal is then processed in accordance with these relationships to modify one or more of controllable parameters of input audio voice signal to produce a modified output audio voice signal in which said selected social signaling characteristic is modified. In a specific illustrative embodiment, a two-level hidden Markov model is used to identify voiced and unvoiced speech segments and selected controllable characteristics of these speech segments are modified to alter the desired social signaling characteristic.Type: GrantFiled: September 6, 2007Date of Patent: July 9, 2013Assignee: Massachusetts Institute of TechnologyInventor: Alex Paul Pentland
-
Publication number: 20130166301Abstract: A computer-implemented method, system and/or program product update voice prints over time. A receiving computer receives an initial voice print. A determining period of time is calculated for that initial voice print. This determining period of time is a length of time during which an expected degree of change in subsequent voice prints, in comparison to the initial voice print and according to a speaker's subsequent age, is predicted to occur. A new voice print is received after the determining period of time has passed, and the new voice print is compared with the initial voice print. In response to a change to the new voice print falling within the expected degree of change in comparison to the initial voice print, a voice print store is updated with the new voice print.Type: ApplicationFiled: February 19, 2013Publication date: June 27, 2013Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: INTERNATIONAL BUSINESS MACHINES CORPORATION
-
Patent number: 8468012Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.Type: GrantFiled: May 26, 2010Date of Patent: June 18, 2013Assignee: Google Inc.Inventors: Matthew I. Lloyd, Trausti Kristjansson
-
Patent number: 8463608Abstract: A method and apparatus for updating a speech model on a multi-user speech recognition system with a personal speech model for a single user. A speech recognition system, for instance in a car, can include a generic speech model for comparison with the user speech input. A way of identifying a personal speech model, for instance in a mobile phone, is connected to the system. A mechanism is included for receiving personal speech model components, for instance a BLUETOOTH connection. The generic speech model is updated using the received personal speech model components. Speech recognition can then be performed on user speech using the updated generic speech model.Type: GrantFiled: March 12, 2012Date of Patent: June 11, 2013Assignee: Nuance Communications, Inc.Inventors: Barry Neil Dow, Eric William Janke, Daniel Lee Yuk Cheung, Benjamin Terrick Staniford
-
Patent number: 8452596Abstract: To enable selection of a speaker, the acoustic feature value of which is similar to that of an utterance speaker, with accuracy and stability, while adapting to changes even when the acoustic feature value of the speaker changes every moment, a long-time speaker score is calculated (log likelihood of each of a plurality of speaker models stored in a speaker model storage with respect to the acoustic feature value) based on an arbitrary number of utterances, for example, and a short-time speaker score is calculated based on a short-time utterance, for example. Speakers are selected corresponding to a predetermined number of speaker models having a high long-time speaker score. Speakers are selected corresponding to the speaker models, the number of which is smaller than the predetermined number and the short-time speaker sore of which is high, from among the speakers having a high long-time speaker score.Type: GrantFiled: February 29, 2008Date of Patent: May 28, 2013Assignee: NEC CorporationInventors: Masahiro Tani, Tadashi Emori, Yoshifumi Onishi
-
Patent number: 8442825Abstract: A device for voice identification including a receiver, a segmenter, a resolver, two advancers, a buffer, and a plurality of IIR resonator digital filters where each IIR filter comprises a set of memory locations or functional equivalent to hold filter specifications, a memory location or functional equivalent to hold the arithmetic reciprocal of the filter's gain, a five cell controller array, several multipliers, an adder, a subtractor, and a logical non-shift register. Each cell of the five cell controller array has five logical states, each acting as a five-position single-pole rotating switch that operates in unison with the four others. Additionally, the device also includes an artificial neural network and a display means.Type: GrantFiled: August 16, 2011Date of Patent: May 14, 2013Assignee: The United States of America as Represented by the Director, National Security AgencyInventor: Michael Sinutko
-
Patent number: 8442824Abstract: Device, system, and method of liveness detection using voice biometrics. For example, a method comprises: generating a first matching score based on a comparison between: (a) a voice-print from a first text-dependent audio sample received at an enrollment stage, and (b) a second text-dependent audio sample received at an authentication stage; generating a second matching score based on a text-independent audio sample; and generating a liveness score by taking into account at least the first matching score and the second matching score.Type: GrantFiled: November 25, 2009Date of Patent: May 14, 2013Assignee: Nuance Communications, Inc.Inventors: Almog Aley-Raz, Nir Moshe Krause, Michael Itzhak Salmon, Ran Yehoshua Gazit
-
Patent number: 8433568Abstract: A method for measuring speech intelligibility includes inputting a speech waveform to a system. At least one acoustic feature is extracted from the waveform. From the acoustic feature, at least one phoneme is segmented. At least one acoustic correlate measure is extracted from the at least one phoneme and at least one intelligibility measure is determined. The at least one acoustic correlate measure is mapped to the at least one intelligibility measure.Type: GrantFiled: March 29, 2010Date of Patent: April 30, 2013Assignee: Cochlear LimitedInventors: Lee Krause, Mark Skowranski, Bonny Banerjee
-
Patent number: 8433567Abstract: A method, system, and computer program product compensation of intra-speaker variability in speaker diarization are provided. The method includes: dividing a speech session into segments of duration less than an average duration between speaker change; parameterizing each segment by a time dependent probability density function supervector, for example, using a Gaussian Mixture Model; computing a difference between successive segment supervectors; and computing a scatter measure such as a covariance matrix of the difference as an estimate of intra-speaker variability. The method further includes compensating the speech session for intra-speaker variability using the estimate of intra-speaker variability.Type: GrantFiled: April 8, 2010Date of Patent: April 30, 2013Assignee: International Business Machines CorporationInventor: Hagai Aronowitz
-
Patent number: 8428950Abstract: A speech recognition apparatus (110) selects an optimum recognition result from recognition results output from a set of speech recognizers (s1-sM) based on a majority decision. This decision is implemented with taking into account weight values, as to the set of the speech recognizers, learned by a learning apparatus (100). The learning apparatus includes a unit (103) selecting speech recognizers corresponding to characteristics of speech for learning (101), a unit (104) finding recognition results of the speech for learning by using the selected speech recognizers, a unit (105) unifying the recognition results and generating a word string network, and a unit (106) finding weight values concerning a set of the speech recognizers by implementing learning processing.Type: GrantFiled: January 18, 2008Date of Patent: April 23, 2013Assignee: NEC CorporationInventors: Yoshifumi Onishi, Tadashi Emori
-
Patent number: 8417527Abstract: A phonetic vocabulary for a speech recognition system is adapted to a particular speaker's pronunciation. A speaker can be attributed specific pronunciation styles, which can be identified from specific pronunciation examples. Consequently, a phonetic vocabulary can be reduced in size, which can improve recognition accuracy and recognition speed.Type: GrantFiled: October 13, 2011Date of Patent: April 9, 2013Assignee: Nuance Communications, Inc.Inventors: Nitendra Rajput, Ashish Verma
-
Patent number: 8412524Abstract: A system enables a transcriptionist to replace a first written form (such as an abbreviation) of a concept with a second written form (such as an expanded form) of the same concept. For example, the system may display to the transcriptionist a draft document produced from speech by an automatic speech recognizer. If the transcriptionist recognizes a first written form of a concept that should be replaced with a second written form of the same concept, the transcriptionist may provide the system with a replacement command. In response, the system may identify the second written form of the concept and replace the first written form with the second written form in the draft document.Type: GrantFiled: March 15, 2012Date of Patent: April 2, 2013Assignee: MModal IP LLCInventor: Kjell Schubert
-
Patent number: 8407057Abstract: A machine, system and method for user-guided teaching and modifications of voice commands and actions to be executed by a conversational learning system. The machine includes a system bus for communicating data and control signals received from the conversational learning system to a computer system, a vehicle data and control bus for connecting devices and sensors in the machine, a bridge module for connecting the vehicle data and control bus to the system bus, machine subsystems coupled to the vehicle data and control bus having a respective user interface for receiving a voice command or input signal from a user, a memory coupled to the system bus for storing action command sequences learned for a new voice command and a processing unit coupled to the system bus for automatically executing the action command sequences learned when the new voice command is spoken.Type: GrantFiled: January 21, 2009Date of Patent: March 26, 2013Assignee: Nuance Communications, Inc.Inventors: Liam D. Comerford, Mahesh Viswanathan
-
Patent number: 8396713Abstract: A method (and system) of handling out-of-grammar utterances includes building a statistical language model for a dialog state using, generating sentences and semantic interpretations for the sentences using finite state grammar, building a statistical action classifier, receiving user input, carrying out recognition with the finite state grammar, carrying out recognition with the statistical language model, using the statistical action classifier to find semantic interpretations, comparing an output from the finite state grammar and an output from the statistical language model, deciding which output of the output from the finite state grammar and the output from the statistical language model to keep as a final recognition output, selecting the final recognition output, and outputting the final recognition result, wherein the statistical action classifier, the finite state grammar and the statistical language model are used in conjunction to carry out speech recognition and interpretation.Type: GrantFiled: April 30, 2007Date of Patent: March 12, 2013Assignee: Nuance Communications, Inc.Inventors: Vaibhava Goel, Ramesh Gopinath, Ea-Ee Jan, Karthik Visweswariah
-
Patent number: 8390574Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate text input. In response to an ambiguous editing input at a location preceding at least a portion of an output word, the software performs one disambiguation operation with respect to the editing input and another disambiguation operation with respect to the editing input in combination with the at least portion of the output word. The results are output in order of decreasing frequency value, with the results of the one disambiguation operation having the portion of the output word appended thereto.Type: GrantFiled: August 10, 2011Date of Patent: March 5, 2013Assignee: Research In Motion LimitedInventors: Michael Elizarov, Vadim Fux, Dan Rubanovich
-
Patent number: 8391807Abstract: A communication device includes memory, an input interface, a processing module, and a transmitter. The processing module receives a digital signal from the input interface, wherein the digital signal includes a desired digital signal component and an undesired digital signal component. The processing module identifies one of a plurality of codebooks based on the undesired digital signal component. The processing module then identifies a codebook entry from the one of the plurality of codebooks based on the desired digital signal component to produce a selected codebook entry. The processing module then generates a coded signal based on the selected codebook entry, wherein the coded signal includes a substantially unattenuated representation of the desired digital signal component and an attenuated representation of the undesired digital signal component. The transmitter converts the coded signal into an outbound signal in accordance with a signaling protocol and transmits it.Type: GrantFiled: August 20, 2012Date of Patent: March 5, 2013Assignee: Broadcom CorporationInventor: Nambirajan Seshadri
-
Patent number: 8386251Abstract: A speech recognition system is provided with iteratively refined multiple passes through the received data to enhance the accuracy of the results by introducing constraints and adaptation from initial passes into subsequent recognition operations. The multiple passes are performed on an initial utterance received from a user. The iteratively enhanced subsequent passes are also performed on following utterances received from the user increasing an overall system efficiency and accuracy.Type: GrantFiled: June 8, 2009Date of Patent: February 26, 2013Assignee: Microsoft CorporationInventors: Nikko Strom, Julian Odell, Jon Hamaker
-
Patent number: 8374868Abstract: A method for recognizing speech involves reciting, into a speech recognition system, an utterance including a numeric sequence that contains a digit string including a plurality of tokens and detecting a co-articulation problem related to at least two potentially co-articulated tokens in the digit string. The numeric sequence may be identified using i) a dynamically generated possible numeric sequence that potentially corresponds with the numeric sequence, and/or ii) at least one supplemental acoustic model. Also disclosed herein is a system for accomplishing the same.Type: GrantFiled: August 21, 2009Date of Patent: February 12, 2013Assignee: General Motors LLCInventors: Uma Arun, Sherri J Voran-Nowak, Rathinavelu Chengalvarayan, Gaurav Talwar
-
Patent number: 8374873Abstract: Techniques for training and applying prosody models for speech synthesis are provided. A speech recognition engine processes audible speech to produce text annotated with prosody information. A prosody model is trained with this annotated text. After initial training, the model is applied during speech synthesis to generate speech with non-standard prosody from input text. Multiple prosody models can be used to represent different prosody styles.Type: GrantFiled: August 11, 2009Date of Patent: February 12, 2013Assignee: Morphism, LLCInventor: James H. Stephens, Jr.
-
Patent number: 8351706Abstract: Document data corresponding to each page included in a document is stored, and furthermore, feature data indicative of a feature of the document data and a document index indicating the document are associated with the document data. A document extracting apparatus obtains input document data, calculates feature data from the input document data, judges similarity between the input document data and the document data based on the feature data, obtains a document index associated with document data similar to the input document data, and extracts a plurality of pieces of document data associated with the document index. Thus, document data concerning the document including a page corresponding to the document data similar to the input document data is extracted for a plurality of pages.Type: GrantFiled: July 23, 2008Date of Patent: January 8, 2013Assignee: Sharp Kabushiki KaishaInventor: Hitoshi Hirohata
-
Patent number: 8346552Abstract: A game apparatus includes a CPU, and the CPU evaluates a pronunciation of a user with respect to an original sentence (ES). First, envelops as to a volume of a voice of the original sentence (ES) and a volume of a voice of the user are taken, and the average values of the volumes are made uniform. When the volumes are made uniform to each other, a degree of similarity (scoreA) of distributions of local solutions when the volumes are equal to or more than the average values, a degree of similarity (scoreB) of distributions (timing of concaves/convexes of the waveform) of values of the high or low level indicating whether or not the volume is equal to or more than a value multiplying the average value by a predetermined value, and a degree of similarity (scoreC) of dispersion values (dispersion of concaves/convexes of the waveform) of the envelopes are evaluated by utilizing the respective envelopes.Type: GrantFiled: January 21, 2010Date of Patent: January 1, 2013Assignee: Nintendo Co., Ltd.Inventor: Tomokazu Abe
-
Patent number: 8326626Abstract: Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In once aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script.Type: GrantFiled: December 22, 2011Date of Patent: December 4, 2012Assignee: West CorporationInventors: Mark J. Pettay, Fonda J. Narke
-
Patent number: 8316108Abstract: A method and apparatus for obtaining a real time media stream provided as a plurality of media fragments from a plurality of remote nodes in a communications network is described. Media fragments are requested from the plurality of remote nodes. A series of media fragments is received from at least one of the plurality of remote nodes. A selection criterion is determined for identifying the series of data fragments, and a blocking request is sent to at least one other of the plurality of remote nodes, the blocking request instructing the at least one other node to block the media fragments satisfying the selection criterion from being sent.Type: GrantFiled: February 22, 2008Date of Patent: November 20, 2012Assignee: Telefonaktiebolaget LM Ericsson (publ)Inventors: Andreas Ljunggren, Robert Skog
-
Patent number: 8316148Abstract: A method and apparatus for obtaining a real time media stream provided as a plurality of media fragments from a plurality of remote nodes in a communications network. A first series of media fragments satisfying a first selection criterion is requested from a first remote node and a further series of media fragments satisfying a further different selection criterion is requested from at least one further remote node. When combined, the first series of fragments and the further series of fragments provide the complete media stream.Type: GrantFiled: February 22, 2008Date of Patent: November 20, 2012Assignee: Telefonaktiebolaget LM Ericsson (publ)Inventors: Andreas Ljunggren, Robert Skog
-
Patent number: 8315864Abstract: Provided herein are systems and methods for using context-sensitive speech recognition logic in a computer to create a software program, including context-aware voice entry of instructions that make up a software program, automatic context-sensitive instruction formatting, and automatic context-sensitive insertion-point positioning.Type: GrantFiled: April 24, 2012Date of Patent: November 20, 2012Inventor: Lunis Orcutt
-
Patent number: 8311814Abstract: The present invention is directed to a voice activity detector that uses the periodicity of amplitude peaks and valleys to identify signals of substantially fixed power or having periodicity.Type: GrantFiled: September 19, 2006Date of Patent: November 13, 2012Assignee: Avaya Inc.Inventors: Mei-Sing Ong, Luke A. Tucker
-
Patent number: 8306814Abstract: A method for classifying a pair of audio signals into an agent audio signal and a customer audio signal. One embodiment relates to unsupervised training, in which the training corpus comprises a multiplicity of audio signal pairs, wherein each pair comprises an agent signal and a customer signal, and wherein it is unknown for each signal if it is by the agent or by the customer. Training is based on the agent signals being more similar to one another than the customer signals. An agent cluster and a customer cluster are determined. The input signals are associated with the agent or the customer according to the higher score combination of the input signals and the clusters. Another embodiment relates to supervised training, wherein an agent model is generated, and the input signal that yields higher score against the model is the agent signal, while the other is the customer signal.Type: GrantFiled: May 11, 2010Date of Patent: November 6, 2012Assignee: Nice-Systems Ltd.Inventors: Gil Dobry, Hila Lam, Moshe Wasserblat
-
Patent number: 8301455Abstract: A user identification method is described in which, in a first identification procedure, identification data (ID1) of a first type belonging to a target individual to be identified are determined and are compared with previously stored user identification data (ND1) of the first type assigned to an authorized user. In addition, identification data (ID2) of a second type that belong with a certain probability to the same target individual are automatically determined. After a successful confirmation of the identify of the target individual with the authorized user from the identification data (ID1) of the first type, user identification data (ND2) of the second type are stored for the respective authorized user using the determined identification data (ID2) of the second type in order to use said data in a subsequent identification procedure. In addition, a corresponding user identification device is disclosed.Type: GrantFiled: December 17, 2002Date of Patent: October 30, 2012Assignee: Koninklijke Philips Electronics N.V.Inventor: Volker Steinbiss
-
Patent number: 8296144Abstract: Embodiments of an automated dialog system testing method and component are described. This automated testing method and system supplements real human-based testing with simulated user input and incorporates a set of evaluation measures that focus on three basic aspects of task-oriented dialog systems, namely, understanding ability, efficiency, and the appropriateness of system actions. These measures are first applied on a corpus generated between a dialog system and a group of human users to demonstrate the validity of these measures with the human users' satisfaction levels. Results generally show that these measures are significantly correlated with these satisfaction levels. A regression model is then built to predict the user satisfaction scores using these evaluation measures.Type: GrantFiled: June 4, 2008Date of Patent: October 23, 2012Assignee: Robert Bosch GmbHInventors: Fuliang Weng, Hua Ai
-
Patent number: 8295462Abstract: A method, system and computer program product for alerting a participant when a topic of interest is being discussed and/or a speaker of interest is speaking during a conference call. A participant to a conference call identifies the topics and/or speakers of interest which is stored for future use along with the participant's contact information. When a participant's identified topic of interest is being discussed and/or a participant's identified speaker of interest is speaking during a conference call, the participant will be alerted to that fact, such as via the means specified in the participant's contact information.Type: GrantFiled: March 8, 2008Date of Patent: October 23, 2012Assignee: International Business Machines CorporationInventors: Steven Michael Miller, Lisa Anne Seacat