Similarity Patents (Class 704/239)

Using human wizards in a conversational understanding system

Patent number: 9830039

Abstract: A wizard control panel may be used by a human wizard to adjust the operation of a Natural Language (NL) conversational system during a real-time dialog flow. Input to the wizard control panel is detected and used to interrupt/change an automatic operation of one or more of the NL conversational system components used during the flow. For example, the wizard control panel may be used to adjust results determined by an Automated Speech Recognition (ASR) component, a Natural Language Understanding (NLU) component, a Dialog Manager (DM) component, and a Natural Language Generation (NLG) before the results are used to perform an automatic operation within the flow. A timeout may also be set such that when the timeout expires, the conversational system performs an automated operation by using the results shown in the wizard control panel (edited/not edited).

Type: Grant

Filed: March 4, 2013

Date of Patent: November 28, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Lisa Stifelman, Dilek Hakkani-Tur, Larry Heck, Madhusudan Chinthakunta
System and method for processing speech to identify keywords or other information

Patent number: 9799333

Abstract: A system and method are provided for performing speech processing. A system includes an audio detection system configured to receive a signal including speech and a memory having stored therein a database of keyword models forming an ensemble of filters associated with each keyword in the database. A processor is configured to receive the signal including speech from the audio detection system, decompose the signal including speech into a sparse set of phonetic impulses, and access the database of keywords and convolve the sparse set of phonetic impulses with the ensemble of filters. The processor is further configured to identify keywords within the signal including speech based a result of the convolution and control operation the electronic system based on the keywords identified.

Type: Grant

Filed: August 31, 2015

Date of Patent: October 24, 2017

Assignee: The Johns Hopkins University

Inventors: Keith Kintzley, Aren Jansen, Hynek Hermansky, Kenneth Church
Detecting potential significant errors in speech recognition results

Patent number: 9685153

Abstract: In some embodiments, the recognition results produced by a speech processing system (which may include a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential significant errors. In some embodiments, the recognition results may be evaluated to determine whether a meaning of any of the alternative recognition results differs from a meaning of the top recognition result in a manner that is significant for the domain. In some embodiments, one or more of the recognition results may be evaluated to determine whether the result(s) include one or more words or phrases that, when included in a result, would change a meaning of the result in a manner that would be significant for the domain.

Type: Grant

Filed: May 15, 2015

Date of Patent: June 20, 2017

Assignee: Nuance Communications, Inc.

Inventors: William F. Ganong, III, Raghu Vemula, Robert Fleming
Trusted device identification and event monitoring

Patent number: 9684433

Abstract: A method includes identifying individuals that are affiliated with a user. The method includes incorporating trusted devices associated with the identified individuals into an event monitor network that is configured to monitor for an occurrence of a monitored event. The method includes identifying a particular input that suggests the occurrence of the monitored event. The method includes communicating to the trusted devices, an input sample that is used for recognition of the particular input from general input that is measured by sensors of the trusted devices. The method includes receiving from at least one of the trusted devices, an event message that indicates the particular input is observed by at least one of the sensors. In response to the event message, the method includes communicating to a user interface of a user device associated with the user, an alarm message that indicates the occurrence of the monitored event.

Type: Grant

Filed: December 30, 2014

Date of Patent: June 20, 2017

Assignee: EBAY INC.

Inventor: Kamal Zamer
Systems and methods related to security credentials

Patent number: 9558335

Abstract: A method includes receiving, from a user via an electronic device, input representing a password to be utilized for an account; automatically determining, utilizing a processor, a complexity value for the input password; automatically determining, based on the determined complexity value, security settings for the account; receiving, from a user via an electronic device, input representing an attempt to login to the account, the input representing an attempt to login to the account including an attempted password; automatically determining that the attempted password does not match the password to be utilized for the account; and determining a course of action to take in response to the determination that the attempted password does not match the password to be utilized for the account, the course of action being determined based at least in part on the automatically determined security settings for the account.

Type: Grant

Filed: December 14, 2015

Date of Patent: January 31, 2017

Assignee: ALLSCRIPTS SOFTWARE, LLC

Inventors: David Thomas Windell, Todd Michael Eischeid, Scott David Bower
Generating text messages using speech recognition in a vehicle navigation system

Patent number: 9476718

Abstract: A vehicle navigation system may send and receive communications, such as text messages. Speech recognition may generate a text message without affecting a driver's control of the vehicle. A user may audibly control the navigation system and generate a text message through a speech recognition element. A microphone may record a user's voice, which is then transformed into a text message for transmission. The message may be recorded sentence-by-sentence, word-by-word, or letter-by-letter. The recorded text message may be visually or audibly presented to the user before transmission.

Type: Grant

Filed: July 10, 2007

Date of Patent: October 25, 2016

Assignee: HARMAN BECKER AUTOMOTIVE SYSTEMS GMBH

Inventor: Mirko Herforth
Classifier-based system combination for spoken term detection

Patent number: 9477753

Abstract: Systems and methods for processing a query include determining a plurality of sets of match candidates for a query using a processor, each of the plurality of sets of match candidates being independently determined from a plurality of diverse word lattice generation components of different type. The plurality of sets of match candidates is merged by generating a first score for each match candidate to provide a merged set of match candidates. A second score is computed for each match candidate of the merged set based upon features of that match candidate. The first score and the second score are combined to provide a final set of match candidates as matches to the query.

Type: Grant

Filed: March 12, 2013

Date of Patent: October 25, 2016

Assignee: International Business Machines Corporation

Inventors: Brian E. D. Kingsbury, Hong-Kwang Jeff Kuo, Lidia Luminita Mangu, Hagen Soltau
System and method for displaying content according to a target format for presentation on a target presentation device

Patent number: 9449126

Abstract: A user interface for presenting a set of related pages of an electronic content work for view at the same time. The pages are sized according to a target format for presentation of the electronic content work, and may also be formatted according to user-defined font and zoom criteria. Each of the related pages comprises a media object, for example a markup language object. Responsive to user manipulation of presentation criterion for the set of related pages, the set of related pages is reformatted and so presented in near real-time. In some instances, a user may manipulate controls of the user interface to isolate a content object included within the set of related pages, have information regarding that content object presented and even edit the content object in-line with the present view.

Type: Grant

Filed: June 1, 2012

Date of Patent: September 20, 2016

Assignee: Inkling Systems, Inc.

Inventors: Thomas Charles Genoni, Peter S. Cho, Norris Hung, Eric Todd Lovett, Huan Zhao
Voice search device, voice search method, and non-transitory recording medium

Patent number: 9437187

Abstract: A search string acquiring unit acquires a search string. A converting unit converts the search string into a phoneme sequence. A time length deriving unit derives the spoken time length of the voice corresponding to the search string. A zone designating unit designates a likelihood acquisition zone in a target voice signal. A likelihood acquiring device acquires a likelihood indicating how likely the likelihood acquisition interval is an interval in which voice corresponding to the search string is spoken. A repeating unit changes the likelihood acquisition zone designated by the zone designating unit, and repeats the process of the zone designating unit and the likelihood acquiring device. An identifying unit identifies, from the target voice signal, estimated intervals for which the voice corresponding to the search string is estimated to be spoken, on the basis of the likelihoods acquired for each of the likelihood acquisition zones.

Type: Grant

Filed: January 23, 2015

Date of Patent: September 6, 2016

Assignee: CASIO COMPUTER CO., LTD.

Inventor: Hiroyasu Ide
Digital communication biometric authentication

Patent number: 9438578

Abstract: A biometric authentication system is disclosed that provides authentication capability using biometric data in connection with a challenge for parties engaging in digital communications such as digital text-oriented, interactive digital communications. End-user systems may be coupled to devices that include biometric data capture devices such as retina scanners, fingerprint recorders, cameras, microphones, ear scanners, DNA profilers, etc., so that biometric data of a communicating party may be captured and used for authentication purposes.

Type: Grant

Filed: August 17, 2013

Date of Patent: September 6, 2016

Assignee: AT&T INTELLECTUAL PROPERTY II, L.P.

Inventors: Pradeep K. Bansal, Lee Begeja, Carroll W. Creswell, Jeffrey Farah, Benjamin J. Stern, Jay Wilpon
Voice search device, voice search method, and non-transitory recording medium

Patent number: 9431007

Abstract: In a voice search device, a processor acquires a search word, converts the search word into a phoneme sequence, acquires, for each frame, an output probability of a feature quantity of a target voice signal being output from each phoneme included in the phoneme sequence, and executes relative calculation of the output probability acquired from each phoneme, based on an output probability acquired from another phoneme included in the phoneme sequence. In addition, the processor successively designates likelihood acquisition zones, acquires a likelihood indicating how likely a designated likelihood acquisition zone is a zone in which voice corresponding to the search word is spoken, and identifies from the target voice signal an estimated zone for which the voice corresponding to the search word is estimated to be spoken, based on the acquired likelihood.

Type: Grant

Filed: January 15, 2015

Date of Patent: August 30, 2016

Assignee: CASIO COMPUTER CO., LTD.

Inventor: Hiroki Tomita
Case-based reasoning system using normalized weight vectors

Patent number: 9330358

Abstract: A system and method include comparing a context to cases stored in a case base, where the cases include Boolean and non-Boolean independent weight variables and a domain-specific dependency variable. The case and context independent weight variables are normalized and a normalized weight vector is determined for the case base. A match between the received context and each case of the case base is determined using the normalized context and case variables and the normalized weight vector. A skew value is determined for each category of domain specific dependency variables and the category of domain specific dependency variables having the minimal skew value is selected. The dependency variable associated with the selected category is then displayed to a user.

Type: Grant

Filed: September 26, 2013

Date of Patent: May 3, 2016

Assignee: THE UNITED STATES OF AMERICA AS REPRESENTED BY THE SECRETARY OF THE NAVY

Inventor: Stuart H. Rubin
Speech dialogue system and speech dialogue method

Patent number: 9330656

Abstract: A speech dialogue system generates a response sentence in a way to improve the efficiency of the dialogue with the user, based on a result of estimation on an attribute of a proper name in an utterance of a user. The system includes a database attribute estimation unit to estimate the attribute of the input proper name by utilizing a database, and a web attribute estimation unit to estimate an attribute of an input proper name by utilizing information on the web. A reliability integration unit calculates integrated reliability of estimation for each of possible attributes obtained from the estimation by the units, by integrating first reliability of the estimation. A response generation unit generates a response sentence to an input utterance based on the integrated reliabilities of the possible attributes.

Type: Grant

Filed: February 26, 2014

Date of Patent: May 3, 2016

Assignee: HONDA MOTOR CO., LTD.

Inventors: Mikio Nakano, Kazunori Komatani, Tsugumi Otsuka
Adaptive pause detection in speech recognition

Patent number: 9311932

Abstract: A method, system, and computer program product for adaptive pause detection in speech recognition are provided in the illustrative embodiments. A speech stream comprising audio signal of a speech is received. A first point in the speech stream is marked with a beginning time stamp. After the first point, a pause is detected in the speech stream. The pause is of a duration at least equal to a pause duration threshold. A second point after the pause in the speech stream is marked with an ending time stamp. A portion of the speech stream between the beginning and the ending time stamps forms a first speech segment. A speech rate of the first speech segment is computed using a number of words in the first speech segment, the beginning time stamp, and the ending time stamp. The pause duration is adjusted according to the first speech segment's speech rate.

Type: Grant

Filed: January 23, 2014

Date of Patent: April 12, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: William S. Carter
Object detection apparatus

Patent number: 9262693

Abstract: An object detection apparatus includes a storage section storing a plurality of selection patterns as combinations of one of a plurality of recognition dictionaries and one of a plurality of image recognition algorithms, a specifying means for specifying at least one of a distance from a position at which an input image is taken and a target corresponding to the detection object within the input image and a state of light of the input image, a selection means for selecting one from the plurality of the selection patterns based on at least one of the distance and the state of the light specified by the specifying means, and a detection means for detecting the detection object within the input image by performing an image recognition process using the image recognition dictionary and the image recognition algorithm included in the selection pattern selected by the selection means.

Type: Grant

Filed: April 16, 2014

Date of Patent: February 16, 2016

Assignee: DENSO CORPORATION

Inventor: Yasunori Kamiya
Verification apparatus, verification program, and verification method

Patent number: 9197416

Abstract: In a verification apparatus, a biometric information acquisition unit acquires a plurality of biometric information pieces from an object. A first verification unit calculates, as a verification score, the similarity between the biometric information piece and a verification information piece, and compares the calculated verification score with a first determination value to determine whether the biometric information piece matches the verification information piece. When the verification fails, a second verification unit performs verification on the plurality of biometric information pieces having a predetermined relationship, using the verification information piece and a second determination value which defines a less stringent criterion than the first determination value.

Type: Grant

Filed: August 8, 2013

Date of Patent: November 24, 2015

Assignee: FUJITSU FRONTECH LIMITED

Inventor: Shinichi Eguchi
Mobile terminal and method for recognizing voice thereof

Patent number: 9147395

Abstract: The present disclosure relates to a mobile terminal and a voice recognition method thereof. The voice recognition method may include receiving a user's voice; providing the received voice to a first voice recognition engine provided in the server and a second voice recognition engine provided in the mobile terminal; acquiring first voice recognition data as a result of recognizing the received voice by the first voice recognition engine; acquiring second voice recognition data as a result of recognizing the received voice by the second voice recognition engine; estimating a function corresponding to the user's intention based on at least one of the first and the second voice recognition data; calculating a similarity between the first and the second voice recognition data when personal information is required for the estimated function; and selecting either one of the first and the second voice recognition data based on the calculated similarity.

Type: Grant

Filed: June 21, 2013

Date of Patent: September 29, 2015

Assignee: LG ELECTRONICS INC.

Inventors: Juhee Kim, Hyunseob Lee, Joonyup Lee, Jungkyu Choi
Apparatus and method for recognizing voice command

Patent number: 9142212

Abstract: An apparatus and method for recognizing a voice command for use in an interactive voice user interface are provided. The apparatus includes a command intention belief generation unit that is configured to recognize a first voice command and that may generate one or more command intention beliefs for the first voice command. The apparatus also includes a command intention belief update unit that is configured to update each of the command intention beliefs based on a system response to the first voice command and a second voice commands. The apparatus also includes a command intention belief selection unit that is configured to select one of the updated command intention beliefs for the first voice command. The apparatus also includes an operation signal output unit that is configured to select a final command intention from the selected updated command intention belief and to output an operation signal based on the selected final command intention.

Type: Grant

Filed: April 26, 2011

Date of Patent: September 22, 2015

Inventors: Chi-Youn Park, Byung-Kwan Kwak, Jeong-Su Kim, Jeong-Mi Cho
Detecting potential significant errors in speech recognition results

Patent number: 9064493

Abstract: In some embodiments, the recognition results produced by a speech processing system (which may include a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential significant errors. In some embodiments, the recognition results may be evaluated to determine whether a meaning of any of the alternative recognition results differs from a meaning of the top recognition result in a manner that is significant for the domain. In some embodiments, one or more of the recognition results may be evaluated to determine whether the result(s) include one or more words or phrases that, when included in a result, would change a meaning of the result in a manner that would be significant for the domain.

Type: Grant

Filed: July 9, 2012

Date of Patent: June 23, 2015

Assignee: Nuance Communications, Inc.

Inventors: William F. Ganong, III, Raghu Vemula, Robert Fleming
Speaker recognition from telephone calls

Patent number: 9043207

Abstract: The present invention relates to a method for speaker recognition, comprising the steps of obtaining and storing speaker information for at least one target speaker; obtaining a plurality of speech samples from a plurality of telephone calls from at least one unknown speaker; classifying the speech samples according to the at least one unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information for the speech samples of each of the speaker-dependent classes of speech samples; combining the extracted speaker information for each of the speaker-dependent classes of speech samples; comparing the combined extracted speaker information for each of the speaker-dependent classes of speech samples with the stored speaker information for the at least one target speaker to obtain at least one comparison result; and determining whether one of the at least one unknown speakers is identical with the at least one target speaker based on the at least one comparison result.

Type: Grant

Filed: November 12, 2009

Date of Patent: May 26, 2015

Assignee: Agnitio S.L.

Inventors: Johan Nikolaas Langehoven Brummer, Luis Buera Rodriguez, Marta Garcia Gomar
SPEAKER IDENTIFICATION

Publication number: 20150127342

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speaker identification. In some implementations, an utterance vector that is derived from an utterance is obtained. Hash values are determined for the utterance vector according to multiple different hash functions. A set of speaker vectors from a plurality of hash tables is determined using the hash values, where each speaker vector was derived from one or more utterances of a respective speaker. The speaker vectors in the set are compared with the utterance vector. A speaker vector is selected based on comparing the speaker vectors in the set with the utterance vector.

Type: Application

Filed: October 24, 2014

Publication date: May 7, 2015

Inventors: Matthew Sharifi, Ignacio Lopez Moreno, Ludwig Schmidt
Location determination system and mobile terminal

Patent number: 9026437

Abstract: A location determination system includes a first mobile terminal and a second mobile terminal. The first mobile terminal includes a first processor to acquire a first sound signal, analyze the first sound signal to obtain a first analysis result, and transmit the first analysis result. The second mobile terminal includes a second processor to acquire a second sound signal, analyze the second sound signal to obtain a second analysis result, receive the first analysis result from the first mobile terminal, compare the second analysis result with the first analysis result to obtain a comparison result, and determine whether the first mobile terminal locates in an area in which the second mobile terminal locates, based on the comparison result.

Type: Grant

Filed: March 26, 2012

Date of Patent: May 5, 2015

Assignee: Fujitsu Limited

Inventor: Eiji Hasegawa
Hidden markov model for speech processing with training method

Patent number: 9020816

Abstract: A method, system and apparatus are shown for identifying non-language speech sounds in a speech or audio signal. An audio signal is segmented and feature vectors are extracted from the segments of the audio signal. The segment is classified using a hidden Markov model (HMM) that has been trained on sequences of these feature vectors. Post-processing components can be utilized to enhance classification. An embodiment is described in which the hidden Markov model is used to classify a segment as a language speech sound or one of a variety of non-language speech sounds. Another embodiment is described in which the hidden Markov model is trained using discriminative learning.

Type: Grant

Filed: August 13, 2009

Date of Patent: April 28, 2015

Assignee: 21CT, Inc.

Inventor: Matthew McClain
Method for refining a search

Patent number: 9015045

Abstract: A method for refining a search is provided. Embodiments may include receiving a first speech signal corresponding to a first utterance and receiving a second speech signal corresponding to a second utterance, wherein the second utterance is a refinement to the first utterance. Embodiments may also include identifying information associated with the first speech signal as first speech signal information and identifying information associated with the second speech signal as second speech signal information. Embodiments may also include determining a first quantity of search results based upon the first speech signal information and determining a second quantity of search results based upon the second speech signal information.

Type: Grant

Filed: March 11, 2013

Date of Patent: April 21, 2015

Assignee: Nuance Communications, Inc.

Inventor: Jean-Francois Lavallee
Machine translation of indirect speech

Patent number: 9009042

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating direct speech messages based on voice commands that include indirect speech messages. In one aspect, a method includes receiving a voice input corresponding to an utterance. A determination is made whether a transcription of the utterance includes a command to initiate a communication to a user and a segment that is classified as indirect speech. In response to determining that the transcription of the utterance includes the command and the segment that is classified as indirect speech, the segment that is classified as indirect speech is provided as input to a machine translator. In response to providing the segment that is classified as indirect speech to the machine translator, a direct speech segment is received from the machine translator. A communication is initiated that includes the direct speech segment.

Type: Grant

Filed: June 13, 2014

Date of Patent: April 14, 2015

Assignee: Google Inc.

Inventors: Matthias Quasthoff, Simon Tickner
Noise adaptive training for speech recognition

Patent number: 9009039

Abstract: Technologies are described herein for noise adaptive training to achieve robust automatic speech recognition. Through the use of these technologies, a noise adaptive training (NAT) approach may use both clean and corrupted speech for training. The NAT approach may normalize the environmental distortion as part of the model training. A set of underlying “pseudo-clean” model parameters may be estimated directly. This may be done without point estimation of clean speech features as an intermediate step. The pseudo-clean model parameters learned from the NAT technique may be used with a Vector Taylor Series (VTS) adaptation. Such adaptation may support decoding noisy utterances during the operating phase of a automatic voice recognition system.

Type: Grant

Filed: June 12, 2009

Date of Patent: April 14, 2015

Assignee: Microsoft Technology Licensing, LLC

Inventors: Michael Lewis Seltzer, James Garnet Droppo, Ozlem Kalinli, Alejandro Acero
State detection device and state detecting method

Patent number: 8996373

Abstract: A state detection device includes: a first model generation unit to generate a first specific speaker model obtained by modeling speech features of a specific speaker in an undepressed state; a second model generation unit to generate a second specific speaker model obtained by modeling speech features of the specific speaker in the depressed state; a likelihood calculation unit to calculate a first likelihood as a likelihood of the first specific speaker model with respect to input voice, and a second likelihood as a likelihood of the second specific speaker model with respect to the input voice; and a state determination unit to determine a state of the speaker of the input voice using the first likelihood and the second likelihood.

Type: Grant

Filed: October 5, 2011

Date of Patent: March 31, 2015

Assignee: Fujitsu Limited

Inventors: Shoji Hayakawa, Naoshi Matsuo
Recognition confidence measuring by lexical distance between candidates

Patent number: 8990086

Abstract: A recognition confidence measurement method, medium and system which can more accurately determine whether an input speech signal is an in-vocabulary, by extracting an optimum number of candidates that match a phone string extracted from the input speech signal and estimating a lexical distance between the extracted candidates is provided. A recognition confidence measurement method includes: extracting a phoneme string from a feature vector of an input speech signal; extracting candidates by matching the extracted phoneme string and phoneme strings of vocabularies registered in a predetermined dictionary and; estimating a lexical distance between the extracted candidates; and determining whether the input speech signal is an in-vocabulary, based on the lexical distance.

Type: Grant

Filed: July 31, 2006

Date of Patent: March 24, 2015

Assignee: Samsung Electronics Co., Ltd.

Inventors: Sang-Bae Jeong, Nam Hoon Kim, Ick Sang Han, In Jeong Choi, Gil Jin Jang, Jae-Hoon Jeong
METHOD AND APPARATUS FOR ADJUSTING DETECTION THRESHOLD FOR ACTIVATING VOICE ASSISTANT FUNCTION

Publication number: 20150081296

Abstract: A method for activating a voice assistant function in a mobile device is disclosed. The method includes receiving an input sound stream by a sound sensor and determining a context of the mobile device. The method may determine the context based on the input sound stream. For determining the context, the method may also obtain data indicative of the context of the mobile device from at least one of an acceleration sensor, a location sensor, an illumination sensor, a proximity sensor, a clock unit, and a calendar unit in the mobile device. In this method, a threshold for activating the voice assistant function is adjusted based on the context. The method detects a target keyword from the input sound stream based on the adjusted threshold. If the target keyword is detected, the method activates the voice assistant function.

Type: Application

Filed: September 17, 2013

Publication date: March 19, 2015

Applicant: QUALCOMM Incorporated

Inventors: Minsub Lee, Taesu Kim, Kyu Woong Hwang, Minho Jin
Intent discovery in audio or text-based conversation

Patent number: 8983840

Abstract: Techniques, an apparatus and an article of manufacture identifying one or more utterances that are likely to carry the intent of a speaker, from a conversation between two or more parties. A method includes obtaining an input of a set of utterances in chronological order from a conversation between two or more parties, computing an intent confidence value of each utterance by summing intent confidence scores from each of the constituent words of the utterance, wherein intent confidence scores capture each word's influence on the subsequent utterances in the conversation based on (i) the uniqueness of the word in the conversation and (ii) the number of times the word subsequently occurs in the conversation, and generating a ranked order of the utterances from highest to lowest intent confidence value, wherein the highest intent value corresponds to the utterance which is most likely to carry intent of the speaker.

Type: Grant

Filed: June 19, 2012

Date of Patent: March 17, 2015

Assignee: International Business Machines Corporation

Inventors: Om D. Deshmukh, Sachindra Joshi, Saket Saurabh, Ashish Verma
APPARATUS AND METHOD FOR ANALYSIS OF LANGUAGE MODEL CHANGES

Publication number: 20150073791

Abstract: An apparatus, a method, and a machine-readable medium are provided for characterizing differences between two language models. A group of utterances from each of a group of time domains are examined. One of a significant word change or a significant word class change within the plurality of utterances is determined. A first cluster of utterances including a word or a word class corresponding to the one of the significant word change or the significant word class change is generated from the utterances. A second cluster of utterances not including the word or the word class corresponding to the one of the significant word change or the significant word class change is generated from the utterances.

Type: Application

Filed: November 14, 2014

Publication date: March 12, 2015

Inventors: Allen Louis GORIN, John Grothendieck, Jeremy Huntley Greet Wright
Voice recognition system for registration of stable utterances

Patent number: 8977547

Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.

Type: Grant

Filed: October 8, 2009

Date of Patent: March 10, 2015

Assignee: Mitsubishi Electric Corporation

Inventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
Differential dynamic content delivery with text display in dependence upon simultaneous speech

Patent number: 8965761

Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.

Type: Grant

Filed: February 27, 2014

Date of Patent: February 24, 2015

Assignee: Nuance Communications, Inc.

Inventors: William Kress Bodin, Michael John Burkhart, Daniel G. Eisenhauer, Thomas James Watson, Daniel Mark Schumacher
Distributed user input to text generated by a speech to text transcription service

Patent number: 8930189

Abstract: A particular method includes receiving, at a representational state transfer endpoint device, a first user input related to a first speech to text conversion performed by a speech to text transcription service. The method also includes receiving, at the representational state transfer endpoint device, a second user input related to a second speech to text conversion performed by the speech to text transcription service. The method includes processing of the first user input and the second user input at the representational state transfer endpoint device to generate speech to text adjustment information.

Type: Grant

Filed: October 28, 2011

Date of Patent: January 6, 2015

Assignee: Microsoft Corporation

Inventors: Jeremy Edward Cath, Timothy Edwin Harris, Marc Mercuri, James Oliver Tisdale, III
Method and apparatus for wind noise detection and suppression using multiple microphones

Patent number: 8924204

Abstract: Unlike sound based pressure waves that go everywhere, air turbulence caused by wind is usually a fairly local event. Therefore, in a system that utilizes two or more spatially separated microphones to pick up sound signals (e.g., speech), wind noise picked up by one of the microphones often will not be picked up (or at least not to the same extent) by the other microphone(s). Embodiments of methods and apparatuses that utilize this fact and others to effectively detect and suppress wind noise using multiple microphones that are spatially separated are described.

Type: Grant

Filed: September 30, 2011

Date of Patent: December 30, 2014

Assignee: Broadcom Corporation

Inventors: Juin-Hwey Chen, Jes Thyssen, Xianxian Zhang, Huaiyu Zeng
Detecting potential significant errors in speech recognition results

Patent number: 8924211

Abstract: In some embodiments, the recognition results produced by a speech processing system (which may include two or more recognition results, including a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential significant errors. In some embodiments, the recognition results may be evaluated to determine whether a meaning of any of the alternative recognition results differs from a meaning of the top recognition result in a manner that is significant for a domain, such as the medical domain. In some embodiments, words and/or phrases that may be confused by an ASR system may be determined and associated in sets of words and/or phrases. Words and/or phrases that may be determined include those that change a meaning of a phrase or sentence when included in the phrase/sentence.

Type: Grant

Filed: July 9, 2012

Date of Patent: December 30, 2014

Assignee: Nuance Communications, Inc.

Inventors: William F. Ganong, III, Raghu Vemula, Robert Fleming
Detecting potential significant errors in speech recognition results

Patent number: 8909526

Abstract: In some embodiments, a recognition result produced by a speech processing system based on an analysis of a speech input is evaluated for indications of potential errors. In some embodiments, sets of words/phrases that may be acoustically similar or otherwise confusable, the misrecognition of which can be significant in the domain, may be used together with a language model to evaluate a recognition result to determine whether the recognition result includes such an indication. In some embodiments, a word/phrase of a set that appears in the result is iteratively replaced with each of the other words/phrases of the set. The result of the replacement may be evaluated using a language model to determine a likelihood of the newly-created string of words appearing in a language and/or domain. The likelihood may then be evaluated to determine whether the result of the replacement is sufficiently likely for an alert to be triggered.

Type: Grant

Filed: July 9, 2012

Date of Patent: December 9, 2014

Assignee: Nuance Communications, Inc.

Inventors: William F. Ganong, III, Raghu Vemula, Robert Fleming
Method and arrangement for controlling user access

Patent number: 8903725

Abstract: Method for controlling user access to a service available in a data network and/or to information stored in a user database, in order to protect stored user data from unauthorized access, such that the method comprises the following: input of a user's speech sample to a user data terminal, processing of the user's speech sample in order to obtain a prepared speech sample as well as a current voice profile of the user, comparison of the current voice profile with an initial voice profile stored in an authorization database, and output of an access-control signal to either permit or refuse access, taking into account the result of the comparison step, such that the comparison step includes a quantitative similarity evaluation of the current and the stored voice profiles as well as a threshold-value discrimination of a similarity measure thereby derived, and an access-control signal that initiates permission of access is generated only if a prespecified similarity measure is not exceeded.

Type: Grant

Filed: November 25, 2009

Date of Patent: December 2, 2014

Assignee: Voice.Trust AG

Inventor: Christian Pilz
Distributed user input to text generated by a speech to text transcription service

Patent number: 8898061

Abstract: A particular method includes receiving, at a representational state transfer endpoint device, a first user input related to a first speech to text conversion performed by a speech to text transcription service. The method also includes receiving, at the representational state transfer endpoint device, a second user input related to a second speech to text conversion performed by the speech to text transcription service. The method includes processing of the first user input and the second user input at the representational state transfer endpoint device to generate speech to text adjustment information.

Type: Grant

Filed: October 28, 2011

Date of Patent: November 25, 2014

Assignee: Microsoft Corporation

Inventors: Jeremy Edward Cath, Timothy Edwin Harris, Marc Mercuri, James Oliver Tisdale, III
METHOD AND SYSTEM FOR SPEECH COMMAND DETECTION, AND INFORMATION PROCESSING SYSTEM

Publication number: 20140337024

Abstract: A method for speech command detection comprises extracting speech features from a speech signal inputted into a system, converting the speech features into a word sequence, obtaining time durations of speech segments corresponding to the respective non-command words and an acoustic score of each of the command word candidates, calculating rhythm features of the speech signal based on the time durations, and recognizing a speech corresponding to the at least one command word candidates as a speech command directed to the system or a speech not directed to the system based on the acoustic score and the rhythm features. The word sequence comprises at least two successive non-command words and at least one command word candidate. The rhythm features describe a similarity of time durations of speech segments corresponding to the respective non-command words, and/or a similarity of energy variations of the speech segments corresponding to the respective non-command words.

Type: Application

Filed: May 9, 2014

Publication date: November 13, 2014

Applicant: CANON KABUSHIKI KAISHA

Inventors: Xiang Zuo, Weixiang Hu, Hefei Liu
CLASSIFICATION METHOD AND DEVICE FOR AUDIO FILES

Publication number: 20140337025

Abstract: The present disclosure discloses a classification method and system for audio files, the classification method includes: constructing Pitch sequence of the audio files to be classified; calculating eigenvectors of the audio files according to the Pitch sequence of the audio files; and classifying the audio files according to the eigenvectors of the audio files. The present disclosure can achieve automatic classification of the audio files, reduce the cost of the classification, and improve classification efficiency and flexibility and intelligence of the classification.

Type: Application

Filed: July 25, 2014

Publication date: November 13, 2014

Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Weifeng ZHAO, Shenyuan Li, Liwei Zhang, Jianfeng Chen
Apparatus and method for generating an audio fingerprint and using a two-stage query

Patent number: 8886531

Abstract: An audio fingerprint is generated by transforming an audio sample of a recording to a time-frequency domain and storing each time-frequency pair in a matrix array, detecting a plurality of local maxima for a predetermined number of time slices, selecting a predetermined number of largest-magnitude maxima from the plurality of local maxima detected by said detecting, and generating one or more hash values corresponding to the predetermined number of largest-magnitude maxima.

Type: Grant

Filed: January 13, 2010

Date of Patent: November 11, 2014

Assignee: Rovi Technologies Corporation

Inventor: Brian Kenneth Vogel
SPARSE MAXIMUM A POSTERIORI (MAP) ADAPTION

Publication number: 20140257809

Abstract: Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.

Type: Application

Filed: May 22, 2014

Publication date: September 11, 2014

Inventors: Vaibhava Goel, Peder A. Olsen, Steven J. Rennie, Jing Huang
Identification using Audio Signatures and Additional Characteristics

Publication number: 20140249817

Abstract: Techniques for using both speaker-identification information and other characteristics associated with received voice commands to determine how and whether to respond to the received voice commands. A user may interact with a device through speech by providing voice commands. After beginning an interaction with the user, the device may detect subsequent speech, which may originate from the user, from another user, or from another source. The device may then use speaker-identification information and other characteristics associated with the speech to attempt to determine whether or not the user interacting with the device uttered the speech. The device may then interpret the speech as a valid voice command and may perform a corresponding operation in response to determining that the user did indeed utter the speech. If the device determines that the user did not utter the speech, however, then the device may refrain from taking action on the speech.

Type: Application

Filed: March 4, 2013

Publication date: September 4, 2014

Applicant: RAWLES LLC

Inventor: RAWLES LLC
Differential dynamic content delivery with text display in dependence upon simultaneous speech

Patent number: 8781830

Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.

Type: Grant

Filed: July 2, 2013

Date of Patent: July 15, 2014

Assignee: Nuance Communications, Inc.

Inventors: William K. Bodin, Michael J. Burkhart, Daniel G. Eisenhauer, Thomas J. Watson, Daniel M. Schumacher
Reducing false positives in speech recognition systems

Patent number: 8781825

Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.

Type: Grant

Filed: August 24, 2011

Date of Patent: July 15, 2014

Assignee: Sensory, Incorporated

Inventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
Machine translation of indirect speech

Patent number: 8768687

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating direct speech messages based on voice commands that include indirect speech messages. In one aspect, a method includes receiving a voice input corresponding to an utterance. A determination is made whether a transcription of the utterance includes a command to initiate a communication to a user and a segment that is classified as indirect speech. In response to determining that the transcription of the utterance includes the command and the segment that is classified as indirect speech, the segment that is classified as indirect speech is provided as input to a machine translator. In response to providing the segment that is classified as indirect speech to the machine translator, a direct speech segment is received from the machine translator. A communication is initiated that includes the direct speech segment.

Type: Grant

Filed: April 29, 2013

Date of Patent: July 1, 2014

Assignee: Google Inc.

Inventors: Matthias Quasthoff, Simon Tickner
Method and apparatus for voice-enabling an application

Patent number: 8768711

Abstract: A method of voice-enabling an application for command and control and content navigation can include the application dynamically generating a markup language fragment specifying a command and control and content navigation grammar for the application, instantiating an interpreter from a voice library, and providing the markup language fragment to the interpreter. The method also can include the interpreter processing a speech input using the command and control and content navigation grammar specified by the markup language fragment and providing an event to the application indicating an instruction representative of the speech input.

Type: Grant

Filed: June 17, 2004

Date of Patent: July 1, 2014

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Brien H. Muschett
System And Method For Event Summarization Using Observer Social Media Messages

Publication number: 20140172427

Abstract: A method for processing messages pertaining to an event includes receiving a plurality of messages pertaining to the event from electronic communication devices associated with a plurality of observers of the event, generating a first message stream that includes only a portion of the plurality of messages corresponding to a first participant in the event, identifying a first sub-event in the first message stream with reference to a time distribution of messages and content distribution of messages in the first message stream, generating a sub-event summary with reference to a portion of the plurality of messages in the first message stream that are associated with the first sub-event, and transmitting the sub-event summary to a plurality of electronic communication devices associated with a plurality of users who are not observers of the event.

Type: Application

Filed: December 13, 2013

Publication date: June 19, 2014

Applicant: Robert Bosch GmbH

Inventors: Fei Liu, Fuliang Weng, Chao Shen, Lin Zhao
Microphone-array-based speech recognition system and method

Patent number: 8744849

Abstract: A microphone-array-based speech recognition system combines a noise cancelling technique for cancelling noise of input speech signals from an array of microphones, according to at least an inputted threshold. The system receives noise-cancelled speech signals outputted by a noise masking module through at least a speech model and at least a filler model, then computes a confidence measure score with the at least a speech model and the at least a filler model for each threshold and each noise-cancelled speech signal, and adjusts the threshold to continue the noise cancelling for achieving a maximum confidence measure score, thereby outputting a speech recognition result related to the maximum confidence measure score.

Type: Grant

Filed: October 12, 2011

Date of Patent: June 3, 2014

Assignee: Industrial Technology Research Institute

Inventor: Hsien-Cheng Liao

prev 1 2 3 4 5 6 … next