Similarity Patents (Class 704/239)
  • Patent number: 9830039
    Abstract: A wizard control panel may be used by a human wizard to adjust the operation of a Natural Language (NL) conversational system during a real-time dialog flow. Input to the wizard control panel is detected and used to interrupt/change an automatic operation of one or more of the NL conversational system components used during the flow. For example, the wizard control panel may be used to adjust results determined by an Automated Speech Recognition (ASR) component, a Natural Language Understanding (NLU) component, a Dialog Manager (DM) component, and a Natural Language Generation (NLG) before the results are used to perform an automatic operation within the flow. A timeout may also be set such that when the timeout expires, the conversational system performs an automated operation by using the results shown in the wizard control panel (edited/not edited).
    Type: Grant
    Filed: March 4, 2013
    Date of Patent: November 28, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lisa Stifelman, Dilek Hakkani-Tur, Larry Heck, Madhusudan Chinthakunta
  • Patent number: 9799333
    Abstract: A system and method are provided for performing speech processing. A system includes an audio detection system configured to receive a signal including speech and a memory having stored therein a database of keyword models forming an ensemble of filters associated with each keyword in the database. A processor is configured to receive the signal including speech from the audio detection system, decompose the signal including speech into a sparse set of phonetic impulses, and access the database of keywords and convolve the sparse set of phonetic impulses with the ensemble of filters. The processor is further configured to identify keywords within the signal including speech based a result of the convolution and control operation the electronic system based on the keywords identified.
    Type: Grant
    Filed: August 31, 2015
    Date of Patent: October 24, 2017
    Assignee: The Johns Hopkins University
    Inventors: Keith Kintzley, Aren Jansen, Hynek Hermansky, Kenneth Church
  • Patent number: 9685153
    Abstract: In some embodiments, the recognition results produced by a speech processing system (which may include a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential significant errors. In some embodiments, the recognition results may be evaluated to determine whether a meaning of any of the alternative recognition results differs from a meaning of the top recognition result in a manner that is significant for the domain. In some embodiments, one or more of the recognition results may be evaluated to determine whether the result(s) include one or more words or phrases that, when included in a result, would change a meaning of the result in a manner that would be significant for the domain.
    Type: Grant
    Filed: May 15, 2015
    Date of Patent: June 20, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: William F. Ganong, III, Raghu Vemula, Robert Fleming
  • Patent number: 9684433
    Abstract: A method includes identifying individuals that are affiliated with a user. The method includes incorporating trusted devices associated with the identified individuals into an event monitor network that is configured to monitor for an occurrence of a monitored event. The method includes identifying a particular input that suggests the occurrence of the monitored event. The method includes communicating to the trusted devices, an input sample that is used for recognition of the particular input from general input that is measured by sensors of the trusted devices. The method includes receiving from at least one of the trusted devices, an event message that indicates the particular input is observed by at least one of the sensors. In response to the event message, the method includes communicating to a user interface of a user device associated with the user, an alarm message that indicates the occurrence of the monitored event.
    Type: Grant
    Filed: December 30, 2014
    Date of Patent: June 20, 2017
    Assignee: EBAY INC.
    Inventor: Kamal Zamer
  • Patent number: 9558335
    Abstract: A method includes receiving, from a user via an electronic device, input representing a password to be utilized for an account; automatically determining, utilizing a processor, a complexity value for the input password; automatically determining, based on the determined complexity value, security settings for the account; receiving, from a user via an electronic device, input representing an attempt to login to the account, the input representing an attempt to login to the account including an attempted password; automatically determining that the attempted password does not match the password to be utilized for the account; and determining a course of action to take in response to the determination that the attempted password does not match the password to be utilized for the account, the course of action being determined based at least in part on the automatically determined security settings for the account.
    Type: Grant
    Filed: December 14, 2015
    Date of Patent: January 31, 2017
    Assignee: ALLSCRIPTS SOFTWARE, LLC
    Inventors: David Thomas Windell, Todd Michael Eischeid, Scott David Bower
  • Patent number: 9476718
    Abstract: A vehicle navigation system may send and receive communications, such as text messages. Speech recognition may generate a text message without affecting a driver's control of the vehicle. A user may audibly control the navigation system and generate a text message through a speech recognition element. A microphone may record a user's voice, which is then transformed into a text message for transmission. The message may be recorded sentence-by-sentence, word-by-word, or letter-by-letter. The recorded text message may be visually or audibly presented to the user before transmission.
    Type: Grant
    Filed: July 10, 2007
    Date of Patent: October 25, 2016
    Assignee: HARMAN BECKER AUTOMOTIVE SYSTEMS GMBH
    Inventor: Mirko Herforth
  • Patent number: 9477753
    Abstract: Systems and methods for processing a query include determining a plurality of sets of match candidates for a query using a processor, each of the plurality of sets of match candidates being independently determined from a plurality of diverse word lattice generation components of different type. The plurality of sets of match candidates is merged by generating a first score for each match candidate to provide a merged set of match candidates. A second score is computed for each match candidate of the merged set based upon features of that match candidate. The first score and the second score are combined to provide a final set of match candidates as matches to the query.
    Type: Grant
    Filed: March 12, 2013
    Date of Patent: October 25, 2016
    Assignee: International Business Machines Corporation
    Inventors: Brian E. D. Kingsbury, Hong-Kwang Jeff Kuo, Lidia Luminita Mangu, Hagen Soltau
  • Patent number: 9449126
    Abstract: A user interface for presenting a set of related pages of an electronic content work for view at the same time. The pages are sized according to a target format for presentation of the electronic content work, and may also be formatted according to user-defined font and zoom criteria. Each of the related pages comprises a media object, for example a markup language object. Responsive to user manipulation of presentation criterion for the set of related pages, the set of related pages is reformatted and so presented in near real-time. In some instances, a user may manipulate controls of the user interface to isolate a content object included within the set of related pages, have information regarding that content object presented and even edit the content object in-line with the present view.
    Type: Grant
    Filed: June 1, 2012
    Date of Patent: September 20, 2016
    Assignee: Inkling Systems, Inc.
    Inventors: Thomas Charles Genoni, Peter S. Cho, Norris Hung, Eric Todd Lovett, Huan Zhao
  • Patent number: 9437187
    Abstract: A search string acquiring unit acquires a search string. A converting unit converts the search string into a phoneme sequence. A time length deriving unit derives the spoken time length of the voice corresponding to the search string. A zone designating unit designates a likelihood acquisition zone in a target voice signal. A likelihood acquiring device acquires a likelihood indicating how likely the likelihood acquisition interval is an interval in which voice corresponding to the search string is spoken. A repeating unit changes the likelihood acquisition zone designated by the zone designating unit, and repeats the process of the zone designating unit and the likelihood acquiring device. An identifying unit identifies, from the target voice signal, estimated intervals for which the voice corresponding to the search string is estimated to be spoken, on the basis of the likelihoods acquired for each of the likelihood acquisition zones.
    Type: Grant
    Filed: January 23, 2015
    Date of Patent: September 6, 2016
    Assignee: CASIO COMPUTER CO., LTD.
    Inventor: Hiroyasu Ide
  • Patent number: 9438578
    Abstract: A biometric authentication system is disclosed that provides authentication capability using biometric data in connection with a challenge for parties engaging in digital communications such as digital text-oriented, interactive digital communications. End-user systems may be coupled to devices that include biometric data capture devices such as retina scanners, fingerprint recorders, cameras, microphones, ear scanners, DNA profilers, etc., so that biometric data of a communicating party may be captured and used for authentication purposes.
    Type: Grant
    Filed: August 17, 2013
    Date of Patent: September 6, 2016
    Assignee: AT&T INTELLECTUAL PROPERTY II, L.P.
    Inventors: Pradeep K. Bansal, Lee Begeja, Carroll W. Creswell, Jeffrey Farah, Benjamin J. Stern, Jay Wilpon
  • Patent number: 9431007
    Abstract: In a voice search device, a processor acquires a search word, converts the search word into a phoneme sequence, acquires, for each frame, an output probability of a feature quantity of a target voice signal being output from each phoneme included in the phoneme sequence, and executes relative calculation of the output probability acquired from each phoneme, based on an output probability acquired from another phoneme included in the phoneme sequence. In addition, the processor successively designates likelihood acquisition zones, acquires a likelihood indicating how likely a designated likelihood acquisition zone is a zone in which voice corresponding to the search word is spoken, and identifies from the target voice signal an estimated zone for which the voice corresponding to the search word is estimated to be spoken, based on the acquired likelihood.
    Type: Grant
    Filed: January 15, 2015
    Date of Patent: August 30, 2016
    Assignee: CASIO COMPUTER CO., LTD.
    Inventor: Hiroki Tomita
  • Patent number: 9330358
    Abstract: A system and method include comparing a context to cases stored in a case base, where the cases include Boolean and non-Boolean independent weight variables and a domain-specific dependency variable. The case and context independent weight variables are normalized and a normalized weight vector is determined for the case base. A match between the received context and each case of the case base is determined using the normalized context and case variables and the normalized weight vector. A skew value is determined for each category of domain specific dependency variables and the category of domain specific dependency variables having the minimal skew value is selected. The dependency variable associated with the selected category is then displayed to a user.
    Type: Grant
    Filed: September 26, 2013
    Date of Patent: May 3, 2016
    Assignee: THE UNITED STATES OF AMERICA AS REPRESENTED BY THE SECRETARY OF THE NAVY
    Inventor: Stuart H. Rubin
  • Patent number: 9330656
    Abstract: A speech dialogue system generates a response sentence in a way to improve the efficiency of the dialogue with the user, based on a result of estimation on an attribute of a proper name in an utterance of a user. The system includes a database attribute estimation unit to estimate the attribute of the input proper name by utilizing a database, and a web attribute estimation unit to estimate an attribute of an input proper name by utilizing information on the web. A reliability integration unit calculates integrated reliability of estimation for each of possible attributes obtained from the estimation by the units, by integrating first reliability of the estimation. A response generation unit generates a response sentence to an input utterance based on the integrated reliabilities of the possible attributes.
    Type: Grant
    Filed: February 26, 2014
    Date of Patent: May 3, 2016
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Mikio Nakano, Kazunori Komatani, Tsugumi Otsuka
  • Patent number: 9311932
    Abstract: A method, system, and computer program product for adaptive pause detection in speech recognition are provided in the illustrative embodiments. A speech stream comprising audio signal of a speech is received. A first point in the speech stream is marked with a beginning time stamp. After the first point, a pause is detected in the speech stream. The pause is of a duration at least equal to a pause duration threshold. A second point after the pause in the speech stream is marked with an ending time stamp. A portion of the speech stream between the beginning and the ending time stamps forms a first speech segment. A speech rate of the first speech segment is computed using a number of words in the first speech segment, the beginning time stamp, and the ending time stamp. The pause duration is adjusted according to the first speech segment's speech rate.
    Type: Grant
    Filed: January 23, 2014
    Date of Patent: April 12, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: William S. Carter
  • Patent number: 9262693
    Abstract: An object detection apparatus includes a storage section storing a plurality of selection patterns as combinations of one of a plurality of recognition dictionaries and one of a plurality of image recognition algorithms, a specifying means for specifying at least one of a distance from a position at which an input image is taken and a target corresponding to the detection object within the input image and a state of light of the input image, a selection means for selecting one from the plurality of the selection patterns based on at least one of the distance and the state of the light specified by the specifying means, and a detection means for detecting the detection object within the input image by performing an image recognition process using the image recognition dictionary and the image recognition algorithm included in the selection pattern selected by the selection means.
    Type: Grant
    Filed: April 16, 2014
    Date of Patent: February 16, 2016
    Assignee: DENSO CORPORATION
    Inventor: Yasunori Kamiya
  • Patent number: 9197416
    Abstract: In a verification apparatus, a biometric information acquisition unit acquires a plurality of biometric information pieces from an object. A first verification unit calculates, as a verification score, the similarity between the biometric information piece and a verification information piece, and compares the calculated verification score with a first determination value to determine whether the biometric information piece matches the verification information piece. When the verification fails, a second verification unit performs verification on the plurality of biometric information pieces having a predetermined relationship, using the verification information piece and a second determination value which defines a less stringent criterion than the first determination value.
    Type: Grant
    Filed: August 8, 2013
    Date of Patent: November 24, 2015
    Assignee: FUJITSU FRONTECH LIMITED
    Inventor: Shinichi Eguchi
  • Patent number: 9147395
    Abstract: The present disclosure relates to a mobile terminal and a voice recognition method thereof. The voice recognition method may include receiving a user's voice; providing the received voice to a first voice recognition engine provided in the server and a second voice recognition engine provided in the mobile terminal; acquiring first voice recognition data as a result of recognizing the received voice by the first voice recognition engine; acquiring second voice recognition data as a result of recognizing the received voice by the second voice recognition engine; estimating a function corresponding to the user's intention based on at least one of the first and the second voice recognition data; calculating a similarity between the first and the second voice recognition data when personal information is required for the estimated function; and selecting either one of the first and the second voice recognition data based on the calculated similarity.
    Type: Grant
    Filed: June 21, 2013
    Date of Patent: September 29, 2015
    Assignee: LG ELECTRONICS INC.
    Inventors: Juhee Kim, Hyunseob Lee, Joonyup Lee, Jungkyu Choi
  • Patent number: 9142212
    Abstract: An apparatus and method for recognizing a voice command for use in an interactive voice user interface are provided. The apparatus includes a command intention belief generation unit that is configured to recognize a first voice command and that may generate one or more command intention beliefs for the first voice command. The apparatus also includes a command intention belief update unit that is configured to update each of the command intention beliefs based on a system response to the first voice command and a second voice commands. The apparatus also includes a command intention belief selection unit that is configured to select one of the updated command intention beliefs for the first voice command. The apparatus also includes an operation signal output unit that is configured to select a final command intention from the selected updated command intention belief and to output an operation signal based on the selected final command intention.
    Type: Grant
    Filed: April 26, 2011
    Date of Patent: September 22, 2015
    Inventors: Chi-Youn Park, Byung-Kwan Kwak, Jeong-Su Kim, Jeong-Mi Cho
  • Patent number: 9064493
    Abstract: In some embodiments, the recognition results produced by a speech processing system (which may include a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential significant errors. In some embodiments, the recognition results may be evaluated to determine whether a meaning of any of the alternative recognition results differs from a meaning of the top recognition result in a manner that is significant for the domain. In some embodiments, one or more of the recognition results may be evaluated to determine whether the result(s) include one or more words or phrases that, when included in a result, would change a meaning of the result in a manner that would be significant for the domain.
    Type: Grant
    Filed: July 9, 2012
    Date of Patent: June 23, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: William F. Ganong, III, Raghu Vemula, Robert Fleming
  • Patent number: 9043207
    Abstract: The present invention relates to a method for speaker recognition, comprising the steps of obtaining and storing speaker information for at least one target speaker; obtaining a plurality of speech samples from a plurality of telephone calls from at least one unknown speaker; classifying the speech samples according to the at least one unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information for the speech samples of each of the speaker-dependent classes of speech samples; combining the extracted speaker information for each of the speaker-dependent classes of speech samples; comparing the combined extracted speaker information for each of the speaker-dependent classes of speech samples with the stored speaker information for the at least one target speaker to obtain at least one comparison result; and determining whether one of the at least one unknown speakers is identical with the at least one target speaker based on the at least one comparison result.
    Type: Grant
    Filed: November 12, 2009
    Date of Patent: May 26, 2015
    Assignee: Agnitio S.L.
    Inventors: Johan Nikolaas Langehoven Brummer, Luis Buera Rodriguez, Marta Garcia Gomar
  • Publication number: 20150127342
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speaker identification. In some implementations, an utterance vector that is derived from an utterance is obtained. Hash values are determined for the utterance vector according to multiple different hash functions. A set of speaker vectors from a plurality of hash tables is determined using the hash values, where each speaker vector was derived from one or more utterances of a respective speaker. The speaker vectors in the set are compared with the utterance vector. A speaker vector is selected based on comparing the speaker vectors in the set with the utterance vector.
    Type: Application
    Filed: October 24, 2014
    Publication date: May 7, 2015
    Inventors: Matthew Sharifi, Ignacio Lopez Moreno, Ludwig Schmidt
  • Patent number: 9026437
    Abstract: A location determination system includes a first mobile terminal and a second mobile terminal. The first mobile terminal includes a first processor to acquire a first sound signal, analyze the first sound signal to obtain a first analysis result, and transmit the first analysis result. The second mobile terminal includes a second processor to acquire a second sound signal, analyze the second sound signal to obtain a second analysis result, receive the first analysis result from the first mobile terminal, compare the second analysis result with the first analysis result to obtain a comparison result, and determine whether the first mobile terminal locates in an area in which the second mobile terminal locates, based on the comparison result.
    Type: Grant
    Filed: March 26, 2012
    Date of Patent: May 5, 2015
    Assignee: Fujitsu Limited
    Inventor: Eiji Hasegawa
  • Patent number: 9020816
    Abstract: A method, system and apparatus are shown for identifying non-language speech sounds in a speech or audio signal. An audio signal is segmented and feature vectors are extracted from the segments of the audio signal. The segment is classified using a hidden Markov model (HMM) that has been trained on sequences of these feature vectors. Post-processing components can be utilized to enhance classification. An embodiment is described in which the hidden Markov model is used to classify a segment as a language speech sound or one of a variety of non-language speech sounds. Another embodiment is described in which the hidden Markov model is trained using discriminative learning.
    Type: Grant
    Filed: August 13, 2009
    Date of Patent: April 28, 2015
    Assignee: 21CT, Inc.
    Inventor: Matthew McClain
  • Patent number: 9015045
    Abstract: A method for refining a search is provided. Embodiments may include receiving a first speech signal corresponding to a first utterance and receiving a second speech signal corresponding to a second utterance, wherein the second utterance is a refinement to the first utterance. Embodiments may also include identifying information associated with the first speech signal as first speech signal information and identifying information associated with the second speech signal as second speech signal information. Embodiments may also include determining a first quantity of search results based upon the first speech signal information and determining a second quantity of search results based upon the second speech signal information.
    Type: Grant
    Filed: March 11, 2013
    Date of Patent: April 21, 2015
    Assignee: Nuance Communications, Inc.
    Inventor: Jean-Francois Lavallee
  • Patent number: 9009042
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating direct speech messages based on voice commands that include indirect speech messages. In one aspect, a method includes receiving a voice input corresponding to an utterance. A determination is made whether a transcription of the utterance includes a command to initiate a communication to a user and a segment that is classified as indirect speech. In response to determining that the transcription of the utterance includes the command and the segment that is classified as indirect speech, the segment that is classified as indirect speech is provided as input to a machine translator. In response to providing the segment that is classified as indirect speech to the machine translator, a direct speech segment is received from the machine translator. A communication is initiated that includes the direct speech segment.
    Type: Grant
    Filed: June 13, 2014
    Date of Patent: April 14, 2015
    Assignee: Google Inc.
    Inventors: Matthias Quasthoff, Simon Tickner
  • Patent number: 9009039
    Abstract: Technologies are described herein for noise adaptive training to achieve robust automatic speech recognition. Through the use of these technologies, a noise adaptive training (NAT) approach may use both clean and corrupted speech for training. The NAT approach may normalize the environmental distortion as part of the model training. A set of underlying “pseudo-clean” model parameters may be estimated directly. This may be done without point estimation of clean speech features as an intermediate step. The pseudo-clean model parameters learned from the NAT technique may be used with a Vector Taylor Series (VTS) adaptation. Such adaptation may support decoding noisy utterances during the operating phase of a automatic voice recognition system.
    Type: Grant
    Filed: June 12, 2009
    Date of Patent: April 14, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Lewis Seltzer, James Garnet Droppo, Ozlem Kalinli, Alejandro Acero
  • Patent number: 8996373
    Abstract: A state detection device includes: a first model generation unit to generate a first specific speaker model obtained by modeling speech features of a specific speaker in an undepressed state; a second model generation unit to generate a second specific speaker model obtained by modeling speech features of the specific speaker in the depressed state; a likelihood calculation unit to calculate a first likelihood as a likelihood of the first specific speaker model with respect to input voice, and a second likelihood as a likelihood of the second specific speaker model with respect to the input voice; and a state determination unit to determine a state of the speaker of the input voice using the first likelihood and the second likelihood.
    Type: Grant
    Filed: October 5, 2011
    Date of Patent: March 31, 2015
    Assignee: Fujitsu Limited
    Inventors: Shoji Hayakawa, Naoshi Matsuo
  • Patent number: 8990086
    Abstract: A recognition confidence measurement method, medium and system which can more accurately determine whether an input speech signal is an in-vocabulary, by extracting an optimum number of candidates that match a phone string extracted from the input speech signal and estimating a lexical distance between the extracted candidates is provided. A recognition confidence measurement method includes: extracting a phoneme string from a feature vector of an input speech signal; extracting candidates by matching the extracted phoneme string and phoneme strings of vocabularies registered in a predetermined dictionary and; estimating a lexical distance between the extracted candidates; and determining whether the input speech signal is an in-vocabulary, based on the lexical distance.
    Type: Grant
    Filed: July 31, 2006
    Date of Patent: March 24, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sang-Bae Jeong, Nam Hoon Kim, Ick Sang Han, In Jeong Choi, Gil Jin Jang, Jae-Hoon Jeong
  • Publication number: 20150081296
    Abstract: A method for activating a voice assistant function in a mobile device is disclosed. The method includes receiving an input sound stream by a sound sensor and determining a context of the mobile device. The method may determine the context based on the input sound stream. For determining the context, the method may also obtain data indicative of the context of the mobile device from at least one of an acceleration sensor, a location sensor, an illumination sensor, a proximity sensor, a clock unit, and a calendar unit in the mobile device. In this method, a threshold for activating the voice assistant function is adjusted based on the context. The method detects a target keyword from the input sound stream based on the adjusted threshold. If the target keyword is detected, the method activates the voice assistant function.
    Type: Application
    Filed: September 17, 2013
    Publication date: March 19, 2015
    Applicant: QUALCOMM Incorporated
    Inventors: Minsub Lee, Taesu Kim, Kyu Woong Hwang, Minho Jin
  • Patent number: 8983840
    Abstract: Techniques, an apparatus and an article of manufacture identifying one or more utterances that are likely to carry the intent of a speaker, from a conversation between two or more parties. A method includes obtaining an input of a set of utterances in chronological order from a conversation between two or more parties, computing an intent confidence value of each utterance by summing intent confidence scores from each of the constituent words of the utterance, wherein intent confidence scores capture each word's influence on the subsequent utterances in the conversation based on (i) the uniqueness of the word in the conversation and (ii) the number of times the word subsequently occurs in the conversation, and generating a ranked order of the utterances from highest to lowest intent confidence value, wherein the highest intent value corresponds to the utterance which is most likely to carry intent of the speaker.
    Type: Grant
    Filed: June 19, 2012
    Date of Patent: March 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Om D. Deshmukh, Sachindra Joshi, Saket Saurabh, Ashish Verma
  • Publication number: 20150073791
    Abstract: An apparatus, a method, and a machine-readable medium are provided for characterizing differences between two language models. A group of utterances from each of a group of time domains are examined. One of a significant word change or a significant word class change within the plurality of utterances is determined. A first cluster of utterances including a word or a word class corresponding to the one of the significant word change or the significant word class change is generated from the utterances. A second cluster of utterances not including the word or the word class corresponding to the one of the significant word change or the significant word class change is generated from the utterances.
    Type: Application
    Filed: November 14, 2014
    Publication date: March 12, 2015
    Inventors: Allen Louis GORIN, John Grothendieck, Jeremy Huntley Greet Wright
  • Patent number: 8977547
    Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.
    Type: Grant
    Filed: October 8, 2009
    Date of Patent: March 10, 2015
    Assignee: Mitsubishi Electric Corporation
    Inventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
  • Patent number: 8965761
    Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.
    Type: Grant
    Filed: February 27, 2014
    Date of Patent: February 24, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: William Kress Bodin, Michael John Burkhart, Daniel G. Eisenhauer, Thomas James Watson, Daniel Mark Schumacher
  • Patent number: 8930189
    Abstract: A particular method includes receiving, at a representational state transfer endpoint device, a first user input related to a first speech to text conversion performed by a speech to text transcription service. The method also includes receiving, at the representational state transfer endpoint device, a second user input related to a second speech to text conversion performed by the speech to text transcription service. The method includes processing of the first user input and the second user input at the representational state transfer endpoint device to generate speech to text adjustment information.
    Type: Grant
    Filed: October 28, 2011
    Date of Patent: January 6, 2015
    Assignee: Microsoft Corporation
    Inventors: Jeremy Edward Cath, Timothy Edwin Harris, Marc Mercuri, James Oliver Tisdale, III
  • Patent number: 8924204
    Abstract: Unlike sound based pressure waves that go everywhere, air turbulence caused by wind is usually a fairly local event. Therefore, in a system that utilizes two or more spatially separated microphones to pick up sound signals (e.g., speech), wind noise picked up by one of the microphones often will not be picked up (or at least not to the same extent) by the other microphone(s). Embodiments of methods and apparatuses that utilize this fact and others to effectively detect and suppress wind noise using multiple microphones that are spatially separated are described.
    Type: Grant
    Filed: September 30, 2011
    Date of Patent: December 30, 2014
    Assignee: Broadcom Corporation
    Inventors: Juin-Hwey Chen, Jes Thyssen, Xianxian Zhang, Huaiyu Zeng
  • Patent number: 8924211
    Abstract: In some embodiments, the recognition results produced by a speech processing system (which may include two or more recognition results, including a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential significant errors. In some embodiments, the recognition results may be evaluated to determine whether a meaning of any of the alternative recognition results differs from a meaning of the top recognition result in a manner that is significant for a domain, such as the medical domain. In some embodiments, words and/or phrases that may be confused by an ASR system may be determined and associated in sets of words and/or phrases. Words and/or phrases that may be determined include those that change a meaning of a phrase or sentence when included in the phrase/sentence.
    Type: Grant
    Filed: July 9, 2012
    Date of Patent: December 30, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: William F. Ganong, III, Raghu Vemula, Robert Fleming
  • Patent number: 8909526
    Abstract: In some embodiments, a recognition result produced by a speech processing system based on an analysis of a speech input is evaluated for indications of potential errors. In some embodiments, sets of words/phrases that may be acoustically similar or otherwise confusable, the misrecognition of which can be significant in the domain, may be used together with a language model to evaluate a recognition result to determine whether the recognition result includes such an indication. In some embodiments, a word/phrase of a set that appears in the result is iteratively replaced with each of the other words/phrases of the set. The result of the replacement may be evaluated using a language model to determine a likelihood of the newly-created string of words appearing in a language and/or domain. The likelihood may then be evaluated to determine whether the result of the replacement is sufficiently likely for an alert to be triggered.
    Type: Grant
    Filed: July 9, 2012
    Date of Patent: December 9, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: William F. Ganong, III, Raghu Vemula, Robert Fleming
  • Patent number: 8903725
    Abstract: Method for controlling user access to a service available in a data network and/or to information stored in a user database, in order to protect stored user data from unauthorized access, such that the method comprises the following: input of a user's speech sample to a user data terminal, processing of the user's speech sample in order to obtain a prepared speech sample as well as a current voice profile of the user, comparison of the current voice profile with an initial voice profile stored in an authorization database, and output of an access-control signal to either permit or refuse access, taking into account the result of the comparison step, such that the comparison step includes a quantitative similarity evaluation of the current and the stored voice profiles as well as a threshold-value discrimination of a similarity measure thereby derived, and an access-control signal that initiates permission of access is generated only if a prespecified similarity measure is not exceeded.
    Type: Grant
    Filed: November 25, 2009
    Date of Patent: December 2, 2014
    Assignee: Voice.Trust AG
    Inventor: Christian Pilz
  • Patent number: 8898061
    Abstract: A particular method includes receiving, at a representational state transfer endpoint device, a first user input related to a first speech to text conversion performed by a speech to text transcription service. The method also includes receiving, at the representational state transfer endpoint device, a second user input related to a second speech to text conversion performed by the speech to text transcription service. The method includes processing of the first user input and the second user input at the representational state transfer endpoint device to generate speech to text adjustment information.
    Type: Grant
    Filed: October 28, 2011
    Date of Patent: November 25, 2014
    Assignee: Microsoft Corporation
    Inventors: Jeremy Edward Cath, Timothy Edwin Harris, Marc Mercuri, James Oliver Tisdale, III
  • Publication number: 20140337024
    Abstract: A method for speech command detection comprises extracting speech features from a speech signal inputted into a system, converting the speech features into a word sequence, obtaining time durations of speech segments corresponding to the respective non-command words and an acoustic score of each of the command word candidates, calculating rhythm features of the speech signal based on the time durations, and recognizing a speech corresponding to the at least one command word candidates as a speech command directed to the system or a speech not directed to the system based on the acoustic score and the rhythm features. The word sequence comprises at least two successive non-command words and at least one command word candidate. The rhythm features describe a similarity of time durations of speech segments corresponding to the respective non-command words, and/or a similarity of energy variations of the speech segments corresponding to the respective non-command words.
    Type: Application
    Filed: May 9, 2014
    Publication date: November 13, 2014
    Applicant: CANON KABUSHIKI KAISHA
    Inventors: Xiang Zuo, Weixiang Hu, Hefei Liu
  • Publication number: 20140337025
    Abstract: The present disclosure discloses a classification method and system for audio files, the classification method includes: constructing Pitch sequence of the audio files to be classified; calculating eigenvectors of the audio files according to the Pitch sequence of the audio files; and classifying the audio files according to the eigenvectors of the audio files. The present disclosure can achieve automatic classification of the audio files, reduce the cost of the classification, and improve classification efficiency and flexibility and intelligence of the classification.
    Type: Application
    Filed: July 25, 2014
    Publication date: November 13, 2014
    Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Weifeng ZHAO, Shenyuan Li, Liwei Zhang, Jianfeng Chen
  • Patent number: 8886531
    Abstract: An audio fingerprint is generated by transforming an audio sample of a recording to a time-frequency domain and storing each time-frequency pair in a matrix array, detecting a plurality of local maxima for a predetermined number of time slices, selecting a predetermined number of largest-magnitude maxima from the plurality of local maxima detected by said detecting, and generating one or more hash values corresponding to the predetermined number of largest-magnitude maxima.
    Type: Grant
    Filed: January 13, 2010
    Date of Patent: November 11, 2014
    Assignee: Rovi Technologies Corporation
    Inventor: Brian Kenneth Vogel
  • Publication number: 20140257809
    Abstract: Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.
    Type: Application
    Filed: May 22, 2014
    Publication date: September 11, 2014
    Inventors: Vaibhava Goel, Peder A. Olsen, Steven J. Rennie, Jing Huang
  • Publication number: 20140249817
    Abstract: Techniques for using both speaker-identification information and other characteristics associated with received voice commands to determine how and whether to respond to the received voice commands. A user may interact with a device through speech by providing voice commands. After beginning an interaction with the user, the device may detect subsequent speech, which may originate from the user, from another user, or from another source. The device may then use speaker-identification information and other characteristics associated with the speech to attempt to determine whether or not the user interacting with the device uttered the speech. The device may then interpret the speech as a valid voice command and may perform a corresponding operation in response to determining that the user did indeed utter the speech. If the device determines that the user did not utter the speech, however, then the device may refrain from taking action on the speech.
    Type: Application
    Filed: March 4, 2013
    Publication date: September 4, 2014
    Applicant: RAWLES LLC
    Inventor: RAWLES LLC
  • Patent number: 8781830
    Abstract: Differential dynamic content delivery including providing a session document for a presentation, wherein the session document includes a session grammar and a session structured document; selecting from the session structured document a classified structural element in dependence upon user classifications of a user participant in the presentation; presenting the selected structural element to the user; streaming presentation speech to the user including individual speech from at least one user participating in the presentation; converting the presentation speech to text; detecting whether the presentation speech contains simultaneous individual speech from two or more users; and displaying the text if the presentation speech contains simultaneous individual speech from two or more users.
    Type: Grant
    Filed: July 2, 2013
    Date of Patent: July 15, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: William K. Bodin, Michael J. Burkhart, Daniel G. Eisenhauer, Thomas J. Watson, Daniel M. Schumacher
  • Patent number: 8781825
    Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.
    Type: Grant
    Filed: August 24, 2011
    Date of Patent: July 15, 2014
    Assignee: Sensory, Incorporated
    Inventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
  • Patent number: 8768687
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating direct speech messages based on voice commands that include indirect speech messages. In one aspect, a method includes receiving a voice input corresponding to an utterance. A determination is made whether a transcription of the utterance includes a command to initiate a communication to a user and a segment that is classified as indirect speech. In response to determining that the transcription of the utterance includes the command and the segment that is classified as indirect speech, the segment that is classified as indirect speech is provided as input to a machine translator. In response to providing the segment that is classified as indirect speech to the machine translator, a direct speech segment is received from the machine translator. A communication is initiated that includes the direct speech segment.
    Type: Grant
    Filed: April 29, 2013
    Date of Patent: July 1, 2014
    Assignee: Google Inc.
    Inventors: Matthias Quasthoff, Simon Tickner
  • Patent number: 8768711
    Abstract: A method of voice-enabling an application for command and control and content navigation can include the application dynamically generating a markup language fragment specifying a command and control and content navigation grammar for the application, instantiating an interpreter from a voice library, and providing the markup language fragment to the interpreter. The method also can include the interpreter processing a speech input using the command and control and content navigation grammar specified by the markup language fragment and providing an event to the application indicating an instruction representative of the speech input.
    Type: Grant
    Filed: June 17, 2004
    Date of Patent: July 1, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Brien H. Muschett
  • Publication number: 20140172427
    Abstract: A method for processing messages pertaining to an event includes receiving a plurality of messages pertaining to the event from electronic communication devices associated with a plurality of observers of the event, generating a first message stream that includes only a portion of the plurality of messages corresponding to a first participant in the event, identifying a first sub-event in the first message stream with reference to a time distribution of messages and content distribution of messages in the first message stream, generating a sub-event summary with reference to a portion of the plurality of messages in the first message stream that are associated with the first sub-event, and transmitting the sub-event summary to a plurality of electronic communication devices associated with a plurality of users who are not observers of the event.
    Type: Application
    Filed: December 13, 2013
    Publication date: June 19, 2014
    Applicant: Robert Bosch GmbH
    Inventors: Fei Liu, Fuliang Weng, Chao Shen, Lin Zhao
  • Patent number: 8744849
    Abstract: A microphone-array-based speech recognition system combines a noise cancelling technique for cancelling noise of input speech signals from an array of microphones, according to at least an inputted threshold. The system receives noise-cancelled speech signals outputted by a noise masking module through at least a speech model and at least a filler model, then computes a confidence measure score with the at least a speech model and the at least a filler model for each threshold and each noise-cancelled speech signal, and adjusts the threshold to continue the noise cancelling for achieving a maximum confidence measure score, thereby outputting a speech recognition result related to the maximum confidence measure score.
    Type: Grant
    Filed: October 12, 2011
    Date of Patent: June 3, 2014
    Assignee: Industrial Technology Research Institute
    Inventor: Hsien-Cheng Liao