Preliminary Matching Patents (Class 704/247)
  • Patent number: 7864987
    Abstract: An access system in one embodiment that first determines that someone has correct credentials by using a non-biometric authentication method such as typing in a password, presenting a Smart card containing a cryptographic secret, or having a valid digital signature. Once the credentials are authenticated, then the user must take at least two biometric tests, which can be chosen randomly. In one approach, the biometric tests need only check a template generated from the user who desires access with the stored templates matching the holder of the credentials authenticated by the non-biometric test. Access desirably will be allowed when both biometric tests are passed.
    Type: Grant
    Filed: April 18, 2006
    Date of Patent: January 4, 2011
    Assignee: Infosys Technologies Ltd.
    Inventors: Kumar Balepur Venkatanna, Rajat Moona, S V Subrahmanya
  • Patent number: 7843364
    Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate text input. In response to an ambiguous editing input at a location preceding at least a portion of an output word, the software performs one disambiguation operation with respect to the editing input and another disambiguation operation with respect to the editing input in combination with the at least portion of the output word. The results are output in order of decreasing frequency value, with the results of the one disambiguation operation having the portion of the output word appended thereto.
    Type: Grant
    Filed: December 30, 2008
    Date of Patent: November 30, 2010
    Assignee: Research In Motion Limited
    Inventors: Michael G. Elizarov, Vadim Fux, Dan Rubanovich
  • Patent number: 7831423
    Abstract: A system enables a transcriptionist to replace a first written form (such as an abbreviation) of a concept with a second written form (such as an expanded form) of the same concept. For example, the system may display to the transcriptionist a draft document produced from speech by an automatic speech recognizer. If the transcriptionist recognizes a first written form of a concept that should be replaced with a second written form of the same concept, the transcriptionist may provide the system with a replacement command. In response, the system may identify the second written form of the concept and replace the first written form with the second written form in the draft document.
    Type: Grant
    Filed: May 25, 2006
    Date of Patent: November 9, 2010
    Assignee: Multimodal Technologies, Inc.
    Inventor: Kjell Schubert
  • Patent number: 7831424
    Abstract: A method is presented which reduces data flow and thereby increases processing capacity while preserving a high level of accuracy in a distributed speech processing environment for speaker detection. The method and system of the present invention includes filtering out data based on a target speaker specific subset of labels using data filters. The method preserves accuracy and passes only a fraction of the data by optimizing target specific performance measures. Therefore, a high level of speaker recognition accuracy is maintained while utilizing existing processing capabilities.
    Type: Grant
    Filed: April 2, 2008
    Date of Patent: November 9, 2010
    Assignee: International Business Machines Corporation
    Inventors: Upendra V. Chaudhari, Juan M. Huerta, Ganesh N. Ramaswamy, Olivier Verscheure
  • Patent number: 7813927
    Abstract: There is provided an apparatus for providing a Text Independent (TI) speaker recognition mode in a Text Dependent (TD) Hidden Markov Model (HMM) speaker recognition system and/or a Text Constrained (TC) HMM speaker recognition system. The apparatus includes a Gaussian Mixture Model (GMM) generator and a Gaussian weight normalizer. The GMM generator is for creating a GMM by pooling Gaussians from a plurality of HMM states. The Gaussian weight normalizer is for normalizing Gaussian weights with respect to the plurality of HMM states.
    Type: Grant
    Filed: June 4, 2008
    Date of Patent: October 12, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Jiri Navratil, James H. Nealand, Jason W. Pelecanos, Ganesh N. Ramaswamy, Ran D. Zilca
  • Patent number: 7788097
    Abstract: A method, system and article of manufacture of recognizing a voice command. One embodiment of the invention comprises: receiving a voice input; using the number of sound fragments, determining a number of sound fragments to be processed in a first set of sound fragments; determining whether the first set of sound fragments of the voice input matches with the first set of sound fragments of a voice command; and if the first set of sound fragments matches with the first set of sound fragments of the voice command, then determining whether one or more remaining sound fragments matches with one or more remaining sound fragments of the voice command.
    Type: Grant
    Filed: October 31, 2006
    Date of Patent: August 31, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Joseph H. McIntyre, Victor S. Moore
  • Patent number: 7778832
    Abstract: One-to-many comparisons of callers' voice prints with known voice prints to identify any matches between them. When a customer communicates with a particular entity, such as a customer service center, the system makes a recording of the real-time call including both the customer's and agent's voices. The system segments the recording to extract at least a portion of the customer's voice to create a customer voice print, and it formats the segmented voice print for network transmission to a server. The server compares the customer's voice print with multiple known voice prints to determine any matches, meaning that the customer's voice print and one of the known voice prints are likely from the same person. The identification of any matches can be used for a variety of purposes, such as determining whether to authorize a transaction requested by the customer.
    Type: Grant
    Filed: September 26, 2007
    Date of Patent: August 17, 2010
    Assignee: American Express Travel Related Services Company, Inc.
    Inventors: Vicki Broman, Vernon Marshall, Seshasayee Bellamkonda, Marcel Leyva, Cynthia Hanson
  • Patent number: 7769583
    Abstract: A system, method and computer program product for classification of an analog electrical signal using statistical models of training data. A technique is described to quantize the analog electrical signal in a manner which maximizes the compression of the signal while simultaneously minimizing the diminution in the ability to classify the compressed signal. These goals are achieved by utilizing a quantizer designed to minimize the loss in a power of the log-likelihood ratio. A further technique is described to enhance the quantization process by optimally allocating a number of bits for each dimension of the quantized feature vector subject to a maximum number of bits available across all dimensions.
    Type: Grant
    Filed: May 13, 2006
    Date of Patent: August 3, 2010
    Assignee: International Business Machines Corporation
    Inventors: Upendra V. Chaudhari, Hsin I. Tseng, Deepak S. Turaga, Olivier Verscheure
  • Patent number: 7761296
    Abstract: A system and method for rescoring the N-best hypotheses from an automatic speech recognition system by comparing an original speech waveform to synthetic speech waveforms that are generated for each text sequence of the N-best hypotheses. A distance is calculated from the original speech waveform to each of the synthesized waveforms, and the text associated with the synthesized waveform that is determined to be closest to the original waveform is selected as the final hypothesis. The original waveform and each synthesized waveform are aligned to a corresponding text sequence on a phoneme level. The mean of the feature vectors which align to each phoneme is computed for the original waveform as well as for each of the synthesized hypotheses.
    Type: Grant
    Filed: April 2, 1999
    Date of Patent: July 20, 2010
    Assignee: International Business Machines Corporation
    Inventors: Raimo Bakis, Ellen M. Eide
  • Publication number: 20100145697
    Abstract: Disclosed herein is a similar speaker recognition method and system using nonlinear analysis. The recognition method extracts a nonlinear feature of a sound signal through nonlinear analysis of the sound signal and combines the nonlinear feature with a linear feature such as spectrum. The method transforms sound data in a time domain into status vectors in a phase domain and uses a nonlinear time series analysis method capable of representing nonlinear features of the status vectors to extract nonlinear information of a sound. The method can overcome technical limitations of conventional linear algorithms. The recognition method can be applied to sound-related application systems other than speaker recognition systems.
    Type: Application
    Filed: October 28, 2009
    Publication date: June 10, 2010
    Applicant: IUCF-HYU INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY
    Inventors: Young-Hun Kwon, Kun-Sang Lee, Sung-IL Yang, Sung-Wook Chang, Jung-Pa Seo, Min-Su Kim, In-Chan Baek
  • Publication number: 20100131273
    Abstract: Device, system, and method of liveness detection using voice biometrics. For example, a method comprises: generating a first matching score based on a comparison between: (a) a voice-print from a first text-dependent audio sample received at an enrollment stage, and (b) a second text-dependent audio sample received at an authentication stage; generating a second matching score based on a text-independent audio sample; and generating a liveness score by taking into account at least the first matching score and the second matching score.
    Type: Application
    Filed: November 25, 2009
    Publication date: May 27, 2010
    Inventors: Almog Aley-Raz, Nir Moshe Krause, Michael Itzhak Salmon, Ran Yehoshua Gazit
  • Patent number: 7716048
    Abstract: A method and apparatus for segmenting an audio interaction, by locating anchor segment from each side of the interaction, iteratively classifying additional segments into one of the two sides, and scoring the resulting segmentation, If the score result is below a threshold, the process is repeated until the segmentation score is satisfactory or until a stopping criterion is met. The anchoring and the scoring steps comprise using additional data associated with the interaction, a speaker thereof, internal or external information related to the interaction or to a speaker thereof or the like.
    Type: Grant
    Filed: January 25, 2006
    Date of Patent: May 11, 2010
    Assignee: Nice Systems, Ltd.
    Inventors: Oren Pereg, Moshe Waserblat
  • Publication number: 20100114572
    Abstract: To enable selection of a speaker, the acoustic feature value of which is similar to that of an utterance speaker, with accuracy and stability, while adapting to changes even when the acoustic feature value of the speaker changes every moment. A speaker score calculating means (22) calculates a long-time speaker score (log likelihood of each of a plurality of speaker models stored in a speaker model storage section (31) with respect to the acoustic feature value) based on an arbitrary number of utterances, for example, and calculates a short-time speaker score based on a short-time utterance, for example. A long-time speaker selecting means 23 selects speakers corresponding to a predetermined number of speaker models having a high long-time speaker score.
    Type: Application
    Filed: February 29, 2008
    Publication date: May 6, 2010
    Inventors: Masahiro Tani, Tadashi Emori, Yoshifumi Onishi
  • Patent number: 7711560
    Abstract: A speech recognition apparatus equipped with the garbage acoustic model storage unit storing the garbage acoustic model which learned the collection of unnecessary words. A feature value calculation unit calculates the feature parameter necessary for recognition by acoustically analyzing the unidentified input speech including the non-language speech per frame which is a unit for speech analysis. A garbage acoustic score calculation unit calculates the garbage acoustic score by comparing the feature parameter and the garbage acoustic model, and a garbage acoustic score correction unit corrects the garbage acoustic score calculated by the garbage acoustic score calculation unit so as to raise it in the frame where the non-language speech is inputted.
    Type: Grant
    Filed: February 4, 2004
    Date of Patent: May 4, 2010
    Assignee: Panasonic Corporation
    Inventors: Maki Yamada, Makoto Nishizaki, Yoshihisa Nakatoh, Shinichi Yoshizawa
  • Patent number: 7636661
    Abstract: A method and arrangement for improved speech recognition in a telephonically challenging speakerphone in-car environment. The method includes receiving a signal from a microphone representative of speech to be recognized, performing detection of a transition in the signal indicative of switch on of the microphone, and, in response to the detection, performing speech recognition on the signal with reduced contribution from an initial portion thereof. The initial portion may be treated as optional speech, the speech recognition may be performed with a predetermined redundant sound, and a user may be requested to speak the predetermined redundant sound when speech recognition has fallen below a predetermined threshold. Thus, recognition may be made possible when otherwise it would not be possible, recognition match scoring will be increased as the low weighting given by deleted initial sounds will be eliminated and therefore confusion of the recognized phrase will be reduced.
    Type: Grant
    Filed: June 30, 2005
    Date of Patent: December 22, 2009
    Assignee: Nuance Communications, Inc.
    Inventors: Adam Pieter De Leeuw, Steven Groeger, Stuart John Hayton
  • Patent number: 7624012
    Abstract: The invention enables to generate a general function (4) which can operate on an input signal (Sx) to extract from the latter a value (DVex) of a global characteristic value expressing a feature (De) of the information conveyed by that signal. It operates by: generating at least one compound function (CF1-CFn), said compound function being generated from at least one of a set of elementary functions (EF1, EF2, . . .
    Type: Grant
    Filed: December 16, 2003
    Date of Patent: November 24, 2009
    Assignee: Sony France S.A.
    Inventors: François Pachet, Aymeric Zils
  • Patent number: 7603274
    Abstract: A method and apparatus for determining the possibility of pattern recognition of time series signal independent of a pattern recognition ratio is provided. The method for determining the possibility of pattern recognition of time series signal includes extracting a time forward feature and a time reversed feature from an input signal having a time series pattern, generating time forward alignment and time reversed alignment by using the time forward feature and the time reversed feature, comparing the time forward alignment with the time reversed alignment to compute a likelihood of pattern recognition, and determining that the input signal can be recognized if the likelihood is larger than a predetermined threshold value.
    Type: Grant
    Filed: November 2, 2005
    Date of Patent: October 13, 2009
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Kwangil Hwang
  • Patent number: 7590537
    Abstract: A speech recognition method and apparatus perform speaker clustering and speaker adaptation using average model variation information over speakers while analyzing the quantity variation amount and the directional variation amount. In the speaker clustering method, a speaker group model variation is generated based on the model variation between a speaker-independent model and a training speaker ML model. In the speaker adaptation method, the model in which the model variation between a test speaker ML model and a speaker group ML model to which the test speaker belongs which is most similar to a training speaker group model variation is found, and speaker adaptation is performed on the found model. Herein, the model variation in the speaker clustering and the speaker adaptation are calculated while analyzing both the quantity variation amount and the directional variation amount. The present invention may be applied to any speaker adaptation algorithm of MLLR and MAP.
    Type: Grant
    Filed: December 27, 2004
    Date of Patent: September 15, 2009
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Namhoon Kim, Injeong Choi, Yoonkyung Song
  • Patent number: 7586423
    Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate text input. In response to an ambiguous editing input at a location preceding at least a portion of an output word, the software performs one disambiguation operation with respect to the editing input and another disambiguation operation with respect to the editing input in combination with the at least portion of the output word. The results are output in order of decreasing frequency value, with the results of the one disambiguation operation having the portion of the output word appended thereto.
    Type: Grant
    Filed: June 30, 2006
    Date of Patent: September 8, 2009
    Assignee: Research In Motion Limited
    Inventors: Michael G. Elizarov, Vadim Fux, Dan Rubanovich
  • Patent number: 7567901
    Abstract: Systems and methods for bio-phonetic multi-phrase speaker identity verification are disclosed. Generally, a speaker identity verification engine generates a dynamic phrase including at least one dynamically-generated word. The speaker identity verification engine prompts a user to speak the dynamic phrase and receives a dynamic phrase utterance. The speaker identity verification engine extracts at least one voice characteristic from the dynamic phrase utterance and compares the at least one voice characteristic with a voice profile the generate a score. The speaker identity verification engine then determines whether to accept a speaker identity claim based on the score.
    Type: Grant
    Filed: April 13, 2007
    Date of Patent: July 28, 2009
    Assignee: AT&T Intellectual Property 1, L.P.
    Inventor: Hisao M. Chang
  • Patent number: 7545290
    Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate compound text input. The device is able to assemble language objects in the memory to generate compound language solutions. The device is able to prioritize compound language solutions according to various criteria, including the degree of completeness of the text components of a compound language solution.
    Type: Grant
    Filed: January 13, 2006
    Date of Patent: June 9, 2009
    Assignee: Research In Motion Limited
    Inventors: Vadim Fux, Michael Elizarov
  • Patent number: 7471775
    Abstract: A method and apparatus (100) for updating a voice tag comprising N stored voice tag phoneme sequences includes a function (110) for determining (205) an accepted stored voice tag phoneme sequence for an utterance, a function (140) for extracting(210) a current set of M phoneme sequences having highest likelihoods of representing the utterance, a function (160) for updating (215) a reference histogram associated with the accepted voice tag, and a function (160) for updating (225) the voice tag with N selected phoneme sequences that are selected from the current set of M phoneme sequences and the set of N voice tag phoneme sequences, wherein the N selected phoneme sequences have phoneme histograms most closely matching the reference histogram. The method and apparatus (100) also generates a voice tag using some functions (110, 140, 160) that are common with the method and apparatus to update the voice tag, such as the extracting (410) of the current set of M phoneme sequences.
    Type: Grant
    Filed: June 30, 2005
    Date of Patent: December 30, 2008
    Assignee: Motorola, Inc.
    Inventor: Yan Ming Cheng
  • Patent number: 7447632
    Abstract: A voice authentication system includes: a standard template storage part 17 in which a standard template that is generated from a registered voice of an authorized user and featured with a voice characteristic of the registered voice is stored preliminarily in a state of being associated with a personal ID of the authorized user; an identifier input part 15 that allows a user who intends to be authenticated to input a personal ID; a voice input part 11 that allows the user to input a voice; a standard template/registered voice selection part 16 that selects a standard template and a registered voice corresponding to the inputted identifier; a determination part 14 that refers to the selected standard template and determines whether or not the inputted voice is a voice of the authorized user him/herself and whether or not presentation-use information is to be outputted by referring to a predetermined determination reference; a presentation-use information extraction part 19 that extracts information regarding
    Type: Grant
    Filed: September 29, 2005
    Date of Patent: November 4, 2008
    Assignee: Fujitsu Limited
    Inventor: Taisuke Itou
  • Patent number: 7447633
    Abstract: There is provided an apparatus for providing a Text Independent (TI) speaker recognition mode in a Text Dependent (TD) Hidden Markov Model (HMM) speaker recognition system and/or a Text Constrained (TC) HMM speaker recognition system. The apparatus includes a Gaussian Mixture Model (GMM) generator and a Gaussian weight normalizer. The GMM generator is for creating a GMM by pooling Gaussians from a plurality of HMM states. The Gaussian weight normalizer is for normalizing Gaussian weights with respect to the plurality of HMM states.
    Type: Grant
    Filed: November 22, 2004
    Date of Patent: November 4, 2008
    Assignee: International Business Machines Corporation
    Inventors: Jiri Navratil, James H. Nealand, Jason W. Pelecanos, Ganesh N. Ramaswamy, Ran D. Zilca
  • Publication number: 20080243504
    Abstract: An interactive speech recognition training process and system is disclosed. A speech recognition process is applied to a received speaker utterance. Utterance data are matched by the system with data in a grammar database and the speaker is requested to confirm a determined match. If the system determines from the speaker's response that the match is not confirmed, a negative score is assigned to the utterance data. If the match is determined by the system to be confirmed, a positive score is assigned to the utterance data. Scores for a plurality of such speaker utterances are accumulated in a log file, the accumulated scores used to adjust acoustic models for the grammar database.
    Type: Application
    Filed: March 30, 2007
    Publication date: October 2, 2008
    Applicant: Verizon Data Services, Inc.
    Inventor: Parind Poi
  • Publication number: 20080228481
    Abstract: Embodiments of the present invention improve content selection systems and methods using speech recognition. In one embodiment, the present invention includes a speech recognition method comprising storing content on an electronic device, wherein the content is associated with a plurality of content attribute values, adding the content attribute values to a first recognition set of a speech recognizer, receiving a speech input signal in said speech recognizer, generating a plurality of likelihood values in response to the speech input signal, wherein each likelihood value is associated with one content attribute value in the recognition set; and accessing the stored content based on the likelihood values.
    Type: Application
    Filed: March 13, 2007
    Publication date: September 18, 2008
    Applicant: Sensory, Incorporated
    Inventor: Todd F. Mozer
  • Patent number: 7424425
    Abstract: In detection systems, such as speaker verification systems, for a given operating point range, with an associated detection “cost”, the detection cost is preferably reduced by essentially trading off the system error in the area of interest with areas essentially “outside” that interest. Among the advantages achieved thereby are higher optimization gain and better generalization. From a measurable Detection Error Tradeoff (DET) curve of the given detection system, a criterion is preferably derived, such that its minimization provably leads to detection cost reduction in the area of interest. The criterion allows for selective access to the slope and offset of the DET curve (a line in case of normally distributed detection scores, a curve approximated by mixture of Gaussians in case of other distributions). By modifying the slope of the DET curve, the behavior of the detection system is changed favorably with respect to the given area of interest.
    Type: Grant
    Filed: May 19, 2002
    Date of Patent: September 9, 2008
    Assignee: International Business Machines Corporation
    Inventors: Jiri Navratil, Ganesh N. Ramaswamy
  • Patent number: 7421387
    Abstract: A method for reducing recognition errors. The method includes receiving an N-best list associated with an input of a computer based recognition system. The N-best list includes one or more hypotheses and associated confidence values. The input is classified in response to the N-best list, resulting in a classification. A re-scoring algorithm that is tuned for the classification is selected. The re-scoring algorithm is applied to the N-best list to create a re-scored N-best list. A hypothesis for the value of the input is selected based on the re-scored N-best list.
    Type: Grant
    Filed: May 18, 2004
    Date of Patent: September 2, 2008
    Assignee: General Motors Corporation
    Inventor: Kurt S. Godden
  • Patent number: 7401017
    Abstract: Method and apparatus for multi-pass speech recognition. An input device receives spoken input. A processor performs a first pass speech recognition technique on the spoken input and forms first pass results. The first pass results include a number of alternative speech expressions, each having an assigned score related to the certainty that the corresponding expression correctly matches the spoken input. The processor selectively performs a second pass speech recognition technique on the spoken input according to the first pass results. Preferably, the second pass attempts to correctly match the spoken input to only those expressions which were identified during the first pass. Otherwise, if one of the expressions identified by the first pass is assigned a score higher than a predetermined threshold (e.g., 95%), the second pass is not performed.
    Type: Grant
    Filed: April 4, 2006
    Date of Patent: July 15, 2008
    Assignee: Nuance Communications
    Inventors: Hy Murveit, Ashvin Kannan, Ben Shahshahani, Chris Leggetter, Katherine Knill
  • Patent number: 7386448
    Abstract: A system and method enrolls a speaker with an enrollment utterance and authenticates a user with a biometric analysis of an authentication utterance, without the need for a PIN (Personal Identification Number). During authentication, the system uses the same authentication utterance to identify who a speaker claims to be with speaker recognition, and verify whether is the speaker is actually the claimed person. Thus, it is not necessary for the speaker to identify biometric data using a PIN. The biometric analysis includes a neural tree network to determine unique aspects of the authentication utterances for comparison to the enrollment authentication. The biometric analysis leverages a statistical analysis using Hidden Markov Models to before authorizing the speaker.
    Type: Grant
    Filed: June 24, 2004
    Date of Patent: June 10, 2008
    Assignee: T-Netix, Inc.
    Inventors: John C. Poss, Dag Boye, Mark W. Mobley
  • Patent number: 7376434
    Abstract: Improved approaches for users of electronic devices to communicate with one another are disclosed. The electronic devices have audio and/or textual output capabilities. The improved approaches can enable users to communicate in different ways depending on device configuration, user preferences, prior history, etc. In one embodiment, the communication between users is achieved by short audio or textual messages.
    Type: Grant
    Filed: August 2, 2006
    Date of Patent: May 20, 2008
    Assignee: IpVenture, Inc.
    Inventors: C. Douglass Thomas, Peter P. Tong
  • Patent number: 7312726
    Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software. The device provides output in the form of a default output and a number of variants. The output is based largely upon the frequency, i.e., the likelihood that a user intended a particular output, but various features of the device provide additional variants that are not based solely on frequency and rather are provided by various logic structures resident on the device. The device enables editing during text entry and also provides a learning function that allows the disambiguation function to adapt to provide a customized experience for the user. The disambiguation function can be selectively disabled and an alternate keystroke interpretation system provided.
    Type: Grant
    Filed: June 2, 2004
    Date of Patent: December 25, 2007
    Assignee: Research In Motion Limited
    Inventors: Vadim Fux, Michael G. Elizarov, Sergey V. Kolomiets
  • Patent number: 7177808
    Abstract: Method for improving speaker identification by determining usable speech. Degraded speech is preprocessed in a speaker identification (SID) process to produce SID usable and SID unusable segments. Features are extracted and analyzed so as to produce a matrix of optimum classifiers for the detection of SID usable and SID unusable speech segments. Optimum classifiers possess a minimum distance from a speaker model. A decision tree based upon fixed thresholds indicates the presence of a speech feature in a given speech segment. Following preprocessing, degraded speech is measured in one or more time, frequency, cepstral or SID usable/unusable domains. The results of the measurements are multiplied by a weighting factor whose value is proportional to the reliability of the corresponding time, frequency, or cepstral measurements performed. The measurements are fused as information, and usable speech segments are extracted for further processing.
    Type: Grant
    Filed: August 18, 2004
    Date of Patent: February 13, 2007
    Assignee: The United States of America as represented by the Secretary of the Air Force
    Inventors: Robert E. Yantorno, Daniel S. Benincasa, Stanley J. Wenndt, Brett Y. Smolenski
  • Patent number: 7146315
    Abstract: A multichannel source activity detection system, e.g., a voice activity detection (VAD) system, and method that exploits spatial localization of a target audio source is provided. The method includes the steps of receiving a mixed sound signal by at least two microphones; Fast Fourier transforming each received mixed sound signal into the frequency domain; filtering the transformed signals to output a signal corresponding to a spatial signature of a source; summing an absolute value squared of the filtered signal over a predetermined range of frequencies; and comparing the sum to a threshold to determine if a voice is present. Additionally, the filtering step includes multiplying the transformed signals by an inverse of a noise spectral power matrix, a vector of channel transfer function ratios, and a source signal spectral power.
    Type: Grant
    Filed: August 30, 2002
    Date of Patent: December 5, 2006
    Assignee: Siemens Corporate Research, Inc.
    Inventors: Radu Victor Balan, Justinian Rosca, Christophe Beaugeant
  • Patent number: 7054812
    Abstract: A system is provided for determining a sequence of sub-word units representative of at least two words output by a word recognition unit in response to an input word to be recognized. In a preferred embodiment, the word alternatives output by the recognition unit are converted into sequences of phonemes. An optimum alignment between these sequences is then determined using a dynamic programming alignment technique. The sequence of phonemes representative of the input sequences is then determined using this optimum alignment.
    Type: Grant
    Filed: April 25, 2001
    Date of Patent: May 30, 2006
    Assignee: Canon Kabushiki Kaisha
    Inventors: Jason Peter Andrew Charlesworth, Philip Neil Garner
  • Patent number: 7054811
    Abstract: A system for verifying and enabling user access, which includes a voice registration unit for providing a substantially unique and initial identification of each of a plurality of the speaker/users by finding the speaker/user's voice parameters in a voice registration sample and storing same in a database. The system also includes a voice authenticating unit for substantially absolute verification of an identity of one of said plurality of users. The voice authenticating unit includes a recognition unit for providing a voice authentication sample, and being operative with the database. The voice authenticating unit also includes a decision unit operative with the recognition unit and the database to decide whether the user is the same as the person of the same identity registered with the system, such that the identity of one of the plurality of users is substantially absolutely verified.
    Type: Grant
    Filed: October 6, 2004
    Date of Patent: May 30, 2006
    Assignee: Cellmax Systems Ltd.
    Inventor: Ziv Barzilay
  • Patent number: 6937702
    Abstract: Method, apparatus, and computer-readable media for minimizing the risk of fraudulent access to call center resources. The invention described herein provides a method of minimizing fraudulent access to call center resources, with the method including at least the following. One or more authenticated biometric samples are associated with at least one person. The person then submits at least one test biometric sample during a login process to obtain authorization to access to call center resources, for example, to process telephone calls or to receive training. This test biometric sample is captured and the differences between the test biometric sample and the one or more authenticated biometric samples is quantified. Depending on the degree of difference between the at least one authenticated biometric sample and the test biometric sample, the person's request for authorization to access call center resources is dispositioned.
    Type: Grant
    Filed: June 24, 2002
    Date of Patent: August 30, 2005
    Assignee: West Corporation
    Inventors: Jill M. Vacek, Mark J. Pettay, Hendryanto Rilantono, Mahmood S. Akhwand, Gary L. West
  • Patent number: 6925154
    Abstract: Techniques for providing an automated conversational name dialing system for placing a call in response to an input by a user. One technique begins with the step of analyzing an input from a user, wherein the input includes information directed to identifying an intended recipient of a telephone call from the user. At least one candidate for the intended recipient is identified in response to the input, wherein the at least one candidate represents at least one potential match between the intended recipient and a predetermined vocabulary. A confidence measure indicative of a likelihood that the at least one candidate is the intended recipient is determined, and additional information is obtained from the user to increase the likelihood that the at least one candidate is the intended recipient, based on the determined confidence measure.
    Type: Grant
    Filed: May 3, 2002
    Date of Patent: August 2, 2005
    Assignee: International Business Machines Corproation
    Inventors: Yuqing Gao, Bhuvana Ramabhadran, Chengjun Julian Chen, Hakan Erdogan, Michael A. Picheny
  • Publication number: 20040260549
    Abstract: A voice recognition system includes an adaptive filter and a subtractor. The adaptive filter generates a simulated talk-back voice y(n) by setting a filter coefficient simulating a transfer system in which an input voice corresponding to a voice command and a talk-back voice output from a speaker are input into a microphone and by filtering a talk-back voice x(n). The subtractor extracts the input voice by subtracting the simulated talk-back voice y(n) from mixed sound input into the microphone. With this configuration, the talk-back voice is attenuated from the mixed sound including the input voice and the talk-back voice input tedinto the microphone, and then, the mixed sound is supplied to a voice recognition engine. Accordingly, the user can input his/her voice during a talk-back operation without the need to interrupt it by pressing a speech button every time the user wishes to input the voice. The voice recognition operation time can be thus reduced.
    Type: Application
    Filed: April 30, 2004
    Publication date: December 23, 2004
    Inventors: Shuichi Matsumoto, Toru Marumoto
  • Publication number: 20040162726
    Abstract: A speaker identity claim (SIC) utterance is received and recognized. The SIC utterance is compared with a voice profile registered under the SIC, and a first verification decision is based thereon. A first dynamic phrase (FDP) is generated, and a user is prompted to speak same. An FDP utterance is received, and compared with the voice profile registered under the SIC to make a second verification decision. If the second verification decision indicates a high or low confidence level, the speaker identity claim is accepted or rejected, respectively. If the verification decision indicates a medium confidence level, a second dynamic phrase (SDP) is generated, and the user is prompted to speak same. An SDP utterance is received, and compared with the voice profile registered under the SIC to make a third verification decision. The speaker identity claim is accepted or rejected based on the third verification decision.
    Type: Application
    Filed: February 13, 2003
    Publication date: August 19, 2004
    Inventor: Hisao M. Chang
  • Publication number: 20040128131
    Abstract: An audible command can be utilized to both permit identification of the speaker and to permit subsequent actions that comprise a corresponding response to the audible command when the identity of the speaker correlates with that of a previously authorized individual. Such identification can be supplemented with other identification mechanisms. Hierarchical levels of permission can be utilized, with or without confidence level thresholds, to further protect the device against unauthorized access and/or manipulation.
    Type: Application
    Filed: December 26, 2002
    Publication date: July 1, 2004
    Applicant: Motorola, Inc.
    Inventors: William Campbell, Robert Gardner, Charles Broun
  • Patent number: 6741962
    Abstract: A speech recognition system for recognizing an input voice of a narrow frequency band. The speech recognition system includes: a frequency band converting unit for converting the input voice of the narrow frequency band into a pseudo voice of a wide frequency band which covers an entirety of the narrow frequency band and which is wider than the narrow frequency band.
    Type: Grant
    Filed: March 7, 2002
    Date of Patent: May 25, 2004
    Assignee: NEC Corporation
    Inventor: Kenichi Iso
  • Patent number: 6697778
    Abstract: Client speaker locations in a speaker space are used to generate speech models for comparison with test speaker data or test speaker speech models. The speaker space can be constructed using training speakers that are entirely separate from the population of client speakers, or from client speakers, or from a mix of training and client speakers. Reestimation of the speaker space based on client environment information is also provided to improve the likelihood that the client data will fall within the speaker space. During enrollment of the clients into the speaker space, additional client speech can be obtained when predetermined conditions are met. The speaker distribution can also be used in the client enrollment step.
    Type: Grant
    Filed: July 5, 2000
    Date of Patent: February 24, 2004
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Roland Kuhn, Olivier Thyes, Patrick Nguyen, Jean-Claude Junqua, Robert Boman
  • Publication number: 20030229492
    Abstract: In one embodiment the present invention provides a method for identity verification comprising the steps of: (a) comparing at least one first spoken voice print of a user speaking at least one piece of personal data against a first stored voice print of the user speaking said at least one piece of personal data; (b) comparing at least one second spoken voice print of the user speaking at least one piece of travel data against a second stored voice print of the user speaking said piece of travel data; and (c) determining if the user is a given individual based the results of step (a) and step (b).
    Type: Application
    Filed: September 13, 2002
    Publication date: December 11, 2003
    Inventor: Marc Edward Nolan
  • Publication number: 20030125947
    Abstract: A voice model database server determines the identity of a speaker through a network over which the voice model database server provides to one or more speech-recognition systems output data regarding a person with access to the speech-recognition system receiving the output data. The voice model database server attempts to locate, based on the identity of the speaker, a voice model for the speaker. Finally, the voice model database server retrieves from a storage area the voice model for the speaker, if the voice model database server located a voice model for the speaker.
    Type: Application
    Filed: January 3, 2002
    Publication date: July 3, 2003
    Inventor: Michael Allen Yudkowsky
  • Patent number: 6563911
    Abstract: The present invention a speech enabled automatic telephone dialer device, system, and method using a spoken name corresponding to name-telephone number data of computer-based address book programs. The invention includes user telephones connected to a PBX-type telephony mechanism, which is connected to a telephony board of a name dialer device. User computer workstations containing loaded address book programs with name-telephone number data are connected to the name dialer device. The name dialer device includes a host computer in a network; a telephony board for controlling the PBX for dialing; memory within the host computer for storing software and name-telephone number data; and, software to access computer-based address book programs, to receive voice inputs from the PBX-type telephony mechanism, to create converted phonemes from names to match voice inputs with specific name-telephone number data from the computer-based address book programs for initiating an automatic dialing.
    Type: Grant
    Filed: January 23, 2001
    Date of Patent: May 13, 2003
    Assignee: iVoice, Inc.
    Inventor: Jerome R. Mahoney
  • Publication number: 20030004720
    Abstract: A system and method for extracting acoustic features and speech activity on a device and transmitting them in a distributed voice recognition system. The distributed voice recognition system includes a local VR engine in a subscriber unit and a server VR engine on a server . The local VR engine comprises a feature extraction (FE) module that extracts features from a speech signal, and a voice activity detection module (VAD) that detects voice activity within a speech signal. The system includes filters, framing and windowing modules, power spectrum analyzers, a neural network, a nonlinear element, and other components to selectively provide an advanced front end vector including predetermined portions of the voice activity detection indication and extracted features from the subscriber unit to the server .
    Type: Application
    Filed: January 28, 2002
    Publication date: January 2, 2003
    Inventors: Harinath Garudadri, Hynek Hermansky, Lukas Burget, Pratibha Jain, Sachin Kajarekar, Sunil Sivadas, Stephane N. Dupont, Maria Carmen Benitez Ortuzar, Nelson H. Morgan
  • Publication number: 20020193991
    Abstract: A method and system for utilizing multiple speech recognizers. The speech system includes a port through which an input audio stream may be received, at least two recognizers that may convert the input stream to text or commands, and a combiner able to combine lists of possible results from each recognizer into a combined list. The method includes receiving an input audio stream, routing the stream to one or more recognizers, receiving a list of possible results from each of the recognizers, combining the lists into a combined list and returning at least a subset of the list to the application.
    Type: Application
    Filed: June 13, 2001
    Publication date: December 19, 2002
    Applicant: Intel Corporation
    Inventors: Steven M. Bennett, Andrew V. Anderson
  • Publication number: 20020184022
    Abstract: A system that identifies recognized words from a voice recognition system that have the lowest possibility of being correct, and flagging those words on a user interface, to help with proofreading.
    Type: Application
    Filed: June 5, 2001
    Publication date: December 5, 2002
    Inventor: Gary F. Davenport
  • Publication number: 20020133344
    Abstract: A system, method and computer program product are provided for speech recognition. During operation, a database of words are maintained. Initially, a probability is assigned to each of the words which indicates a prevalency of use of the word. Further, an utterance is received for speech recognition purposes. Such utterance is matched with one of the words in the database based on least in part on the probability.
    Type: Application
    Filed: January 24, 2001
    Publication date: September 19, 2002
    Inventor: Bertrand A. Damiba