Preliminary Matching Patents (Class 704/247)
-
Patent number: 7864987Abstract: An access system in one embodiment that first determines that someone has correct credentials by using a non-biometric authentication method such as typing in a password, presenting a Smart card containing a cryptographic secret, or having a valid digital signature. Once the credentials are authenticated, then the user must take at least two biometric tests, which can be chosen randomly. In one approach, the biometric tests need only check a template generated from the user who desires access with the stored templates matching the holder of the credentials authenticated by the non-biometric test. Access desirably will be allowed when both biometric tests are passed.Type: GrantFiled: April 18, 2006Date of Patent: January 4, 2011Assignee: Infosys Technologies Ltd.Inventors: Kumar Balepur Venkatanna, Rajat Moona, S V Subrahmanya
-
Patent number: 7843364Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate text input. In response to an ambiguous editing input at a location preceding at least a portion of an output word, the software performs one disambiguation operation with respect to the editing input and another disambiguation operation with respect to the editing input in combination with the at least portion of the output word. The results are output in order of decreasing frequency value, with the results of the one disambiguation operation having the portion of the output word appended thereto.Type: GrantFiled: December 30, 2008Date of Patent: November 30, 2010Assignee: Research In Motion LimitedInventors: Michael G. Elizarov, Vadim Fux, Dan Rubanovich
-
Patent number: 7831423Abstract: A system enables a transcriptionist to replace a first written form (such as an abbreviation) of a concept with a second written form (such as an expanded form) of the same concept. For example, the system may display to the transcriptionist a draft document produced from speech by an automatic speech recognizer. If the transcriptionist recognizes a first written form of a concept that should be replaced with a second written form of the same concept, the transcriptionist may provide the system with a replacement command. In response, the system may identify the second written form of the concept and replace the first written form with the second written form in the draft document.Type: GrantFiled: May 25, 2006Date of Patent: November 9, 2010Assignee: Multimodal Technologies, Inc.Inventor: Kjell Schubert
-
Patent number: 7831424Abstract: A method is presented which reduces data flow and thereby increases processing capacity while preserving a high level of accuracy in a distributed speech processing environment for speaker detection. The method and system of the present invention includes filtering out data based on a target speaker specific subset of labels using data filters. The method preserves accuracy and passes only a fraction of the data by optimizing target specific performance measures. Therefore, a high level of speaker recognition accuracy is maintained while utilizing existing processing capabilities.Type: GrantFiled: April 2, 2008Date of Patent: November 9, 2010Assignee: International Business Machines CorporationInventors: Upendra V. Chaudhari, Juan M. Huerta, Ganesh N. Ramaswamy, Olivier Verscheure
-
Patent number: 7813927Abstract: There is provided an apparatus for providing a Text Independent (TI) speaker recognition mode in a Text Dependent (TD) Hidden Markov Model (HMM) speaker recognition system and/or a Text Constrained (TC) HMM speaker recognition system. The apparatus includes a Gaussian Mixture Model (GMM) generator and a Gaussian weight normalizer. The GMM generator is for creating a GMM by pooling Gaussians from a plurality of HMM states. The Gaussian weight normalizer is for normalizing Gaussian weights with respect to the plurality of HMM states.Type: GrantFiled: June 4, 2008Date of Patent: October 12, 2010Assignee: Nuance Communications, Inc.Inventors: Jiri Navratil, James H. Nealand, Jason W. Pelecanos, Ganesh N. Ramaswamy, Ran D. Zilca
-
Patent number: 7788097Abstract: A method, system and article of manufacture of recognizing a voice command. One embodiment of the invention comprises: receiving a voice input; using the number of sound fragments, determining a number of sound fragments to be processed in a first set of sound fragments; determining whether the first set of sound fragments of the voice input matches with the first set of sound fragments of a voice command; and if the first set of sound fragments matches with the first set of sound fragments of the voice command, then determining whether one or more remaining sound fragments matches with one or more remaining sound fragments of the voice command.Type: GrantFiled: October 31, 2006Date of Patent: August 31, 2010Assignee: Nuance Communications, Inc.Inventors: Joseph H. McIntyre, Victor S. Moore
-
Patent number: 7778832Abstract: One-to-many comparisons of callers' voice prints with known voice prints to identify any matches between them. When a customer communicates with a particular entity, such as a customer service center, the system makes a recording of the real-time call including both the customer's and agent's voices. The system segments the recording to extract at least a portion of the customer's voice to create a customer voice print, and it formats the segmented voice print for network transmission to a server. The server compares the customer's voice print with multiple known voice prints to determine any matches, meaning that the customer's voice print and one of the known voice prints are likely from the same person. The identification of any matches can be used for a variety of purposes, such as determining whether to authorize a transaction requested by the customer.Type: GrantFiled: September 26, 2007Date of Patent: August 17, 2010Assignee: American Express Travel Related Services Company, Inc.Inventors: Vicki Broman, Vernon Marshall, Seshasayee Bellamkonda, Marcel Leyva, Cynthia Hanson
-
Patent number: 7769583Abstract: A system, method and computer program product for classification of an analog electrical signal using statistical models of training data. A technique is described to quantize the analog electrical signal in a manner which maximizes the compression of the signal while simultaneously minimizing the diminution in the ability to classify the compressed signal. These goals are achieved by utilizing a quantizer designed to minimize the loss in a power of the log-likelihood ratio. A further technique is described to enhance the quantization process by optimally allocating a number of bits for each dimension of the quantized feature vector subject to a maximum number of bits available across all dimensions.Type: GrantFiled: May 13, 2006Date of Patent: August 3, 2010Assignee: International Business Machines CorporationInventors: Upendra V. Chaudhari, Hsin I. Tseng, Deepak S. Turaga, Olivier Verscheure
-
Patent number: 7761296Abstract: A system and method for rescoring the N-best hypotheses from an automatic speech recognition system by comparing an original speech waveform to synthetic speech waveforms that are generated for each text sequence of the N-best hypotheses. A distance is calculated from the original speech waveform to each of the synthesized waveforms, and the text associated with the synthesized waveform that is determined to be closest to the original waveform is selected as the final hypothesis. The original waveform and each synthesized waveform are aligned to a corresponding text sequence on a phoneme level. The mean of the feature vectors which align to each phoneme is computed for the original waveform as well as for each of the synthesized hypotheses.Type: GrantFiled: April 2, 1999Date of Patent: July 20, 2010Assignee: International Business Machines CorporationInventors: Raimo Bakis, Ellen M. Eide
-
Publication number: 20100145697Abstract: Disclosed herein is a similar speaker recognition method and system using nonlinear analysis. The recognition method extracts a nonlinear feature of a sound signal through nonlinear analysis of the sound signal and combines the nonlinear feature with a linear feature such as spectrum. The method transforms sound data in a time domain into status vectors in a phase domain and uses a nonlinear time series analysis method capable of representing nonlinear features of the status vectors to extract nonlinear information of a sound. The method can overcome technical limitations of conventional linear algorithms. The recognition method can be applied to sound-related application systems other than speaker recognition systems.Type: ApplicationFiled: October 28, 2009Publication date: June 10, 2010Applicant: IUCF-HYU INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITYInventors: Young-Hun Kwon, Kun-Sang Lee, Sung-IL Yang, Sung-Wook Chang, Jung-Pa Seo, Min-Su Kim, In-Chan Baek
-
Publication number: 20100131273Abstract: Device, system, and method of liveness detection using voice biometrics. For example, a method comprises: generating a first matching score based on a comparison between: (a) a voice-print from a first text-dependent audio sample received at an enrollment stage, and (b) a second text-dependent audio sample received at an authentication stage; generating a second matching score based on a text-independent audio sample; and generating a liveness score by taking into account at least the first matching score and the second matching score.Type: ApplicationFiled: November 25, 2009Publication date: May 27, 2010Inventors: Almog Aley-Raz, Nir Moshe Krause, Michael Itzhak Salmon, Ran Yehoshua Gazit
-
Patent number: 7716048Abstract: A method and apparatus for segmenting an audio interaction, by locating anchor segment from each side of the interaction, iteratively classifying additional segments into one of the two sides, and scoring the resulting segmentation, If the score result is below a threshold, the process is repeated until the segmentation score is satisfactory or until a stopping criterion is met. The anchoring and the scoring steps comprise using additional data associated with the interaction, a speaker thereof, internal or external information related to the interaction or to a speaker thereof or the like.Type: GrantFiled: January 25, 2006Date of Patent: May 11, 2010Assignee: Nice Systems, Ltd.Inventors: Oren Pereg, Moshe Waserblat
-
Publication number: 20100114572Abstract: To enable selection of a speaker, the acoustic feature value of which is similar to that of an utterance speaker, with accuracy and stability, while adapting to changes even when the acoustic feature value of the speaker changes every moment. A speaker score calculating means (22) calculates a long-time speaker score (log likelihood of each of a plurality of speaker models stored in a speaker model storage section (31) with respect to the acoustic feature value) based on an arbitrary number of utterances, for example, and calculates a short-time speaker score based on a short-time utterance, for example. A long-time speaker selecting means 23 selects speakers corresponding to a predetermined number of speaker models having a high long-time speaker score.Type: ApplicationFiled: February 29, 2008Publication date: May 6, 2010Inventors: Masahiro Tani, Tadashi Emori, Yoshifumi Onishi
-
Patent number: 7711560Abstract: A speech recognition apparatus equipped with the garbage acoustic model storage unit storing the garbage acoustic model which learned the collection of unnecessary words. A feature value calculation unit calculates the feature parameter necessary for recognition by acoustically analyzing the unidentified input speech including the non-language speech per frame which is a unit for speech analysis. A garbage acoustic score calculation unit calculates the garbage acoustic score by comparing the feature parameter and the garbage acoustic model, and a garbage acoustic score correction unit corrects the garbage acoustic score calculated by the garbage acoustic score calculation unit so as to raise it in the frame where the non-language speech is inputted.Type: GrantFiled: February 4, 2004Date of Patent: May 4, 2010Assignee: Panasonic CorporationInventors: Maki Yamada, Makoto Nishizaki, Yoshihisa Nakatoh, Shinichi Yoshizawa
-
Patent number: 7636661Abstract: A method and arrangement for improved speech recognition in a telephonically challenging speakerphone in-car environment. The method includes receiving a signal from a microphone representative of speech to be recognized, performing detection of a transition in the signal indicative of switch on of the microphone, and, in response to the detection, performing speech recognition on the signal with reduced contribution from an initial portion thereof. The initial portion may be treated as optional speech, the speech recognition may be performed with a predetermined redundant sound, and a user may be requested to speak the predetermined redundant sound when speech recognition has fallen below a predetermined threshold. Thus, recognition may be made possible when otherwise it would not be possible, recognition match scoring will be increased as the low weighting given by deleted initial sounds will be eliminated and therefore confusion of the recognized phrase will be reduced.Type: GrantFiled: June 30, 2005Date of Patent: December 22, 2009Assignee: Nuance Communications, Inc.Inventors: Adam Pieter De Leeuw, Steven Groeger, Stuart John Hayton
-
Patent number: 7624012Abstract: The invention enables to generate a general function (4) which can operate on an input signal (Sx) to extract from the latter a value (DVex) of a global characteristic value expressing a feature (De) of the information conveyed by that signal. It operates by: generating at least one compound function (CF1-CFn), said compound function being generated from at least one of a set of elementary functions (EF1, EF2, . . .Type: GrantFiled: December 16, 2003Date of Patent: November 24, 2009Assignee: Sony France S.A.Inventors: François Pachet, Aymeric Zils
-
Patent number: 7603274Abstract: A method and apparatus for determining the possibility of pattern recognition of time series signal independent of a pattern recognition ratio is provided. The method for determining the possibility of pattern recognition of time series signal includes extracting a time forward feature and a time reversed feature from an input signal having a time series pattern, generating time forward alignment and time reversed alignment by using the time forward feature and the time reversed feature, comparing the time forward alignment with the time reversed alignment to compute a likelihood of pattern recognition, and determining that the input signal can be recognized if the likelihood is larger than a predetermined threshold value.Type: GrantFiled: November 2, 2005Date of Patent: October 13, 2009Assignee: Samsung Electronics Co., Ltd.Inventor: Kwangil Hwang
-
Patent number: 7590537Abstract: A speech recognition method and apparatus perform speaker clustering and speaker adaptation using average model variation information over speakers while analyzing the quantity variation amount and the directional variation amount. In the speaker clustering method, a speaker group model variation is generated based on the model variation between a speaker-independent model and a training speaker ML model. In the speaker adaptation method, the model in which the model variation between a test speaker ML model and a speaker group ML model to which the test speaker belongs which is most similar to a training speaker group model variation is found, and speaker adaptation is performed on the found model. Herein, the model variation in the speaker clustering and the speaker adaptation are calculated while analyzing both the quantity variation amount and the directional variation amount. The present invention may be applied to any speaker adaptation algorithm of MLLR and MAP.Type: GrantFiled: December 27, 2004Date of Patent: September 15, 2009Assignee: Samsung Electronics Co., Ltd.Inventors: Namhoon Kim, Injeong Choi, Yoonkyung Song
-
Patent number: 7586423Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate text input. In response to an ambiguous editing input at a location preceding at least a portion of an output word, the software performs one disambiguation operation with respect to the editing input and another disambiguation operation with respect to the editing input in combination with the at least portion of the output word. The results are output in order of decreasing frequency value, with the results of the one disambiguation operation having the portion of the output word appended thereto.Type: GrantFiled: June 30, 2006Date of Patent: September 8, 2009Assignee: Research In Motion LimitedInventors: Michael G. Elizarov, Vadim Fux, Dan Rubanovich
-
Patent number: 7567901Abstract: Systems and methods for bio-phonetic multi-phrase speaker identity verification are disclosed. Generally, a speaker identity verification engine generates a dynamic phrase including at least one dynamically-generated word. The speaker identity verification engine prompts a user to speak the dynamic phrase and receives a dynamic phrase utterance. The speaker identity verification engine extracts at least one voice characteristic from the dynamic phrase utterance and compares the at least one voice characteristic with a voice profile the generate a score. The speaker identity verification engine then determines whether to accept a speaker identity claim based on the score.Type: GrantFiled: April 13, 2007Date of Patent: July 28, 2009Assignee: AT&T Intellectual Property 1, L.P.Inventor: Hisao M. Chang
-
Patent number: 7545290Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software that is operable to disambiguate compound text input. The device is able to assemble language objects in the memory to generate compound language solutions. The device is able to prioritize compound language solutions according to various criteria, including the degree of completeness of the text components of a compound language solution.Type: GrantFiled: January 13, 2006Date of Patent: June 9, 2009Assignee: Research In Motion LimitedInventors: Vadim Fux, Michael Elizarov
-
Patent number: 7471775Abstract: A method and apparatus (100) for updating a voice tag comprising N stored voice tag phoneme sequences includes a function (110) for determining (205) an accepted stored voice tag phoneme sequence for an utterance, a function (140) for extracting(210) a current set of M phoneme sequences having highest likelihoods of representing the utterance, a function (160) for updating (215) a reference histogram associated with the accepted voice tag, and a function (160) for updating (225) the voice tag with N selected phoneme sequences that are selected from the current set of M phoneme sequences and the set of N voice tag phoneme sequences, wherein the N selected phoneme sequences have phoneme histograms most closely matching the reference histogram. The method and apparatus (100) also generates a voice tag using some functions (110, 140, 160) that are common with the method and apparatus to update the voice tag, such as the extracting (410) of the current set of M phoneme sequences.Type: GrantFiled: June 30, 2005Date of Patent: December 30, 2008Assignee: Motorola, Inc.Inventor: Yan Ming Cheng
-
Patent number: 7447632Abstract: A voice authentication system includes: a standard template storage part 17 in which a standard template that is generated from a registered voice of an authorized user and featured with a voice characteristic of the registered voice is stored preliminarily in a state of being associated with a personal ID of the authorized user; an identifier input part 15 that allows a user who intends to be authenticated to input a personal ID; a voice input part 11 that allows the user to input a voice; a standard template/registered voice selection part 16 that selects a standard template and a registered voice corresponding to the inputted identifier; a determination part 14 that refers to the selected standard template and determines whether or not the inputted voice is a voice of the authorized user him/herself and whether or not presentation-use information is to be outputted by referring to a predetermined determination reference; a presentation-use information extraction part 19 that extracts information regardingType: GrantFiled: September 29, 2005Date of Patent: November 4, 2008Assignee: Fujitsu LimitedInventor: Taisuke Itou
-
Patent number: 7447633Abstract: There is provided an apparatus for providing a Text Independent (TI) speaker recognition mode in a Text Dependent (TD) Hidden Markov Model (HMM) speaker recognition system and/or a Text Constrained (TC) HMM speaker recognition system. The apparatus includes a Gaussian Mixture Model (GMM) generator and a Gaussian weight normalizer. The GMM generator is for creating a GMM by pooling Gaussians from a plurality of HMM states. The Gaussian weight normalizer is for normalizing Gaussian weights with respect to the plurality of HMM states.Type: GrantFiled: November 22, 2004Date of Patent: November 4, 2008Assignee: International Business Machines CorporationInventors: Jiri Navratil, James H. Nealand, Jason W. Pelecanos, Ganesh N. Ramaswamy, Ran D. Zilca
-
Publication number: 20080243504Abstract: An interactive speech recognition training process and system is disclosed. A speech recognition process is applied to a received speaker utterance. Utterance data are matched by the system with data in a grammar database and the speaker is requested to confirm a determined match. If the system determines from the speaker's response that the match is not confirmed, a negative score is assigned to the utterance data. If the match is determined by the system to be confirmed, a positive score is assigned to the utterance data. Scores for a plurality of such speaker utterances are accumulated in a log file, the accumulated scores used to adjust acoustic models for the grammar database.Type: ApplicationFiled: March 30, 2007Publication date: October 2, 2008Applicant: Verizon Data Services, Inc.Inventor: Parind Poi
-
Publication number: 20080228481Abstract: Embodiments of the present invention improve content selection systems and methods using speech recognition. In one embodiment, the present invention includes a speech recognition method comprising storing content on an electronic device, wherein the content is associated with a plurality of content attribute values, adding the content attribute values to a first recognition set of a speech recognizer, receiving a speech input signal in said speech recognizer, generating a plurality of likelihood values in response to the speech input signal, wherein each likelihood value is associated with one content attribute value in the recognition set; and accessing the stored content based on the likelihood values.Type: ApplicationFiled: March 13, 2007Publication date: September 18, 2008Applicant: Sensory, IncorporatedInventor: Todd F. Mozer
-
Patent number: 7424425Abstract: In detection systems, such as speaker verification systems, for a given operating point range, with an associated detection “cost”, the detection cost is preferably reduced by essentially trading off the system error in the area of interest with areas essentially “outside” that interest. Among the advantages achieved thereby are higher optimization gain and better generalization. From a measurable Detection Error Tradeoff (DET) curve of the given detection system, a criterion is preferably derived, such that its minimization provably leads to detection cost reduction in the area of interest. The criterion allows for selective access to the slope and offset of the DET curve (a line in case of normally distributed detection scores, a curve approximated by mixture of Gaussians in case of other distributions). By modifying the slope of the DET curve, the behavior of the detection system is changed favorably with respect to the given area of interest.Type: GrantFiled: May 19, 2002Date of Patent: September 9, 2008Assignee: International Business Machines CorporationInventors: Jiri Navratil, Ganesh N. Ramaswamy
-
Patent number: 7421387Abstract: A method for reducing recognition errors. The method includes receiving an N-best list associated with an input of a computer based recognition system. The N-best list includes one or more hypotheses and associated confidence values. The input is classified in response to the N-best list, resulting in a classification. A re-scoring algorithm that is tuned for the classification is selected. The re-scoring algorithm is applied to the N-best list to create a re-scored N-best list. A hypothesis for the value of the input is selected based on the re-scored N-best list.Type: GrantFiled: May 18, 2004Date of Patent: September 2, 2008Assignee: General Motors CorporationInventor: Kurt S. Godden
-
Patent number: 7401017Abstract: Method and apparatus for multi-pass speech recognition. An input device receives spoken input. A processor performs a first pass speech recognition technique on the spoken input and forms first pass results. The first pass results include a number of alternative speech expressions, each having an assigned score related to the certainty that the corresponding expression correctly matches the spoken input. The processor selectively performs a second pass speech recognition technique on the spoken input according to the first pass results. Preferably, the second pass attempts to correctly match the spoken input to only those expressions which were identified during the first pass. Otherwise, if one of the expressions identified by the first pass is assigned a score higher than a predetermined threshold (e.g., 95%), the second pass is not performed.Type: GrantFiled: April 4, 2006Date of Patent: July 15, 2008Assignee: Nuance CommunicationsInventors: Hy Murveit, Ashvin Kannan, Ben Shahshahani, Chris Leggetter, Katherine Knill
-
Patent number: 7386448Abstract: A system and method enrolls a speaker with an enrollment utterance and authenticates a user with a biometric analysis of an authentication utterance, without the need for a PIN (Personal Identification Number). During authentication, the system uses the same authentication utterance to identify who a speaker claims to be with speaker recognition, and verify whether is the speaker is actually the claimed person. Thus, it is not necessary for the speaker to identify biometric data using a PIN. The biometric analysis includes a neural tree network to determine unique aspects of the authentication utterances for comparison to the enrollment authentication. The biometric analysis leverages a statistical analysis using Hidden Markov Models to before authorizing the speaker.Type: GrantFiled: June 24, 2004Date of Patent: June 10, 2008Assignee: T-Netix, Inc.Inventors: John C. Poss, Dag Boye, Mark W. Mobley
-
Patent number: 7376434Abstract: Improved approaches for users of electronic devices to communicate with one another are disclosed. The electronic devices have audio and/or textual output capabilities. The improved approaches can enable users to communicate in different ways depending on device configuration, user preferences, prior history, etc. In one embodiment, the communication between users is achieved by short audio or textual messages.Type: GrantFiled: August 2, 2006Date of Patent: May 20, 2008Assignee: IpVenture, Inc.Inventors: C. Douglass Thomas, Peter P. Tong
-
Patent number: 7312726Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software. The device provides output in the form of a default output and a number of variants. The output is based largely upon the frequency, i.e., the likelihood that a user intended a particular output, but various features of the device provide additional variants that are not based solely on frequency and rather are provided by various logic structures resident on the device. The device enables editing during text entry and also provides a learning function that allows the disambiguation function to adapt to provide a customized experience for the user. The disambiguation function can be selectively disabled and an alternate keystroke interpretation system provided.Type: GrantFiled: June 2, 2004Date of Patent: December 25, 2007Assignee: Research In Motion LimitedInventors: Vadim Fux, Michael G. Elizarov, Sergey V. Kolomiets
-
Patent number: 7177808Abstract: Method for improving speaker identification by determining usable speech. Degraded speech is preprocessed in a speaker identification (SID) process to produce SID usable and SID unusable segments. Features are extracted and analyzed so as to produce a matrix of optimum classifiers for the detection of SID usable and SID unusable speech segments. Optimum classifiers possess a minimum distance from a speaker model. A decision tree based upon fixed thresholds indicates the presence of a speech feature in a given speech segment. Following preprocessing, degraded speech is measured in one or more time, frequency, cepstral or SID usable/unusable domains. The results of the measurements are multiplied by a weighting factor whose value is proportional to the reliability of the corresponding time, frequency, or cepstral measurements performed. The measurements are fused as information, and usable speech segments are extracted for further processing.Type: GrantFiled: August 18, 2004Date of Patent: February 13, 2007Assignee: The United States of America as represented by the Secretary of the Air ForceInventors: Robert E. Yantorno, Daniel S. Benincasa, Stanley J. Wenndt, Brett Y. Smolenski
-
Patent number: 7146315Abstract: A multichannel source activity detection system, e.g., a voice activity detection (VAD) system, and method that exploits spatial localization of a target audio source is provided. The method includes the steps of receiving a mixed sound signal by at least two microphones; Fast Fourier transforming each received mixed sound signal into the frequency domain; filtering the transformed signals to output a signal corresponding to a spatial signature of a source; summing an absolute value squared of the filtered signal over a predetermined range of frequencies; and comparing the sum to a threshold to determine if a voice is present. Additionally, the filtering step includes multiplying the transformed signals by an inverse of a noise spectral power matrix, a vector of channel transfer function ratios, and a source signal spectral power.Type: GrantFiled: August 30, 2002Date of Patent: December 5, 2006Assignee: Siemens Corporate Research, Inc.Inventors: Radu Victor Balan, Justinian Rosca, Christophe Beaugeant
-
Patent number: 7054812Abstract: A system is provided for determining a sequence of sub-word units representative of at least two words output by a word recognition unit in response to an input word to be recognized. In a preferred embodiment, the word alternatives output by the recognition unit are converted into sequences of phonemes. An optimum alignment between these sequences is then determined using a dynamic programming alignment technique. The sequence of phonemes representative of the input sequences is then determined using this optimum alignment.Type: GrantFiled: April 25, 2001Date of Patent: May 30, 2006Assignee: Canon Kabushiki KaishaInventors: Jason Peter Andrew Charlesworth, Philip Neil Garner
-
Patent number: 7054811Abstract: A system for verifying and enabling user access, which includes a voice registration unit for providing a substantially unique and initial identification of each of a plurality of the speaker/users by finding the speaker/user's voice parameters in a voice registration sample and storing same in a database. The system also includes a voice authenticating unit for substantially absolute verification of an identity of one of said plurality of users. The voice authenticating unit includes a recognition unit for providing a voice authentication sample, and being operative with the database. The voice authenticating unit also includes a decision unit operative with the recognition unit and the database to decide whether the user is the same as the person of the same identity registered with the system, such that the identity of one of the plurality of users is substantially absolutely verified.Type: GrantFiled: October 6, 2004Date of Patent: May 30, 2006Assignee: Cellmax Systems Ltd.Inventor: Ziv Barzilay
-
Patent number: 6937702Abstract: Method, apparatus, and computer-readable media for minimizing the risk of fraudulent access to call center resources. The invention described herein provides a method of minimizing fraudulent access to call center resources, with the method including at least the following. One or more authenticated biometric samples are associated with at least one person. The person then submits at least one test biometric sample during a login process to obtain authorization to access to call center resources, for example, to process telephone calls or to receive training. This test biometric sample is captured and the differences between the test biometric sample and the one or more authenticated biometric samples is quantified. Depending on the degree of difference between the at least one authenticated biometric sample and the test biometric sample, the person's request for authorization to access call center resources is dispositioned.Type: GrantFiled: June 24, 2002Date of Patent: August 30, 2005Assignee: West CorporationInventors: Jill M. Vacek, Mark J. Pettay, Hendryanto Rilantono, Mahmood S. Akhwand, Gary L. West
-
Patent number: 6925154Abstract: Techniques for providing an automated conversational name dialing system for placing a call in response to an input by a user. One technique begins with the step of analyzing an input from a user, wherein the input includes information directed to identifying an intended recipient of a telephone call from the user. At least one candidate for the intended recipient is identified in response to the input, wherein the at least one candidate represents at least one potential match between the intended recipient and a predetermined vocabulary. A confidence measure indicative of a likelihood that the at least one candidate is the intended recipient is determined, and additional information is obtained from the user to increase the likelihood that the at least one candidate is the intended recipient, based on the determined confidence measure.Type: GrantFiled: May 3, 2002Date of Patent: August 2, 2005Assignee: International Business Machines CorproationInventors: Yuqing Gao, Bhuvana Ramabhadran, Chengjun Julian Chen, Hakan Erdogan, Michael A. Picheny
-
Publication number: 20040260549Abstract: A voice recognition system includes an adaptive filter and a subtractor. The adaptive filter generates a simulated talk-back voice y(n) by setting a filter coefficient simulating a transfer system in which an input voice corresponding to a voice command and a talk-back voice output from a speaker are input into a microphone and by filtering a talk-back voice x(n). The subtractor extracts the input voice by subtracting the simulated talk-back voice y(n) from mixed sound input into the microphone. With this configuration, the talk-back voice is attenuated from the mixed sound including the input voice and the talk-back voice input tedinto the microphone, and then, the mixed sound is supplied to a voice recognition engine. Accordingly, the user can input his/her voice during a talk-back operation without the need to interrupt it by pressing a speech button every time the user wishes to input the voice. The voice recognition operation time can be thus reduced.Type: ApplicationFiled: April 30, 2004Publication date: December 23, 2004Inventors: Shuichi Matsumoto, Toru Marumoto
-
Publication number: 20040162726Abstract: A speaker identity claim (SIC) utterance is received and recognized. The SIC utterance is compared with a voice profile registered under the SIC, and a first verification decision is based thereon. A first dynamic phrase (FDP) is generated, and a user is prompted to speak same. An FDP utterance is received, and compared with the voice profile registered under the SIC to make a second verification decision. If the second verification decision indicates a high or low confidence level, the speaker identity claim is accepted or rejected, respectively. If the verification decision indicates a medium confidence level, a second dynamic phrase (SDP) is generated, and the user is prompted to speak same. An SDP utterance is received, and compared with the voice profile registered under the SIC to make a third verification decision. The speaker identity claim is accepted or rejected based on the third verification decision.Type: ApplicationFiled: February 13, 2003Publication date: August 19, 2004Inventor: Hisao M. Chang
-
Publication number: 20040128131Abstract: An audible command can be utilized to both permit identification of the speaker and to permit subsequent actions that comprise a corresponding response to the audible command when the identity of the speaker correlates with that of a previously authorized individual. Such identification can be supplemented with other identification mechanisms. Hierarchical levels of permission can be utilized, with or without confidence level thresholds, to further protect the device against unauthorized access and/or manipulation.Type: ApplicationFiled: December 26, 2002Publication date: July 1, 2004Applicant: Motorola, Inc.Inventors: William Campbell, Robert Gardner, Charles Broun
-
Patent number: 6741962Abstract: A speech recognition system for recognizing an input voice of a narrow frequency band. The speech recognition system includes: a frequency band converting unit for converting the input voice of the narrow frequency band into a pseudo voice of a wide frequency band which covers an entirety of the narrow frequency band and which is wider than the narrow frequency band.Type: GrantFiled: March 7, 2002Date of Patent: May 25, 2004Assignee: NEC CorporationInventor: Kenichi Iso
-
Patent number: 6697778Abstract: Client speaker locations in a speaker space are used to generate speech models for comparison with test speaker data or test speaker speech models. The speaker space can be constructed using training speakers that are entirely separate from the population of client speakers, or from client speakers, or from a mix of training and client speakers. Reestimation of the speaker space based on client environment information is also provided to improve the likelihood that the client data will fall within the speaker space. During enrollment of the clients into the speaker space, additional client speech can be obtained when predetermined conditions are met. The speaker distribution can also be used in the client enrollment step.Type: GrantFiled: July 5, 2000Date of Patent: February 24, 2004Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Roland Kuhn, Olivier Thyes, Patrick Nguyen, Jean-Claude Junqua, Robert Boman
-
Publication number: 20030229492Abstract: In one embodiment the present invention provides a method for identity verification comprising the steps of: (a) comparing at least one first spoken voice print of a user speaking at least one piece of personal data against a first stored voice print of the user speaking said at least one piece of personal data; (b) comparing at least one second spoken voice print of the user speaking at least one piece of travel data against a second stored voice print of the user speaking said piece of travel data; and (c) determining if the user is a given individual based the results of step (a) and step (b).Type: ApplicationFiled: September 13, 2002Publication date: December 11, 2003Inventor: Marc Edward Nolan
-
Publication number: 20030125947Abstract: A voice model database server determines the identity of a speaker through a network over which the voice model database server provides to one or more speech-recognition systems output data regarding a person with access to the speech-recognition system receiving the output data. The voice model database server attempts to locate, based on the identity of the speaker, a voice model for the speaker. Finally, the voice model database server retrieves from a storage area the voice model for the speaker, if the voice model database server located a voice model for the speaker.Type: ApplicationFiled: January 3, 2002Publication date: July 3, 2003Inventor: Michael Allen Yudkowsky
-
Patent number: 6563911Abstract: The present invention a speech enabled automatic telephone dialer device, system, and method using a spoken name corresponding to name-telephone number data of computer-based address book programs. The invention includes user telephones connected to a PBX-type telephony mechanism, which is connected to a telephony board of a name dialer device. User computer workstations containing loaded address book programs with name-telephone number data are connected to the name dialer device. The name dialer device includes a host computer in a network; a telephony board for controlling the PBX for dialing; memory within the host computer for storing software and name-telephone number data; and, software to access computer-based address book programs, to receive voice inputs from the PBX-type telephony mechanism, to create converted phonemes from names to match voice inputs with specific name-telephone number data from the computer-based address book programs for initiating an automatic dialing.Type: GrantFiled: January 23, 2001Date of Patent: May 13, 2003Assignee: iVoice, Inc.Inventor: Jerome R. Mahoney
-
Publication number: 20030004720Abstract: A system and method for extracting acoustic features and speech activity on a device and transmitting them in a distributed voice recognition system. The distributed voice recognition system includes a local VR engine in a subscriber unit and a server VR engine on a server . The local VR engine comprises a feature extraction (FE) module that extracts features from a speech signal, and a voice activity detection module (VAD) that detects voice activity within a speech signal. The system includes filters, framing and windowing modules, power spectrum analyzers, a neural network, a nonlinear element, and other components to selectively provide an advanced front end vector including predetermined portions of the voice activity detection indication and extracted features from the subscriber unit to the server .Type: ApplicationFiled: January 28, 2002Publication date: January 2, 2003Inventors: Harinath Garudadri, Hynek Hermansky, Lukas Burget, Pratibha Jain, Sachin Kajarekar, Sunil Sivadas, Stephane N. Dupont, Maria Carmen Benitez Ortuzar, Nelson H. Morgan
-
Publication number: 20020193991Abstract: A method and system for utilizing multiple speech recognizers. The speech system includes a port through which an input audio stream may be received, at least two recognizers that may convert the input stream to text or commands, and a combiner able to combine lists of possible results from each recognizer into a combined list. The method includes receiving an input audio stream, routing the stream to one or more recognizers, receiving a list of possible results from each of the recognizers, combining the lists into a combined list and returning at least a subset of the list to the application.Type: ApplicationFiled: June 13, 2001Publication date: December 19, 2002Applicant: Intel CorporationInventors: Steven M. Bennett, Andrew V. Anderson
-
Publication number: 20020184022Abstract: A system that identifies recognized words from a voice recognition system that have the lowest possibility of being correct, and flagging those words on a user interface, to help with proofreading.Type: ApplicationFiled: June 5, 2001Publication date: December 5, 2002Inventor: Gary F. Davenport
-
Publication number: 20020133344Abstract: A system, method and computer program product are provided for speech recognition. During operation, a database of words are maintained. Initially, a probability is assigned to each of the words which indicates a prevalency of use of the word. Further, an utterance is received for speech recognition purposes. Such utterance is matched with one of the words in the database based on least in part on the probability.Type: ApplicationFiled: January 24, 2001Publication date: September 19, 2002Inventor: Bertrand A. Damiba