Distance Patents (Class 704/238)
  • Patent number: 9959850
    Abstract: It is inter alia disclosed a method comprising: determining a divergence measure between a statistical distribution of audio features of a first audio track and a statistical distribution of audio features of at least one further audio track; determining a divergence measure threshold value from at least the divergence measure between the statistical distribution of audio features of a first audio track and the statistical distribution of audio features of the at least one further audio track; and comparing the divergence measure with the divergence measure threshold value.
    Type: Grant
    Filed: June 9, 2014
    Date of Patent: May 1, 2018
    Assignee: Nokia Technologies Oy
    Inventors: Antti Eronen, Jussi Leppänen
  • Patent number: 9928839
    Abstract: Methods and systems for authenticating a user are described. In some embodiments, a one-time token and a recording of the one-time token is read aloud by the user. The voice characteristics derived from the recording of the one-time token are compared with voice characteristics derived from samples of the user's voice. The user may be authenticated when the one-time token is verified and when a match of the voice characteristics derived from the recording of the one-time token and the voice characteristics derived from the samples of the user's voice meet or exceed a threshold.
    Type: Grant
    Filed: April 16, 2014
    Date of Patent: March 27, 2018
    Assignee: United Services Automobile Association (USAA)
    Inventors: Michael Wayne Lester, Debra Randall Casillas, Sudarshan Rangarajan, John Shelton, Maland Keith Mortensen
  • Patent number: 9911410
    Abstract: A method, computer program product, and system for adapting speech recognition of a user's speech is provided. The method includes receiving a first utterance from a user having a duration below a predetermined threshold, identifying at least one further utterance from the user that provides additional information, generating a concatenated utterance by concatenating the first utterance with the at least one further utterance, transmitting the concatenated utterance to a speech recognition server, receiving a transcription of the concatenated utterance from the speech recognition server that includes a transcription of the first utterance, and extracting the transcription of the first utterance from the transcription of the concatenated utterance. The transcription of the first utterance is based on the additional information provided by the at least one further utterance.
    Type: Grant
    Filed: August 19, 2015
    Date of Patent: March 6, 2018
    Assignee: International Business Machines Corporation
    Inventor: Shay Ben-David
  • Patent number: 9886957
    Abstract: A voice recognition system and method are provided. The voice recognition system includes a voice input unit configured to receive learning voice data and a target label including consonant and vowel (letter) information representing the learning voice data, and divide the learning voice data into windows having a predetermined size; a first voice recognition unit configured to learn features of the divided windows using a first neural network model and the target label; a second voice recognition unit configured to learn a time-series pattern of the extracted features using a second neural network model; and a text output unit configured to convert target voice data input to the voice input unit into a text based on learning results of the first voice recognition unit and the second voice recognition unit, and output the text.
    Type: Grant
    Filed: December 29, 2015
    Date of Patent: February 6, 2018
    Assignee: SAMSUNG SDS CO., LTD.
    Inventors: Ji-Hyeon Seo, Jae-Young Lee, Byung-Wuek Lee, Kyung-Jun An
  • Patent number: 9870774
    Abstract: A vehicular apparatus is provided. In the vehicular apparatus, a first controller is disposed on a first board and a second controller is disposed on a second board which is exchangeable with respect to the first board. An AD converter performs A/D conversion of first and second speech data. A switch disposed on the first board is switchable between a first connection state in which the switch outputs the first speech data inputted from the A/D converter and a second connection state in which the switch outputs a sound data different from each of the first and second speech data. A switch controller controls the switch so that the switch is in the second connection state when the second controller performs speech recognition of the second speech data.
    Type: Grant
    Filed: August 25, 2014
    Date of Patent: January 16, 2018
    Assignee: DENSO CORPORATION
    Inventors: Takayuki Nishiwaki, Takahiro Enomoto
  • Patent number: 9837072
    Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.
    Type: Grant
    Filed: May 15, 2017
    Date of Patent: December 5, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Alistair D. Conkie
  • Patent number: 9826090
    Abstract: Merchant/consumer calls may be recorded and evaluated according to a variety of criteria. The call recordings and analyses thereof, as well as consumer tracking information, may be displayed in a user interface of a web-based online portal for convenience in evaluating the use and efficacy of marketing channels as well as the quality of merchant/consumer interactions. In an aspect, the user interface provides call visualization in the form of audio data from a telephone call displayed as a waveform on a call timeline. The call may be (automatically or manually) annotated with various business-value-specific keywords spoken during the telephone call, and markers for these keywords can be presented on the call timeline to visually indicate the keyword and the time during the call when the keyword was spoken. A business value for the call may be determined based at least in part on keywords spoken during the call.
    Type: Grant
    Filed: December 13, 2016
    Date of Patent: November 21, 2017
    Assignee: Patient Prism LLC
    Inventors: Michael G. Spiessbach, Amol Nirgudkar
  • Patent number: 9817889
    Abstract: The present invention relates to a searching device, searching method, and program whereby searching for a word string corresponding to input voice can be performed in a robust manner. A voice recognition unit 11 subjects an input voice to voice recognition. A matching unit 16 performs matching, for each of multiple word strings for search results which are word strings that are to be search results for word strings corresponding to the input voice, of a pronunciation symbol string for search results, which is an array of pronunciation symbols expressing pronunciation of the word string search result, and a recognition result pronunciation symbol string which is an array of pronunciation symbols expressing pronunciation of the voice recognition results of the input voice.
    Type: Grant
    Filed: December 2, 2010
    Date of Patent: November 14, 2017
    Assignee: SONY CORPORATION
    Inventors: Hitoshi Honda, Yoshinori Maeda, Satoshi Asakawa
  • Patent number: 9805712
    Abstract: A method for recognizing a voice and a device for recognizing a voice are provided. The method includes: collecting voice information input by a user; extracting characteristics from the voice information to obtain characteristic information; decoding the characteristic information according to an acoustic model and a language model obtained in advance to obtain recognized voice information, wherein the acoustic model is obtained by data compression in advance.
    Type: Grant
    Filed: December 18, 2014
    Date of Patent: October 31, 2017
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Bo Li, Zhiqian Wang, Na Hu, Xiangyu Mu, Lei Jia, Wei Wei
  • Patent number: 9754603
    Abstract: According to one embodiment, a speech feature extraction apparatus includes an extraction unit and a calculation unit. The extraction unit extracts speech segments over a predetermined period at intervals of a unit time from either an input speech signal or a plurality of subband input speech signals obtained by extracting signal components of a plurality of frequency bands from the input speech signal, to generate either a unit speech signal or a plurality of subband unit speech signals. The calculation unit calculates either each average time of the unit speech signal in each of the plurality of frequency bands or each average time of each of the plurality of subband unit speech signals to obtain a speech feature.
    Type: Grant
    Filed: December 27, 2012
    Date of Patent: September 5, 2017
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Masanobu Nakamura, Takashi Masuko
  • Patent number: 9721564
    Abstract: Systems and methods for performing ASR in the presence of heterographs are provided. Verbal input is received from the user that includes a plurality of utterances. A first of the plurality of utterances is matched to a first word. It is determined that a second utterance in the plurality of utterances matches a plurality of words that is in a same heterograph set. It is identified which one of the plurality of words is associated with a context of the first word. A function is performed based on the first word and the identified one of the plurality of words.
    Type: Grant
    Filed: July 31, 2014
    Date of Patent: August 1, 2017
    Assignee: Rovi Guides, Inc.
    Inventors: Akshat Agarwal, Rakesh Barve
  • Patent number: 9697827
    Abstract: Features are disclosed for reducing errors in speech recognition processing. Methods for reducing errors can include receiving multiple speech recognition hypotheses based on an utterance indicative of a command or query of a user and determining a command or query within a grammar having a least amount of difference from one of the speech recognition hypotheses. The determination of the least amount of difference may be based at least in part on a comparison of individual subword units along at least some of the sequence paths of the speech recognition hypotheses and the grammar. For example, the comparison may be performed on the phoneme level instead of the word level.
    Type: Grant
    Filed: December 11, 2012
    Date of Patent: July 4, 2017
    Assignee: Amazon Technologies, Inc.
    Inventors: Jeffrey Paul Lilly, Ryan Paul Thomas, Jeffrey Penrod Adams
  • Patent number: 9659564
    Abstract: The present invention relates to a non-standard speech detection system and method whereby a speech is analyzed based on models that are trained using personalized speech for each individual. The model is stored in a database and used to analyze a speech in real time to determine the content and behavior of an individual who is a party to a conversation that produces the speech. The results of the analysis can be used to determine if a conversation takes place under normal circumstances or under extraneous circumstances.
    Type: Grant
    Filed: October 22, 2015
    Date of Patent: May 23, 2017
    Assignee: SESTEK SES VE ILETISIM BILGISAYAR TEKNOLOJILERI SANAYI TICARET ANONIM SIRKETI
    Inventor: Mustafa Levent Arslan
  • Patent number: 9552037
    Abstract: Systems and methods for switching a computing device from a low-power state to a high-power state are provided. In some aspects, a method, implemented on a power management processing unit of the computing device, includes receiving, while the computing device is in the low-power state, a first audio signal. The method also includes verifying the first audio signal based on an audio signal key. The method also includes providing, in response to verifying the first audio signal, instructions for switching the computing device from the low-power state to the high-power state.
    Type: Grant
    Filed: April 23, 2012
    Date of Patent: January 24, 2017
    Assignee: Google Inc.
    Inventor: Leng Ooi
  • Patent number: 9536525
    Abstract: A speaker indexing device extracts a plurality of features from a speech signal on a frame-by-frame basis, models a distribution of first feature sets by a mixture distribution containing as many probability distributions as there are speakers, selects for each probability distribution either first feature sets located within a predetermined distance from the center of the probability distribution or a predetermined number of first feature sets in sequence starting from the first feature set closest to the center of the probability distribution, selects a second feature for the frame corresponding to the selected first feature sets as first training data for the speaker corresponding to the probability distribution and, using the first training data, trains a speaker model to be used to append to each frame identification information for identifying the speaker speaking in the frame.
    Type: Grant
    Filed: August 13, 2015
    Date of Patent: January 3, 2017
    Assignee: FUJITSU LIMITED
    Inventor: Shoji Hayakawa
  • Patent number: 9520125
    Abstract: There are provided a speech synthesis device, a speech synthesis method and a speech synthesis program which can represent a phoneme as a duration shorter than a duration upon modeling according to a statistical method. A speech synthesis device 80 according to the present invention includes a phoneme boundary updating means 81 which, by using a voiced utterance likelihood index which is an index indicating a degree of voiced utterance likelihood of each state which represents a phoneme modeled by a statistical method, updates a phoneme boundary position which is a boundary with other phonemes neighboring to the phoneme.
    Type: Grant
    Filed: June 8, 2012
    Date of Patent: December 13, 2016
    Assignee: NEC Corporation
    Inventors: Yasuyuki Mitsui, Masanori Kato, Reishi Kondo
  • Patent number: 9514747
    Abstract: In an automatic speech recognition (ASR) processing system, ASR processing may be configured to reduce a latency of returning speech results to a user. The latency may be determined by comparing a time stamp of an utterance in process to a current time. Latency may also be estimated based on an endpoint of the utterance or other considerations such as how difficult the utterance may be to process. To improve latency the ASR system may be configured to adjust various processing parameters, such as graph pruning factors, path weights, ASR models, etc. Latency checks and corrections may occur dynamically for a particular utterance while it is being processed, thus allowing the ASR system to adjust to rapidly changing latency conditions.
    Type: Grant
    Filed: August 28, 2013
    Date of Patent: December 6, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Michael Maximilian Emanuel Bisani, Hugh Evan Secker-Walker, Kenneth John Basye, Alexander David Rosen
  • Patent number: 9495956
    Abstract: In embodiments of the present disclosure, capabilities are described for interacting with a mobile communication facility, which may include receiving a switch activation from a user to initiate a speech recognition recording session, recording the speech recognition recording session using a mobile communication facility resident capture facility, recognizing a portion of the voice command as an indication that user speech for recognition will begin following the end of the portion of the voice command, recognizing the recorded speech using a speech recognition facility to produce an external output, and using the selected output to perform a function on the mobile communication facility. The speech recognition recording session may include a voice command from the user followed by the speech to be recognized from the user.
    Type: Grant
    Filed: November 10, 2014
    Date of Patent: November 15, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: William S. Meisel, Michael S. Phillips, John N. Nguyen
  • Patent number: 9479895
    Abstract: A location of a first mobile device associated with a first user is determined, and a location of a second mobile device associated with a second user is determined. A relationship between the first user and the second user is determined, and a proximity of the first mobile device relative to the second mobile device is determined. A location-oriented data service is provided to at least one of the first mobile device and the second mobile device.
    Type: Grant
    Filed: April 23, 2009
    Date of Patent: October 25, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Edith Helen Stern, Patrick Joseph O'Sullivan, Robert Cameron Weir, Barry E. Willner
  • Patent number: 9460710
    Abstract: In embodiments of the present disclosure, capabilities are described for interacting with a mobile communication facility, which may include receiving a switch activation from a user to initiate a speech recognition recording session, recording the speech recognition recording session using a mobile communication facility resident capture facility, recognizing a portion of the voice command as an indication that user speech for recognition will begin following the end of the portion of the voice command, recognizing the recorded speech using a speech recognition facility to produce an external output, and using the selected output to perform a function on the mobile communication facility. The speech recognition recording session may include a voice command from the user followed by the speech to be recognized from the user.
    Type: Grant
    Filed: November 10, 2014
    Date of Patent: October 4, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: William S. Meisel, Michael S. Phillips, John N. Nguyen
  • Patent number: 9350532
    Abstract: A method, apparatus and system for secure forensic investigation of a target machine by a client machine over a communications network. In one aspect the method comprises establishing secure communication with a server over a communications network, establishing secure communication with the target machine over the communications network, wherein establishing secure communication with the target machine includes establishing secure communication between the server and the target machine, installing a servelet on the target machine, transmitting a secure command to the servelet over the communications network, executing the secure command in the servelet, transmitting data, by the target machine, in response to a servelet instruction, and receiving the data from the target machine over the communication network.
    Type: Grant
    Filed: January 10, 2011
    Date of Patent: May 24, 2016
    Assignee: GUIDANCE SOFTWARE, INC.
    Inventors: Shawn McCreight, Dominik Weber, Matthew Garrett
  • Patent number: 9343068
    Abstract: A method for controlling access to a plurality of applications in an electronic device includes receiving a voice command from a speaker for accessing a target application among the plurality of applications, and verifying whether the voice command is indicative of a user authorized to access the applications based on a speaker model of the authorized user. In this method, each application is associated with a security level having a threshold value. The method further includes updating the speaker model with the voice command if the voice command is verified to be indicative of the user, and adjusting at least one of the threshold values based on the updated speaker model.
    Type: Grant
    Filed: September 16, 2013
    Date of Patent: May 17, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Sungrack Yun, Taesu Kim, Jun-Cheol Cho, Min-Kyu Park, Kyu Woong Hwang
  • Patent number: 9336781
    Abstract: A content-aware speaker recognition system includes technologies to, among other things, analyze phonetic content of a speech sample, incorporate phonetic content of the speech sample into a speaker model, and use the phonetically-aware speaker model for speaker recognition.
    Type: Grant
    Filed: April 29, 2014
    Date of Patent: May 10, 2016
    Assignee: SRI INTERNATIONAL
    Inventors: Nicolas Scheffer, Yun Lei
  • Patent number: 9262471
    Abstract: A record is received including a token without a corresponding predetermined weight. Information pertaining to the received token is retrieved from at least one of external reference information and historic statistics. A token with a predetermined weight closest to the received token is determined based on the retrieved information. The predetermined weight of the closest token is assigned to the received token and data is matched based on the assigned weight of the received token.
    Type: Grant
    Filed: August 6, 2013
    Date of Patent: February 16, 2016
    Assignee: International Business Machines Corporation
    Inventors: Karl J. Weinmeister, Yinle Zhou
  • Patent number: 9076441
    Abstract: A speech recognition circuit comprising a circuit for providing state identifiers which identify states corresponding to nodes or groups of adjacent nodes in a lexical tree, and for providing scores corresponding to said state identifiers, the lexical tree comprising a model of words; a memory structure for receiving and storing state identifiers identified by a node identifier identifying a node or group of adjacent nodes, said memory structure being adapted to allow lookup to identify particular state identifiers, reading of the scores corresponding to the state identifiers, and writing back of the scores to the memory structure after modification of the scores; an accumulator for receiving score updates corresponding to particular state identifiers from a score update generating circuit which generates the score updates using audio input, for receiving scores from the memory structure, and for modifying said scores by adding said score updates to said scores; and a selector circuit for selecting at least o
    Type: Grant
    Filed: January 7, 2013
    Date of Patent: July 7, 2015
    Assignee: Zentian Limited
    Inventors: Guy Larri, Mark Catchpole, Damian Kelly Harris-Dowsett, Timothy Brian Reynolds
  • Patent number: 9009044
    Abstract: Methods and apparatus related to speech recognition performed by a speech recognition device are disclosed. The speech recognition device can receive a plurality of samples corresponding to an utterance and generate a feature vector z from the plurality of samples. The speech recognition device can select a first frame y0 from the feature vector z, and can generate a second frame y1, where y0 and y1 differ. The speech recognition device can generate a modified frame x? based on the first frame y0 and the second frame y1 and then recognize speech related to the utterance based on the modified frame x?. The recognized speech can be output by the speech recognition device.
    Type: Grant
    Filed: February 7, 2013
    Date of Patent: April 14, 2015
    Assignee: Google Inc.
    Inventor: Andrew William Senior
  • Patent number: 9009041
    Abstract: A method is described for improving the accuracy of a transcription generated by an automatic speech recognition (ASR) engine. A personal vocabulary is maintained that includes replacement words. The replacement words in the personal vocabulary are obtained from personal data associated with a user. A transcription is received of an audio recording. The transcription is generated by an ASR engine using an ASR vocabulary and includes a transcribed word that represents a spoken word in the audio recording. Data is received that is associated with the transcribed word. A replacement word from the personal vocabulary is identified, which is used to re-score the transcription and replace the transcribed word.
    Type: Grant
    Filed: July 26, 2011
    Date of Patent: April 14, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: George Zavaliagkos, William F. Ganong, III, Uwe H. Jost, Shreedhar Madhavapeddi, Gary B. Clayton
  • Patent number: 8990086
    Abstract: A recognition confidence measurement method, medium and system which can more accurately determine whether an input speech signal is an in-vocabulary, by extracting an optimum number of candidates that match a phone string extracted from the input speech signal and estimating a lexical distance between the extracted candidates is provided. A recognition confidence measurement method includes: extracting a phoneme string from a feature vector of an input speech signal; extracting candidates by matching the extracted phoneme string and phoneme strings of vocabularies registered in a predetermined dictionary and; estimating a lexical distance between the extracted candidates; and determining whether the input speech signal is an in-vocabulary, based on the lexical distance.
    Type: Grant
    Filed: July 31, 2006
    Date of Patent: March 24, 2015
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sang-Bae Jeong, Nam Hoon Kim, Ick Sang Han, In Jeong Choi, Gil Jin Jang, Jae-Hoon Jeong
  • Patent number: 8977547
    Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.
    Type: Grant
    Filed: October 8, 2009
    Date of Patent: March 10, 2015
    Assignee: Mitsubishi Electric Corporation
    Inventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
  • Patent number: 8948298
    Abstract: Disclosed is a multiple-input multiple-output (MIMO) system including a transmitting end and a receiving end, wherein the transmitting end includes: a hierarchical codebook in which at least one base codebook is designated as the upper matrix and a child codebook generated based on a chordal distance between respective codewords configuring the base codebook is designated as the lower matrix; a scheduler configured to receive channel state information from the receiving end and select precoding matrices from the hierarchical codebook based on the channel state information; and a precoder configured to apply the precoding matrices selected in the scheduler to data to be transmitted to the receiving end and transmit the selected precoding matrices through a plurality of antennas.
    Type: Grant
    Filed: March 3, 2012
    Date of Patent: February 3, 2015
    Assignee: Seoul National University Industry Foundation
    Inventors: Jung Woo Lee, Kyeong Jun Ko, Sung Kyu Jung
  • Patent number: 8918319
    Abstract: In a speech recognition device and a speech recognition method, a key phrase containing at least one key word is received. The speech recognition method comprises steps: receiving a sound source signal of a key word and generating a plurality of audio signals; transforming the audio signals into a plurality of frequency signals; receiving the frequency signals to obtain a space-frequency spectrum and an angular estimation value thereof; receiving the space-frequency spectrum to define and output at least one spatial eigenparameter, and using the angular estimation value and the frequency signals to perform spotting and evaluation and outputting a Bhattacharyya distance; and receiving the spatial eigenparameter and the Bhattacharyya distance and using corresponding thresholds to determine correctness of the key phrase. Thereby this invention robustly achieves high speech recognition rate under very low SNR conditions.
    Type: Grant
    Filed: July 7, 2011
    Date of Patent: December 23, 2014
    Assignee: National Chiao University
    Inventors: Jwu-Sheng Hu, Ming-Tang Lee, Ting-Chao Wang, Chia Hsin Yang
  • Patent number: 8913761
    Abstract: Disclosed herein is a sound source recording apparatus and method adaptable to an operating environment, which can record a target sound source at a predetermined level without being affected by characteristics of the sound source or ambient noise. A target sound source is separated from a sound source signal received through an array of microphones and a recording sound pressure level and a gain are estimated using a reference sound pressure level and a reference distance for the target sound source, thereby controlling or adjusting the gain of the microphones.
    Type: Grant
    Filed: October 25, 2010
    Date of Patent: December 16, 2014
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Ki Hoon Shin
  • Patent number: 8914285
    Abstract: A computerized method for sales optimization including receiving at a computer server a digital representation of a portion of an interaction between a customer and an organization representative, the portion of an interaction comprises a speech signal of the customer and a speech signal of the organization representative; analyzing the speech signal of the organization representative; analyzing the speech signal of the customer; determining a distance vector between the speech signal of the organization representative and the speech signal of the customer; and predicting a sale success probability score for the captured speech signal portion.
    Type: Grant
    Filed: July 17, 2012
    Date of Patent: December 16, 2014
    Assignee: Nice-Systems Ltd
    Inventors: Moshe Wasserblat, Dan Eylon, Ezra Daya, Tzach Ashkenazi, Oren Pereg, Ohad Pollak, Moshe Avlagon
  • Patent number: 8903725
    Abstract: Method for controlling user access to a service available in a data network and/or to information stored in a user database, in order to protect stored user data from unauthorized access, such that the method comprises the following: input of a user's speech sample to a user data terminal, processing of the user's speech sample in order to obtain a prepared speech sample as well as a current voice profile of the user, comparison of the current voice profile with an initial voice profile stored in an authorization database, and output of an access-control signal to either permit or refuse access, taking into account the result of the comparison step, such that the comparison step includes a quantitative similarity evaluation of the current and the stored voice profiles as well as a threshold-value discrimination of a similarity measure thereby derived, and an access-control signal that initiates permission of access is generated only if a prespecified similarity measure is not exceeded.
    Type: Grant
    Filed: November 25, 2009
    Date of Patent: December 2, 2014
    Assignee: Voice.Trust AG
    Inventor: Christian Pilz
  • Publication number: 20140330564
    Abstract: A frame erasure concealment technique for a bitstream-based feature extractors in a speech recognition system particularly suited for use in a wireless communication system operates to “delete” each frame in which an erasure is declared. The deletions thus reduce the length of the observation sequence, but have been found to provide for sufficient speech recognition based on both single word and “string” tests of the deletion technique.
    Type: Application
    Filed: May 19, 2014
    Publication date: November 6, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventors: Richard Vandervoort Cox, Hong Kook Kim
  • Publication number: 20140330565
    Abstract: An apparatus and a method are provided for building a spoken language understanding model. Labeled data may be obtained for a target application. A new classification model may be formed for use with the target application by using the labeled data for adaptation of an existing classification model. In some implementations, the existing classification model may be used to determine the most informative examples to label.
    Type: Application
    Filed: May 20, 2014
    Publication date: November 6, 2014
    Applicant: AT&T Intellectual Property II, L.P.
    Inventor: Gokhan Tur
  • Patent number: 8838452
    Abstract: A method (400) and system (200) for classifying a audio signal are described. The method (400) operates by first receiving a sequence of audio frame feature data, each of the frame feature data characterising an audio frame along the audio segment. In response to receipt of each of the audio frame feature data, statistical data characterising the audio segment is updated with the received frame feature data. The received frame feature data is then discarded. A preliminary classification for the audio segment may be determined from the statistical data. Upon receipt of a notification of an end boundary of the audio segment, the audio segment is classified (410) based on the statistical data.
    Type: Grant
    Filed: June 6, 2005
    Date of Patent: September 16, 2014
    Assignee: Canon Kabushiki Kaisha
    Inventors: Reuben Kan, Dmitri Katchalov, Muhammad Majid, George Politis, Timothy John Wark
  • Patent number: 8831208
    Abstract: A dialog manager for a spoken dialog system. A decision module selects a path from a plurality of alternative paths for a given call, wherein each path implements one of a plurality of strategies for a call flow. A weighting module weights the path selection decision and is connected to a probability estimator for estimating the probability value that a given one of the plurality of paths is the best-performing path.
    Type: Grant
    Filed: September 23, 2011
    Date of Patent: September 9, 2014
    Assignee: Synchronoss Technologies, Inc.
    Inventors: David Suendermann, Jackson Liscombe, Jonathan Bloom, Grace Li, Roberto Pieraccini
  • Patent number: 8804178
    Abstract: A method for routing a confirmation of receipt of a facsimile or portion thereof according to one embodiment of the present invention includes analyzing text of a facsimile for at least one of a meaning and a context of the text; and routing one or more confirmations to one or more destinations based on the analysis.
    Type: Grant
    Filed: February 25, 2013
    Date of Patent: August 12, 2014
    Assignee: Kofax, Inc.
    Inventors: Roland G. Borrey, Roy Couchman
  • Patent number: 8781825
    Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.
    Type: Grant
    Filed: August 24, 2011
    Date of Patent: July 15, 2014
    Assignee: Sensory, Incorporated
    Inventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
  • Patent number: 8743998
    Abstract: A device for generating a transmission codebook in a communication system including a multi-input multi-output (MIMO) antenna according to an embodiment of the present invention includes: a frequency determiner that determines a frequency to allow the transmission codebook to have an optimal characteristic; a precoding matrix generator that generates a precoding matrix on the basis of the frequency; and a codebook generator that generates a retransmission codebook to be used for retransmission on the basis of the precoding matrix and generates the transmission codebook on the basis of the retransmission codebook.
    Type: Grant
    Filed: September 1, 2009
    Date of Patent: June 3, 2014
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: DongSeung Kwon, Byung-Jae Kwak, Choongil Yeh, Young Seog Song, Ji Hyung Kim, Wooram Shin, Chung Gu Kang, Jin-Woo Kim
  • Patent number: 8738367
    Abstract: A speech signal processing device is equipped with a power acquisition unit, a probability distribution acquisition unit, and a correspondence degree determination unit. The power acquisition unit accepts an inputted speech signal and, based on the accepted speech signal, acquires power representing the intensity of a speech sound represented by the speech signal. The probability distribution acquisition unit acquires a probability distribution using the intensity of the power acquired by the power acquisition unit as a random variable. The correspondence degree determination unit determines whether a correspondence degree representing a degree that power acquired by the power acquisition unit in a case that a predetermined reference speech signal is inputted into the power acquisition unit corresponds with predetermined reference power is higher than a predetermined reference correspondence degree, based on the probability distribution acquired by the probability distribution acquisition unit.
    Type: Grant
    Filed: February 18, 2010
    Date of Patent: May 27, 2014
    Assignee: NEC Corporation
    Inventor: Tadashi Emori
  • Patent number: 8694317
    Abstract: Methods for processing audio data containing speech to produce a searchable index file and for subsequently searching such an index file are provided. The processing method uses a phonetic approach and models each frame of the audio data with a set of reference phones. A score for each of the reference phones, representing the difference of the audio from the phone model, is stored in the searchable data file for each of the phones in the reference set. A consequence of storing information regarding each of the reference phones is that the accuracy of searches carried out on the index file is not compromised by the rejection of information about particular phones. A subsequent search method is also provided which uses a simple and efficient dynamic programming search to locate instances of a search term in the audio. The methods of the present invention have particular application to the field of audio data mining.
    Type: Grant
    Filed: February 6, 2006
    Date of Patent: April 8, 2014
    Assignee: Aurix Limited
    Inventors: Adrian I Skilling, Howard A K Wright
  • Patent number: 8676858
    Abstract: Methods and apparatus teach a digital spectrum of a file. The digital spectrum is used to map a file's position in a multi-dimensional space. This position relative to another file's position reveals distances between the files. Closest files can be grouped together. When contemplating voluminous numbers of files for digital spectrums, various methods include: concatenating all such files together to get a single key useful for creating a file's spectrum; or compressing files individually and combining their collective dictionaries into a single dictionary that defines the digital spectrum. Each provides advantage over the other. The latter consumes considerably less run time because each compression event can be distributed to a separate processor. Method two provides better spectrums because it is more “informationally” valid than is method one.
    Type: Grant
    Filed: January 8, 2010
    Date of Patent: March 18, 2014
    Assignee: Novell, Inc.
    Inventor: Craig N. Teerlink
  • Publication number: 20140025376
    Abstract: The subject matter discloses a computerized method for sales optimization comprising: receiving at a computer server a digital representation of a portion of an interaction between a customer and an organization representative, the portion of an interaction comprises a speech signal of the customer and a speech signal of the organization representative; analyzing the speech signal of the organization representative; analyzing the speech signal of the customer; determining a distance vector between the speech signal of the organization representative and the speech signal of the customer; and predicting a sale success probability score for the captured speech signal portion.
    Type: Application
    Filed: July 17, 2012
    Publication date: January 23, 2014
    Applicant: NICE-SYSTEMS LTD
    Inventors: Moshe WASSERBLAT, Dan EYLON, Ezra DAYA, Tzach ASHKENAZI, Oren PEREG, Ohad POLLAK, Moshe AVLAGON
  • Patent number: 8620655
    Abstract: A speech processing method, comprising: receiving a speech input which comprises a sequence of feature vectors; determining the likelihood of a sequence of words arising from the sequence of feature vectors using an acoustic model and a language model, comprising: providing an acoustic model for performing speech recognition on an input signal which comprises a sequence of feature vectors, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to a feature vector, wherein said speech input is a mismatched speech input which is received from a speaker in an environment which is not matched to the speaker or environment under which the acoustic model was trained; and adapting the acoustic model to the mismatched speech input, the speech processing method further comprising determining the likelihood of a sequence of features occurring in a given language using a language model; and combining the likelihoods determined by the acoustic
    Type: Grant
    Filed: August 10, 2011
    Date of Patent: December 31, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Haitian Xu, Kean Kheong Chin, Mark John Francis Gales
  • Patent number: 8612225
    Abstract: A voice recognition device that recognizes a voice of an input voice signal, comprises a voice model storage unit that stores in advance a predetermined voice model having a plurality of detail levels, the plurality of detail levels being information indicating a feature property of a voice for the voice model; a detail level selection unit that selects a detail level, closest to a feature property of an input voice signal, from the detail levels of the voice model stored in the voice model storage unit; and a parameter setting unit that sets parameters for recognizing the voice of an input voice according to the detail level selected by the detail level selection unit.
    Type: Grant
    Filed: February 26, 2008
    Date of Patent: December 17, 2013
    Assignee: NEC Corporation
    Inventors: Takayuki Arakawa, Ken Hanazawa, Masanori Tsujikawa
  • Patent number: 8606580
    Abstract: To provide a data process unit and data process unit control program that are suitable for generating acoustic models for unspecified speakers taking distribution of diversifying feature parameters into consideration under such specific conditions as the type of speaker, speech lexicons, speech styles, and speech environment and that are suitable for providing acoustic models intended for unspecified speakers and adapted to speech of a specific person. The data process unit comprises a data classification section, data storing section, pattern model generating section, data control section, mathematical distance calculating section, pattern model converting section, pattern model display section, region dividing section, division changing section, region selecting section, and specific pattern model generating section.
    Type: Grant
    Filed: December 30, 2008
    Date of Patent: December 10, 2013
    Assignee: Asahi Kasei Kabushiki Kaisha
    Inventors: Makoto Shozakai, Goshu Nagino
  • Publication number: 20130325468
    Abstract: Disclosed are a conversation management method and a device for executing same are disclosed. The device includes: a calculation unit for calculating the importance of an utterance intention, the similarity between utterance intentions, and the relative distance between utterance intentions using at least one of a plurality of utterance intentions in a corpus and an utterance intention in a sequence relationship with the at least one utterance intention; a similarity calculating unit for calculating the similarity between conversation flows by comparing a conversation flow obtained from a corpus and a conversation flow obtained from a user utterance by means of the importance and similarity of an utterance intention; and an utterance intention verifying unit for calculating an evaluation score of an utterance intention by evaluating a user utterance according to the relative distance between utterance intentions.
    Type: Application
    Filed: October 21, 2011
    Publication date: December 5, 2013
    Applicant: POSTECH ACADEMY - INDUSTRY FOUNDATION
    Inventors: Geun-Bae Lee, Sung-Jin Lee, Hyung-Jong Noh, Kyu-Song Lee
  • Patent number: 8599419
    Abstract: A method for routing a facsimile according to one embodiment includes receiving or generating text of a facsimile in a computer-readable format; routing the facsimile or text thereof to an intended recipient identified by recognizing at least one of a name, an email address and contact information of the intended recipient in the facsimile; analyzing the text of the facsimile for at least one of a meaning and a context of the text; and routing the facsimile or text thereof to one or more other destinations based on the analysis. A method according to another embodiment includes analyzing a pattern of light and dark areas of a facsimile in a computer-readable format; correlating the pattern to one or more forms; and routing the facsimile to one or more destinations based on the correlation, with the proviso that the analyzing, correlating and routing are performed without optical character recognition.
    Type: Grant
    Filed: August 30, 2012
    Date of Patent: December 3, 2013
    Assignee: Kofax, Inc.
    Inventor: Roy Couchman