Distance Patents (Class 704/238)

Acoustic music similarity determiner

Patent number: 9959850

Abstract: It is inter alia disclosed a method comprising: determining a divergence measure between a statistical distribution of audio features of a first audio track and a statistical distribution of audio features of at least one further audio track; determining a divergence measure threshold value from at least the divergence measure between the statistical distribution of audio features of a first audio track and the statistical distribution of audio features of the at least one further audio track; and comparing the divergence measure with the divergence measure threshold value.

Type: Grant

Filed: June 9, 2014

Date of Patent: May 1, 2018

Assignee: Nokia Technologies Oy

Inventors: Antti Eronen, Jussi Leppänen
Systems and methods for authentication using voice biometrics and device verification

Patent number: 9928839

Abstract: Methods and systems for authenticating a user are described. In some embodiments, a one-time token and a recording of the one-time token is read aloud by the user. The voice characteristics derived from the recording of the one-time token are compared with voice characteristics derived from samples of the user's voice. The user may be authenticated when the one-time token is verified and when a match of the voice characteristics derived from the recording of the one-time token and the voice characteristics derived from the samples of the user's voice meet or exceed a threshold.

Type: Grant

Filed: April 16, 2014

Date of Patent: March 27, 2018

Assignee: United Services Automobile Association (USAA)

Inventors: Michael Wayne Lester, Debra Randall Casillas, Sudarshan Rangarajan, John Shelton, Maland Keith Mortensen
Adaptation of speech recognition

Patent number: 9911410

Abstract: A method, computer program product, and system for adapting speech recognition of a user's speech is provided. The method includes receiving a first utterance from a user having a duration below a predetermined threshold, identifying at least one further utterance from the user that provides additional information, generating a concatenated utterance by concatenating the first utterance with the at least one further utterance, transmitting the concatenated utterance to a speech recognition server, receiving a transcription of the concatenated utterance from the speech recognition server that includes a transcription of the first utterance, and extracting the transcription of the first utterance from the transcription of the concatenated utterance. The transcription of the first utterance is based on the additional information provided by the at least one further utterance.

Type: Grant

Filed: August 19, 2015

Date of Patent: March 6, 2018

Assignee: International Business Machines Corporation

Inventor: Shay Ben-David
System and method for voice recognition

Patent number: 9886957

Abstract: A voice recognition system and method are provided. The voice recognition system includes a voice input unit configured to receive learning voice data and a target label including consonant and vowel (letter) information representing the learning voice data, and divide the learning voice data into windows having a predetermined size; a first voice recognition unit configured to learn features of the divided windows using a first neural network model and the target label; a second voice recognition unit configured to learn a time-series pattern of the extracted features using a second neural network model; and a text output unit configured to convert target voice data input to the voice input unit into a text based on learning results of the first voice recognition unit and the second voice recognition unit, and output the text.

Type: Grant

Filed: December 29, 2015

Date of Patent: February 6, 2018

Assignee: SAMSUNG SDS CO., LTD.

Inventors: Ji-Hyeon Seo, Jae-Young Lee, Byung-Wuek Lee, Kyung-Jun An
Vehicular apparatus and speech switchover control program

Patent number: 9870774

Abstract: A vehicular apparatus is provided. In the vehicular apparatus, a first controller is disposed on a first board and a second controller is disposed on a second board which is exchangeable with respect to the first board. An AD converter performs A/D conversion of first and second speech data. A switch disposed on the first board is switchable between a first connection state in which the switch outputs the first speech data inputted from the A/D converter and a second connection state in which the switch outputs a sound data different from each of the first and second speech data. A switch controller controls the switch so that the switch is in the second connection state when the second controller performs speech recognition of the second speech data.

Type: Grant

Filed: August 25, 2014

Date of Patent: January 16, 2018

Assignee: DENSO CORPORATION

Inventors: Takayuki Nishiwaki, Takahiro Enomoto
System and method for personalization of acoustic models for automatic speech recognition

Patent number: 9837072

Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.

Type: Grant

Filed: May 15, 2017

Date of Patent: December 5, 2017

Assignee: Nuance Communications, Inc.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Alistair D. Conkie
Call visualization

Patent number: 9826090

Abstract: Merchant/consumer calls may be recorded and evaluated according to a variety of criteria. The call recordings and analyses thereof, as well as consumer tracking information, may be displayed in a user interface of a web-based online portal for convenience in evaluating the use and efficacy of marketing channels as well as the quality of merchant/consumer interactions. In an aspect, the user interface provides call visualization in the form of audio data from a telephone call displayed as a waveform on a call timeline. The call may be (automatically or manually) annotated with various business-value-specific keywords spoken during the telephone call, and markers for these keywords can be presented on the call timeline to visually indicate the keyword and the time during the call when the keyword was spoken. A business value for the call may be determined based at least in part on keywords spoken during the call.

Type: Grant

Filed: December 13, 2016

Date of Patent: November 21, 2017

Assignee: Patient Prism LLC

Inventors: Michael G. Spiessbach, Amol Nirgudkar
Speech-based pronunciation symbol searching device, method and program using correction distance

Patent number: 9817889

Abstract: The present invention relates to a searching device, searching method, and program whereby searching for a word string corresponding to input voice can be performed in a robust manner. A voice recognition unit 11 subjects an input voice to voice recognition. A matching unit 16 performs matching, for each of multiple word strings for search results which are word strings that are to be search results for word strings corresponding to the input voice, of a pronunciation symbol string for search results, which is an array of pronunciation symbols expressing pronunciation of the word string search result, and a recognition result pronunciation symbol string which is an array of pronunciation symbols expressing pronunciation of the voice recognition results of the input voice.

Type: Grant

Filed: December 2, 2010

Date of Patent: November 14, 2017

Assignee: SONY CORPORATION

Inventors: Hitoshi Honda, Yoshinori Maeda, Satoshi Asakawa
Method and device for recognizing voice

Patent number: 9805712

Abstract: A method for recognizing a voice and a device for recognizing a voice are provided. The method includes: collecting voice information input by a user; extracting characteristics from the voice information to obtain characteristic information; decoding the characteristic information according to an acoustic model and a language model obtained in advance to obtain recognized voice information, wherein the acoustic model is obtained by data compression in advance.

Type: Grant

Filed: December 18, 2014

Date of Patent: October 31, 2017

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Bo Li, Zhiqian Wang, Na Hu, Xiangyu Mu, Lei Jia, Wei Wei
Speech feature extraction apparatus and speech feature extraction method

Patent number: 9754603

Abstract: According to one embodiment, a speech feature extraction apparatus includes an extraction unit and a calculation unit. The extraction unit extracts speech segments over a predetermined period at intervals of a unit time from either an input speech signal or a plurality of subband input speech signals obtained by extracting signal components of a plurality of frequency bands from the input speech signal, to generate either a unit speech signal or a plurality of subband unit speech signals. The calculation unit calculates either each average time of the unit speech signal in each of the plurality of frequency bands or each average time of each of the plurality of subband unit speech signals to obtain a speech feature.

Type: Grant

Filed: December 27, 2012

Date of Patent: September 5, 2017

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masanobu Nakamura, Takashi Masuko
Systems and methods for performing ASR in the presence of heterographs

Patent number: 9721564

Abstract: Systems and methods for performing ASR in the presence of heterographs are provided. Verbal input is received from the user that includes a plurality of utterances. A first of the plurality of utterances is matched to a first word. It is determined that a second utterance in the plurality of utterances matches a plurality of words that is in a same heterograph set. It is identified which one of the plurality of words is associated with a context of the first word. A function is performed based on the first word and the identified one of the plurality of words.

Type: Grant

Filed: July 31, 2014

Date of Patent: August 1, 2017

Assignee: Rovi Guides, Inc.

Inventors: Akshat Agarwal, Rakesh Barve
Error reduction in speech processing

Patent number: 9697827

Abstract: Features are disclosed for reducing errors in speech recognition processing. Methods for reducing errors can include receiving multiple speech recognition hypotheses based on an utterance indicative of a command or query of a user and determining a command or query within a grammar having a least amount of difference from one of the speech recognition hypotheses. The determination of the least amount of difference may be based at least in part on a comparison of individual subword units along at least some of the sequence paths of the speech recognition hypotheses and the grammar. For example, the comparison may be performed on the phoneme level instead of the word level.

Type: Grant

Filed: December 11, 2012

Date of Patent: July 4, 2017

Assignee: Amazon Technologies, Inc.

Inventors: Jeffrey Paul Lilly, Ryan Paul Thomas, Jeffrey Penrod Adams
Speaker verification based on acoustic behavioral characteristics of the speaker

Patent number: 9659564

Abstract: The present invention relates to a non-standard speech detection system and method whereby a speech is analyzed based on models that are trained using personalized speech for each individual. The model is stored in a database and used to analyze a speech in real time to determine the content and behavior of an individual who is a party to a conversation that produces the speech. The results of the analysis can be used to determine if a conversation takes place under normal circumstances or under extraneous circumstances.

Type: Grant

Filed: October 22, 2015

Date of Patent: May 23, 2017

Assignee: SESTEK SES VE ILETISIM BILGISAYAR TEKNOLOJILERI SANAYI TICARET ANONIM SIRKETI

Inventor: Mustafa Levent Arslan
Switching a computing device from a low-power state to a high-power state

Patent number: 9552037

Abstract: Systems and methods for switching a computing device from a low-power state to a high-power state are provided. In some aspects, a method, implemented on a power management processing unit of the computing device, includes receiving, while the computing device is in the low-power state, a first audio signal. The method also includes verifying the first audio signal based on an audio signal key. The method also includes providing, in response to verifying the first audio signal, instructions for switching the computing device from the low-power state to the high-power state.

Type: Grant

Filed: April 23, 2012

Date of Patent: January 24, 2017

Assignee: Google Inc.

Inventor: Leng Ooi
Speaker indexing device and speaker indexing method

Patent number: 9536525

Abstract: A speaker indexing device extracts a plurality of features from a speech signal on a frame-by-frame basis, models a distribution of first feature sets by a mixture distribution containing as many probability distributions as there are speakers, selects for each probability distribution either first feature sets located within a predetermined distance from the center of the probability distribution or a predetermined number of first feature sets in sequence starting from the first feature set closest to the center of the probability distribution, selects a second feature for the frame corresponding to the selected first feature sets as first training data for the speaker corresponding to the probability distribution and, using the first training data, trains a speaker model to be used to append to each frame identification information for identifying the speaker speaking in the frame.

Type: Grant

Filed: August 13, 2015

Date of Patent: January 3, 2017

Assignee: FUJITSU LIMITED

Inventor: Shoji Hayakawa
Speech synthesis device, speech synthesis method, and speech synthesis program

Patent number: 9520125

Abstract: There are provided a speech synthesis device, a speech synthesis method and a speech synthesis program which can represent a phoneme as a duration shorter than a duration upon modeling according to a statistical method. A speech synthesis device 80 according to the present invention includes a phoneme boundary updating means 81 which, by using a voiced utterance likelihood index which is an index indicating a degree of voiced utterance likelihood of each state which represents a phoneme modeled by a statistical method, updates a phoneme boundary position which is a boundary with other phonemes neighboring to the phoneme.

Type: Grant

Filed: June 8, 2012

Date of Patent: December 13, 2016

Assignee: NEC Corporation

Inventors: Yasuyuki Mitsui, Masanori Kato, Reishi Kondo
Reducing speech recognition latency

Patent number: 9514747

Abstract: In an automatic speech recognition (ASR) processing system, ASR processing may be configured to reduce a latency of returning speech results to a user. The latency may be determined by comparing a time stamp of an utterance in process to a current time. Latency may also be estimated based on an endpoint of the utterance or other considerations such as how difficult the utterance may be to process. To improve latency the ASR system may be configured to adjust various processing parameters, such as graph pruning factors, path weights, ASR models, etc. Latency checks and corrections may occur dynamically for a particular utterance while it is being processed, thus allowing the ASR system to adjust to rapidly changing latency conditions.

Type: Grant

Filed: August 28, 2013

Date of Patent: December 6, 2016

Assignee: Amazon Technologies, Inc.

Inventors: Michael Maximilian Emanuel Bisani, Hugh Evan Secker-Walker, Kenneth John Basye, Alexander David Rosen
Dealing with switch latency in speech recognition

Patent number: 9495956

Abstract: In embodiments of the present disclosure, capabilities are described for interacting with a mobile communication facility, which may include receiving a switch activation from a user to initiate a speech recognition recording session, recording the speech recognition recording session using a mobile communication facility resident capture facility, recognizing a portion of the voice command as an indication that user speech for recognition will begin following the end of the portion of the voice command, recognizing the recorded speech using a speech recognition facility to produce an external output, and using the selected output to perform a function on the mobile communication facility. The speech recognition recording session may include a voice command from the user followed by the speech to be recognized from the user.

Type: Grant

Filed: November 10, 2014

Date of Patent: November 15, 2016

Assignee: Nuance Communications, Inc.

Inventors: William S. Meisel, Michael S. Phillips, John N. Nguyen
Location-oriented services

Patent number: 9479895

Abstract: A location of a first mobile device associated with a first user is determined, and a location of a second mobile device associated with a second user is determined. A relationship between the first user and the second user is determined, and a proximity of the first mobile device relative to the second mobile device is determined. A location-oriented data service is provided to at least one of the first mobile device and the second mobile device.

Type: Grant

Filed: April 23, 2009

Date of Patent: October 25, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Edith Helen Stern, Patrick Joseph O'Sullivan, Robert Cameron Weir, Barry E. Willner
Dealing with switch latency in speech recognition

Patent number: 9460710

Abstract: In embodiments of the present disclosure, capabilities are described for interacting with a mobile communication facility, which may include receiving a switch activation from a user to initiate a speech recognition recording session, recording the speech recognition recording session using a mobile communication facility resident capture facility, recognizing a portion of the voice command as an indication that user speech for recognition will begin following the end of the portion of the voice command, recognizing the recorded speech using a speech recognition facility to produce an external output, and using the selected output to perform a function on the mobile communication facility. The speech recognition recording session may include a voice command from the user followed by the speech to be recognized from the user.

Type: Grant

Filed: November 10, 2014

Date of Patent: October 4, 2016

Assignee: Nuance Communications, Inc.

Inventors: William S. Meisel, Michael S. Phillips, John N. Nguyen
System and method for conducting searches at target devices

Patent number: 9350532

Abstract: A method, apparatus and system for secure forensic investigation of a target machine by a client machine over a communications network. In one aspect the method comprises establishing secure communication with a server over a communications network, establishing secure communication with the target machine over the communications network, wherein establishing secure communication with the target machine includes establishing secure communication between the server and the target machine, installing a servelet on the target machine, transmitting a secure command to the servelet over the communications network, executing the secure command in the servelet, transmitting data, by the target machine, in response to a servelet instruction, and receiving the data from the target machine over the communication network.

Type: Grant

Filed: January 10, 2011

Date of Patent: May 24, 2016

Assignee: GUIDANCE SOFTWARE, INC.

Inventors: Shawn McCreight, Dominik Weber, Matthew Garrett
Method and apparatus for controlling access to applications having different security levels

Patent number: 9343068

Abstract: A method for controlling access to a plurality of applications in an electronic device includes receiving a voice command from a speaker for accessing a target application among the plurality of applications, and verifying whether the voice command is indicative of a user authorized to access the applications based on a speaker model of the authorized user. In this method, each application is associated with a security level having a threshold value. The method further includes updating the speaker model with the voice command if the voice command is verified to be indicative of the user, and adjusting at least one of the threshold values based on the updated speaker model.

Type: Grant

Filed: September 16, 2013

Date of Patent: May 17, 2016

Assignee: QUALCOMM Incorporated

Inventors: Sungrack Yun, Taesu Kim, Jun-Cheol Cho, Min-Kyu Park, Kyu Woong Hwang
Content-aware speaker recognition

Patent number: 9336781

Abstract: A content-aware speaker recognition system includes technologies to, among other things, analyze phonetic content of a speech sample, incorporate phonetic content of the speech sample into a speaker model, and use the phonetically-aware speaker model for speaker recognition.

Type: Grant

Filed: April 29, 2014

Date of Patent: May 10, 2016

Assignee: SRI INTERNATIONAL

Inventors: Nicolas Scheffer, Yun Lei
Weight adjustment in a probabilistic matching system based on external demographic data

Patent number: 9262471

Abstract: A record is received including a token without a corresponding predetermined weight. Information pertaining to the received token is retrieved from at least one of external reference information and historic statistics. A token with a predetermined weight closest to the received token is determined based on the retrieved information. The predetermined weight of the closest token is assigned to the received token and data is matched based on the assigned weight of the received token.

Type: Grant

Filed: August 6, 2013

Date of Patent: February 16, 2016

Assignee: International Business Machines Corporation

Inventors: Karl J. Weinmeister, Yinle Zhou
Speech recognition circuit and method

Patent number: 9076441

Abstract: A speech recognition circuit comprising a circuit for providing state identifiers which identify states corresponding to nodes or groups of adjacent nodes in a lexical tree, and for providing scores corresponding to said state identifiers, the lexical tree comprising a model of words; a memory structure for receiving and storing state identifiers identified by a node identifier identifying a node or group of adjacent nodes, said memory structure being adapted to allow lookup to identify particular state identifiers, reading of the scores corresponding to the state identifiers, and writing back of the scores to the memory structure after modification of the scores; an accumulator for receiving score updates corresponding to particular state identifiers from a score update generating circuit which generates the score updates using audio input, for receiving scores from the memory structure, and for modifying said scores by adding said score updates to said scores; and a selector circuit for selecting at least o

Type: Grant

Filed: January 7, 2013

Date of Patent: July 7, 2015

Assignee: Zentian Limited

Inventors: Guy Larri, Mark Catchpole, Damian Kelly Harris-Dowsett, Timothy Brian Reynolds
Multiple subspace discriminative feature training

Patent number: 9009044

Abstract: Methods and apparatus related to speech recognition performed by a speech recognition device are disclosed. The speech recognition device can receive a plurality of samples corresponding to an utterance and generate a feature vector z from the plurality of samples. The speech recognition device can select a first frame y0 from the feature vector z, and can generate a second frame y1, where y0 and y1 differ. The speech recognition device can generate a modified frame x? based on the first frame y0 and the second frame y1 and then recognize speech related to the utterance based on the modified frame x?. The recognized speech can be output by the speech recognition device.

Type: Grant

Filed: February 7, 2013

Date of Patent: April 14, 2015

Assignee: Google Inc.

Inventor: Andrew William Senior
Systems and methods for improving the accuracy of a transcription using auxiliary data such as personal data

Patent number: 9009041

Abstract: A method is described for improving the accuracy of a transcription generated by an automatic speech recognition (ASR) engine. A personal vocabulary is maintained that includes replacement words. The replacement words in the personal vocabulary are obtained from personal data associated with a user. A transcription is received of an audio recording. The transcription is generated by an ASR engine using an ASR vocabulary and includes a transcribed word that represents a spoken word in the audio recording. Data is received that is associated with the transcribed word. A replacement word from the personal vocabulary is identified, which is used to re-score the transcription and replace the transcribed word.

Type: Grant

Filed: July 26, 2011

Date of Patent: April 14, 2015

Assignee: Nuance Communications, Inc.

Inventors: George Zavaliagkos, William F. Ganong, III, Uwe H. Jost, Shreedhar Madhavapeddi, Gary B. Clayton
Recognition confidence measuring by lexical distance between candidates

Patent number: 8990086

Abstract: A recognition confidence measurement method, medium and system which can more accurately determine whether an input speech signal is an in-vocabulary, by extracting an optimum number of candidates that match a phone string extracted from the input speech signal and estimating a lexical distance between the extracted candidates is provided. A recognition confidence measurement method includes: extracting a phoneme string from a feature vector of an input speech signal; extracting candidates by matching the extracted phoneme string and phoneme strings of vocabularies registered in a predetermined dictionary and; estimating a lexical distance between the extracted candidates; and determining whether the input speech signal is an in-vocabulary, based on the lexical distance.

Type: Grant

Filed: July 31, 2006

Date of Patent: March 24, 2015

Assignee: Samsung Electronics Co., Ltd.

Inventors: Sang-Bae Jeong, Nam Hoon Kim, Ick Sang Han, In Jeong Choi, Gil Jin Jang, Jae-Hoon Jeong
Voice recognition system for registration of stable utterances

Patent number: 8977547

Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.

Type: Grant

Filed: October 8, 2009

Date of Patent: March 10, 2015

Assignee: Mitsubishi Electric Corporation

Inventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
MIMO system and method of generating hierarchical codebook therefor

Patent number: 8948298

Abstract: Disclosed is a multiple-input multiple-output (MIMO) system including a transmitting end and a receiving end, wherein the transmitting end includes: a hierarchical codebook in which at least one base codebook is designated as the upper matrix and a child codebook generated based on a chordal distance between respective codewords configuring the base codebook is designated as the lower matrix; a scheduler configured to receive channel state information from the receiving end and select precoding matrices from the hierarchical codebook based on the channel state information; and a precoder configured to apply the precoding matrices selected in the scheduler to data to be transmitted to the receiving end and transmit the selected precoding matrices through a plurality of antennas.

Type: Grant

Filed: March 3, 2012

Date of Patent: February 3, 2015

Assignee: Seoul National University Industry Foundation

Inventors: Jung Woo Lee, Kyeong Jun Ko, Sung Kyu Jung
Speech recognition device and speech recognition method using space-frequency spectrum

Patent number: 8918319

Abstract: In a speech recognition device and a speech recognition method, a key phrase containing at least one key word is received. The speech recognition method comprises steps: receiving a sound source signal of a key word and generating a plurality of audio signals; transforming the audio signals into a plurality of frequency signals; receiving the frequency signals to obtain a space-frequency spectrum and an angular estimation value thereof; receiving the space-frequency spectrum to define and output at least one spatial eigenparameter, and using the angular estimation value and the frequency signals to perform spotting and evaluation and outputting a Bhattacharyya distance; and receiving the spatial eigenparameter and the Bhattacharyya distance and using corresponding thresholds to determine correctness of the key phrase. Thereby this invention robustly achieves high speech recognition rate under very low SNR conditions.

Type: Grant

Filed: July 7, 2011

Date of Patent: December 23, 2014

Assignee: National Chiao University

Inventors: Jwu-Sheng Hu, Ming-Tang Lee, Ting-Chao Wang, Chia Hsin Yang
Sound source recording apparatus and method adaptable to operating environment

Patent number: 8913761

Abstract: Disclosed herein is a sound source recording apparatus and method adaptable to an operating environment, which can record a target sound source at a predetermined level without being affected by characteristics of the sound source or ambient noise. A target sound source is separated from a sound source signal received through an array of microphones and a recording sound pressure level and a gain are estimated using a reference sound pressure level and a reference distance for the target sound source, thereby controlling or adjusting the gain of the microphones.

Type: Grant

Filed: October 25, 2010

Date of Patent: December 16, 2014

Assignee: Samsung Electronics Co., Ltd.

Inventor: Ki Hoon Shin
Predicting a sales success probability score from a distance vector between speech of a customer and speech of an organization representative

Patent number: 8914285

Abstract: A computerized method for sales optimization including receiving at a computer server a digital representation of a portion of an interaction between a customer and an organization representative, the portion of an interaction comprises a speech signal of the customer and a speech signal of the organization representative; analyzing the speech signal of the organization representative; analyzing the speech signal of the customer; determining a distance vector between the speech signal of the organization representative and the speech signal of the customer; and predicting a sale success probability score for the captured speech signal portion.

Type: Grant

Filed: July 17, 2012

Date of Patent: December 16, 2014

Assignee: Nice-Systems Ltd

Inventors: Moshe Wasserblat, Dan Eylon, Ezra Daya, Tzach Ashkenazi, Oren Pereg, Ohad Pollak, Moshe Avlagon
Method and arrangement for controlling user access

Patent number: 8903725

Abstract: Method for controlling user access to a service available in a data network and/or to information stored in a user database, in order to protect stored user data from unauthorized access, such that the method comprises the following: input of a user's speech sample to a user data terminal, processing of the user's speech sample in order to obtain a prepared speech sample as well as a current voice profile of the user, comparison of the current voice profile with an initial voice profile stored in an authorization database, and output of an access-control signal to either permit or refuse access, taking into account the result of the comparison step, such that the comparison step includes a quantitative similarity evaluation of the current and the stored voice profiles as well as a threshold-value discrimination of a similarity measure thereby derived, and an access-control signal that initiates permission of access is generated only if a prespecified similarity measure is not exceeded.

Type: Grant

Filed: November 25, 2009

Date of Patent: December 2, 2014

Assignee: Voice.Trust AG

Inventor: Christian Pilz
FRAME ERASURE CONCEALMENT TECHNIQUE FOR A BITSTREAM-BASED FEATURE EXTRACTOR

Publication number: 20140330564

Abstract: A frame erasure concealment technique for a bitstream-based feature extractors in a speech recognition system particularly suited for use in a wireless communication system operates to “delete” each frame in which an erasure is declared. The deletions thus reduce the length of the observation sequence, but have been found to provide for sufficient speech recognition based on both single word and “string” tests of the deletion technique.

Type: Application

Filed: May 19, 2014

Publication date: November 6, 2014

Applicant: AT&T Intellectual Property II, L.P.

Inventors: Richard Vandervoort Cox, Hong Kook Kim
Apparatus and Method for Model Adaptation for Spoken Language Understanding

Publication number: 20140330565

Abstract: An apparatus and a method are provided for building a spoken language understanding model. Labeled data may be obtained for a target application. A new classification model may be formed for use with the target application by using the labeled data for adaptation of an existing classification model. In some implementations, the existing classification model may be used to determine the most informative examples to label.

Type: Application

Filed: May 20, 2014

Publication date: November 6, 2014

Applicant: AT&T Intellectual Property II, L.P.

Inventor: Gokhan Tur
Effective audio segmentation and classification

Patent number: 8838452

Abstract: A method (400) and system (200) for classifying a audio signal are described. The method (400) operates by first receiving a sequence of audio frame feature data, each of the frame feature data characterising an audio frame along the audio segment. In response to receipt of each of the audio frame feature data, statistical data characterising the audio segment is updated with the received frame feature data. The received frame feature data is then discarded. A preliminary classification for the audio segment may be determined from the statistical data. Upon receipt of a notification of an end boundary of the audio segment, the audio segment is classified (410) based on the statistical data.

Type: Grant

Filed: June 6, 2005

Date of Patent: September 16, 2014

Assignee: Canon Kabushiki Kaisha

Inventors: Reuben Kan, Dmitri Katchalov, Muhammad Majid, George Politis, Timothy John Wark
System and method for optimizing call flows of a spoken dialog system

Patent number: 8831208

Abstract: A dialog manager for a spoken dialog system. A decision module selects a path from a plurality of alternative paths for a given call, wherein each path implements one of a plurality of strategies for a call flow. A weighting module weights the path selection decision and is connected to a probability estimator for estimating the probability value that a given one of the plurality of paths is the best-performing path.

Type: Grant

Filed: September 23, 2011

Date of Patent: September 9, 2014

Assignee: Synchronoss Technologies, Inc.

Inventors: David Suendermann, Jackson Liscombe, Jonathan Bloom, Grace Li, Roberto Pieraccini
Systems and methods for routing a facsimile confirmation based on content

Patent number: 8804178

Abstract: A method for routing a confirmation of receipt of a facsimile or portion thereof according to one embodiment of the present invention includes analyzing text of a facsimile for at least one of a meaning and a context of the text; and routing one or more confirmations to one or more destinations based on the analysis.

Type: Grant

Filed: February 25, 2013

Date of Patent: August 12, 2014

Assignee: Kofax, Inc.

Inventors: Roland G. Borrey, Roy Couchman
Reducing false positives in speech recognition systems

Patent number: 8781825

Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.

Type: Grant

Filed: August 24, 2011

Date of Patent: July 15, 2014

Assignee: Sensory, Incorporated

Inventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
Device for generating codebook, method for generating codebook, and method for transmitting data

Patent number: 8743998

Abstract: A device for generating a transmission codebook in a communication system including a multi-input multi-output (MIMO) antenna according to an embodiment of the present invention includes: a frequency determiner that determines a frequency to allow the transmission codebook to have an optimal characteristic; a precoding matrix generator that generates a precoding matrix on the basis of the frequency; and a codebook generator that generates a retransmission codebook to be used for retransmission on the basis of the precoding matrix and generates the transmission codebook on the basis of the retransmission codebook.

Type: Grant

Filed: September 1, 2009

Date of Patent: June 3, 2014

Assignee: Electronics and Telecommunications Research Institute

Inventors: DongSeung Kwon, Byung-Jae Kwak, Choongil Yeh, Young Seog Song, Ji Hyung Kim, Wooram Shin, Chung Gu Kang, Jin-Woo Kim
Speech signal processing device

Patent number: 8738367

Abstract: A speech signal processing device is equipped with a power acquisition unit, a probability distribution acquisition unit, and a correspondence degree determination unit. The power acquisition unit accepts an inputted speech signal and, based on the accepted speech signal, acquires power representing the intensity of a speech sound represented by the speech signal. The probability distribution acquisition unit acquires a probability distribution using the intensity of the power acquired by the power acquisition unit as a random variable. The correspondence degree determination unit determines whether a correspondence degree representing a degree that power acquired by the power acquisition unit in a case that a predetermined reference speech signal is inputted into the power acquisition unit corresponds with predetermined reference power is higher than a predetermined reference correspondence degree, based on the probability distribution acquired by the probability distribution acquisition unit.

Type: Grant

Filed: February 18, 2010

Date of Patent: May 27, 2014

Assignee: NEC Corporation

Inventor: Tadashi Emori
Methods and apparatus relating to searching of spoken audio data

Patent number: 8694317

Abstract: Methods for processing audio data containing speech to produce a searchable index file and for subsequently searching such an index file are provided. The processing method uses a phonetic approach and models each frame of the audio data with a set of reference phones. A score for each of the reference phones, representing the difference of the audio from the phone model, is stored in the searchable data file for each of the phones in the reference set. A consequence of storing information regarding each of the reference phones is that the accuracy of searches carried out on the index file is not compromised by the rejection of information about particular phones. A subsequent search method is also provided which uses a simple and efficient dynamic programming search to locate instances of a search term in the audio. The methods of the present invention have particular application to the field of audio data mining.

Type: Grant

Filed: February 6, 2006

Date of Patent: April 8, 2014

Assignee: Aurix Limited

Inventors: Adrian I Skilling, Howard A K Wright
Grouping and differentiating volumes of files

Patent number: 8676858

Abstract: Methods and apparatus teach a digital spectrum of a file. The digital spectrum is used to map a file's position in a multi-dimensional space. This position relative to another file's position reveals distances between the files. Closest files can be grouped together. When contemplating voluminous numbers of files for digital spectrums, various methods include: concatenating all such files together to get a single key useful for creating a file's spectrum; or compressing files individually and combining their collective dictionaries into a single dictionary that defines the digital spectrum. Each provides advantage over the other. The latter consumes considerably less run time because each compression event can be distributed to a separate processor. Method two provides better spectrums because it is more “informationally” valid than is method one.

Type: Grant

Filed: January 8, 2010

Date of Patent: March 18, 2014

Assignee: Novell, Inc.

Inventor: Craig N. Teerlink
METHOD AND APPARATUS FOR REAL TIME SALES OPTIMIZATION BASED ON AUDIO INTERACTIONS ANALYSIS

Publication number: 20140025376

Abstract: The subject matter discloses a computerized method for sales optimization comprising: receiving at a computer server a digital representation of a portion of an interaction between a customer and an organization representative, the portion of an interaction comprises a speech signal of the customer and a speech signal of the organization representative; analyzing the speech signal of the organization representative; analyzing the speech signal of the customer; determining a distance vector between the speech signal of the organization representative and the speech signal of the customer; and predicting a sale success probability score for the captured speech signal portion.

Type: Application

Filed: July 17, 2012

Publication date: January 23, 2014

Applicant: NICE-SYSTEMS LTD

Inventors: Moshe WASSERBLAT, Dan EYLON, Ezra DAYA, Tzach ASHKENAZI, Oren PEREG, Ohad POLLAK, Moshe AVLAGON
Speech processing system and method

Patent number: 8620655

Abstract: A speech processing method, comprising: receiving a speech input which comprises a sequence of feature vectors; determining the likelihood of a sequence of words arising from the sequence of feature vectors using an acoustic model and a language model, comprising: providing an acoustic model for performing speech recognition on an input signal which comprises a sequence of feature vectors, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to a feature vector, wherein said speech input is a mismatched speech input which is received from a speaker in an environment which is not matched to the speaker or environment under which the acoustic model was trained; and adapting the acoustic model to the mismatched speech input, the speech processing method further comprising determining the likelihood of a sequence of features occurring in a given language using a language model; and combining the likelihoods determined by the acoustic

Type: Grant

Filed: August 10, 2011

Date of Patent: December 31, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Haitian Xu, Kean Kheong Chin, Mark John Francis Gales
Voice recognition device, voice recognition method, and voice recognition program

Patent number: 8612225

Abstract: A voice recognition device that recognizes a voice of an input voice signal, comprises a voice model storage unit that stores in advance a predetermined voice model having a plurality of detail levels, the plurality of detail levels being information indicating a feature property of a voice for the voice model; a detail level selection unit that selects a detail level, closest to a feature property of an input voice signal, from the detail levels of the voice model stored in the voice model storage unit; and a parameter setting unit that sets parameters for recognizing the voice of an input voice according to the detail level selected by the detail level selection unit.

Type: Grant

Filed: February 26, 2008

Date of Patent: December 17, 2013

Assignee: NEC Corporation

Inventors: Takayuki Arakawa, Ken Hanazawa, Masanori Tsujikawa
Speech data process unit and speech data process unit control program for speech recognition

Patent number: 8606580

Abstract: To provide a data process unit and data process unit control program that are suitable for generating acoustic models for unspecified speakers taking distribution of diversifying feature parameters into consideration under such specific conditions as the type of speaker, speech lexicons, speech styles, and speech environment and that are suitable for providing acoustic models intended for unspecified speakers and adapted to speech of a specific person. The data process unit comprises a data classification section, data storing section, pattern model generating section, data control section, mathematical distance calculating section, pattern model converting section, pattern model display section, region dividing section, division changing section, region selecting section, and specific pattern model generating section.

Type: Grant

Filed: December 30, 2008

Date of Patent: December 10, 2013

Assignee: Asahi Kasei Kabushiki Kaisha

Inventors: Makoto Shozakai, Goshu Nagino
CONVERSATION MANAGEMENT METHOD, AND DEVICE FOR EXECUTING SAME

Publication number: 20130325468

Abstract: Disclosed are a conversation management method and a device for executing same are disclosed. The device includes: a calculation unit for calculating the importance of an utterance intention, the similarity between utterance intentions, and the relative distance between utterance intentions using at least one of a plurality of utterance intentions in a corpus and an utterance intention in a sequence relationship with the at least one utterance intention; a similarity calculating unit for calculating the similarity between conversation flows by comparing a conversation flow obtained from a corpus and a conversation flow obtained from a user utterance by means of the importance and similarity of an utterance intention; and an utterance intention verifying unit for calculating an evaluation score of an utterance intention by evaluating a user utterance according to the relative distance between utterance intentions.

Type: Application

Filed: October 21, 2011

Publication date: December 5, 2013

Applicant: POSTECH ACADEMY - INDUSTRY FOUNDATION

Inventors: Geun-Bae Lee, Sung-Jin Lee, Hyung-Jong Noh, Kyu-Song Lee
Systems and methods for routing facsimiles based on content

Patent number: 8599419

Abstract: A method for routing a facsimile according to one embodiment includes receiving or generating text of a facsimile in a computer-readable format; routing the facsimile or text thereof to an intended recipient identified by recognizing at least one of a name, an email address and contact information of the intended recipient in the facsimile; analyzing the text of the facsimile for at least one of a meaning and a context of the text; and routing the facsimile or text thereof to one or more other destinations based on the analysis. A method according to another embodiment includes analyzing a pattern of light and dark areas of a facsimile in a computer-readable format; correlating the pattern to one or more forms; and routing the facsimile to one or more destinations based on the correlation, with the proviso that the analyzing, correlating and routing are performed without optical character recognition.

Type: Grant

Filed: August 30, 2012

Date of Patent: December 3, 2013

Assignee: Kofax, Inc.

Inventor: Roy Couchman

prev 1 2 3 4 5 6 next