Distance Patents (Class 704/238)
-
Patent number: 9959850Abstract: It is inter alia disclosed a method comprising: determining a divergence measure between a statistical distribution of audio features of a first audio track and a statistical distribution of audio features of at least one further audio track; determining a divergence measure threshold value from at least the divergence measure between the statistical distribution of audio features of a first audio track and the statistical distribution of audio features of the at least one further audio track; and comparing the divergence measure with the divergence measure threshold value.Type: GrantFiled: June 9, 2014Date of Patent: May 1, 2018Assignee: Nokia Technologies OyInventors: Antti Eronen, Jussi Leppänen
-
Patent number: 9928839Abstract: Methods and systems for authenticating a user are described. In some embodiments, a one-time token and a recording of the one-time token is read aloud by the user. The voice characteristics derived from the recording of the one-time token are compared with voice characteristics derived from samples of the user's voice. The user may be authenticated when the one-time token is verified and when a match of the voice characteristics derived from the recording of the one-time token and the voice characteristics derived from the samples of the user's voice meet or exceed a threshold.Type: GrantFiled: April 16, 2014Date of Patent: March 27, 2018Assignee: United Services Automobile Association (USAA)Inventors: Michael Wayne Lester, Debra Randall Casillas, Sudarshan Rangarajan, John Shelton, Maland Keith Mortensen
-
Patent number: 9911410Abstract: A method, computer program product, and system for adapting speech recognition of a user's speech is provided. The method includes receiving a first utterance from a user having a duration below a predetermined threshold, identifying at least one further utterance from the user that provides additional information, generating a concatenated utterance by concatenating the first utterance with the at least one further utterance, transmitting the concatenated utterance to a speech recognition server, receiving a transcription of the concatenated utterance from the speech recognition server that includes a transcription of the first utterance, and extracting the transcription of the first utterance from the transcription of the concatenated utterance. The transcription of the first utterance is based on the additional information provided by the at least one further utterance.Type: GrantFiled: August 19, 2015Date of Patent: March 6, 2018Assignee: International Business Machines CorporationInventor: Shay Ben-David
-
Patent number: 9886957Abstract: A voice recognition system and method are provided. The voice recognition system includes a voice input unit configured to receive learning voice data and a target label including consonant and vowel (letter) information representing the learning voice data, and divide the learning voice data into windows having a predetermined size; a first voice recognition unit configured to learn features of the divided windows using a first neural network model and the target label; a second voice recognition unit configured to learn a time-series pattern of the extracted features using a second neural network model; and a text output unit configured to convert target voice data input to the voice input unit into a text based on learning results of the first voice recognition unit and the second voice recognition unit, and output the text.Type: GrantFiled: December 29, 2015Date of Patent: February 6, 2018Assignee: SAMSUNG SDS CO., LTD.Inventors: Ji-Hyeon Seo, Jae-Young Lee, Byung-Wuek Lee, Kyung-Jun An
-
Patent number: 9870774Abstract: A vehicular apparatus is provided. In the vehicular apparatus, a first controller is disposed on a first board and a second controller is disposed on a second board which is exchangeable with respect to the first board. An AD converter performs A/D conversion of first and second speech data. A switch disposed on the first board is switchable between a first connection state in which the switch outputs the first speech data inputted from the A/D converter and a second connection state in which the switch outputs a sound data different from each of the first and second speech data. A switch controller controls the switch so that the switch is in the second connection state when the second controller performs speech recognition of the second speech data.Type: GrantFiled: August 25, 2014Date of Patent: January 16, 2018Assignee: DENSO CORPORATIONInventors: Takayuki Nishiwaki, Takahiro Enomoto
-
Patent number: 9837072Abstract: Disclosed herein are methods, systems, and computer-readable storage media for automatic speech recognition. The method includes selecting a speaker independent model, and selecting a quantity of speaker dependent models, the quantity of speaker dependent models being based on available computing resources, the selected models including the speaker independent model and the quantity of speaker dependent models. The method also includes recognizing an utterance using each of the selected models in parallel, and selecting a dominant speech model from the selected models based on recognition accuracy using the group of selected models. The system includes a processor and modules configured to control the processor to perform the method. The computer-readable storage medium includes instructions for causing a computing device to perform the steps of the method.Type: GrantFiled: May 15, 2017Date of Patent: December 5, 2017Assignee: Nuance Communications, Inc.Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Alistair D. Conkie
-
Patent number: 9826090Abstract: Merchant/consumer calls may be recorded and evaluated according to a variety of criteria. The call recordings and analyses thereof, as well as consumer tracking information, may be displayed in a user interface of a web-based online portal for convenience in evaluating the use and efficacy of marketing channels as well as the quality of merchant/consumer interactions. In an aspect, the user interface provides call visualization in the form of audio data from a telephone call displayed as a waveform on a call timeline. The call may be (automatically or manually) annotated with various business-value-specific keywords spoken during the telephone call, and markers for these keywords can be presented on the call timeline to visually indicate the keyword and the time during the call when the keyword was spoken. A business value for the call may be determined based at least in part on keywords spoken during the call.Type: GrantFiled: December 13, 2016Date of Patent: November 21, 2017Assignee: Patient Prism LLCInventors: Michael G. Spiessbach, Amol Nirgudkar
-
Patent number: 9817889Abstract: The present invention relates to a searching device, searching method, and program whereby searching for a word string corresponding to input voice can be performed in a robust manner. A voice recognition unit 11 subjects an input voice to voice recognition. A matching unit 16 performs matching, for each of multiple word strings for search results which are word strings that are to be search results for word strings corresponding to the input voice, of a pronunciation symbol string for search results, which is an array of pronunciation symbols expressing pronunciation of the word string search result, and a recognition result pronunciation symbol string which is an array of pronunciation symbols expressing pronunciation of the voice recognition results of the input voice.Type: GrantFiled: December 2, 2010Date of Patent: November 14, 2017Assignee: SONY CORPORATIONInventors: Hitoshi Honda, Yoshinori Maeda, Satoshi Asakawa
-
Patent number: 9805712Abstract: A method for recognizing a voice and a device for recognizing a voice are provided. The method includes: collecting voice information input by a user; extracting characteristics from the voice information to obtain characteristic information; decoding the characteristic information according to an acoustic model and a language model obtained in advance to obtain recognized voice information, wherein the acoustic model is obtained by data compression in advance.Type: GrantFiled: December 18, 2014Date of Patent: October 31, 2017Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.Inventors: Bo Li, Zhiqian Wang, Na Hu, Xiangyu Mu, Lei Jia, Wei Wei
-
Patent number: 9754603Abstract: According to one embodiment, a speech feature extraction apparatus includes an extraction unit and a calculation unit. The extraction unit extracts speech segments over a predetermined period at intervals of a unit time from either an input speech signal or a plurality of subband input speech signals obtained by extracting signal components of a plurality of frequency bands from the input speech signal, to generate either a unit speech signal or a plurality of subband unit speech signals. The calculation unit calculates either each average time of the unit speech signal in each of the plurality of frequency bands or each average time of each of the plurality of subband unit speech signals to obtain a speech feature.Type: GrantFiled: December 27, 2012Date of Patent: September 5, 2017Assignee: Kabushiki Kaisha ToshibaInventors: Masanobu Nakamura, Takashi Masuko
-
Patent number: 9721564Abstract: Systems and methods for performing ASR in the presence of heterographs are provided. Verbal input is received from the user that includes a plurality of utterances. A first of the plurality of utterances is matched to a first word. It is determined that a second utterance in the plurality of utterances matches a plurality of words that is in a same heterograph set. It is identified which one of the plurality of words is associated with a context of the first word. A function is performed based on the first word and the identified one of the plurality of words.Type: GrantFiled: July 31, 2014Date of Patent: August 1, 2017Assignee: Rovi Guides, Inc.Inventors: Akshat Agarwal, Rakesh Barve
-
Patent number: 9697827Abstract: Features are disclosed for reducing errors in speech recognition processing. Methods for reducing errors can include receiving multiple speech recognition hypotheses based on an utterance indicative of a command or query of a user and determining a command or query within a grammar having a least amount of difference from one of the speech recognition hypotheses. The determination of the least amount of difference may be based at least in part on a comparison of individual subword units along at least some of the sequence paths of the speech recognition hypotheses and the grammar. For example, the comparison may be performed on the phoneme level instead of the word level.Type: GrantFiled: December 11, 2012Date of Patent: July 4, 2017Assignee: Amazon Technologies, Inc.Inventors: Jeffrey Paul Lilly, Ryan Paul Thomas, Jeffrey Penrod Adams
-
Patent number: 9659564Abstract: The present invention relates to a non-standard speech detection system and method whereby a speech is analyzed based on models that are trained using personalized speech for each individual. The model is stored in a database and used to analyze a speech in real time to determine the content and behavior of an individual who is a party to a conversation that produces the speech. The results of the analysis can be used to determine if a conversation takes place under normal circumstances or under extraneous circumstances.Type: GrantFiled: October 22, 2015Date of Patent: May 23, 2017Assignee: SESTEK SES VE ILETISIM BILGISAYAR TEKNOLOJILERI SANAYI TICARET ANONIM SIRKETIInventor: Mustafa Levent Arslan
-
Patent number: 9552037Abstract: Systems and methods for switching a computing device from a low-power state to a high-power state are provided. In some aspects, a method, implemented on a power management processing unit of the computing device, includes receiving, while the computing device is in the low-power state, a first audio signal. The method also includes verifying the first audio signal based on an audio signal key. The method also includes providing, in response to verifying the first audio signal, instructions for switching the computing device from the low-power state to the high-power state.Type: GrantFiled: April 23, 2012Date of Patent: January 24, 2017Assignee: Google Inc.Inventor: Leng Ooi
-
Patent number: 9536525Abstract: A speaker indexing device extracts a plurality of features from a speech signal on a frame-by-frame basis, models a distribution of first feature sets by a mixture distribution containing as many probability distributions as there are speakers, selects for each probability distribution either first feature sets located within a predetermined distance from the center of the probability distribution or a predetermined number of first feature sets in sequence starting from the first feature set closest to the center of the probability distribution, selects a second feature for the frame corresponding to the selected first feature sets as first training data for the speaker corresponding to the probability distribution and, using the first training data, trains a speaker model to be used to append to each frame identification information for identifying the speaker speaking in the frame.Type: GrantFiled: August 13, 2015Date of Patent: January 3, 2017Assignee: FUJITSU LIMITEDInventor: Shoji Hayakawa
-
Patent number: 9520125Abstract: There are provided a speech synthesis device, a speech synthesis method and a speech synthesis program which can represent a phoneme as a duration shorter than a duration upon modeling according to a statistical method. A speech synthesis device 80 according to the present invention includes a phoneme boundary updating means 81 which, by using a voiced utterance likelihood index which is an index indicating a degree of voiced utterance likelihood of each state which represents a phoneme modeled by a statistical method, updates a phoneme boundary position which is a boundary with other phonemes neighboring to the phoneme.Type: GrantFiled: June 8, 2012Date of Patent: December 13, 2016Assignee: NEC CorporationInventors: Yasuyuki Mitsui, Masanori Kato, Reishi Kondo
-
Patent number: 9514747Abstract: In an automatic speech recognition (ASR) processing system, ASR processing may be configured to reduce a latency of returning speech results to a user. The latency may be determined by comparing a time stamp of an utterance in process to a current time. Latency may also be estimated based on an endpoint of the utterance or other considerations such as how difficult the utterance may be to process. To improve latency the ASR system may be configured to adjust various processing parameters, such as graph pruning factors, path weights, ASR models, etc. Latency checks and corrections may occur dynamically for a particular utterance while it is being processed, thus allowing the ASR system to adjust to rapidly changing latency conditions.Type: GrantFiled: August 28, 2013Date of Patent: December 6, 2016Assignee: Amazon Technologies, Inc.Inventors: Michael Maximilian Emanuel Bisani, Hugh Evan Secker-Walker, Kenneth John Basye, Alexander David Rosen
-
Patent number: 9495956Abstract: In embodiments of the present disclosure, capabilities are described for interacting with a mobile communication facility, which may include receiving a switch activation from a user to initiate a speech recognition recording session, recording the speech recognition recording session using a mobile communication facility resident capture facility, recognizing a portion of the voice command as an indication that user speech for recognition will begin following the end of the portion of the voice command, recognizing the recorded speech using a speech recognition facility to produce an external output, and using the selected output to perform a function on the mobile communication facility. The speech recognition recording session may include a voice command from the user followed by the speech to be recognized from the user.Type: GrantFiled: November 10, 2014Date of Patent: November 15, 2016Assignee: Nuance Communications, Inc.Inventors: William S. Meisel, Michael S. Phillips, John N. Nguyen
-
Patent number: 9479895Abstract: A location of a first mobile device associated with a first user is determined, and a location of a second mobile device associated with a second user is determined. A relationship between the first user and the second user is determined, and a proximity of the first mobile device relative to the second mobile device is determined. A location-oriented data service is provided to at least one of the first mobile device and the second mobile device.Type: GrantFiled: April 23, 2009Date of Patent: October 25, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Edith Helen Stern, Patrick Joseph O'Sullivan, Robert Cameron Weir, Barry E. Willner
-
Patent number: 9460710Abstract: In embodiments of the present disclosure, capabilities are described for interacting with a mobile communication facility, which may include receiving a switch activation from a user to initiate a speech recognition recording session, recording the speech recognition recording session using a mobile communication facility resident capture facility, recognizing a portion of the voice command as an indication that user speech for recognition will begin following the end of the portion of the voice command, recognizing the recorded speech using a speech recognition facility to produce an external output, and using the selected output to perform a function on the mobile communication facility. The speech recognition recording session may include a voice command from the user followed by the speech to be recognized from the user.Type: GrantFiled: November 10, 2014Date of Patent: October 4, 2016Assignee: Nuance Communications, Inc.Inventors: William S. Meisel, Michael S. Phillips, John N. Nguyen
-
Patent number: 9350532Abstract: A method, apparatus and system for secure forensic investigation of a target machine by a client machine over a communications network. In one aspect the method comprises establishing secure communication with a server over a communications network, establishing secure communication with the target machine over the communications network, wherein establishing secure communication with the target machine includes establishing secure communication between the server and the target machine, installing a servelet on the target machine, transmitting a secure command to the servelet over the communications network, executing the secure command in the servelet, transmitting data, by the target machine, in response to a servelet instruction, and receiving the data from the target machine over the communication network.Type: GrantFiled: January 10, 2011Date of Patent: May 24, 2016Assignee: GUIDANCE SOFTWARE, INC.Inventors: Shawn McCreight, Dominik Weber, Matthew Garrett
-
Patent number: 9343068Abstract: A method for controlling access to a plurality of applications in an electronic device includes receiving a voice command from a speaker for accessing a target application among the plurality of applications, and verifying whether the voice command is indicative of a user authorized to access the applications based on a speaker model of the authorized user. In this method, each application is associated with a security level having a threshold value. The method further includes updating the speaker model with the voice command if the voice command is verified to be indicative of the user, and adjusting at least one of the threshold values based on the updated speaker model.Type: GrantFiled: September 16, 2013Date of Patent: May 17, 2016Assignee: QUALCOMM IncorporatedInventors: Sungrack Yun, Taesu Kim, Jun-Cheol Cho, Min-Kyu Park, Kyu Woong Hwang
-
Patent number: 9336781Abstract: A content-aware speaker recognition system includes technologies to, among other things, analyze phonetic content of a speech sample, incorporate phonetic content of the speech sample into a speaker model, and use the phonetically-aware speaker model for speaker recognition.Type: GrantFiled: April 29, 2014Date of Patent: May 10, 2016Assignee: SRI INTERNATIONALInventors: Nicolas Scheffer, Yun Lei
-
Patent number: 9262471Abstract: A record is received including a token without a corresponding predetermined weight. Information pertaining to the received token is retrieved from at least one of external reference information and historic statistics. A token with a predetermined weight closest to the received token is determined based on the retrieved information. The predetermined weight of the closest token is assigned to the received token and data is matched based on the assigned weight of the received token.Type: GrantFiled: August 6, 2013Date of Patent: February 16, 2016Assignee: International Business Machines CorporationInventors: Karl J. Weinmeister, Yinle Zhou
-
Patent number: 9076441Abstract: A speech recognition circuit comprising a circuit for providing state identifiers which identify states corresponding to nodes or groups of adjacent nodes in a lexical tree, and for providing scores corresponding to said state identifiers, the lexical tree comprising a model of words; a memory structure for receiving and storing state identifiers identified by a node identifier identifying a node or group of adjacent nodes, said memory structure being adapted to allow lookup to identify particular state identifiers, reading of the scores corresponding to the state identifiers, and writing back of the scores to the memory structure after modification of the scores; an accumulator for receiving score updates corresponding to particular state identifiers from a score update generating circuit which generates the score updates using audio input, for receiving scores from the memory structure, and for modifying said scores by adding said score updates to said scores; and a selector circuit for selecting at least oType: GrantFiled: January 7, 2013Date of Patent: July 7, 2015Assignee: Zentian LimitedInventors: Guy Larri, Mark Catchpole, Damian Kelly Harris-Dowsett, Timothy Brian Reynolds
-
Patent number: 9009044Abstract: Methods and apparatus related to speech recognition performed by a speech recognition device are disclosed. The speech recognition device can receive a plurality of samples corresponding to an utterance and generate a feature vector z from the plurality of samples. The speech recognition device can select a first frame y0 from the feature vector z, and can generate a second frame y1, where y0 and y1 differ. The speech recognition device can generate a modified frame x? based on the first frame y0 and the second frame y1 and then recognize speech related to the utterance based on the modified frame x?. The recognized speech can be output by the speech recognition device.Type: GrantFiled: February 7, 2013Date of Patent: April 14, 2015Assignee: Google Inc.Inventor: Andrew William Senior
-
Patent number: 9009041Abstract: A method is described for improving the accuracy of a transcription generated by an automatic speech recognition (ASR) engine. A personal vocabulary is maintained that includes replacement words. The replacement words in the personal vocabulary are obtained from personal data associated with a user. A transcription is received of an audio recording. The transcription is generated by an ASR engine using an ASR vocabulary and includes a transcribed word that represents a spoken word in the audio recording. Data is received that is associated with the transcribed word. A replacement word from the personal vocabulary is identified, which is used to re-score the transcription and replace the transcribed word.Type: GrantFiled: July 26, 2011Date of Patent: April 14, 2015Assignee: Nuance Communications, Inc.Inventors: George Zavaliagkos, William F. Ganong, III, Uwe H. Jost, Shreedhar Madhavapeddi, Gary B. Clayton
-
Patent number: 8990086Abstract: A recognition confidence measurement method, medium and system which can more accurately determine whether an input speech signal is an in-vocabulary, by extracting an optimum number of candidates that match a phone string extracted from the input speech signal and estimating a lexical distance between the extracted candidates is provided. A recognition confidence measurement method includes: extracting a phoneme string from a feature vector of an input speech signal; extracting candidates by matching the extracted phoneme string and phoneme strings of vocabularies registered in a predetermined dictionary and; estimating a lexical distance between the extracted candidates; and determining whether the input speech signal is an in-vocabulary, based on the lexical distance.Type: GrantFiled: July 31, 2006Date of Patent: March 24, 2015Assignee: Samsung Electronics Co., Ltd.Inventors: Sang-Bae Jeong, Nam Hoon Kim, Ick Sang Han, In Jeong Choi, Gil Jin Jang, Jae-Hoon Jeong
-
Patent number: 8977547Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.Type: GrantFiled: October 8, 2009Date of Patent: March 10, 2015Assignee: Mitsubishi Electric CorporationInventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
-
Patent number: 8948298Abstract: Disclosed is a multiple-input multiple-output (MIMO) system including a transmitting end and a receiving end, wherein the transmitting end includes: a hierarchical codebook in which at least one base codebook is designated as the upper matrix and a child codebook generated based on a chordal distance between respective codewords configuring the base codebook is designated as the lower matrix; a scheduler configured to receive channel state information from the receiving end and select precoding matrices from the hierarchical codebook based on the channel state information; and a precoder configured to apply the precoding matrices selected in the scheduler to data to be transmitted to the receiving end and transmit the selected precoding matrices through a plurality of antennas.Type: GrantFiled: March 3, 2012Date of Patent: February 3, 2015Assignee: Seoul National University Industry FoundationInventors: Jung Woo Lee, Kyeong Jun Ko, Sung Kyu Jung
-
Patent number: 8918319Abstract: In a speech recognition device and a speech recognition method, a key phrase containing at least one key word is received. The speech recognition method comprises steps: receiving a sound source signal of a key word and generating a plurality of audio signals; transforming the audio signals into a plurality of frequency signals; receiving the frequency signals to obtain a space-frequency spectrum and an angular estimation value thereof; receiving the space-frequency spectrum to define and output at least one spatial eigenparameter, and using the angular estimation value and the frequency signals to perform spotting and evaluation and outputting a Bhattacharyya distance; and receiving the spatial eigenparameter and the Bhattacharyya distance and using corresponding thresholds to determine correctness of the key phrase. Thereby this invention robustly achieves high speech recognition rate under very low SNR conditions.Type: GrantFiled: July 7, 2011Date of Patent: December 23, 2014Assignee: National Chiao UniversityInventors: Jwu-Sheng Hu, Ming-Tang Lee, Ting-Chao Wang, Chia Hsin Yang
-
Patent number: 8913761Abstract: Disclosed herein is a sound source recording apparatus and method adaptable to an operating environment, which can record a target sound source at a predetermined level without being affected by characteristics of the sound source or ambient noise. A target sound source is separated from a sound source signal received through an array of microphones and a recording sound pressure level and a gain are estimated using a reference sound pressure level and a reference distance for the target sound source, thereby controlling or adjusting the gain of the microphones.Type: GrantFiled: October 25, 2010Date of Patent: December 16, 2014Assignee: Samsung Electronics Co., Ltd.Inventor: Ki Hoon Shin
-
Patent number: 8914285Abstract: A computerized method for sales optimization including receiving at a computer server a digital representation of a portion of an interaction between a customer and an organization representative, the portion of an interaction comprises a speech signal of the customer and a speech signal of the organization representative; analyzing the speech signal of the organization representative; analyzing the speech signal of the customer; determining a distance vector between the speech signal of the organization representative and the speech signal of the customer; and predicting a sale success probability score for the captured speech signal portion.Type: GrantFiled: July 17, 2012Date of Patent: December 16, 2014Assignee: Nice-Systems LtdInventors: Moshe Wasserblat, Dan Eylon, Ezra Daya, Tzach Ashkenazi, Oren Pereg, Ohad Pollak, Moshe Avlagon
-
Patent number: 8903725Abstract: Method for controlling user access to a service available in a data network and/or to information stored in a user database, in order to protect stored user data from unauthorized access, such that the method comprises the following: input of a user's speech sample to a user data terminal, processing of the user's speech sample in order to obtain a prepared speech sample as well as a current voice profile of the user, comparison of the current voice profile with an initial voice profile stored in an authorization database, and output of an access-control signal to either permit or refuse access, taking into account the result of the comparison step, such that the comparison step includes a quantitative similarity evaluation of the current and the stored voice profiles as well as a threshold-value discrimination of a similarity measure thereby derived, and an access-control signal that initiates permission of access is generated only if a prespecified similarity measure is not exceeded.Type: GrantFiled: November 25, 2009Date of Patent: December 2, 2014Assignee: Voice.Trust AGInventor: Christian Pilz
-
Publication number: 20140330564Abstract: A frame erasure concealment technique for a bitstream-based feature extractors in a speech recognition system particularly suited for use in a wireless communication system operates to “delete” each frame in which an erasure is declared. The deletions thus reduce the length of the observation sequence, but have been found to provide for sufficient speech recognition based on both single word and “string” tests of the deletion technique.Type: ApplicationFiled: May 19, 2014Publication date: November 6, 2014Applicant: AT&T Intellectual Property II, L.P.Inventors: Richard Vandervoort Cox, Hong Kook Kim
-
Publication number: 20140330565Abstract: An apparatus and a method are provided for building a spoken language understanding model. Labeled data may be obtained for a target application. A new classification model may be formed for use with the target application by using the labeled data for adaptation of an existing classification model. In some implementations, the existing classification model may be used to determine the most informative examples to label.Type: ApplicationFiled: May 20, 2014Publication date: November 6, 2014Applicant: AT&T Intellectual Property II, L.P.Inventor: Gokhan Tur
-
Patent number: 8838452Abstract: A method (400) and system (200) for classifying a audio signal are described. The method (400) operates by first receiving a sequence of audio frame feature data, each of the frame feature data characterising an audio frame along the audio segment. In response to receipt of each of the audio frame feature data, statistical data characterising the audio segment is updated with the received frame feature data. The received frame feature data is then discarded. A preliminary classification for the audio segment may be determined from the statistical data. Upon receipt of a notification of an end boundary of the audio segment, the audio segment is classified (410) based on the statistical data.Type: GrantFiled: June 6, 2005Date of Patent: September 16, 2014Assignee: Canon Kabushiki KaishaInventors: Reuben Kan, Dmitri Katchalov, Muhammad Majid, George Politis, Timothy John Wark
-
Patent number: 8831208Abstract: A dialog manager for a spoken dialog system. A decision module selects a path from a plurality of alternative paths for a given call, wherein each path implements one of a plurality of strategies for a call flow. A weighting module weights the path selection decision and is connected to a probability estimator for estimating the probability value that a given one of the plurality of paths is the best-performing path.Type: GrantFiled: September 23, 2011Date of Patent: September 9, 2014Assignee: Synchronoss Technologies, Inc.Inventors: David Suendermann, Jackson Liscombe, Jonathan Bloom, Grace Li, Roberto Pieraccini
-
Patent number: 8804178Abstract: A method for routing a confirmation of receipt of a facsimile or portion thereof according to one embodiment of the present invention includes analyzing text of a facsimile for at least one of a meaning and a context of the text; and routing one or more confirmations to one or more destinations based on the analysis.Type: GrantFiled: February 25, 2013Date of Patent: August 12, 2014Assignee: Kofax, Inc.Inventors: Roland G. Borrey, Roy Couchman
-
Patent number: 8781825Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.Type: GrantFiled: August 24, 2011Date of Patent: July 15, 2014Assignee: Sensory, IncorporatedInventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
-
Patent number: 8743998Abstract: A device for generating a transmission codebook in a communication system including a multi-input multi-output (MIMO) antenna according to an embodiment of the present invention includes: a frequency determiner that determines a frequency to allow the transmission codebook to have an optimal characteristic; a precoding matrix generator that generates a precoding matrix on the basis of the frequency; and a codebook generator that generates a retransmission codebook to be used for retransmission on the basis of the precoding matrix and generates the transmission codebook on the basis of the retransmission codebook.Type: GrantFiled: September 1, 2009Date of Patent: June 3, 2014Assignee: Electronics and Telecommunications Research InstituteInventors: DongSeung Kwon, Byung-Jae Kwak, Choongil Yeh, Young Seog Song, Ji Hyung Kim, Wooram Shin, Chung Gu Kang, Jin-Woo Kim
-
Patent number: 8738367Abstract: A speech signal processing device is equipped with a power acquisition unit, a probability distribution acquisition unit, and a correspondence degree determination unit. The power acquisition unit accepts an inputted speech signal and, based on the accepted speech signal, acquires power representing the intensity of a speech sound represented by the speech signal. The probability distribution acquisition unit acquires a probability distribution using the intensity of the power acquired by the power acquisition unit as a random variable. The correspondence degree determination unit determines whether a correspondence degree representing a degree that power acquired by the power acquisition unit in a case that a predetermined reference speech signal is inputted into the power acquisition unit corresponds with predetermined reference power is higher than a predetermined reference correspondence degree, based on the probability distribution acquired by the probability distribution acquisition unit.Type: GrantFiled: February 18, 2010Date of Patent: May 27, 2014Assignee: NEC CorporationInventor: Tadashi Emori
-
Patent number: 8694317Abstract: Methods for processing audio data containing speech to produce a searchable index file and for subsequently searching such an index file are provided. The processing method uses a phonetic approach and models each frame of the audio data with a set of reference phones. A score for each of the reference phones, representing the difference of the audio from the phone model, is stored in the searchable data file for each of the phones in the reference set. A consequence of storing information regarding each of the reference phones is that the accuracy of searches carried out on the index file is not compromised by the rejection of information about particular phones. A subsequent search method is also provided which uses a simple and efficient dynamic programming search to locate instances of a search term in the audio. The methods of the present invention have particular application to the field of audio data mining.Type: GrantFiled: February 6, 2006Date of Patent: April 8, 2014Assignee: Aurix LimitedInventors: Adrian I Skilling, Howard A K Wright
-
Patent number: 8676858Abstract: Methods and apparatus teach a digital spectrum of a file. The digital spectrum is used to map a file's position in a multi-dimensional space. This position relative to another file's position reveals distances between the files. Closest files can be grouped together. When contemplating voluminous numbers of files for digital spectrums, various methods include: concatenating all such files together to get a single key useful for creating a file's spectrum; or compressing files individually and combining their collective dictionaries into a single dictionary that defines the digital spectrum. Each provides advantage over the other. The latter consumes considerably less run time because each compression event can be distributed to a separate processor. Method two provides better spectrums because it is more “informationally” valid than is method one.Type: GrantFiled: January 8, 2010Date of Patent: March 18, 2014Assignee: Novell, Inc.Inventor: Craig N. Teerlink
-
Publication number: 20140025376Abstract: The subject matter discloses a computerized method for sales optimization comprising: receiving at a computer server a digital representation of a portion of an interaction between a customer and an organization representative, the portion of an interaction comprises a speech signal of the customer and a speech signal of the organization representative; analyzing the speech signal of the organization representative; analyzing the speech signal of the customer; determining a distance vector between the speech signal of the organization representative and the speech signal of the customer; and predicting a sale success probability score for the captured speech signal portion.Type: ApplicationFiled: July 17, 2012Publication date: January 23, 2014Applicant: NICE-SYSTEMS LTDInventors: Moshe WASSERBLAT, Dan EYLON, Ezra DAYA, Tzach ASHKENAZI, Oren PEREG, Ohad POLLAK, Moshe AVLAGON
-
Patent number: 8620655Abstract: A speech processing method, comprising: receiving a speech input which comprises a sequence of feature vectors; determining the likelihood of a sequence of words arising from the sequence of feature vectors using an acoustic model and a language model, comprising: providing an acoustic model for performing speech recognition on an input signal which comprises a sequence of feature vectors, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to a feature vector, wherein said speech input is a mismatched speech input which is received from a speaker in an environment which is not matched to the speaker or environment under which the acoustic model was trained; and adapting the acoustic model to the mismatched speech input, the speech processing method further comprising determining the likelihood of a sequence of features occurring in a given language using a language model; and combining the likelihoods determined by the acousticType: GrantFiled: August 10, 2011Date of Patent: December 31, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Haitian Xu, Kean Kheong Chin, Mark John Francis Gales
-
Patent number: 8612225Abstract: A voice recognition device that recognizes a voice of an input voice signal, comprises a voice model storage unit that stores in advance a predetermined voice model having a plurality of detail levels, the plurality of detail levels being information indicating a feature property of a voice for the voice model; a detail level selection unit that selects a detail level, closest to a feature property of an input voice signal, from the detail levels of the voice model stored in the voice model storage unit; and a parameter setting unit that sets parameters for recognizing the voice of an input voice according to the detail level selected by the detail level selection unit.Type: GrantFiled: February 26, 2008Date of Patent: December 17, 2013Assignee: NEC CorporationInventors: Takayuki Arakawa, Ken Hanazawa, Masanori Tsujikawa
-
Patent number: 8606580Abstract: To provide a data process unit and data process unit control program that are suitable for generating acoustic models for unspecified speakers taking distribution of diversifying feature parameters into consideration under such specific conditions as the type of speaker, speech lexicons, speech styles, and speech environment and that are suitable for providing acoustic models intended for unspecified speakers and adapted to speech of a specific person. The data process unit comprises a data classification section, data storing section, pattern model generating section, data control section, mathematical distance calculating section, pattern model converting section, pattern model display section, region dividing section, division changing section, region selecting section, and specific pattern model generating section.Type: GrantFiled: December 30, 2008Date of Patent: December 10, 2013Assignee: Asahi Kasei Kabushiki KaishaInventors: Makoto Shozakai, Goshu Nagino
-
Publication number: 20130325468Abstract: Disclosed are a conversation management method and a device for executing same are disclosed. The device includes: a calculation unit for calculating the importance of an utterance intention, the similarity between utterance intentions, and the relative distance between utterance intentions using at least one of a plurality of utterance intentions in a corpus and an utterance intention in a sequence relationship with the at least one utterance intention; a similarity calculating unit for calculating the similarity between conversation flows by comparing a conversation flow obtained from a corpus and a conversation flow obtained from a user utterance by means of the importance and similarity of an utterance intention; and an utterance intention verifying unit for calculating an evaluation score of an utterance intention by evaluating a user utterance according to the relative distance between utterance intentions.Type: ApplicationFiled: October 21, 2011Publication date: December 5, 2013Applicant: POSTECH ACADEMY - INDUSTRY FOUNDATIONInventors: Geun-Bae Lee, Sung-Jin Lee, Hyung-Jong Noh, Kyu-Song Lee
-
Patent number: 8599419Abstract: A method for routing a facsimile according to one embodiment includes receiving or generating text of a facsimile in a computer-readable format; routing the facsimile or text thereof to an intended recipient identified by recognizing at least one of a name, an email address and contact information of the intended recipient in the facsimile; analyzing the text of the facsimile for at least one of a meaning and a context of the text; and routing the facsimile or text thereof to one or more other destinations based on the analysis. A method according to another embodiment includes analyzing a pattern of light and dark areas of a facsimile in a computer-readable format; correlating the pattern to one or more forms; and routing the facsimile to one or more destinations based on the correlation, with the proviso that the analyzing, correlating and routing are performed without optical character recognition.Type: GrantFiled: August 30, 2012Date of Patent: December 3, 2013Assignee: Kofax, Inc.Inventor: Roy Couchman