Correlation Patents (Class 704/237)
  • Patent number: 10713262
    Abstract: Approaches are described for ranking multiple products or other items, such as products obtained in response to a search request submitted to a server. The ranking system determines a ranking score for the products based on both data available online and item data that must be computed offline due to longer computation time or unavailability of data. The ranking score can be used to rank the products and determine which products are the most relevant to the user. A hybrid boosting method is used to first train an online ranking function to produce an online ranking score for the item. In the second phase, an offline ranking function is trained to produce a second ranking score for the item. The online rank score is combined with the offline rank score at query time to produce a combined rank for the items in the search results.
    Type: Grant
    Filed: October 26, 2016
    Date of Patent: July 14, 2020
    Assignee: A9.com, Inc.
    Inventors: Yue Zhou, Francois Huet
  • Patent number: 10693905
    Abstract: An invalidity detection electronic control unit connected to a bus used by a plurality of electronic control units (ECUs) to communicate with one another in accordance with controller area network (CAN) protocol includes a receiving unit that receives a frame for which transmission is started and a transmitting unit that transmits an error frame on the bus before a tail end of the frame is transmitted if the frame received by the receiving unit meets a predetermined condition indicating invalidity and transmits a normal frame that conforms to the CAN protocol after the error frame is transmitted. Even when a reception error counter of the ECU connected to the bus is incremented due to the impact of the error frame, the reception error counter is decremented by the normal frame.
    Type: Grant
    Filed: January 25, 2018
    Date of Patent: June 23, 2020
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventors: Hiroshi Amano, Toshihisa Nakano, Natsume Matsuzaki, Tomoyuki Haga, Yoshihiro Ujiie, Takeshi Kishikawa
  • Patent number: 10629226
    Abstract: System and method for acoustic signal processing are disclosed. An exemplary device for acoustic signal processing includes a voice activity detector configured to detect a speech of a user. The device includes a microphone configured to receive acoustic signals from the user. The device further includes at least one processor configured to process the acoustic signals in response to detecting the speech of the user. The at least one processor is in an idle state before the speech of the user is detected.
    Type: Grant
    Filed: April 29, 2019
    Date of Patent: April 21, 2020
    Assignee: BESTECHNIC (SHANGHAI) CO., LTD.
    Inventors: Weifeng Tong, Qian Li, Liang Zhang
  • Patent number: 10199053
    Abstract: A method and apparatus for eliminating popping sounds at the beginning of audio includes: examining audio frames within a pre-set time period at the beginning of audio to determine a popping residing section; applying popping elimination to audio frames in the popping residing section; calculating an average value of amplitudes of M audio frames preceding the popping residing section and an average value of amplitudes of K audio frames succeeding the popping residing section; setting the amplitudes of the audio frames in the popping residing section to zero in response to a determination that the two average values are both smaller than a pre-set sound reduction threshold; weakening the amplitudes of the audio frames in the popping residing section in response to a determination that both the two average values are not smaller than a pre-set sound reduction threshold; M and K are integers larger than one.
    Type: Grant
    Filed: September 20, 2017
    Date of Patent: February 5, 2019
    Assignee: Tencent Technology (Shenzhen) Company Limited
    Inventor: Lingcheng Kong
  • Patent number: 10157619
    Abstract: A method and a device for searching according to a speech based on artificial intelligence are provided. The method includes: identifying an input speech of a user to determine whether the input speech is a child speech; filtrating a searched result obtained according to the input speech to obtain a filtrated searched result, if the input speech is the child speech; and feeding the filtrated searched result back to the user.
    Type: Grant
    Filed: November 28, 2017
    Date of Patent: December 18, 2018
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Chao Li, Xiangang Li, Jue Sun
  • Patent number: 10134425
    Abstract: A system for determining an endpoint of an utterance during automatic speech recognition (ASR) processing that accounts for the direction and duration of the incoming speech. Beamformers of the ASR system may identify a source direction of the audio. The system may track the duration speech has been received from that source direction so that if speech is detected in another direction, the original source speech may be weighted differently for purposes of determining an endpoint of the utterance. Speech from a new direction may be discarded or treated like non-speech for purposes of determining an endpoint of speech from an original direction.
    Type: Grant
    Filed: June 29, 2015
    Date of Patent: November 20, 2018
    Assignee: Amazon Technologies, Inc.
    Inventor: Charles Melvin Johnson, Jr.
  • Patent number: 10121471
    Abstract: An automatic speech recognition (ASR) system detects an endpoint of an utterance using the active hypotheses under consideration by a decoder. The ASR system calculates the amount of non-speech detected by a plurality of hypotheses and weights the non-speech duration by the probability of each hypotheses. When the aggregate weighted non-speech exceeds a threshold, an endpoint may be declared.
    Type: Grant
    Filed: June 29, 2015
    Date of Patent: November 6, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Bjorn Hoffmeister, Ariya Rastrow, Baiyang Liu
  • Patent number: 9905240
    Abstract: Systems, methods, and devices for intelligent speech recognition and processing are disclosed. According to one embodiment, a method for improving intelligibility of a speech signal may include (1) at least one processor receiving an incoming speech signal comprising a plurality of sound elements; (2) the at least one processor recognizing a sound element in the incoming speech signal to improve the intelligibility thereof; (3) the at least one processor processing the sound element by at least one of modifying and replacing the sound element; and (4) the at least one processor outputting the processed speech signal comprising the processed sound element.
    Type: Grant
    Filed: October 19, 2015
    Date of Patent: February 27, 2018
    Assignee: AUDIMAX, LLC
    Inventor: Harry Levitt
  • Patent number: 9690776
    Abstract: Methods and systems are provided for contextual language understanding. A natural language expression may be received at a single-turn model and a multi-turn model for determining an intent of a user. For example, the single-turn model may determine a first prediction of at least one of a domain classification, intent classification, and slot type of the natural language expression. The multi-turn model may determine a second prediction of at least one of a domain classification, intent classification, and slot type of the natural language expression. The first prediction and the second prediction may be combined to produce a final prediction relative to the intent of the natural language expression. An action may be performed based on the final prediction of the natural language expression.
    Type: Grant
    Filed: December 1, 2014
    Date of Patent: June 27, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ruhi Sarikaya, Puyang Xu, Alexandre Rochette, Asli Celikyilmaz
  • Patent number: 9484022
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a deep neural network. One of the methods includes generating a plurality of feature vectors that each model a different portion of an audio waveform, generating a first posterior probability vector for a first feature vector using a first neural network, determining whether one of the scores in the first posterior probability vector satisfies a first threshold value, generating a second posterior probability vector for each subsequent feature vector using a second neural network, wherein the second neural network is trained to identify the same key words and key phrases and includes more inner layer nodes than the first neural network, and determining whether one of the scores in the second posterior probability vector satisfies a second threshold value.
    Type: Grant
    Filed: May 23, 2014
    Date of Patent: November 1, 2016
    Assignee: Google Inc.
    Inventor: Alexander H. Gruenstein
  • Patent number: 9247376
    Abstract: A method of recommending an application, which is capable of selecting and recommending an application with a high possibility of use, the method including: receiving, in a server, frequencies of use of a plurality of applications that are classified according to a time when each application is executed or a location where each application is executed; selecting an application from among the plurality of applications based on time and location information of where a mobile terminal is located and the frequency of use of the application; and transmitting application recommendation information including the selected application from the server to the mobile terminal.
    Type: Grant
    Filed: April 11, 2012
    Date of Patent: January 26, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ji-in Nam, Moon-sang Lee, Min-soo Koo, Seung-hyun Yoon
  • Patent number: 9245529
    Abstract: A method of encoding samples in a digital signal is provided that includes receiving a plurality of samples of the digital signal, and encoding the plurality of samples, wherein an output number of bits is adapted for coding efficiency when a value in a range of possible distinct data values of the plurality of samples is not found in the plurality of samples.
    Type: Grant
    Filed: June 18, 2010
    Date of Patent: January 26, 2016
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Jacek Piotr Stachurski, Lorin Paul Netsch
  • Patent number: 9123330
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving audio data encoding ambient sounds, identifying media content that matches the audio data, and a timestamp corresponding to a particular portion of the identified media content, identifying a speaker associated with the particular portion of the identified media content corresponding to the timestamp, and providing information identifying the speaker associated with the particular portion of the identified media content for output.
    Type: Grant
    Filed: May 1, 2013
    Date of Patent: September 1, 2015
    Assignee: Google Inc.
    Inventors: Matthew Sharifi, Dominik Roblek
  • Patent number: 9117458
    Abstract: A method of processing an audio signal is disclosed. The present invention includes a method for processing an audio signal, comprising: receiving, by an audio processing apparatus, the spectral data including a current block, and substitution type information indicating whether to apply a shape prediction scheme to a current block; when the substitution type information indicates that the shape prediction scheme is applied to the current block, receiving lag information indicating an interval between spectral coefficients of the current block and the predictive shape vector of a current frame or a previous frame; obtaining spectral coefficients by substituting for spectral hole included in the current block using the predictive shape vector.
    Type: Grant
    Filed: November 2, 2010
    Date of Patent: August 25, 2015
    Assignees: LG Electronics Inc., Industry-Academic Cooperation Foundation, Yonsei University
    Inventors: Hyen-O Oh, Chang Heon Lee, Hong Goo Kang
  • Patent number: 9020817
    Abstract: Methods and apparatus, including computer program products, for using speech to text for detecting commercials and aligning edited episodes with transcripts. A method includes, receiving an original video or audio having a transcript, receiving an edited video or audio of the original video or audio, applying a speech-to-text process to the received original video or audio having a transcript, applying a speech-to-text process to the received edited video or audio, and applying an alignment to determine locations of the edits.
    Type: Grant
    Filed: January 18, 2013
    Date of Patent: April 28, 2015
    Assignee: Ramp Holdings, Inc.
    Inventors: R Paul Johnson, Raymond Lau
  • Publication number: 20150088508
    Abstract: In embodiments, apparatuses, methods and storage media are described that are associated with training adaptive speech recognition systems (“ASR”) using audio and text obtained from captioned video. In various embodiments, the audio and caption may be aligned for identification, such as according to a start and end time associated with a caption, and the alignment may be adjusted to better fit audio to a given caption. In various embodiments, the aligned audio and caption may then be used for training if an error value associated with the audio and caption demonstrates that the audio and caption will aid in training the ASR. In various embodiments, filters may be used on audio and text prior to training. Such filters may be used to exclude potential training audio and text based on filter criteria. Other embodiments may be described and claimed.
    Type: Application
    Filed: September 25, 2013
    Publication date: March 26, 2015
    Applicant: Verizon Patent and Licensing Inc.
    Inventors: Sujeeth S. Bharadwaj, Suri B. Medapati
  • Patent number: 8942978
    Abstract: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.
    Type: Grant
    Filed: July 14, 2011
    Date of Patent: January 27, 2015
    Assignee: Microsoft Corporation
    Inventors: Li Deng, Dong Yu, Xiaolong Li, Alejandro Acero
  • Publication number: 20150012274
    Abstract: An apparatus for extracting features for speech recognition in accordance with the present invention includes: a frame forming portion configured to separate input speech signals in frame units having a prescribed size; a static feature extracting portion configured to extract a static feature vector for each frame of the speech signals; a dynamic feature extracting portion configured to extract a dynamic feature vector representing a temporal variance of the extracted static feature vector by use of a basis function or a basis vector; and a feature vector combining portion configured to combine the extracted static feature vector with the extracted dynamic feature vector to configure a feature vector stream.
    Type: Application
    Filed: May 15, 2014
    Publication date: January 8, 2015
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Sung-Joo LEE, Byung-Ok Kang, Hoon Chung, Ho-Young Jung, Hwa-Jeon Song, Yoo-Rhee Oh, Yun-Keun Lee
  • Publication number: 20150012273
    Abstract: An apparatus includes a function module, a strength module, and a filter module. The function module compares an input signal, which has a component, to a first delayed version of the input signal and a second delayed version of the input signal to produce a multi-dimensional model. The strength module calculates a strength of each extremum from a plurality of extrema of the multi-dimensional model based on a value of at least one opposite extremum of the multi-dimensional model. The strength module then identifies a first extremum from the plurality of extrema, which is associated with a pitch of the component of the input signal, that has the strength greater than the strength of the remaining extrema. The filter module extracts the pitch of the component from the input signal based on the strength of the first extremum.
    Type: Application
    Filed: March 3, 2014
    Publication date: January 8, 2015
    Applicant: University Of Maryland, College Park
    Inventors: Carol Espy-Wilson, Srikanth Vishnubhotla
  • Publication number: 20150012275
    Abstract: A semiconductor integrated circuit device for speech recognition includes a scenario setting unit that receives a command designating scenario flow information and selects prescribed speech reproduction data in a speech reproduction data storage and a prescribed conversion list, in accordance with the scenario flow information, a standard pattern extraction unit that extracts a standard pattern corresponding to at least part of individual words or sentences included in the prescribed conversion list from a speech recognition database, a speech signal synthesizer that synthesizes an output speech signal, a signal processor that generates a feature pattern representing the distribution state of the frequency component of an input speech signal, and a match detector that compares the feature pattern with the standard pattern and outputs a speech recognition result.
    Type: Application
    Filed: July 7, 2014
    Publication date: January 8, 2015
    Inventor: Tsutomu NONAKA
  • Patent number: 8924216
    Abstract: A method for synchronizing sound data and text data, said text data being obtained by manual transcription of said sound data during playback of the latter. The proposed method comprises the steps of repeatedly querying said sound data and said text data to obtain a current time position corresponding to a currently played sound datum and a currently transcribed text datum, respectively, correcting said current time position by applying a time correction value in accordance with a transcription delay, and generating at least one association datum indicative of a synchronization association between said corrected time position and said currently transcribed text datum. Thus, the proposed method achieves cost-effective synchronization of sound and text in connection with the manual transcription of sound data.
    Type: Grant
    Filed: September 27, 2013
    Date of Patent: December 30, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Andreas Neubacher, Miklos Papai
  • Patent number: 8892436
    Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.
    Type: Grant
    Filed: October 19, 2011
    Date of Patent: November 18, 2014
    Assignees: Samsung Electronics Co., Ltd., Seoul National University Industry Foundation
    Inventors: Ki-wan Eom, Chang-woo Han, Tae-gyoon Kang, Nam-soo Kim, Doo-hwa Hong, Jae-won Lee, Hyung-joon Lim
  • Patent number: 8825480
    Abstract: A system is provided for transmitting information through a speech codec (in-band) such as found in a wireless communication network. A modulator transforms the data into a spectrally noise-like signal based on the mapping of a shaped pulse to predetermined positions within a modulation frame, and the signal is efficiently encoded by a speech codec. A synchronization sequence provides modulation frame timing at the receiver and is detected based on analysis of a correlation peak pattern. A request/response protocol provides reliable transfer of data using message redundancy, retransmission, and/or robust modulation modes dependent on the communication channel conditions.
    Type: Grant
    Filed: June 3, 2009
    Date of Patent: September 2, 2014
    Assignee: Qualcomm Incorporated
    Inventors: Christoph A. Joetten, Christian Sgraja, Georg Frank, Pengjun Huang, Christian Pietsch, Marc W. Werner, Ethan R. Duni, Eugene J. Baik
  • Patent number: 8781825
    Abstract: Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters.
    Type: Grant
    Filed: August 24, 2011
    Date of Patent: July 15, 2014
    Assignee: Sensory, Incorporated
    Inventors: Jonathan Shaw, Pieter Vermeulen, Stephen Sutton, Robert Savoie
  • Patent number: 8694317
    Abstract: Methods for processing audio data containing speech to produce a searchable index file and for subsequently searching such an index file are provided. The processing method uses a phonetic approach and models each frame of the audio data with a set of reference phones. A score for each of the reference phones, representing the difference of the audio from the phone model, is stored in the searchable data file for each of the phones in the reference set. A consequence of storing information regarding each of the reference phones is that the accuracy of searches carried out on the index file is not compromised by the rejection of information about particular phones. A subsequent search method is also provided which uses a simple and efficient dynamic programming search to locate instances of a search term in the audio. The methods of the present invention have particular application to the field of audio data mining.
    Type: Grant
    Filed: February 6, 2006
    Date of Patent: April 8, 2014
    Assignee: Aurix Limited
    Inventors: Adrian I Skilling, Howard A K Wright
  • Patent number: 8655655
    Abstract: A sound event detecting module for detecting whether a sound event with characteristic of repeating is generated. A sound end recognizing unit recognizes ends of sounds according to a sound signal to generate sound sections and multiple sets of feature vectors of the sound sections correspondingly. A storage unit stores at least M sets of feature vectors. A similarity comparing unit compares the at least M sets of feature vectors with each other, and correspondingly generates a similarity score matrix, which stores similarity scores of any two of the sound sections of the at least M of the sound sections. A correlation arbitrating unit determines the number of sound sections with high correlations to each other according to the similarity score matrix. When the number is greater than one threshold value, the correlation arbitrating unit indicates that the sound event with the characteristic of repeating is generated.
    Type: Grant
    Filed: December 30, 2010
    Date of Patent: February 18, 2014
    Assignee: Industrial Technology Research Institute
    Inventors: Yuh-Ching Wang, Kuo-Yuan Li
  • Patent number: 8589165
    Abstract: The present disclosure provides method and system for converting a free text expression of an identity to a phonetic equivalent code. The conversion follows a set of rules based on phonetic groupings and compresses the expression to a shorter series of characters than the expression. The phonetic equivalent code may be compared to one or more other phonetic equivalent code to establish a correlation between the codes. The phonetic equivalent code of the free text expression may be associated with the code of a known identity. The known identity may be provided to a user for confirmation of the identity. Further, a plurality of expressions stored in a database may be consolidated by converting the expressions to phonetic equivalent codes, comparing the codes to find correlations, and if appropriate reducing the number of expressions or mapping the expressions to a fewer number of expressions.
    Type: Grant
    Filed: January 24, 2012
    Date of Patent: November 19, 2013
    Assignee: United Services Automobile Association (USAA)
    Inventors: Gregory Brian Meyer, James Elden Nicholson
  • Patent number: 8576961
    Abstract: A method for determining an overlap and add length estimate comprises determining a plurality of correlation values of a plurality of ordered frequency domain samples obtained from a data frame; comparing the correlation values of a first subset of the samples to a first predetermined threshold to determine a first edge sample; comparing the correlation values of a second subset of the samples to a second predetermined threshold to determine a second edge sample; using the first and second edge samples to determine an overlap and add length estimate; and providing the overlap and add length estimate to an overlap and add circuit.
    Type: Grant
    Filed: June 15, 2009
    Date of Patent: November 5, 2013
    Assignee: Olympus Corporation
    Inventors: Haidong Zhu, Dumitru Mihai Ionescu, Abu Amanullah
  • Patent number: 8560327
    Abstract: A method for synchronizing sound data and text data, said text data being obtained by manual transcription of said sound data during playback of the latter. The proposed method comprises the steps of repeatedly querying said sound data and said text data to obtain a current time position corresponding to a currently played sound datum and a currently transcribed text datum, respectively, correcting said current time position by applying a time correction value in accordance with a transcription delay, and generating at least one association datum indicative of a synchronization association between said corrected time position and said currently transcribed text datum. Thus, the proposed method achieves cost-effective synchronization of sound and text in connection with the manual transcription of sound data.
    Type: Grant
    Filed: August 18, 2006
    Date of Patent: October 15, 2013
    Assignee: Nuance Communications, Inc.
    Inventors: Andreas Neubacher, Miklos Papai
  • Patent number: 8515753
    Abstract: The example embodiment of the present invention provides an acoustic model adaptation method for enhancing recognition performance for a non-native speaker's speech. In order to adapt acoustic models, first, pronunciation variations are examined by analyzing a non-native speaker's speech. Thereafter, based on variation pronunciation of a non-native speaker's speech, acoustic models are adapted in a state-tying step during a training process of acoustic models. When the present invention for adapting acoustic models and a conventional acoustic model adaptation scheme are combined, more-enhanced recognition performance can be obtained. The example embodiment of the present invention enhances recognition performance for a non-native speaker's speech while reducing the degradation of recognition performance for a native speaker's speech.
    Type: Grant
    Filed: March 30, 2007
    Date of Patent: August 20, 2013
    Assignee: Gwangju Institute of Science and Technology
    Inventors: Hong Kook Kim, Yoo Rhee Oh, Jae Sam Yoon
  • Patent number: 8494847
    Abstract: A weighting factor learning system includes an audio recognition section that recognizes learning audio data and outputting the recognition result; a weighting factor updating section that updates a weighting factor applied to a score obtained from an acoustic model and a language model so that the difference between a correct-answer score calculated with the use of a correct-answer text of the learning audio data and a score of the recognition result becomes large; a convergence determination section that determines, with the use of the score after updating, whether to return to the weighting factor updating section to update the weighting factor again; and a weighting factor convergence determination section that determines, with the use of the score after updating, whether to return to the audio recognition section to perform the process again and update the weighting factor using the weighting factor updating section.
    Type: Grant
    Filed: February 19, 2008
    Date of Patent: July 23, 2013
    Assignee: NEC Corporation
    Inventors: Tadashi Emori, Yoshifumi Onishi
  • Publication number: 20130166291
    Abstract: Mental state of a person is classified in an automated manner by analysing natural speech of the person. A glottal waveform is extracted from a natural speech signal. Pre-determined parameters defining at least one diagnostic class of a class model are retrieved, the parameters determined from selected training glottal waveform features. The selected glottal waveform features are extracted from the signal. Current mental state of the person is classified by comparing extracted glottal waveform features with the parameters and class model. Feature extraction from a glottal waveform or other natural speech signal may involve determining spectral amplitudes of the signal, setting spectral amplitudes below a pre-defined threshold to zero and, for each of a plurality of sub bands, determining an area under the thresholded spectral amplitudes, and deriving signal feature parameters from the determined areas in accordance with a diagnostic class model.
    Type: Application
    Filed: August 23, 2010
    Publication date: June 27, 2013
    Applicant: RMIT UNIVERSITY
    Inventors: Margaret Lech, Nicholas Brian Allen, Ian Shaw Burnett, Ling He
  • Patent number: 8447605
    Abstract: A game apparatus includes a CPU core for creating an input envelope and a registered envelope. The input envelope has a plurality of envelope values detected from a voice waveform input in real time through a microphone. The registered envelope has a plurality of envelope values detected from a voice waveform previously input. Both of the input envelope and the registered envelope are stored in a RAM. The CPU core evaluates difference of the envelope values between the input envelope and the registered envelope. When an evaluated value satisfies a condition, the CPU core executes a process according to a command assigned to the registered envelope.
    Type: Grant
    Filed: June 3, 2005
    Date of Patent: May 21, 2013
    Assignee: Nintendo Co., Ltd.
    Inventor: Yoji Inagaki
  • Patent number: 8433568
    Abstract: A method for measuring speech intelligibility includes inputting a speech waveform to a system. At least one acoustic feature is extracted from the waveform. From the acoustic feature, at least one phoneme is segmented. At least one acoustic correlate measure is extracted from the at least one phoneme and at least one intelligibility measure is determined. The at least one acoustic correlate measure is mapped to the at least one intelligibility measure.
    Type: Grant
    Filed: March 29, 2010
    Date of Patent: April 30, 2013
    Assignee: Cochlear Limited
    Inventors: Lee Krause, Mark Skowranski, Bonny Banerjee
  • Patent number: 8417518
    Abstract: A voice recognition system comprises: a voice input unit that receives an input signal from a voice input element and output it; a voice detection unit that detects an utterance segment in the input signal; a voice recognition unit that performs voice recognition for the utterance segment; and a control unit that outputs a control signal to at least one of the voice input unit and the voice detection unit and suppresses a detection frequency if the detection frequency satisfies a predetermined condition.
    Type: Grant
    Filed: February 27, 2008
    Date of Patent: April 9, 2013
    Assignee: NEC Corporation
    Inventor: Toru Iwasawa
  • Publication number: 20130013308
    Abstract: An approach is provided for determining a user age range. An age estimator causes, at least in part, acquisition of voice data. Next, the age estimator calculates a first set of probability values, wherein each of the probability values represents a probability that the voice data is in a respective one of a plurality of predefined age ranges, and the predefined age ranges are segments of a lifespan. Then, the age estimator derives a second set of probability values by applying a correlation matrix to the first set of probability values, wherein the correlation matrix associates the first set of probability values with probabilities of the voice data matching individual ages over the lifespan. Then, the age estimator, for each of the predefined age ranges, calculates a sum of the probabilities in the second set of probability values corresponding to the individual ages within the respective predefined age ranges.
    Type: Application
    Filed: March 23, 2010
    Publication date: January 10, 2013
    Applicant: NOKIA CORPORATION
    Inventors: Yang Cao, Feng Ding, Jilei Tian
  • Patent number: 8296313
    Abstract: A method for controlling a relational database system, with a query statement comprised of keywords being analyzed, with the RTN being formed of independent RTN building blocks. Each RTN building block has an inner, directed decision graph which is defined independently from the inner, directed decision graphs of the other RTN building blocks with at least one decision position along at least one decision path. The inner decision graphs of all RTN building blocks are run by means of the keywords in a selection step and all possible paths of this decision graph are followed until either no match with the respectively selected path is determined by the decision graph and the process is interrupted, or the respectively chosen path is run until the end.
    Type: Grant
    Filed: June 7, 2010
    Date of Patent: October 23, 2012
    Assignee: Mediareif Moestl & Reif Kommunikations-und Informationstechnologien OEG
    Inventor: Matthias Moestl
  • Patent number: 8265932
    Abstract: A system and method for identifying audio command prompts for use in a voice response environment is provided. A signature is generated for audio samples each having preceding audio, reference phrase audio, and trailing audio segments. The trailing segment is removed and each of the preceding and reference phrase segments are divided into buffers. The buffers are transformed into discrete fourier transform buffers. One of the discrete fourier transform buffers from the reference phrase segment that is dissimilar to each of the discrete fourier transform buffers from the preceding segment is selected as the signature. Audio command prompts are processed to generate a discrete fourier transform. Each discrete fourier transform for the audio command prompts is compared with each of the signatures and a correlation value is determined. One such audio command prompt matches one such signature when the correlation value for that audio command prompt satisfies a threshold.
    Type: Grant
    Filed: October 3, 2011
    Date of Patent: September 11, 2012
    Assignee: Intellisist, Inc.
    Inventor: Martin R. M. Dunsmuir
  • Patent number: 8249871
    Abstract: A clustering tool to generate word clusters. In embodiments described, the clustering tool includes a clustering component that generates word clusters for words or word combinations in input data. In illustrated embodiments, the word clusters are used to modify or update a grammar for a closed vocabulary speech recognition application.
    Type: Grant
    Filed: November 18, 2005
    Date of Patent: August 21, 2012
    Assignee: Microsoft Corporation
    Inventor: Kunal Mukerjee
  • Publication number: 20120197642
    Abstract: Embodiments of the present invention relate to a signal identifying method, including: obtaining signal characteristics of a current frame of input signals; deciding, according to the signal characteristics of the current frame and updated signal characteristics of a background signal frame before the current frame, whether the current frame is a background signal frame; detecting whether the current frame serving as a background signal frame is in a first type signal state; and adjusting a signal classification decision threshold according to whether the current frame serving as a background signal frame is in the first type signal state to enhance the speech signal identification capability.
    Type: Application
    Filed: April 12, 2012
    Publication date: August 2, 2012
    Applicant: Huawei Technologies Co., Ltd.
    Inventors: Yuanyuan Liu, Zhe Wang, Eyal Shlomot
  • Patent number: 8175882
    Abstract: A method for task execution improvement, the method includes: generating a baseline model for executing a task; recording a user executing a task; comparing the baseline model to the user's execution of the task; and providing feedback to the user based on the differences in the user's execution and the baseline model.
    Type: Grant
    Filed: January 25, 2008
    Date of Patent: May 8, 2012
    Assignee: International Business Machines Corporation
    Inventors: Sara H. Basson, Dimitiri Kanevsky, Edward E. Kelley, Bhuvana Ramabhadran
  • Publication number: 20120095762
    Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.
    Type: Application
    Filed: October 19, 2011
    Publication date: April 19, 2012
    Applicants: SEOUL NATIONAL UNIVERSITY INDUSTRY FOUNDATION, SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ki-wan EOM, Chang-woo HAN, Tae-gyoon KANG, Nam-soo KIM, Doo-hwa HONG, Jae-won LEE, Hyung-joon LIM
  • Patent number: 8140329
    Abstract: A method and apparatus are proposed for automatically recognizing observed audio data. An observation vector is created of audio features extracted from the observed audio data and the observed audio data is recognized from the observation vector. The audio features include features are selected from a group of 3 types of features obtained from the observed audio data: (i) ICA features obtained by processing the observed audio data, (ii) first MFCC features obtained by removing a logarithm step from the conventional MFCC process, or (iii) second MFCC features obtained by applying the ICA process to results of a mel scale filter bank.
    Type: Grant
    Filed: April 5, 2004
    Date of Patent: March 20, 2012
    Assignee: Sony Corporation
    Inventors: Jian Zhang, Wei Lu, Xiaobing Sun
  • Patent number: 8103506
    Abstract: The present disclosure provides method and system for converting a free text expression of an identity to a phonetic equivalent code. The conversion follows a set of rules based on phonetic groupings and compresses the expression to a shorter series of characters than the expression. The phonetic equivalent code may be compared to one or more other phonetic equivalent code to establish a correlation between the codes. The phonetic equivalent code of the free text expression may be associated with the code of a known identity. The known identity may be provided to a user for confirmation of the identity. Further, a plurality of expressions stored in a database may be consolidated by converting the expressions to phonetic equivalent codes, comparing the codes to find correlations, and if appropriate reducing the number of expressions or mapping the expressions to a fewer number of expressions.
    Type: Grant
    Filed: September 20, 2007
    Date of Patent: January 24, 2012
    Assignee: United Services Automobile Association
    Inventors: Gregory Brian Meyer, James Elden Nicholson
  • Publication number: 20110288864
    Abstract: A method for detecting speech using a first microphone adapted to produce a first signal (x), and a second microphone adapted to produce a second signal (x2), the method comprising the steps of: (i) applying gain to the second signal to produce a normalised second signal, which signal is normalised relative to the first signal; (ii) constructing one or more signal components from the first signal and the normalised second signal; (iii) constructing an adaptive differential microphone (ADM) having a constructed microphone response constructed from the one or more signal components which response has at least one directional null; (iv) producing one or more ADM outputs (yf, yb) from the constructed microphone response in response to detected sound; (v) computing a ratio of a parameter of either a first signal component or a constructed microphone response to a parameter of an output of the ADM; (vi) comparing the ratio to an adaptive threshold value; (vii) detecting speech if the ratio is greater than or equ
    Type: Application
    Filed: November 19, 2010
    Publication date: November 24, 2011
    Applicant: NXP B.V.
    Inventors: Patrick Kechichian, Cornelis Pieter Janse, Rene Martinus Maria Derkx, Wouter Joos Tirry
  • Patent number: 8060365
    Abstract: A dialog processing system which includes a target expression data extraction unit for extracting a plurality of target expression data each including a pattern matching portion which matches an utterance pattern, which are inputted by an utterance pattern input unit and is an utterance structure derived from contents of field-independent general conversations, among a plurality of utterance data which are inputted by an utterance data input unit and obtained by converting contents of a plurality of conversations in one field; a feature extraction unit for retrieving the pattern matching portions, respectively, from the plurality of target expression data extracted, and then for extracting feature quantity common to the plurality of pattern matching portions; and a mandatory data extraction unit for extracting mandatory data in the one field included in the plurality of utterance data by use of the feature quantities extracted.
    Type: Grant
    Filed: July 3, 2008
    Date of Patent: November 15, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Nobuyasu Itoh, Shiho Negishi, Hironori Takeuchi
  • Publication number: 20110276323
    Abstract: The illustrative embodiments described herein provide systems and methods for authenticating a speaker. In one embodiment, a method includes receiving reference speech input including a reference passphrase to form a reference recording, and receiving test speech input including a test passphrase to form a test recording. The method includes determining whether the test passphrase matches the reference passphrase, and determining whether one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase. The method authenticates the speaker of the test speech input in response to determining that the reference passphrase matches the test passphrase and that one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase.
    Type: Application
    Filed: May 6, 2010
    Publication date: November 10, 2011
    Applicant: Senam Consulting, Inc.
    Inventor: Serge Olegovich Seyfetdinov
  • Publication number: 20110231186
    Abstract: A speech detection method is presented, which includes the following steps. A first voice captured device samples a first signal and a second voice captured device samples a second signal. The first voice captured device is closer to a speech signal source than the second voice captured device. A first energy corresponding to the first signal within an interval is calculated, a second energy corresponding to the second signal within the interval is calculated, and a first ratio is calculated according to the first energy and the second energy. The first ratio is transformed into a second ratio. A threshold value is set. It is determined whether the speech signal source is detected by comparing the second ratio and the threshold value.
    Type: Application
    Filed: July 30, 2010
    Publication date: September 22, 2011
    Applicant: ISSC TECHNOLOGIES CORP.
    Inventors: Ying Tsung Lin, Yung Chen Ting, Pansop Kim
  • Patent number: 8010356
    Abstract: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.
    Type: Grant
    Filed: February 17, 2006
    Date of Patent: August 30, 2011
    Assignee: Microsoft Corporation
    Inventors: Li Deng, Dong Yu, Xiaolong Li, Alejandro Acero
  • Patent number: 7974392
    Abstract: A communication device and method are provided for audibly outputting a received text message to a user, the text message being received from a sender. A text message to present audibly is received. An output voice to present the text message is retrieved, wherein the output voice is synthesized using predefined voice characteristic information to represent the sender's voice. The output voice is used to audibly present the text message to the user.
    Type: Grant
    Filed: March 2, 2010
    Date of Patent: July 5, 2011
    Assignee: Research In Motion Limited
    Inventor: Eric Ng