Hidden Markov Models (hmms) (epo) Patents (Class 704/E15.028)
  • Patent number: 11625853
    Abstract: There is disclosed a system for automatically detecting an irregularity on a pipe. The system includes a camera arranged at an external surface of the pipe, the camera being configured to capture a Red, Green and Blue (RBG) image of a region of a pipe. One or more hardware processors are in communication with the camera and are configured to: convert the RGB image to a modified image; split the modified image into a plurality of components; generate a binary image via performing a thresholding operation which utilizes the plurality of components; and detect the irregularity on the pipe via performing a feature extraction process on the binary image. Also disclosed and described is a related method.
    Type: Grant
    Filed: April 12, 2021
    Date of Patent: April 11, 2023
    Assignee: SAUDI ARABIAN OIL COMPANY
    Inventors: Ahmed Alalouni, Abubaker Saeed
  • Publication number: 20130006631
    Abstract: Environmental recognition systems may improve recognition accuracy by leveraging local and nonlocal features in a recognition target. A local decoder may be used to analyze local features, and a nonlocal decoder may be used to analyze nonlocal features. Local and nonlocal estimates may then be exchanged to improve the accuracy of the local and nonlocal decoders. Additional iterations of analysis and exchange may be performed until a predetermined threshold is reached. In some embodiments, the system may comprise extrinsic information extractors to prevent positive feedback loops from causing the system to adhere to erroneous previous decisions.
    Type: Application
    Filed: June 28, 2012
    Publication date: January 3, 2013
    Applicant: UTAH STATE UNIVERSITY
    Inventors: Jacob Gunther, Todd Moon
  • Publication number: 20120109650
    Abstract: Disclosed herein is an apparatus and method for creating an acoustic model. The apparatus includes a binary tree creation unit, an information creation unit, and a binary tree reduction unit. The binary tree creation unit creates a binary tree by repeatedly merging a plurality of Gaussian components for each Hidden Markov Model (HMM) state of an acoustic model based on a distance measure reflecting a variation in likelihood score. The information creation unit creates information about information about the largest size of the acoustic model in accordance with a platform including a speech recognizer. The binary tree reduction unit reduces the binary tree in accordance with the information about the largest size of the acoustic model.
    Type: Application
    Filed: October 28, 2011
    Publication date: May 3, 2012
    Applicant: Electronics and Telecommunications Research Institute
    Inventors: Hoon-Young CHO, Young-Ik Kim, Il-Bin Lee, Seung-Hi Kim, Jun Park, Dong-Hyun Kim, Sang-Hun Kim
  • Publication number: 20120101820
    Abstract: A method is disclosed for applying a multi-state barge-in acoustic model in a spoken dialogue system. The method includes receiving an audio speech input from the user during the presentation of a prompt, accumulating the audio speech input from the user, applying a non-speech component having at least two one-state Hidden Markov Models (HMMs) to the audio speech input from the user, applying a speech component having at least five three-state HMMs to the audio speech input from the user, in which each of the five three-state HMMs represents a different phonetic category, determining whether the audio speech input is a barge-in-speech input from the user, and if the audio speech input is determined to be the barge-in-speech input from the user, terminating the presentation of the prompt.
    Type: Application
    Filed: October 24, 2011
    Publication date: April 26, 2012
    Applicant: AT&T Intellectual Property I, L.P.
    Inventor: Andrej Ljolje
  • Publication number: 20120059657
    Abstract: A method for detecting and recognizing speech is provided that remotely detects body motions from a speaker during vocalization with one or more radar sensors. Specifically, the radar sensors include a transmit aperture that transmits one or more waveforms towards the speaker, and each of the waveforms has a distinct wavelength. A receiver aperture is configured to receive the scattered radio frequency energy from the speaker. Doppler signals correlated with the speaker vocalization are extracted with a receiver. Digital signal processors are configured to develop feature vectors utilizing the vocalization Doppler signals, and words associated with the feature vectors are recognized with a word classifier.
    Type: Application
    Filed: June 7, 2011
    Publication date: March 8, 2012
    Inventors: Jefferson M. Willey, Todd Stephenson, Hugh Faust, James P. Hansen, George J. Linde, Carol Chang, Justin Nevitt, James A. Ballas, Thomas Herne Crystal, Vincent Michael Stanford, Jean W. de Graaf
  • Publication number: 20120053944
    Abstract: A compressed state sequence s is determined directly from the input sequence of data x. A deterministic function ƒ(x) only tracks unique state transitions, and not the dwell times in each state. A polynomial time compressed state sequence inference method outperforms conventional compressed state sequence inference techniques.
    Type: Application
    Filed: August 31, 2010
    Publication date: March 1, 2012
    Inventors: Cuneyt Oncel Tuzel, Gungor Polatkan
  • Publication number: 20120041764
    Abstract: A speech processing method, comprising: receiving a speech input which comprises a sequence of feature vectors; determining the likelihood of a sequence of words arising from the sequence of feature vectors using an acoustic model and a language model, comprising: providing an acoustic model for performing speech recognition on an input signal which comprises a sequence of feature vectors, said model having a plurality of model parameters relating to the probability distribution of a word or part thereof being related to a feature vector, wherein said speech input is a mismatched speech input which is received from a speaker in an environment which is not matched to the speaker or environment under which the acoustic model was trained; and adapting the acoustic model to the mismatched speech input, the speech processing method further comprising determining the likelihood of a sequence of features occurring in a given language using a language model; and combining the likelihoods determined by the acoustic
    Type: Application
    Filed: August 10, 2011
    Publication date: February 16, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventors: Haitian XU, Kean Kheong Chin, Mark John Francis Gales
  • Publication number: 20100191532
    Abstract: An object comparison method comprises: generating a first ordered vector sequence representation of a first object; generating a second ordered vector sequence representation of a second object; representing the first object by a first ordered sequence of model parameters generated by modeling the first ordered vector sequence representation using a semi-continuous hidden Markov model employing a universal basis; representing the second object by a second ordered sequence of model parameters generated by modeling the second ordered vector sequence representation using a semi-continuous hidden Markov model employing the universal basis; and comparing the first and second ordered sequences of model parameters to generate a quantitative comparison measure.
    Type: Application
    Filed: January 28, 2009
    Publication date: July 29, 2010
    Applicant: Xerox Corporation
    Inventors: Jose A. Rodriguez Serrano, Florent C. Perronnin
  • Publication number: 20100145698
    Abstract: Computer-implemented systems and methods are provided for assessing non-native spontaneous speech pronunciation. Speech recognition on digitized speech is performed using a non-native acoustic model trained with non-native speech to generate word hypotheses for the digitized speech. Time alignment is performed between the digitized speech and the word hypotheses using a reference acoustic model trained with native-quality speech. Statistics are calculated regarding individual words and phonemes in the word hypotheses based on the alignment. A plurality of features for use in assessing pronunciation of the speech are calculated based on the statistics, an assessment score is calculated based on one or more of the calculated features, and the assessment score is stored in a computer-readable memory.
    Type: Application
    Filed: December 1, 2009
    Publication date: June 10, 2010
    Applicant: Educational Testing Service
    Inventors: Lei Chen, Klaus Zechner, Xiaoming Xi
  • Publication number: 20100094626
    Abstract: It is an object of the present invention to provide a method and apparatus for locating a keyword of a speech and a speech recognition system. The method includes the steps of: by extracting feature parameters from frames constituting the recognition target speech, forming a feature parameter vector sequence that represents the recognition target speech; by normalizing of the feature parameter vector sequence with use of a codebook containing a plurality of codebook vectors, obtaining a feature trace of the recognition target speech in a vector space; and specifying the position of a keyword by matching prestored keyword template traces with the feature trace. According to the present invention, a keyword template trace and a feature space trace of a target speech are drawn in accordance with an identical codebook. This causes resampling to be unnecessary in performing linear movement matching of speech wave frames having similar phonological feature structures.
    Type: Application
    Filed: September 27, 2007
    Publication date: April 15, 2010
    Inventors: Fengqin Li, Yadong Wu, Qinqtao Yang, Chen Chen
  • Publication number: 20100076758
    Abstract: A speech recognition system described herein includes a receiver component that receives a distorted speech utterance. The speech recognition also includes an updater component that is in communication with a first model and a second model, wherein the updater component automatically updates parameters of the second model based at least in part upon joint estimates of additive and convolutive distortions output by the first model, wherein the joint estimates of additive and convolutive distortions are estimates of distortions based on a phase-sensitive model in the speech utterance received by the receiver component. Further, distortions other than additive and convolutive distortions, including other stationary and nonstationary sources, can also be estimated used to update the parameters of the second model.
    Type: Application
    Filed: September 24, 2008
    Publication date: March 25, 2010
    Applicant: Microsoft Corporation
    Inventors: Jinyu Li, Li Deng, Dong Yu, Yifan Gong, Alejandro Acero
  • Publication number: 20100070274
    Abstract: An apparatus for a speech recognition based on source separation and identification includes: a sound source separator for separating mixed signals, which are input to two or more microphones, into sound source signals by using independent component analysis (ICA), and estimating direction information of the separated sound source signals; and a speech recognizer for calculating normalized log likelihood probabilities of the separated sound source signals. The apparatus further includes a speech signal identifier identifying a sound source corresponding to a user's speech signal by using both of the estimated direction information and the reliability information based on the normalized log likelihood probabilities.
    Type: Application
    Filed: July 7, 2009
    Publication date: March 18, 2010
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Hoon-Young CHO, Sang Kyu Park, Jun Park, Seung Hi Kim, Ilbin Lee, Kyuwoong Hwang, Hyung-Bae Jeon, Yunkeun Lee
  • Publication number: 20100057462
    Abstract: The present invention relates to a method for speech recognition of a speech signal comprising the steps of providing at least one codebook comprising codebook entries, in particular, multivariate Gaussians of feature vectors, that are frequency weighted such that higher weights are assigned to entries corresponding to frequencies below a predetermined level than to entries corresponding to frequencies above the predetermined level and processing the speech signal for speech recognition comprising extracting at least one feature vector from the speech signal and matching the feature vector with the entries of the codebook.
    Type: Application
    Filed: September 2, 2009
    Publication date: March 4, 2010
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Tobias Herbig, Martin Raab, Raymond Brueckner, Rainer Gruhn
  • Publication number: 20080294436
    Abstract: A device may identify terms in a speech signal using speech recognition. The device may further retain one or more of the identified terms by comparing them to a set of words and send the retained terms and information associated with the retained terms to a remote device. The device may also receive messages that are related to the retained terms and to the information associated with the retained terms from the remote device.
    Type: Application
    Filed: May 21, 2007
    Publication date: November 27, 2008
    Applicant: SONY ERICSSON MOBILE COMMUNICATIONS AB
    Inventors: Mans Folke Markus Andreasson, Per Emil Astrand, Erik Johan Vendel Backlund
  • Publication number: 20080114595
    Abstract: An automatic speech recognition method for identifying words from an input speech signal includes providing at least one hypothesis recognition based on the input speech signal, the hypothesis recognition being an individual hypothesis word or a sequence of individual hypothesis words, and computing a confidence measure for the hypothesis recognition, based on the input speech signal, wherein computing a confidence measure includes computing differential contributions to the confidence measure, each as a difference between a constrained acoustic score and an unconstrained acoustic score, weighting each differential contribution by applying thereto a cumulative distribution function of the differential contribution, so as to make the distributions of the confidence measures homogeneous in terms of rejection capability, as the language, vocabulary and grammar vary, and computing the confidence measure by averaging the weighted differential contributions.
    Type: Application
    Filed: December 28, 2004
    Publication date: May 15, 2008
    Inventors: Claudio Vair, Daniele Colibro