Continuous Density, E.g, Gaussian Distribution, Lapalce (epo) Patents (Class 704/256.7)
  • Patent number: 10390130
    Abstract: A sound processing apparatus includes an acquisition unit configured to acquire sound signals collected by a microphone array, a sound source localization unit configured to determine a sound source direction on the basis of the sound signals acquired by the acquisition unit, and a sound source identification unit configured to identify a type of sound source on the basis of a sound model indicating a dependence relationship between sound sources, in which the sound model is represented by a probabilistic model expression including sound source localization as an element.
    Type: Grant
    Filed: June 12, 2017
    Date of Patent: August 20, 2019
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Kazuhiro Nakadai, Ryosuke Kojima
  • Patent number: 9602938
    Abstract: At least one exemplary embodiment is directed to an electronic device configured to collect acoustic information or a method including the steps of collecting acoustic data by a microphone communicatively coupled to a mobile device, analyzing the acoustic data for a sound signature, tagging the sound signature with metadata, sending the sound signature with metadata to an acoustic database, associating sounds within the acoustic data with information, presenting the information on a map on the mobile device, and accessing at least audio from the acoustic database when a cursor is placed over a specific location on the map corresponding to captured ambient sounds from a geographic location and wherein acoustic information can be retrieved corresponding to the geographic location and for different selected periods of time.
    Type: Grant
    Filed: February 1, 2016
    Date of Patent: March 21, 2017
    Assignee: Personics Holdings, LLC
    Inventors: Steven Goldstein, Marc Boillot, Gary Hoshizaki, John P Keady
  • Patent number: 9020818
    Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.
    Type: Grant
    Filed: August 20, 2012
    Date of Patent: April 28, 2015
    Assignee: Malaspina Labs (Barbados) Inc.
    Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
  • Patent number: 8856002
    Abstract: A universal pattern processing system receives input data and produces output patterns that are best associated with said data. The system uses input means receiving and processing input data, a universal pattern decoder means transforming models using the input data and associating output patterns with original models that are changed least during transforming, and output means outputting best associated patterns chosen by a pattern decoder means.
    Type: Grant
    Filed: April 11, 2008
    Date of Patent: October 7, 2014
    Assignee: International Business Machines Corporation
    Inventors: Dimitri Kanevsky, David Nahamoo, Tara N Sainath
  • Patent number: 8700403
    Abstract: A method of statistical modeling is provided which includes constructing a statistical model and incorporating Gaussian priors during feature selection and during parameter optimization for the construction of the statistical model.
    Type: Grant
    Filed: November 3, 2005
    Date of Patent: April 15, 2014
    Assignee: Robert Bosch GmbH
    Inventors: Fuliang Weng, Lin Zhao
  • Patent number: 8504362
    Abstract: A speech recognition system includes: a speed level classifier for measuring a moving speed of a moving object by using a noise signal at an initial time of speech recognition to determine a speed level of the moving object; a first speech enhancement unit for enhancing sound quality of an input speech signal of the speech recognition by using a Wiener filter, if the speed level of the moving object is equal to or lower than a specific level; and a second speech enhancement unit enhancing the sound quality of the input speech signal by using a Gaussian mixture model, if the speed level of the moving object is higher than the specific level. The system further includes an end point detection unit for detecting start and end points, an elimination unit for eliminating sudden noise components based on a sudden noise Gaussian mixture model.
    Type: Grant
    Filed: July 21, 2009
    Date of Patent: August 6, 2013
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Sung Joo Lee, Ho-Young Jung, Jeon Gue Park, Hoon Chung, Yunkeun Lee, Byung Ok Kang, Hyung-Bae Jeon, Jong Jin Kim, Ki-young Park, Euisok Chung, Ji Hyun Wang, Jeom Ja Kang
  • Patent number: 8145488
    Abstract: A speech recognition system uses Gaussian mixture variable-parameter hidden Markov models (VPHMMs) to recognize speech. The VPHMMs include Gaussian parameters that vary as a function of at least one environmental conditioning parameter. The relationship of each Gaussian parameter to the environmental conditioning parameter(s) is modeled using a piecewise fitting approach, such as by using spline functions. In a training phase, the recognition system can use clustering to identify classes of spline functions, each class grouping together spline functions which are similar to each other based on some distance measure. The recognition system can then store sets of spline parameters that represent respective classes of spline functions. An instance of a spline function that belongs to a class can make reference to an associated shared set of spline parameters. The Gaussian parameters can be represented in an efficient form that accommodates the use of sharing in the above-summarized manner.
    Type: Grant
    Filed: September 16, 2008
    Date of Patent: March 27, 2012
    Assignee: Microsoft Corporation
    Inventors: Dong Yu, Li Deng, Yifan Gong, Alejandro Acero
  • Patent number: 8140334
    Abstract: An apparatus and method for recognizing voice.
    Type: Grant
    Filed: June 28, 2006
    Date of Patent: March 20, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sang-bae Jeong, Nam-hoon Kim, Jeong-su Kim, In-jeong Choi, Ick-sang Han
  • Patent number: 8140333
    Abstract: A probability density function compensation method used for a continuous hidden Markov model and a speech recognition method and apparatus, the probability density function compensation method including extracting feature vectors from speech signals, and using the extracted feature vectors, training a model having a plurality of probability density functions to increase probabilities of recognizing the speech signals; obtaining a global variance by averaging variances of the plurality of the probability density functions after completing the training; obtaining a compensation factor using the global variance; and applying the global variance to each of the probability density functions and compensating each of the probability density functions for the global variance using the compensation factor.
    Type: Grant
    Filed: February 28, 2005
    Date of Patent: March 20, 2012
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Icksang Han, Sangbae Jeong, Eugene Jon
  • Patent number: 8078465
    Abstract: Certain aspects and embodiments of the present invention are directed to systems and methods for monitoring and analyzing the language environment and the development of a key child. A key child's language environment and language development can be monitored without placing artificial limitations on the key child's activities or requiring a third party observer. The language environment can be analyzed to identify words, vocalizations, or other noises directed to or spoken by the key child, independent of content. The analysis can include the number of responses between the child and another, such as an adult and the number of words spoken by the child and/or another, independent of content of the speech. One or more metrics can be determined based on the analysis and provided to assist in improving the language environment and/or tracking language development of the key child.
    Type: Grant
    Filed: January 23, 2008
    Date of Patent: December 13, 2011
    Assignee: LENA Foundation
    Inventors: Terrance Paul, Dongxin Xu, Umit Yapenel, Sharmistha Gray
  • Patent number: 8014617
    Abstract: A decoding apparatus includes a random number generating section and a decoding section. The random number generating section generates random numbers according to distribution of original data corresponding to respective quantization indexes. The decoding section generates decoded data on a basis of the random numbers generated by the random number generating section.
    Type: Grant
    Filed: March 12, 2010
    Date of Patent: September 6, 2011
    Assignee: Fuji Xerox Co., Ltd.
    Inventor: Shunichi Kimura
  • Patent number: 8005306
    Abstract: A decoding apparatus includes a classification section, a distribution-information generation section and an inverse-quantization-value generation section. The classification section classifies quantization indices contained in input code data into a plurality of groups. The distribution-information generation section generates distribution information of the quantization indices for each group, based on the quantization indices classified by the classification section. The inverse-quantization-value generation section generates inverse quantization values, which correspond to the respective quantization indices, based on the distribution information generated by the distribution-information generation section.
    Type: Grant
    Filed: August 9, 2006
    Date of Patent: August 23, 2011
    Assignee: Fuji Xerox Co., Ltd.
    Inventor: Shunichi Kimura
  • Patent number: 8005666
    Abstract: An automatic system for temporal alignment between a music audio signal and lyrics is provided. The automatic system can prevent accuracy for temporal alignment from being lowered due to the influence of non-vocal sections. Alignment means of the system is provided with a phone model for singing voice that estimates phonemes corresponding to temporal-alignment features or features available for temporal alignment. The alignment means receives temporal-alignment features outputted from temporal-alignment feature extraction means, information on the vocal and non-vocal sections outputted from vocal section estimation means, and a phoneme network, and performs an alignment operation on condition that no phoneme exists at least in non-vocal sections.
    Type: Grant
    Filed: August 7, 2007
    Date of Patent: August 23, 2011
    Assignee: National Institute of Advanced Industrial Science and Technology
    Inventors: Masataka Goto, Hiromasa Fujihara, Hiroshi Okuno
  • Patent number: 7941317
    Abstract: Systems and methods for low-latency real-time speech recognition/transcription. A discriminative feature extraction, such as a heteroscedastic discriminant analysis transform, in combination with a maximum likelihood linear transform is applied during front-end processing of a digital speech signal. The extracted features reduce the word error rate. A discriminative acoustic model is applied by generating state-level lattices using Maximum Mutual Information Estimation. Recognition networks of language models are replaced by their closure. Latency is reduced by eliminating segmentation such that a number of words/sentences can be recognized as a single utterance. Latency is further reduced by performing front-end normalization in a causal fashion.
    Type: Grant
    Filed: June 5, 2007
    Date of Patent: May 10, 2011
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Vincent Goffin, Michael Dennis Riley, Murat Saraclar
  • Patent number: 7930181
    Abstract: Systems and methods for low-latency real-time speech recognition/transcription. A discriminative feature extraction, such as a heteroscedastic discriminant analysis transform, in combination with a maximum likelihood linear transform is applied during front-end processing of a digital speech signal. The extracted features reduce the word error rate. A discriminative acoustic model is applied by generating state-level lattices using Maximum Mutual Information Estimation. Recognition networks of language models are replaced by their closure. Latency is reduced by eliminating segmentation such that a number of words/sentences can be recognized as a single utterance. Latency is further reduced by performing front-end normalization in a causal fashion.
    Type: Grant
    Filed: November 21, 2002
    Date of Patent: April 19, 2011
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Vincent Goffin, Michael Dennis Riley, Murat Saraclar
  • Patent number: 7805301
    Abstract: A reliable full covariance matrix estimation algorithm for pattern unit's state output distribution in pattern recognition system is discussed. An intermediate hierarchical tree structure is built to relate models for product units. Full covariance matrices of pattern unit's state output distribution are estimated based on all the related nodes in the tree.
    Type: Grant
    Filed: July 1, 2005
    Date of Patent: September 28, 2010
    Assignee: Microsoft Corporation
    Inventors: Ye Tian, Frank Kao-Ping Soong, Jian-Lai Zhou
  • Patent number: 7778463
    Abstract: A pattern recognition system, pattern recognition method, and pattern recognition program capable of increasing the accuracy in computing the false acceptance probability and capable of ensuring a stable security strength are provided. Pattern recognition systems 10 and 10a comprise a first probability computation unit 32, and a second probability computation unit 33 coupled to the first probability computation unit 32. The first probability computation unit 32 computes a first probability PFCR based on the number n of corresponding characteristic points cs1 to csn and cf1 to cfn indicating points corresponding between characteristic points s1 to sns in a first pattern and characteristic points f1 to fnf in a second pattern. The first probability PFCR indicates the probability of existence of a third pattern that has a greater number of corresponding characteristic points to the first pattern than the number n of the corresponding characteristic points.
    Type: Grant
    Filed: July 14, 2006
    Date of Patent: August 17, 2010
    Assignee: NEC Corporation
    Inventor: Lei Huang
  • Publication number: 20100191532
    Abstract: An object comparison method comprises: generating a first ordered vector sequence representation of a first object; generating a second ordered vector sequence representation of a second object; representing the first object by a first ordered sequence of model parameters generated by modeling the first ordered vector sequence representation using a semi-continuous hidden Markov model employing a universal basis; representing the second object by a second ordered sequence of model parameters generated by modeling the second ordered vector sequence representation using a semi-continuous hidden Markov model employing the universal basis; and comparing the first and second ordered sequences of model parameters to generate a quantitative comparison measure.
    Type: Application
    Filed: January 28, 2009
    Publication date: July 29, 2010
    Applicant: Xerox Corporation
    Inventors: Jose A. Rodriguez Serrano, Florent C. Perronnin
  • Patent number: 7571097
    Abstract: A method for compressing multiple dimensional gaussian distributions with diagonal covariance matrixes includes clustering a plurality of gaussian distributions in a multiplicity of clusters for each dimension. Each cluster can be represented by a centroid having a mean and a variance. A total decrease in likelihood of a training dataset is minimized for the representation of the plurality of gaussian distributions.
    Type: Grant
    Filed: March 13, 2003
    Date of Patent: August 4, 2009
    Assignee: Microsoft Corporation
    Inventors: Alejandro Acero, Michael D. Plumpe
  • Patent number: 7505950
    Abstract: Systems and methods are provided for performing soft alignment in Gaussian mixture model (GMM) based and other vector transformations. Soft alignment may assign alignment probabilities to source and target feature vector pairs. The vector pairs and associated probabilities may then be used calculate a conversion function, for example, by computing GMM training parameters from the joint vectors and alignment probabilities to create a voice conversion function for converting speech sounds from a source speaker to a target speaker.
    Type: Grant
    Filed: April 26, 2006
    Date of Patent: March 17, 2009
    Assignee: Nokia Corporation
    Inventors: Jilei Tian, Jani Nurminen, Victor Popa
  • Patent number: 7454336
    Abstract: A system and method that facilitate modeling unobserved speech dynamics based upon a hidden dynamic speech model in the form of segmental switching state space model that employs model parameters including those describing the unobserved speech dynamics and those describing the relationship between the unobserved speech dynamic vector and the observed acoustic feature vector is provided. The model parameters are modified based, at least in part, upon, a variational learning technique. In accordance with an aspect of the present invention, novel and powerful variational expectation maximization (EM) algorithm(s) for the segmental switching state space models used in speech applications, which are capable of capturing key internal (or hidden) dynamics of natural speech production, are provided. For example, modification of model parameters can be based upon an approximate mixture of Gaussian (MOG) posterior and/or based upon an approximate hidden Markov model (HMM) posterior using a variational technique.
    Type: Grant
    Filed: June 20, 2003
    Date of Patent: November 18, 2008
    Assignee: Microsoft Corporation
    Inventors: Hagai Attias, Li Deng, Leo J. Lee
  • Patent number: 7454341
    Abstract: According to one aspect of the invention, a method is provided in which a mean vector set and a variance vector set of a set of N Gaussians are divided into multiple mean sub-vector sets and variance sub-vector sets, respectively. Each mean sub-vector set contains a subset of the dimensions of the corresponding mean vector set and each variance sub-vector set contains a subset of the dimensions of the corresponding variance vector set. Each resultant sub-vector set is clustered to build a codebook for the respective sub-vector set using a modified K-means clustering process which dynamically merges and splits clusters based upon the size and average distortion of each cluster during each iteration in the modified K-means clustering process.
    Type: Grant
    Filed: September 30, 2000
    Date of Patent: November 18, 2008
    Assignee: Intel Corporation
    Inventors: Jielin Pan, Baosheng Yuan
  • Patent number: 7263485
    Abstract: A method (200) and apparatus (100) for classifying a homogeneous audio segment are disclosed. The homogeneous audio comprises a sequence of audio samples (x(n)). The method (200) starts by forming a sequence of frames (701-704) along the sequence of audio samples (x(n)), each frame (701-704) comprising a plurality of the audio samples (x(n)). The homogeneous audio segment is next divided (206) into a plurality of audio clips (711-714), with each audio clip being associated with a plurality of the frames (701-704). The method (200) then extracts (208) at least one frame feature for each clip (711-714). A clip feature vector (f) is next extracted from frame features of frames associated with the audio clip (711-714). Finally the segment is classified based on a continuous function during the distribution of the clip feature vectors (f).
    Type: Grant
    Filed: May 28, 2003
    Date of Patent: August 28, 2007
    Assignee: Canon Kabushiki Kaisha
    Inventor: Timothy John Wark