Continuous Density, E.g, Gaussian Distribution, Lapalce (epo) Patents (Class 704/256.7)
-
Patent number: 10390130Abstract: A sound processing apparatus includes an acquisition unit configured to acquire sound signals collected by a microphone array, a sound source localization unit configured to determine a sound source direction on the basis of the sound signals acquired by the acquisition unit, and a sound source identification unit configured to identify a type of sound source on the basis of a sound model indicating a dependence relationship between sound sources, in which the sound model is represented by a probabilistic model expression including sound source localization as an element.Type: GrantFiled: June 12, 2017Date of Patent: August 20, 2019Assignee: HONDA MOTOR CO., LTD.Inventors: Kazuhiro Nakadai, Ryosuke Kojima
-
Patent number: 9602938Abstract: At least one exemplary embodiment is directed to an electronic device configured to collect acoustic information or a method including the steps of collecting acoustic data by a microphone communicatively coupled to a mobile device, analyzing the acoustic data for a sound signature, tagging the sound signature with metadata, sending the sound signature with metadata to an acoustic database, associating sounds within the acoustic data with information, presenting the information on a map on the mobile device, and accessing at least audio from the acoustic database when a cursor is placed over a specific location on the map corresponding to captured ambient sounds from a geographic location and wherein acoustic information can be retrieved corresponding to the geographic location and for different selected periods of time.Type: GrantFiled: February 1, 2016Date of Patent: March 21, 2017Assignee: Personics Holdings, LLCInventors: Steven Goldstein, Marc Boillot, Gary Hoshizaki, John P Keady
-
Patent number: 9020818Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.Type: GrantFiled: August 20, 2012Date of Patent: April 28, 2015Assignee: Malaspina Labs (Barbados) Inc.Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
-
Patent number: 8856002Abstract: A universal pattern processing system receives input data and produces output patterns that are best associated with said data. The system uses input means receiving and processing input data, a universal pattern decoder means transforming models using the input data and associating output patterns with original models that are changed least during transforming, and output means outputting best associated patterns chosen by a pattern decoder means.Type: GrantFiled: April 11, 2008Date of Patent: October 7, 2014Assignee: International Business Machines CorporationInventors: Dimitri Kanevsky, David Nahamoo, Tara N Sainath
-
Patent number: 8700403Abstract: A method of statistical modeling is provided which includes constructing a statistical model and incorporating Gaussian priors during feature selection and during parameter optimization for the construction of the statistical model.Type: GrantFiled: November 3, 2005Date of Patent: April 15, 2014Assignee: Robert Bosch GmbHInventors: Fuliang Weng, Lin Zhao
-
Patent number: 8504362Abstract: A speech recognition system includes: a speed level classifier for measuring a moving speed of a moving object by using a noise signal at an initial time of speech recognition to determine a speed level of the moving object; a first speech enhancement unit for enhancing sound quality of an input speech signal of the speech recognition by using a Wiener filter, if the speed level of the moving object is equal to or lower than a specific level; and a second speech enhancement unit enhancing the sound quality of the input speech signal by using a Gaussian mixture model, if the speed level of the moving object is higher than the specific level. The system further includes an end point detection unit for detecting start and end points, an elimination unit for eliminating sudden noise components based on a sudden noise Gaussian mixture model.Type: GrantFiled: July 21, 2009Date of Patent: August 6, 2013Assignee: Electronics and Telecommunications Research InstituteInventors: Sung Joo Lee, Ho-Young Jung, Jeon Gue Park, Hoon Chung, Yunkeun Lee, Byung Ok Kang, Hyung-Bae Jeon, Jong Jin Kim, Ki-young Park, Euisok Chung, Ji Hyun Wang, Jeom Ja Kang
-
Patent number: 8145488Abstract: A speech recognition system uses Gaussian mixture variable-parameter hidden Markov models (VPHMMs) to recognize speech. The VPHMMs include Gaussian parameters that vary as a function of at least one environmental conditioning parameter. The relationship of each Gaussian parameter to the environmental conditioning parameter(s) is modeled using a piecewise fitting approach, such as by using spline functions. In a training phase, the recognition system can use clustering to identify classes of spline functions, each class grouping together spline functions which are similar to each other based on some distance measure. The recognition system can then store sets of spline parameters that represent respective classes of spline functions. An instance of a spline function that belongs to a class can make reference to an associated shared set of spline parameters. The Gaussian parameters can be represented in an efficient form that accommodates the use of sharing in the above-summarized manner.Type: GrantFiled: September 16, 2008Date of Patent: March 27, 2012Assignee: Microsoft CorporationInventors: Dong Yu, Li Deng, Yifan Gong, Alejandro Acero
-
Patent number: 8140334Abstract: An apparatus and method for recognizing voice.Type: GrantFiled: June 28, 2006Date of Patent: March 20, 2012Assignee: Samsung Electronics Co., Ltd.Inventors: Sang-bae Jeong, Nam-hoon Kim, Jeong-su Kim, In-jeong Choi, Ick-sang Han
-
Patent number: 8140333Abstract: A probability density function compensation method used for a continuous hidden Markov model and a speech recognition method and apparatus, the probability density function compensation method including extracting feature vectors from speech signals, and using the extracted feature vectors, training a model having a plurality of probability density functions to increase probabilities of recognizing the speech signals; obtaining a global variance by averaging variances of the plurality of the probability density functions after completing the training; obtaining a compensation factor using the global variance; and applying the global variance to each of the probability density functions and compensating each of the probability density functions for the global variance using the compensation factor.Type: GrantFiled: February 28, 2005Date of Patent: March 20, 2012Assignee: Samsung Electronics Co., Ltd.Inventors: Icksang Han, Sangbae Jeong, Eugene Jon
-
Patent number: 8078465Abstract: Certain aspects and embodiments of the present invention are directed to systems and methods for monitoring and analyzing the language environment and the development of a key child. A key child's language environment and language development can be monitored without placing artificial limitations on the key child's activities or requiring a third party observer. The language environment can be analyzed to identify words, vocalizations, or other noises directed to or spoken by the key child, independent of content. The analysis can include the number of responses between the child and another, such as an adult and the number of words spoken by the child and/or another, independent of content of the speech. One or more metrics can be determined based on the analysis and provided to assist in improving the language environment and/or tracking language development of the key child.Type: GrantFiled: January 23, 2008Date of Patent: December 13, 2011Assignee: LENA FoundationInventors: Terrance Paul, Dongxin Xu, Umit Yapenel, Sharmistha Gray
-
Patent number: 8014617Abstract: A decoding apparatus includes a random number generating section and a decoding section. The random number generating section generates random numbers according to distribution of original data corresponding to respective quantization indexes. The decoding section generates decoded data on a basis of the random numbers generated by the random number generating section.Type: GrantFiled: March 12, 2010Date of Patent: September 6, 2011Assignee: Fuji Xerox Co., Ltd.Inventor: Shunichi Kimura
-
Patent number: 8005306Abstract: A decoding apparatus includes a classification section, a distribution-information generation section and an inverse-quantization-value generation section. The classification section classifies quantization indices contained in input code data into a plurality of groups. The distribution-information generation section generates distribution information of the quantization indices for each group, based on the quantization indices classified by the classification section. The inverse-quantization-value generation section generates inverse quantization values, which correspond to the respective quantization indices, based on the distribution information generated by the distribution-information generation section.Type: GrantFiled: August 9, 2006Date of Patent: August 23, 2011Assignee: Fuji Xerox Co., Ltd.Inventor: Shunichi Kimura
-
Patent number: 8005666Abstract: An automatic system for temporal alignment between a music audio signal and lyrics is provided. The automatic system can prevent accuracy for temporal alignment from being lowered due to the influence of non-vocal sections. Alignment means of the system is provided with a phone model for singing voice that estimates phonemes corresponding to temporal-alignment features or features available for temporal alignment. The alignment means receives temporal-alignment features outputted from temporal-alignment feature extraction means, information on the vocal and non-vocal sections outputted from vocal section estimation means, and a phoneme network, and performs an alignment operation on condition that no phoneme exists at least in non-vocal sections.Type: GrantFiled: August 7, 2007Date of Patent: August 23, 2011Assignee: National Institute of Advanced Industrial Science and TechnologyInventors: Masataka Goto, Hiromasa Fujihara, Hiroshi Okuno
-
Patent number: 7941317Abstract: Systems and methods for low-latency real-time speech recognition/transcription. A discriminative feature extraction, such as a heteroscedastic discriminant analysis transform, in combination with a maximum likelihood linear transform is applied during front-end processing of a digital speech signal. The extracted features reduce the word error rate. A discriminative acoustic model is applied by generating state-level lattices using Maximum Mutual Information Estimation. Recognition networks of language models are replaced by their closure. Latency is reduced by eliminating segmentation such that a number of words/sentences can be recognized as a single utterance. Latency is further reduced by performing front-end normalization in a causal fashion.Type: GrantFiled: June 5, 2007Date of Patent: May 10, 2011Assignee: AT&T Intellectual Property II, L.P.Inventors: Vincent Goffin, Michael Dennis Riley, Murat Saraclar
-
Patent number: 7930181Abstract: Systems and methods for low-latency real-time speech recognition/transcription. A discriminative feature extraction, such as a heteroscedastic discriminant analysis transform, in combination with a maximum likelihood linear transform is applied during front-end processing of a digital speech signal. The extracted features reduce the word error rate. A discriminative acoustic model is applied by generating state-level lattices using Maximum Mutual Information Estimation. Recognition networks of language models are replaced by their closure. Latency is reduced by eliminating segmentation such that a number of words/sentences can be recognized as a single utterance. Latency is further reduced by performing front-end normalization in a causal fashion.Type: GrantFiled: November 21, 2002Date of Patent: April 19, 2011Assignee: AT&T Intellectual Property II, L.P.Inventors: Vincent Goffin, Michael Dennis Riley, Murat Saraclar
-
Patent number: 7805301Abstract: A reliable full covariance matrix estimation algorithm for pattern unit's state output distribution in pattern recognition system is discussed. An intermediate hierarchical tree structure is built to relate models for product units. Full covariance matrices of pattern unit's state output distribution are estimated based on all the related nodes in the tree.Type: GrantFiled: July 1, 2005Date of Patent: September 28, 2010Assignee: Microsoft CorporationInventors: Ye Tian, Frank Kao-Ping Soong, Jian-Lai Zhou
-
Patent number: 7778463Abstract: A pattern recognition system, pattern recognition method, and pattern recognition program capable of increasing the accuracy in computing the false acceptance probability and capable of ensuring a stable security strength are provided. Pattern recognition systems 10 and 10a comprise a first probability computation unit 32, and a second probability computation unit 33 coupled to the first probability computation unit 32. The first probability computation unit 32 computes a first probability PFCR based on the number n of corresponding characteristic points cs1 to csn and cf1 to cfn indicating points corresponding between characteristic points s1 to sns in a first pattern and characteristic points f1 to fnf in a second pattern. The first probability PFCR indicates the probability of existence of a third pattern that has a greater number of corresponding characteristic points to the first pattern than the number n of the corresponding characteristic points.Type: GrantFiled: July 14, 2006Date of Patent: August 17, 2010Assignee: NEC CorporationInventor: Lei Huang
-
Publication number: 20100191532Abstract: An object comparison method comprises: generating a first ordered vector sequence representation of a first object; generating a second ordered vector sequence representation of a second object; representing the first object by a first ordered sequence of model parameters generated by modeling the first ordered vector sequence representation using a semi-continuous hidden Markov model employing a universal basis; representing the second object by a second ordered sequence of model parameters generated by modeling the second ordered vector sequence representation using a semi-continuous hidden Markov model employing the universal basis; and comparing the first and second ordered sequences of model parameters to generate a quantitative comparison measure.Type: ApplicationFiled: January 28, 2009Publication date: July 29, 2010Applicant: Xerox CorporationInventors: Jose A. Rodriguez Serrano, Florent C. Perronnin
-
Patent number: 7571097Abstract: A method for compressing multiple dimensional gaussian distributions with diagonal covariance matrixes includes clustering a plurality of gaussian distributions in a multiplicity of clusters for each dimension. Each cluster can be represented by a centroid having a mean and a variance. A total decrease in likelihood of a training dataset is minimized for the representation of the plurality of gaussian distributions.Type: GrantFiled: March 13, 2003Date of Patent: August 4, 2009Assignee: Microsoft CorporationInventors: Alejandro Acero, Michael D. Plumpe
-
Patent number: 7505950Abstract: Systems and methods are provided for performing soft alignment in Gaussian mixture model (GMM) based and other vector transformations. Soft alignment may assign alignment probabilities to source and target feature vector pairs. The vector pairs and associated probabilities may then be used calculate a conversion function, for example, by computing GMM training parameters from the joint vectors and alignment probabilities to create a voice conversion function for converting speech sounds from a source speaker to a target speaker.Type: GrantFiled: April 26, 2006Date of Patent: March 17, 2009Assignee: Nokia CorporationInventors: Jilei Tian, Jani Nurminen, Victor Popa
-
Patent number: 7454336Abstract: A system and method that facilitate modeling unobserved speech dynamics based upon a hidden dynamic speech model in the form of segmental switching state space model that employs model parameters including those describing the unobserved speech dynamics and those describing the relationship between the unobserved speech dynamic vector and the observed acoustic feature vector is provided. The model parameters are modified based, at least in part, upon, a variational learning technique. In accordance with an aspect of the present invention, novel and powerful variational expectation maximization (EM) algorithm(s) for the segmental switching state space models used in speech applications, which are capable of capturing key internal (or hidden) dynamics of natural speech production, are provided. For example, modification of model parameters can be based upon an approximate mixture of Gaussian (MOG) posterior and/or based upon an approximate hidden Markov model (HMM) posterior using a variational technique.Type: GrantFiled: June 20, 2003Date of Patent: November 18, 2008Assignee: Microsoft CorporationInventors: Hagai Attias, Li Deng, Leo J. Lee
-
Patent number: 7454341Abstract: According to one aspect of the invention, a method is provided in which a mean vector set and a variance vector set of a set of N Gaussians are divided into multiple mean sub-vector sets and variance sub-vector sets, respectively. Each mean sub-vector set contains a subset of the dimensions of the corresponding mean vector set and each variance sub-vector set contains a subset of the dimensions of the corresponding variance vector set. Each resultant sub-vector set is clustered to build a codebook for the respective sub-vector set using a modified K-means clustering process which dynamically merges and splits clusters based upon the size and average distortion of each cluster during each iteration in the modified K-means clustering process.Type: GrantFiled: September 30, 2000Date of Patent: November 18, 2008Assignee: Intel CorporationInventors: Jielin Pan, Baosheng Yuan
-
Patent number: 7263485Abstract: A method (200) and apparatus (100) for classifying a homogeneous audio segment are disclosed. The homogeneous audio comprises a sequence of audio samples (x(n)). The method (200) starts by forming a sequence of frames (701-704) along the sequence of audio samples (x(n)), each frame (701-704) comprising a plurality of the audio samples (x(n)). The homogeneous audio segment is next divided (206) into a plurality of audio clips (711-714), with each audio clip being associated with a plurality of the frames (701-704). The method (200) then extracts (208) at least one frame feature for each clip (711-714). A clip feature vector (f) is next extracted from frame features of frames associated with the audio clip (711-714). Finally the segment is classified based on a continuous function during the distribution of the clip feature vectors (f).Type: GrantFiled: May 28, 2003Date of Patent: August 28, 2007Assignee: Canon Kabushiki KaishaInventor: Timothy John Wark