Continuous Density, E.g, Gaussian Distribution, Lapalce (epo) Patents (Class 704/256.7)

Sound processing apparatus and sound processing method

Patent number: 10390130

Abstract: A sound processing apparatus includes an acquisition unit configured to acquire sound signals collected by a microphone array, a sound source localization unit configured to determine a sound source direction on the basis of the sound signals acquired by the acquisition unit, and a sound source identification unit configured to identify a type of sound source on the basis of a sound model indicating a dependence relationship between sound sources, in which the sound model is represented by a probabilistic model expression including sound source localization as an element.

Type: Grant

Filed: June 12, 2017

Date of Patent: August 20, 2019

Assignee: HONDA MOTOR CO., LTD.

Inventors: Kazuhiro Nakadai, Ryosuke Kojima
Sound library and method

Patent number: 9602938

Abstract: At least one exemplary embodiment is directed to an electronic device configured to collect acoustic information or a method including the steps of collecting acoustic data by a microphone communicatively coupled to a mobile device, analyzing the acoustic data for a sound signature, tagging the sound signature with metadata, sending the sound signature with metadata to an acoustic database, associating sounds within the acoustic data with information, presenting the information on a map on the mobile device, and accessing at least audio from the acoustic database when a cursor is placed over a specific location on the map corresponding to captured ambient sounds from a geographic location and wherein acoustic information can be retrieved corresponding to the geographic location and for different selected periods of time.

Type: Grant

Filed: February 1, 2016

Date of Patent: March 21, 2017

Assignee: Personics Holdings, LLC

Inventors: Steven Goldstein, Marc Boillot, Gary Hoshizaki, John P Keady
Format based speech reconstruction from noisy signals

Patent number: 9020818

Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.

Type: Grant

Filed: August 20, 2012

Date of Patent: April 28, 2015

Assignee: Malaspina Labs (Barbados) Inc.

Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
Distance metrics for universal pattern processing tasks

Patent number: 8856002

Abstract: A universal pattern processing system receives input data and produces output patterns that are best associated with said data. The system uses input means receiving and processing input data, a universal pattern decoder means transforming models using the input data and associating output patterns with original models that are changed least during transforming, and output means outputting best associated patterns chosen by a pattern decoder means.

Type: Grant

Filed: April 11, 2008

Date of Patent: October 7, 2014

Assignee: International Business Machines Corporation

Inventors: Dimitri Kanevsky, David Nahamoo, Tara N Sainath
Unified treatment of data-sparseness and data-overfitting in maximum entropy modeling

Patent number: 8700403

Abstract: A method of statistical modeling is provided which includes constructing a statistical model and incorporating Gaussian priors during feature selection and during parameter optimization for the construction of the statistical model.

Type: Grant

Filed: November 3, 2005

Date of Patent: April 15, 2014

Assignee: Robert Bosch GmbH

Inventors: Fuliang Weng, Lin Zhao
Noise reduction for speech recognition in a moving vehicle

Patent number: 8504362

Abstract: A speech recognition system includes: a speed level classifier for measuring a moving speed of a moving object by using a noise signal at an initial time of speech recognition to determine a speed level of the moving object; a first speech enhancement unit for enhancing sound quality of an input speech signal of the speech recognition by using a Wiener filter, if the speed level of the moving object is equal to or lower than a specific level; and a second speech enhancement unit enhancing the sound quality of the input speech signal by using a Gaussian mixture model, if the speed level of the moving object is higher than the specific level. The system further includes an end point detection unit for detecting start and end points, an elimination unit for eliminating sudden noise components based on a sudden noise Gaussian mixture model.

Type: Grant

Filed: July 21, 2009

Date of Patent: August 6, 2013

Assignee: Electronics and Telecommunications Research Institute

Inventors: Sung Joo Lee, Ho-Young Jung, Jeon Gue Park, Hoon Chung, Yunkeun Lee, Byung Ok Kang, Hyung-Bae Jeon, Jong Jin Kim, Ki-young Park, Euisok Chung, Ji Hyun Wang, Jeom Ja Kang
Parameter clustering and sharing for variable-parameter hidden markov models

Patent number: 8145488

Abstract: A speech recognition system uses Gaussian mixture variable-parameter hidden Markov models (VPHMMs) to recognize speech. The VPHMMs include Gaussian parameters that vary as a function of at least one environmental conditioning parameter. The relationship of each Gaussian parameter to the environmental conditioning parameter(s) is modeled using a piecewise fitting approach, such as by using spline functions. In a training phase, the recognition system can use clustering to identify classes of spline functions, each class grouping together spline functions which are similar to each other based on some distance measure. The recognition system can then store sets of spline parameters that represent respective classes of spline functions. An instance of a spline function that belongs to a class can make reference to an associated shared set of spline parameters. The Gaussian parameters can be represented in an efficient form that accommodates the use of sharing in the above-summarized manner.

Type: Grant

Filed: September 16, 2008

Date of Patent: March 27, 2012

Assignee: Microsoft Corporation

Inventors: Dong Yu, Li Deng, Yifan Gong, Alejandro Acero
Apparatus and method for recognizing voice

Patent number: 8140334

Abstract: An apparatus and method for recognizing voice.

Type: Grant

Filed: June 28, 2006

Date of Patent: March 20, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: Sang-bae Jeong, Nam-hoon Kim, Jeong-su Kim, In-jeong Choi, Ick-sang Han
Probability density function compensation method for hidden markov model and speech recognition method and apparatus using the same

Patent number: 8140333

Abstract: A probability density function compensation method used for a continuous hidden Markov model and a speech recognition method and apparatus, the probability density function compensation method including extracting feature vectors from speech signals, and using the extracted feature vectors, training a model having a plurality of probability density functions to increase probabilities of recognizing the speech signals; obtaining a global variance by averaging variances of the plurality of the probability density functions after completing the training; obtaining a compensation factor using the global variance; and applying the global variance to each of the probability density functions and compensating each of the probability density functions for the global variance using the compensation factor.

Type: Grant

Filed: February 28, 2005

Date of Patent: March 20, 2012

Assignee: Samsung Electronics Co., Ltd.

Inventors: Icksang Han, Sangbae Jeong, Eugene Jon
System and method for detection and analysis of speech

Patent number: 8078465

Abstract: Certain aspects and embodiments of the present invention are directed to systems and methods for monitoring and analyzing the language environment and the development of a key child. A key child's language environment and language development can be monitored without placing artificial limitations on the key child's activities or requiring a third party observer. The language environment can be analyzed to identify words, vocalizations, or other noises directed to or spoken by the key child, independent of content. The analysis can include the number of responses between the child and another, such as an adult and the number of words spoken by the child and/or another, independent of content of the speech. One or more metrics can be determined based on the analysis and provided to assist in improving the language environment and/or tracking language development of the key child.

Type: Grant

Filed: January 23, 2008

Date of Patent: December 13, 2011

Assignee: LENA Foundation

Inventors: Terrance Paul, Dongxin Xu, Umit Yapenel, Sharmistha Gray
Decoding apparatus, dequantizing method, distribution determining method, and program thereof

Patent number: 8014617

Abstract: A decoding apparatus includes a random number generating section and a decoding section. The random number generating section generates random numbers according to distribution of original data corresponding to respective quantization indexes. The decoding section generates decoded data on a basis of the random numbers generated by the random number generating section.

Type: Grant

Filed: March 12, 2010

Date of Patent: September 6, 2011

Assignee: Fuji Xerox Co., Ltd.

Inventor: Shunichi Kimura
Decoding apparatus, inverse quantization method, and computer readable medium

Patent number: 8005306

Abstract: A decoding apparatus includes a classification section, a distribution-information generation section and an inverse-quantization-value generation section. The classification section classifies quantization indices contained in input code data into a plurality of groups. The distribution-information generation section generates distribution information of the quantization indices for each group, based on the quantization indices classified by the classification section. The inverse-quantization-value generation section generates inverse quantization values, which correspond to the respective quantization indices, based on the distribution information generated by the distribution-information generation section.

Type: Grant

Filed: August 9, 2006

Date of Patent: August 23, 2011

Assignee: Fuji Xerox Co., Ltd.

Inventor: Shunichi Kimura
Automatic system for temporal alignment of music audio signal with lyrics

Patent number: 8005666

Abstract: An automatic system for temporal alignment between a music audio signal and lyrics is provided. The automatic system can prevent accuracy for temporal alignment from being lowered due to the influence of non-vocal sections. Alignment means of the system is provided with a phone model for singing voice that estimates phonemes corresponding to temporal-alignment features or features available for temporal alignment. The alignment means receives temporal-alignment features outputted from temporal-alignment feature extraction means, information on the vocal and non-vocal sections outputted from vocal section estimation means, and a phoneme network, and performs an alignment operation on condition that no phoneme exists at least in non-vocal sections.

Type: Grant

Filed: August 7, 2007

Date of Patent: August 23, 2011

Assignee: National Institute of Advanced Industrial Science and Technology

Inventors: Masataka Goto, Hiromasa Fujihara, Hiroshi Okuno
Low latency real-time speech transcription

Patent number: 7941317

Abstract: Systems and methods for low-latency real-time speech recognition/transcription. A discriminative feature extraction, such as a heteroscedastic discriminant analysis transform, in combination with a maximum likelihood linear transform is applied during front-end processing of a digital speech signal. The extracted features reduce the word error rate. A discriminative acoustic model is applied by generating state-level lattices using Maximum Mutual Information Estimation. Recognition networks of language models are replaced by their closure. Latency is reduced by eliminating segmentation such that a number of words/sentences can be recognized as a single utterance. Latency is further reduced by performing front-end normalization in a causal fashion.

Type: Grant

Filed: June 5, 2007

Date of Patent: May 10, 2011

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Vincent Goffin, Michael Dennis Riley, Murat Saraclar
Low latency real-time speech transcription

Patent number: 7930181

Abstract: Systems and methods for low-latency real-time speech recognition/transcription. A discriminative feature extraction, such as a heteroscedastic discriminant analysis transform, in combination with a maximum likelihood linear transform is applied during front-end processing of a digital speech signal. The extracted features reduce the word error rate. A discriminative acoustic model is applied by generating state-level lattices using Maximum Mutual Information Estimation. Recognition networks of language models are replaced by their closure. Latency is reduced by eliminating segmentation such that a number of words/sentences can be recognized as a single utterance. Latency is further reduced by performing front-end normalization in a causal fashion.

Type: Grant

Filed: November 21, 2002

Date of Patent: April 19, 2011

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Vincent Goffin, Michael Dennis Riley, Murat Saraclar
Covariance estimation for pattern recognition

Patent number: 7805301

Abstract: A reliable full covariance matrix estimation algorithm for pattern unit's state output distribution in pattern recognition system is discussed. An intermediate hierarchical tree structure is built to relate models for product units. Full covariance matrices of pattern unit's state output distribution are estimated based on all the related nodes in the tree.

Type: Grant

Filed: July 1, 2005

Date of Patent: September 28, 2010

Assignee: Microsoft Corporation

Inventors: Ye Tian, Frank Kao-Ping Soong, Jian-Lai Zhou
Pattern recognition system, pattern recognition method, and pattern recognition program

Patent number: 7778463

Abstract: A pattern recognition system, pattern recognition method, and pattern recognition program capable of increasing the accuracy in computing the false acceptance probability and capable of ensuring a stable security strength are provided. Pattern recognition systems 10 and 10a comprise a first probability computation unit 32, and a second probability computation unit 33 coupled to the first probability computation unit 32. The first probability computation unit 32 computes a first probability PFCR based on the number n of corresponding characteristic points cs1 to csn and cf1 to cfn indicating points corresponding between characteristic points s1 to sns in a first pattern and characteristic points f1 to fnf in a second pattern. The first probability PFCR indicates the probability of existence of a third pattern that has a greater number of corresponding characteristic points to the first pattern than the number n of the corresponding characteristic points.

Type: Grant

Filed: July 14, 2006

Date of Patent: August 17, 2010

Assignee: NEC Corporation

Inventor: Lei Huang
Model-based comparative measure for vector sequences and word spotting using same

Publication number: 20100191532

Abstract: An object comparison method comprises: generating a first ordered vector sequence representation of a first object; generating a second ordered vector sequence representation of a second object; representing the first object by a first ordered sequence of model parameters generated by modeling the first ordered vector sequence representation using a semi-continuous hidden Markov model employing a universal basis; representing the second object by a second ordered sequence of model parameters generated by modeling the second ordered vector sequence representation using a semi-continuous hidden Markov model employing the universal basis; and comparing the first and second ordered sequences of model parameters to generate a quantitative comparison measure.

Type: Application

Filed: January 28, 2009

Publication date: July 29, 2010

Applicant: Xerox Corporation

Inventors: Jose A. Rodriguez Serrano, Florent C. Perronnin
Method for training of subspace coded gaussian models

Patent number: 7571097

Abstract: A method for compressing multiple dimensional gaussian distributions with diagonal covariance matrixes includes clustering a plurality of gaussian distributions in a multiplicity of clusters for each dimension. Each cluster can be represented by a centroid having a mean and a variance. A total decrease in likelihood of a training dataset is minimized for the representation of the plurality of gaussian distributions.

Type: Grant

Filed: March 13, 2003

Date of Patent: August 4, 2009

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, Michael D. Plumpe
Soft alignment based on a probability of time alignment

Patent number: 7505950

Abstract: Systems and methods are provided for performing soft alignment in Gaussian mixture model (GMM) based and other vector transformations. Soft alignment may assign alignment probabilities to source and target feature vector pairs. The vector pairs and associated probabilities may then be used calculate a conversion function, for example, by computing GMM training parameters from the joint vectors and alignment probabilities to create a voice conversion function for converting speech sounds from a source speaker to a target speaker.

Type: Grant

Filed: April 26, 2006

Date of Patent: March 17, 2009

Assignee: Nokia Corporation

Inventors: Jilei Tian, Jani Nurminen, Victor Popa
Variational inference and learning for segmental switching state space models of hidden speech dynamics

Patent number: 7454336

Abstract: A system and method that facilitate modeling unobserved speech dynamics based upon a hidden dynamic speech model in the form of segmental switching state space model that employs model parameters including those describing the unobserved speech dynamics and those describing the relationship between the unobserved speech dynamic vector and the observed acoustic feature vector is provided. The model parameters are modified based, at least in part, upon, a variational learning technique. In accordance with an aspect of the present invention, novel and powerful variational expectation maximization (EM) algorithm(s) for the segmental switching state space models used in speech applications, which are capable of capturing key internal (or hidden) dynamics of natural speech production, are provided. For example, modification of model parameters can be based upon an approximate mixture of Gaussian (MOG) posterior and/or based upon an approximate hidden Markov model (HMM) posterior using a variational technique.

Type: Grant

Filed: June 20, 2003

Date of Patent: November 18, 2008

Assignee: Microsoft Corporation

Inventors: Hagai Attias, Li Deng, Leo J. Lee
Method, apparatus, and system for building a compact model for large vocabulary continuous speech recognition (LVCSR) system

Patent number: 7454341

Abstract: According to one aspect of the invention, a method is provided in which a mean vector set and a variance vector set of a set of N Gaussians are divided into multiple mean sub-vector sets and variance sub-vector sets, respectively. Each mean sub-vector set contains a subset of the dimensions of the corresponding mean vector set and each variance sub-vector set contains a subset of the dimensions of the corresponding variance vector set. Each resultant sub-vector set is clustered to build a codebook for the respective sub-vector set using a modified K-means clustering process which dynamically merges and splits clusters based upon the size and average distortion of each cluster during each iteration in the modified K-means clustering process.

Type: Grant

Filed: September 30, 2000

Date of Patent: November 18, 2008

Assignee: Intel Corporation

Inventors: Jielin Pan, Baosheng Yuan
Robust detection and classification of objects in audio using limited training data

Patent number: 7263485

Abstract: A method (200) and apparatus (100) for classifying a homogeneous audio segment are disclosed. The homogeneous audio comprises a sequence of audio samples (x(n)). The method (200) starts by forming a sequence of frames (701-704) along the sequence of audio samples (x(n)), each frame (701-704) comprising a plurality of the audio samples (x(n)). The homogeneous audio segment is next divided (206) into a plurality of audio clips (711-714), with each audio clip being associated with a plurality of the frames (701-704). The method (200) then extracts (208) at least one frame feature for each clip (711-714). A clip feature vector (f) is next extracted from frame features of frames associated with the audio clip (711-714). Finally the segment is classified based on a continuous function during the distribution of the clip feature vectors (f).

Type: Grant

Filed: May 28, 2003

Date of Patent: August 28, 2007

Assignee: Canon Kabushiki Kaisha

Inventor: Timothy John Wark