Patents by Inventor Petr Motlicek

Petr Motlicek has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Apparatus and method for constructing multilingual acoustic model and computer readable recording medium for storing program for performing the method

Patent number: 10460043

Abstract: An apparatus and a method for constructing a multilingual acoustic model, and a computer readable recording medium are provided. The method for constructing a multilingual acoustic model includes dividing an input feature into a common language portion and a distinctive language portion, acquiring a tandem feature by training the divided common language portion and distinctive language portion using a neural network to estimate and remove correlation between phonemes, dividing parameters of an initial acoustic model constructed using the tandem feature into common language parameters and distinctive language parameters, adapting the common language parameters using data of a training language, adapting the distinctive language parameters using data of a target language, and constructing an acoustic model for the target language using the adapted common language parameters and the adapted distinctive language parameters.

Type: Grant

Filed: November 22, 2013

Date of Patent: October 29, 2019

Assignees: SAMSUNG ELECTRONICS CO., LTD., IDIAP RESEARCH INSTITUTE

Inventors: Nam-Hoon Kim, Petr Motlicek, Philip Neil Garner, David Imseng, Jae-won Lee, Jeong-Mi Cho
APPARATUS AND METHOD FOR CONSTRUCTING MULTILINGUAL ACOUSTIC MODEL AND COMPUTER READABLE RECORDING MEDIUM FOR STORING PROGRAM FOR PERFORMING THE METHOD

Publication number: 20140149104

Abstract: An apparatus and a method for constructing a multilingual acoustic model, and a computer readable recording medium are provided. The method for constructing a multilingual acoustic model includes dividing an input feature into a common language portion and a distinctive language portion, acquiring a tandem feature by training the divided common language portion and distinctive language portion using a neural network to estimate and remove correlation between phonemes, dividing parameters of an initial acoustic model constructed using the tandem feature into common language parameters and distinctive language parameters, adapting the common language parameters using data of a training language, adapting the distinctive language parameters using data of a target language, and constructing an acoustic model for the target language using the adapted common language parameters and the adapted distinctive language parameters.

Type: Application

Filed: November 22, 2013

Publication date: May 29, 2014

Applicants: IDIAP RESEARCH INSTITUTE, SAMSUNG ELECTRONICS CO., LTD.

Inventors: Nam-Hoon KIM, Petr MOTLICEK, Philip Neil GARNER, David IMSENG, Jae-won LEE, Jeong-Mi CHO
Spectral noise shaping in audio coding based on spectral dynamics in frequency sub-bands

Patent number: 8428957

Abstract: A technique of spectral noise shaping in an audio coding system is disclosed. Frequency decomposition of an input audio signal is performed to obtain multiple frequency sub-bands that closely follow critical bands of human auditory system decomposition. The tonality of each sub-band is determined. If a sub-band is tonal, time domain linear prediction (TDLP) processing is applied to the sub-band, yielding a residual signal and linear predictive coding (LPC) coefficients of an all-pole model representing the sub-band signal. The residual signal is further processed using a frequency domain linear prediction (FDLP) method. The FDLP parameters and LPC coefficients are transferred to a decoder. At the decoder, an inverse-FDLP process is applied to the encoded residual signal followed by an inverse TDLP process, which shapes the quantization noise according to the power spectral density of the original sub-band signal. Non-tonal sub-band signals bypass the TDLP process.

Type: Grant

Filed: August 22, 2008

Date of Patent: April 23, 2013

Assignee: QUALCOMM Incorporated

Inventors: Harinath Garudadri, Sriram Ganapathy, Petr Motlicek, Hynek Hermansky
Processing of excitation in audio coding and decoding

Patent number: 8392176

Abstract: In an apparatus and method, time-varying signals are processed and encoded via a frequency domain linear prediction (FDLP) scheme to arrive at an all-pole model. Residual signals resulted from the scheme are estimated and transformed into a time domain signal. Through the process of heterodyning, the time domain signal is frequency shifted toward the baseband level as a downshifted carrier signal. Quantized values of the all-pole model and the frequency transform of the downshifted carrier signal are packetized as encoded signals suitable for transmission or storage. To reconstruct the time-varying signals, the encoded signals are decoded. The decoding process is basically the reverse of the encoding process.

Type: Grant

Filed: April 5, 2007

Date of Patent: March 5, 2013

Assignee: QUALCOMM Incorporated

Inventors: Harinath Garudadri, Naveen B. Srinivasamurthy, Petr Motlicek, Hynek Hermansky
SPECTRAL NOISE SHAPING IN AUDIO CODING BASED ON SPECTRAL DYNAMICS IN FREQUENCY SUB-BANDS

Publication number: 20110270616

Abstract: A technique of spectral noise shaping in an audio coding system is disclosed. Frequency decomposition of an input audio signal is performed to obtain multiple frequency sub-bands that closely follow critical bands of human auditory system decomposition. The tonality of each sub-band is determined. If a sub-band is tonal, time domain linear prediction (TDLP) processing is applied to the sub-band, yielding a residual signal and linear predictive coding (LPC) coefficients of an all-pole model representing the sub-band signal. The residual signal is further processed using a frequency domain linear prediction (FDLP) method. The FDLP parameters and LPC coefficients are transferred to a decoder. At the decoder, an inverse-FDLP process is applied to the encoded residual signal followed by an inverse TDLP process, which shapes the quantization noise according to the power spectral density of the original sub-band signal. Non-tonal sub-band signals bypass the TDLP process.

Type: Application

Filed: August 22, 2008

Publication date: November 3, 2011

Applicant: QUALCOMM Incorporated

Inventors: Harinath Garudadri, Sriram Ganapathy, Petr Motlicek, Hynek Hermansky
Signal coding and decoding based on spectral dynamics

Patent number: 8027242

Abstract: In an apparatus and method, time-varying signals are processed and encoded via a frequency domain linear prediction (FDLP) scheme to arrive at an all-pole model. Residual signals resulted from the scheme are estimated. Quantized values of the all-pole model and the residual signals are packetized as encoded signals suitable for transmission or storage. To reconstruct the time-varying signals, the encoded signals are decoded. The decoding process is basically the reverse of the encoding process.

Type: Grant

Filed: October 18, 2006

Date of Patent: September 27, 2011

Assignee: QUALCOMM Incorporated

Inventors: Harinath Garudadri, Naveen B. Srinivasamurthy, Petr Motlicek, Hynek Hermansky
TEMPORAL MASKING IN AUDIO CODING BASED ON SPECTRAL DYNAMICS IN FREQUENCY SUB-BANDS

Publication number: 20090198500

Abstract: An audio coding technique based on modeling spectral dynamics is disclosed. Frequency decomposition of an input audio signal is performed to obtain multiple frequency sub-bands that closely follow critical bands of human auditory system decomposition. Each sub-band is then frequency transformed and linear prediction is applied. This results in a Hilbert envelope and a Hilbert Carrier for each of the sub-bands. Because of application of linear prediction to frequency components, the technique is called Frequency Domain Linear Prediction (FDLP). The Hilbert envelope and the Hilbert Carrier are analogous to spectral envelope and excitation signals in the Time Domain Linear Prediction (TDLP) techniques. Temporal masking is applied to the FDLP sub-bands to improve the compression efficiency. Specifically, forward masking of the sub-band FDLP carrier signal can be employed to improve compression efficiency of an encoded signal.

Type: Application

Filed: August 22, 2008

Publication date: August 6, 2009

Applicant: QUALCOMM Incorporated

Inventors: Harinath Garudadri, Sriram Ganapathy, Petr Motlicek, Hynek Hermansky
Signal coding and decoding based on spectral dynamics

Publication number: 20080031365

Abstract: In an apparatus and method, time-varying signals are processed and encoded via a frequency domain linear prediction (FDLP) scheme to arrive at an all-pole model. Residual signals resulted from the scheme are estimated. Quantized values of the all-pole model and the residual signals are packetized as encoded signals suitable for transmission or storage. To reconstruct the time-varying signals, the encoded signals are decoded. The decoding process is basically the reverse of the encoding process.

Type: Application

Filed: October 18, 2006

Publication date: February 7, 2008

Inventors: Harinath Garudadri, Naveen Srinivasamurthy, Petr Motlicek, Hynek Hermansky
Multistream network feature processing for a distributed speech recognition system

Patent number: 7089178

Abstract: A distributed voice recognition system and method for obtaining acoustic features and speech activity at multiple frequencies by extracting high frequency components thereof on a device, such as a subscriber station and transmitting them to a network server having multiple stream processing capability, including cepstral feature processing, MLP nonlinear transformation processing, and multiband temporal pattern architecture processing. The features received at the network server are processed using all three streams, wherein each of the three streams provide benefits not available in the other two, thereby enhancing feature interpretation. Feature extraction and feature interpretation may operate at multiple frequencies, including but not limited to 8 kHz, 11 kHz, and 16 kHz.

Type: Grant

Filed: April 30, 2002

Date of Patent: August 8, 2006

Assignee: Qualcomm Inc.

Inventors: Harinath Garudadri, Sunil Sivadas, Hynek Hermansky, Nelson H. Morgan, Charles C. Wooters, Andre Gustavo Adami, Maria Carmen Benitez Ortuzar, Lukas Burget, Stephane N. Dupont, Frantisek Grezl, Pratibha Jain, Sachin Kajarekar, Petr Motlicek
Distributed voice recognition system utilizing multistream network feature processing

Publication number: 20030204394

Abstract: A distributed voice recognition system and method for obtaining acoustic features and speech activity at multiple frequencies by extracting high frequency components thereof on a device, such as a subscriber station and transmitting them to a network server having multiple stream processing capability, including cepstral feature processing, MLP nonlinear transformation processing, and multiband temporal pattern architecture processing. The features received at the network server are processed using all three streams, wherein each of the three streams provide benefits not available in the other two, thereby enhancing feature interpretation. Feature extraction and feature interpretation may operate at multiple frequencies, including but not limited to 8 kHz, 11 kHz, and 16 kHz.

Type: Application

Filed: April 30, 2002

Publication date: October 30, 2003

Inventors: Harinath Garudadri, Sunil Sivadas, Hynek Hermansky, Nelson H. Morgan, Charles C. Wooters, Andre Gustavo Adami, Maria Carmen Benitez Ortuzar, Lukas Burget, Stephane N. Dupont, Frantisek Grezl, Pratibha Jain, Sachin Kajarekar, Petr Motlicek