Patents by Inventor Petr Motlicek
Petr Motlicek has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10460043Abstract: An apparatus and a method for constructing a multilingual acoustic model, and a computer readable recording medium are provided. The method for constructing a multilingual acoustic model includes dividing an input feature into a common language portion and a distinctive language portion, acquiring a tandem feature by training the divided common language portion and distinctive language portion using a neural network to estimate and remove correlation between phonemes, dividing parameters of an initial acoustic model constructed using the tandem feature into common language parameters and distinctive language parameters, adapting the common language parameters using data of a training language, adapting the distinctive language parameters using data of a target language, and constructing an acoustic model for the target language using the adapted common language parameters and the adapted distinctive language parameters.Type: GrantFiled: November 22, 2013Date of Patent: October 29, 2019Assignees: SAMSUNG ELECTRONICS CO., LTD., IDIAP RESEARCH INSTITUTEInventors: Nam-Hoon Kim, Petr Motlicek, Philip Neil Garner, David Imseng, Jae-won Lee, Jeong-Mi Cho
-
Publication number: 20140149104Abstract: An apparatus and a method for constructing a multilingual acoustic model, and a computer readable recording medium are provided. The method for constructing a multilingual acoustic model includes dividing an input feature into a common language portion and a distinctive language portion, acquiring a tandem feature by training the divided common language portion and distinctive language portion using a neural network to estimate and remove correlation between phonemes, dividing parameters of an initial acoustic model constructed using the tandem feature into common language parameters and distinctive language parameters, adapting the common language parameters using data of a training language, adapting the distinctive language parameters using data of a target language, and constructing an acoustic model for the target language using the adapted common language parameters and the adapted distinctive language parameters.Type: ApplicationFiled: November 22, 2013Publication date: May 29, 2014Applicants: IDIAP RESEARCH INSTITUTE, SAMSUNG ELECTRONICS CO., LTD.Inventors: Nam-Hoon KIM, Petr MOTLICEK, Philip Neil GARNER, David IMSENG, Jae-won LEE, Jeong-Mi CHO
-
Patent number: 8428957Abstract: A technique of spectral noise shaping in an audio coding system is disclosed. Frequency decomposition of an input audio signal is performed to obtain multiple frequency sub-bands that closely follow critical bands of human auditory system decomposition. The tonality of each sub-band is determined. If a sub-band is tonal, time domain linear prediction (TDLP) processing is applied to the sub-band, yielding a residual signal and linear predictive coding (LPC) coefficients of an all-pole model representing the sub-band signal. The residual signal is further processed using a frequency domain linear prediction (FDLP) method. The FDLP parameters and LPC coefficients are transferred to a decoder. At the decoder, an inverse-FDLP process is applied to the encoded residual signal followed by an inverse TDLP process, which shapes the quantization noise according to the power spectral density of the original sub-band signal. Non-tonal sub-band signals bypass the TDLP process.Type: GrantFiled: August 22, 2008Date of Patent: April 23, 2013Assignee: QUALCOMM IncorporatedInventors: Harinath Garudadri, Sriram Ganapathy, Petr Motlicek, Hynek Hermansky
-
Patent number: 8392176Abstract: In an apparatus and method, time-varying signals are processed and encoded via a frequency domain linear prediction (FDLP) scheme to arrive at an all-pole model. Residual signals resulted from the scheme are estimated and transformed into a time domain signal. Through the process of heterodyning, the time domain signal is frequency shifted toward the baseband level as a downshifted carrier signal. Quantized values of the all-pole model and the frequency transform of the downshifted carrier signal are packetized as encoded signals suitable for transmission or storage. To reconstruct the time-varying signals, the encoded signals are decoded. The decoding process is basically the reverse of the encoding process.Type: GrantFiled: April 5, 2007Date of Patent: March 5, 2013Assignee: QUALCOMM IncorporatedInventors: Harinath Garudadri, Naveen B. Srinivasamurthy, Petr Motlicek, Hynek Hermansky
-
Publication number: 20110270616Abstract: A technique of spectral noise shaping in an audio coding system is disclosed. Frequency decomposition of an input audio signal is performed to obtain multiple frequency sub-bands that closely follow critical bands of human auditory system decomposition. The tonality of each sub-band is determined. If a sub-band is tonal, time domain linear prediction (TDLP) processing is applied to the sub-band, yielding a residual signal and linear predictive coding (LPC) coefficients of an all-pole model representing the sub-band signal. The residual signal is further processed using a frequency domain linear prediction (FDLP) method. The FDLP parameters and LPC coefficients are transferred to a decoder. At the decoder, an inverse-FDLP process is applied to the encoded residual signal followed by an inverse TDLP process, which shapes the quantization noise according to the power spectral density of the original sub-band signal. Non-tonal sub-band signals bypass the TDLP process.Type: ApplicationFiled: August 22, 2008Publication date: November 3, 2011Applicant: QUALCOMM IncorporatedInventors: Harinath Garudadri, Sriram Ganapathy, Petr Motlicek, Hynek Hermansky
-
Patent number: 8027242Abstract: In an apparatus and method, time-varying signals are processed and encoded via a frequency domain linear prediction (FDLP) scheme to arrive at an all-pole model. Residual signals resulted from the scheme are estimated. Quantized values of the all-pole model and the residual signals are packetized as encoded signals suitable for transmission or storage. To reconstruct the time-varying signals, the encoded signals are decoded. The decoding process is basically the reverse of the encoding process.Type: GrantFiled: October 18, 2006Date of Patent: September 27, 2011Assignee: QUALCOMM IncorporatedInventors: Harinath Garudadri, Naveen B. Srinivasamurthy, Petr Motlicek, Hynek Hermansky
-
Publication number: 20090198500Abstract: An audio coding technique based on modeling spectral dynamics is disclosed. Frequency decomposition of an input audio signal is performed to obtain multiple frequency sub-bands that closely follow critical bands of human auditory system decomposition. Each sub-band is then frequency transformed and linear prediction is applied. This results in a Hilbert envelope and a Hilbert Carrier for each of the sub-bands. Because of application of linear prediction to frequency components, the technique is called Frequency Domain Linear Prediction (FDLP). The Hilbert envelope and the Hilbert Carrier are analogous to spectral envelope and excitation signals in the Time Domain Linear Prediction (TDLP) techniques. Temporal masking is applied to the FDLP sub-bands to improve the compression efficiency. Specifically, forward masking of the sub-band FDLP carrier signal can be employed to improve compression efficiency of an encoded signal.Type: ApplicationFiled: August 22, 2008Publication date: August 6, 2009Applicant: QUALCOMM IncorporatedInventors: Harinath Garudadri, Sriram Ganapathy, Petr Motlicek, Hynek Hermansky
-
Publication number: 20080031365Abstract: In an apparatus and method, time-varying signals are processed and encoded via a frequency domain linear prediction (FDLP) scheme to arrive at an all-pole model. Residual signals resulted from the scheme are estimated. Quantized values of the all-pole model and the residual signals are packetized as encoded signals suitable for transmission or storage. To reconstruct the time-varying signals, the encoded signals are decoded. The decoding process is basically the reverse of the encoding process.Type: ApplicationFiled: October 18, 2006Publication date: February 7, 2008Inventors: Harinath Garudadri, Naveen Srinivasamurthy, Petr Motlicek, Hynek Hermansky
-
Patent number: 7089178Abstract: A distributed voice recognition system and method for obtaining acoustic features and speech activity at multiple frequencies by extracting high frequency components thereof on a device, such as a subscriber station and transmitting them to a network server having multiple stream processing capability, including cepstral feature processing, MLP nonlinear transformation processing, and multiband temporal pattern architecture processing. The features received at the network server are processed using all three streams, wherein each of the three streams provide benefits not available in the other two, thereby enhancing feature interpretation. Feature extraction and feature interpretation may operate at multiple frequencies, including but not limited to 8 kHz, 11 kHz, and 16 kHz.Type: GrantFiled: April 30, 2002Date of Patent: August 8, 2006Assignee: Qualcomm Inc.Inventors: Harinath Garudadri, Sunil Sivadas, Hynek Hermansky, Nelson H. Morgan, Charles C. Wooters, Andre Gustavo Adami, Maria Carmen Benitez Ortuzar, Lukas Burget, Stephane N. Dupont, Frantisek Grezl, Pratibha Jain, Sachin Kajarekar, Petr Motlicek
-
Publication number: 20030204394Abstract: A distributed voice recognition system and method for obtaining acoustic features and speech activity at multiple frequencies by extracting high frequency components thereof on a device, such as a subscriber station and transmitting them to a network server having multiple stream processing capability, including cepstral feature processing, MLP nonlinear transformation processing, and multiband temporal pattern architecture processing. The features received at the network server are processed using all three streams, wherein each of the three streams provide benefits not available in the other two, thereby enhancing feature interpretation. Feature extraction and feature interpretation may operate at multiple frequencies, including but not limited to 8 kHz, 11 kHz, and 16 kHz.Type: ApplicationFiled: April 30, 2002Publication date: October 30, 2003Inventors: Harinath Garudadri, Sunil Sivadas, Hynek Hermansky, Nelson H. Morgan, Charles C. Wooters, Andre Gustavo Adami, Maria Carmen Benitez Ortuzar, Lukas Burget, Stephane N. Dupont, Frantisek Grezl, Pratibha Jain, Sachin Kajarekar, Petr Motlicek