Patents by Inventor Petr Motlicek

Petr Motlicek has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10460043
    Abstract: An apparatus and a method for constructing a multilingual acoustic model, and a computer readable recording medium are provided. The method for constructing a multilingual acoustic model includes dividing an input feature into a common language portion and a distinctive language portion, acquiring a tandem feature by training the divided common language portion and distinctive language portion using a neural network to estimate and remove correlation between phonemes, dividing parameters of an initial acoustic model constructed using the tandem feature into common language parameters and distinctive language parameters, adapting the common language parameters using data of a training language, adapting the distinctive language parameters using data of a target language, and constructing an acoustic model for the target language using the adapted common language parameters and the adapted distinctive language parameters.
    Type: Grant
    Filed: November 22, 2013
    Date of Patent: October 29, 2019
    Assignees: SAMSUNG ELECTRONICS CO., LTD., IDIAP RESEARCH INSTITUTE
    Inventors: Nam-Hoon Kim, Petr Motlicek, Philip Neil Garner, David Imseng, Jae-won Lee, Jeong-Mi Cho
  • Publication number: 20140149104
    Abstract: An apparatus and a method for constructing a multilingual acoustic model, and a computer readable recording medium are provided. The method for constructing a multilingual acoustic model includes dividing an input feature into a common language portion and a distinctive language portion, acquiring a tandem feature by training the divided common language portion and distinctive language portion using a neural network to estimate and remove correlation between phonemes, dividing parameters of an initial acoustic model constructed using the tandem feature into common language parameters and distinctive language parameters, adapting the common language parameters using data of a training language, adapting the distinctive language parameters using data of a target language, and constructing an acoustic model for the target language using the adapted common language parameters and the adapted distinctive language parameters.
    Type: Application
    Filed: November 22, 2013
    Publication date: May 29, 2014
    Applicants: IDIAP RESEARCH INSTITUTE, SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Nam-Hoon KIM, Petr MOTLICEK, Philip Neil GARNER, David IMSENG, Jae-won LEE, Jeong-Mi CHO
  • Patent number: 8428957
    Abstract: A technique of spectral noise shaping in an audio coding system is disclosed. Frequency decomposition of an input audio signal is performed to obtain multiple frequency sub-bands that closely follow critical bands of human auditory system decomposition. The tonality of each sub-band is determined. If a sub-band is tonal, time domain linear prediction (TDLP) processing is applied to the sub-band, yielding a residual signal and linear predictive coding (LPC) coefficients of an all-pole model representing the sub-band signal. The residual signal is further processed using a frequency domain linear prediction (FDLP) method. The FDLP parameters and LPC coefficients are transferred to a decoder. At the decoder, an inverse-FDLP process is applied to the encoded residual signal followed by an inverse TDLP process, which shapes the quantization noise according to the power spectral density of the original sub-band signal. Non-tonal sub-band signals bypass the TDLP process.
    Type: Grant
    Filed: August 22, 2008
    Date of Patent: April 23, 2013
    Assignee: QUALCOMM Incorporated
    Inventors: Harinath Garudadri, Sriram Ganapathy, Petr Motlicek, Hynek Hermansky
  • Patent number: 8392176
    Abstract: In an apparatus and method, time-varying signals are processed and encoded via a frequency domain linear prediction (FDLP) scheme to arrive at an all-pole model. Residual signals resulted from the scheme are estimated and transformed into a time domain signal. Through the process of heterodyning, the time domain signal is frequency shifted toward the baseband level as a downshifted carrier signal. Quantized values of the all-pole model and the frequency transform of the downshifted carrier signal are packetized as encoded signals suitable for transmission or storage. To reconstruct the time-varying signals, the encoded signals are decoded. The decoding process is basically the reverse of the encoding process.
    Type: Grant
    Filed: April 5, 2007
    Date of Patent: March 5, 2013
    Assignee: QUALCOMM Incorporated
    Inventors: Harinath Garudadri, Naveen B. Srinivasamurthy, Petr Motlicek, Hynek Hermansky
  • Publication number: 20110270616
    Abstract: A technique of spectral noise shaping in an audio coding system is disclosed. Frequency decomposition of an input audio signal is performed to obtain multiple frequency sub-bands that closely follow critical bands of human auditory system decomposition. The tonality of each sub-band is determined. If a sub-band is tonal, time domain linear prediction (TDLP) processing is applied to the sub-band, yielding a residual signal and linear predictive coding (LPC) coefficients of an all-pole model representing the sub-band signal. The residual signal is further processed using a frequency domain linear prediction (FDLP) method. The FDLP parameters and LPC coefficients are transferred to a decoder. At the decoder, an inverse-FDLP process is applied to the encoded residual signal followed by an inverse TDLP process, which shapes the quantization noise according to the power spectral density of the original sub-band signal. Non-tonal sub-band signals bypass the TDLP process.
    Type: Application
    Filed: August 22, 2008
    Publication date: November 3, 2011
    Applicant: QUALCOMM Incorporated
    Inventors: Harinath Garudadri, Sriram Ganapathy, Petr Motlicek, Hynek Hermansky
  • Patent number: 8027242
    Abstract: In an apparatus and method, time-varying signals are processed and encoded via a frequency domain linear prediction (FDLP) scheme to arrive at an all-pole model. Residual signals resulted from the scheme are estimated. Quantized values of the all-pole model and the residual signals are packetized as encoded signals suitable for transmission or storage. To reconstruct the time-varying signals, the encoded signals are decoded. The decoding process is basically the reverse of the encoding process.
    Type: Grant
    Filed: October 18, 2006
    Date of Patent: September 27, 2011
    Assignee: QUALCOMM Incorporated
    Inventors: Harinath Garudadri, Naveen B. Srinivasamurthy, Petr Motlicek, Hynek Hermansky
  • Publication number: 20090198500
    Abstract: An audio coding technique based on modeling spectral dynamics is disclosed. Frequency decomposition of an input audio signal is performed to obtain multiple frequency sub-bands that closely follow critical bands of human auditory system decomposition. Each sub-band is then frequency transformed and linear prediction is applied. This results in a Hilbert envelope and a Hilbert Carrier for each of the sub-bands. Because of application of linear prediction to frequency components, the technique is called Frequency Domain Linear Prediction (FDLP). The Hilbert envelope and the Hilbert Carrier are analogous to spectral envelope and excitation signals in the Time Domain Linear Prediction (TDLP) techniques. Temporal masking is applied to the FDLP sub-bands to improve the compression efficiency. Specifically, forward masking of the sub-band FDLP carrier signal can be employed to improve compression efficiency of an encoded signal.
    Type: Application
    Filed: August 22, 2008
    Publication date: August 6, 2009
    Applicant: QUALCOMM Incorporated
    Inventors: Harinath Garudadri, Sriram Ganapathy, Petr Motlicek, Hynek Hermansky
  • Publication number: 20080031365
    Abstract: In an apparatus and method, time-varying signals are processed and encoded via a frequency domain linear prediction (FDLP) scheme to arrive at an all-pole model. Residual signals resulted from the scheme are estimated. Quantized values of the all-pole model and the residual signals are packetized as encoded signals suitable for transmission or storage. To reconstruct the time-varying signals, the encoded signals are decoded. The decoding process is basically the reverse of the encoding process.
    Type: Application
    Filed: October 18, 2006
    Publication date: February 7, 2008
    Inventors: Harinath Garudadri, Naveen Srinivasamurthy, Petr Motlicek, Hynek Hermansky
  • Patent number: 7089178
    Abstract: A distributed voice recognition system and method for obtaining acoustic features and speech activity at multiple frequencies by extracting high frequency components thereof on a device, such as a subscriber station and transmitting them to a network server having multiple stream processing capability, including cepstral feature processing, MLP nonlinear transformation processing, and multiband temporal pattern architecture processing. The features received at the network server are processed using all three streams, wherein each of the three streams provide benefits not available in the other two, thereby enhancing feature interpretation. Feature extraction and feature interpretation may operate at multiple frequencies, including but not limited to 8 kHz, 11 kHz, and 16 kHz.
    Type: Grant
    Filed: April 30, 2002
    Date of Patent: August 8, 2006
    Assignee: Qualcomm Inc.
    Inventors: Harinath Garudadri, Sunil Sivadas, Hynek Hermansky, Nelson H. Morgan, Charles C. Wooters, Andre Gustavo Adami, Maria Carmen Benitez Ortuzar, Lukas Burget, Stephane N. Dupont, Frantisek Grezl, Pratibha Jain, Sachin Kajarekar, Petr Motlicek
  • Publication number: 20030204394
    Abstract: A distributed voice recognition system and method for obtaining acoustic features and speech activity at multiple frequencies by extracting high frequency components thereof on a device, such as a subscriber station and transmitting them to a network server having multiple stream processing capability, including cepstral feature processing, MLP nonlinear transformation processing, and multiband temporal pattern architecture processing. The features received at the network server are processed using all three streams, wherein each of the three streams provide benefits not available in the other two, thereby enhancing feature interpretation. Feature extraction and feature interpretation may operate at multiple frequencies, including but not limited to 8 kHz, 11 kHz, and 16 kHz.
    Type: Application
    Filed: April 30, 2002
    Publication date: October 30, 2003
    Inventors: Harinath Garudadri, Sunil Sivadas, Hynek Hermansky, Nelson H. Morgan, Charles C. Wooters, Andre Gustavo Adami, Maria Carmen Benitez Ortuzar, Lukas Burget, Stephane N. Dupont, Frantisek Grezl, Pratibha Jain, Sachin Kajarekar, Petr Motlicek