Patents by Inventor Xiaobo Pi

Xiaobo Pi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8473290
    Abstract: An interactive voice response system is described that supports full duplex data transfer to enable the playing of a voice prompt to a user of telephony system while the system listens for voice barge-in from the user. The system includes a speech detection module that may utilize various criteria such as frame energy magnitude and duration thresholds to detect speech. The system also includes an automatic speech recognition engine. When the automatic speech recognition engine recognizes a segment of speech, a feature extraction module may be used to subtract a prompt echo spectrum, which corresponds to the currently playing voice prompt, from an echo-dirtied speech spectrum recorded by the system. In order to improve spectrum subtraction, an estimation of the time delay between the echo-dirtied speech and the prompt echo may also be performed.
    Type: Grant
    Filed: August 25, 2008
    Date of Patent: June 25, 2013
    Assignee: Intel Corporation
    Inventors: Xiaobo Pi, Ying Jia
  • Patent number: 7472063
    Abstract: A speech recognition method includes several embodiments describing application of support vector machine analysis to a mouth region. Lip position can be accurately determined and used in conjunction with synchronous or asynchronous audio data to enhance speech recognition probabilities.
    Type: Grant
    Filed: December 19, 2002
    Date of Patent: December 30, 2008
    Assignee: Intel Corporation
    Inventors: Ara V. Nefian, Xiaobo Pi, Luhong Liang, Xiaoxing Liu, Yibao Zhao
  • Publication number: 20080310601
    Abstract: An interactive voice response system is described that supports full duplex data transfer to enable the playing of a voice prompt to a user of telephony system while the system listens for voice barge-in from the user. The system includes a speech detection module that may utilize various criteria such as frame energy magnitude and duration thresholds to detect speech. The system also includes an automatic speech recognition engine. When the automatic speech recognition engine recognizes a segment of speech, a feature extraction module may be used to subtract a prompt echo spectrum, which corresponds to the currently playing voice prompt, from an echo-dirtied speech spectrum recorded by the system. In order to improve spectrum subtraction, an estimation of the time delay between the echo-dirtied speech and the prompt echo may also be performed.
    Type: Application
    Filed: August 25, 2008
    Publication date: December 18, 2008
    Inventors: Xiaobo Pi, Ying Jia
  • Patent number: 7454342
    Abstract: Method and apparatus for an audiovisual continuous speech recognition (AVCSR) system using a coupled hidden Markov model (CHMM) are described herein. In one aspect, an exemplary process includes receiving an audio data stream and a video data stream, and performing continuous speech recognition based on the audio and video data streams using a plurality of hidden Markov models (HMMs), a node of each of the HMMs at a time slot being subject to one or more nodes of related HMMs at a preceding time slot. Other methods and apparatuses are also described.
    Type: Grant
    Filed: March 19, 2003
    Date of Patent: November 18, 2008
    Assignee: Intel Corporation
    Inventors: Ara Victor Nefian, Xiaoxing Liu, Xiaobo Pi, Luhong Liang, Yibao Zhao
  • Patent number: 7437286
    Abstract: An interactive voice response system is described that supports full duplex data transfer to enable the playing of a voice prompt to a user of telephony system while the system listens for voice barge-in from the user. The system includes a speech detection module that may utilize various criteria such as frame energy magnitude and duration thresholds to detect speech. The system also includes an automatic speech recognition engine. When the automatic speech recognition engine recognizes a segment of speech, a feature extraction module may be used to subtract a prompt echo spectrum, which corresponds to the currently playing voice prompt, from an echo-dirtied speech spectrum recorded by the system. In order to improve spectrum subtraction, an estimation of the time delay between the echo-dirtied speech and the prompt echo may also be performed.
    Type: Grant
    Filed: December 27, 2000
    Date of Patent: October 14, 2008
    Assignee: Intel Corporation
    Inventors: Xiaobo Pi, Ying Jia
  • Patent number: 7346497
    Abstract: An automatic speech recognition system comprising a speech decoder to resolve phone and word level information, a vector generator to generate information vectors on which a confidence measure is based by a neural network classifier (ANN). An error signal is designed which is not subject to false saturation or over specialization. The error signal is integrated into an error function which is back propagated through the ANN.
    Type: Grant
    Filed: May 8, 2001
    Date of Patent: March 18, 2008
    Assignee: Intel Corporation
    Inventors: Xiaobo Pi, Ying Jia
  • Patent number: 7072750
    Abstract: An automatic speech recognition system for continuous speech recognition of vocabulary words for an autoattendent system proving hand-free telephone calling and utilizing a vocabulary comprising numbers or names of people to be called using known techniques for automatic speech recognition models of word sequencing resulting in high confidence levels of recognition.
    Type: Grant
    Filed: May 8, 2001
    Date of Patent: July 4, 2006
    Assignee: Intel Corporation
    Inventors: Xiaobo Pi, Ying Jia
  • Publication number: 20050027530
    Abstract: A phoneme and a viseme of a person may be modeled using a coupled hidden Markov model. The coupled hidden Markov model and a second model may be compared to identify the person.
    Type: Application
    Filed: July 31, 2003
    Publication date: February 3, 2005
    Inventors: Tieyan Fu, Xiaoxing Liu, Luhong Liang, Xiaobo Pi, Ara Nefian
  • Publication number: 20050015251
    Abstract: An automatic speech recognition system comprising a speech decoder to resolve phone and world level information, a vector generator to generate information vectors on which a confidence measure is based by a neural network classifier (ANN). An error signal is designed which is not subject to false saturation or over specialization. The error signal is integrated into an error function which is back propagated through the ANN.
    Type: Application
    Filed: May 8, 2001
    Publication date: January 20, 2005
    Inventors: Xiaobo Pi, Ying Jia
  • Publication number: 20040186718
    Abstract: Method and apparatus for an audiovisual continuous speech recognition (AVCSR) system using a coupled hidden Markov model (CHMM) are described herein. In one aspect, an exemplary process includes receiving an audio data stream and a video data stream, and performing continuous speech recognition based on the audio and video data streams using a plurality of hidden Markov models (HMMs), a node of each of the HMMs at a time slot being subject to one or more nodes of related HMMs at a preceding time slot. Other methods and apparatuses are also described.
    Type: Application
    Filed: March 19, 2003
    Publication date: September 23, 2004
    Inventors: Ara Victor Nefian, Xiaoxing Liu, Xiaobo Pi, Luhong Liang, Yibao Zhao
  • Publication number: 20040122675
    Abstract: A speech recognition method includes several embodiments describing application of support vector machine analysis to a mouth region. Lip position can be accurately determined and used in conjunction with synchronous or asynchronous audio data to enhance speech recognition probabilities.
    Type: Application
    Filed: December 19, 2002
    Publication date: June 24, 2004
    Inventors: Ara Victor Nefian, Xiaobo Pi, Luhong Liang, Xiaoxing Liu, Yibao Zhao
  • Publication number: 20040015357
    Abstract: An automatic speech recognition system for continuous speech recognition of vocabulary words for an autoattendent system proving hand-free telephone calling and utilizing a vocabulary comprising numbers or names of people to be called using known techniques for automatic speech recognition models of word sequencing resulting in high confidence levels of recognition.
    Type: Application
    Filed: June 10, 2003
    Publication date: January 22, 2004
    Inventors: Xiaobo Pi, Ying Jia
  • Publication number: 20030212552
    Abstract: A visual feature extraction method includes application of multiclass linear discriminant analysis to the mouth region. Lip position can be accurately determined and used in conjunction with synchronous or asynchronous audio data to enhance speech recognition probabilities.
    Type: Application
    Filed: May 9, 2002
    Publication date: November 13, 2003
    Inventors: Lu Hong Liang, Xiaobo Pi, Xiaoxing Liu, Crusoe Mao, Ara V. Nefian
  • Publication number: 20030158732
    Abstract: An interactive voice response system is described that supports full duplex data transfer to enable the playing of a voice prompt to a user of telephony system while the system listens for voice barge-in from the user. The system includes a speech detection module that may utilize various criteria such as frame energy magnitude and duration thresholds to detect speech. The system also includes an automatic speech recognition engine. When the automatic speech recognition engine recognizes a segment of speech, a feature extraction module may be used to subtract a prompt echo spectrum, which corresponds to the currently playing voice prompt, from an echo-dirtied speech spectrum recorded by the system. In order to improve spectrum subtraction, an estimation of the time delay between the echo-dirtied speech and the prompt echo may also be performed.
    Type: Application
    Filed: March 25, 2003
    Publication date: August 21, 2003
    Inventors: Xiaobo Pi, Ying Jia
  • Publication number: 20030139926
    Abstract: Methods for processing speech data are described herein. In one aspect of the invention, an exemplary method includes receiving a speech data stream, performing a Mel Frequency Cepstral Coefficients (MFCC) feature extraction on the speech data stream, optimizing feature space transformation (FST), optimizing model space transformation (MST) based on the FST, and performing recognition decoding based on the FST and the MST, generating a word sequence. Other methods and apparatuses are also described.
    Type: Application
    Filed: January 23, 2002
    Publication date: July 24, 2003
    Inventors: Ying Jia, Xiaobo Pi, Yonghong Yan