Patents by Inventor Jose Lainez

Jose Lainez has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9530400
    Abstract: Embodiments included herein are directed towards a system and method for compressed domain language identification. Embodiments may include receiving a bitstream of a sequence of packets at one or more computing devices and classifying each packet into speech or non-speech based upon, at least in part, compressed domain voice activity detection (VAD). Embodiments may further include extracting a pseudo-cepstral representation from the speech detected packets and partially decoding without extracting a PCM format and generating a sequence of multi-frames, based upon, at least in part, the pseudo-cepstral representation. Embodiments may also include providing in real time the sequence of multi-frames to a deep neural network (DNN), wherein the DNN has been trained off-line for one or more desired target languages.
    Type: Grant
    Filed: September 29, 2014
    Date of Patent: December 27, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Jose Lainez, Daniel Almendro Barreda
  • Patent number: 9489958
    Abstract: The present disclosure is directed towards a method for discontinuous transmission (“DTX”) bandwidth reduction. The method may include receiving, at a processor, a frame identified as speech and determining that the frame was mistakenly identified as speech based upon, at least in part, a voice activity detection algorithm. The method may further include labeling the frame as a silence indicator frame.
    Type: Grant
    Filed: July 31, 2014
    Date of Patent: November 8, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Sridhar Pilli, Jose Lainez, Dushyant Sharma, Daniel A. Barreda, Patrick Naylor, Mahesh Godavarti
  • Patent number: 9373342
    Abstract: The present disclosure is directed towards a method for speech intelligibility. The method may include receiving, at one or more computing devices, a first speech input from a first user and performing voice activity detection upon the first speech input. The method may also include analyzing a spectral tilt associated with the first speech input, wherein analyzing includes computing an impulse response of a linear predictive coding (“LPC”) synthesis filter in a linear pulse code modulation (“PCM”) domain and wherein the one or more computing devices includes an adaptive high pass filter configured to recalculate one or more linear prediction coefficients.
    Type: Grant
    Filed: June 23, 2014
    Date of Patent: June 21, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Sridhar Pilli, Mahesh Godavarti, Qian-Yu Tang, Jose Lainez, Jagadeesh Balam
  • Patent number: 9361899
    Abstract: The present disclosure is directed towards a process for estimating the signal to noise ratio of a speech signal. The process may include receiving, at a computing device, a speech signal having a bitstream and a signal-to-noise ratio (“SNR”) associated therewith. The process may further include estimating the SNR directly from the bitstream or using a partial decoder that is configured to extract one or more parameters, the parameters including at least one of a fixed codebook gain, an adaptive codebook gain, a pitch lag, and a line spectral frequency (“LSF”) coefficient.
    Type: Grant
    Filed: July 2, 2014
    Date of Patent: June 7, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Jose Lainez, Daniel A. Barreda, Dushyant Sharma, Patrick Naylor, Sridhar Pilli
  • Publication number: 20160093290
    Abstract: Embodiments included herein are directed towards a system and method for compressed domain language identification. Embodiments may include receiving a bitstream of a sequence of packets at one or more computing devices and classifying each packet into speech or non-speech based upon, at least in part, compressed domain voice activity detection (VAD). Embodiments may further include extracting a pseudo-cepstral representation from the speech detected packets and partially decoding without extracting a PCM format and generating a sequence of multi-frames, based upon, at least in part, the pseudo-cepstral representation. Embodiments may also include providing in real time the sequence of multi-frames to a deep neural network (DNN), wherein the DNN has been trained off-line for one or more desired target languages.
    Type: Application
    Filed: September 29, 2014
    Publication date: March 31, 2016
    Inventors: Jose Lainez, Daniel Almendro Barreda
  • Publication number: 20160035359
    Abstract: The present disclosure is directed towards a method for discontinuous transmission (“DTX”) bandwidth reduction. The method may include receiving, at a processor, a frame identified as speech and determining that the frame was mistakenly identified as speech based upon, at least in part, a voice activity detection algorithm. The method may further include labeling the frame as a silence indicator frame.
    Type: Application
    Filed: July 31, 2014
    Publication date: February 4, 2016
    Inventors: Sridhar Pilli, Jose Lainez, Dushyant Sharma, Daniel A. Barreda, Patrick Naylor, Mahesh Godavarti
  • Publication number: 20160005414
    Abstract: The present disclosure is directed towards a process for estimating the signal to noise ratio of a speech signal. The process may include receiving, at a computing device, a speech signal having a bitstream and a signal-to-noise ratio (“SNR”) associated therewith. The process may further include estimating the SNR directly from the bitstream or using a partial decoder that is configured to extract one or more parameters, the parameters including at least one of a fixed codebook gain, an adaptive codebook gain, a pitch lag, and a line spectral frequency (“LSF”) coefficient.
    Type: Application
    Filed: July 2, 2014
    Publication date: January 7, 2016
    Inventors: Jose Lainez, Daniel A. Barreda, Dushyant Sharma, Patrick Naylor, Sridhar Pilli
  • Publication number: 20150371653
    Abstract: The present disclosure is directed towards a method for speech intelligibility. The method may include receiving, at one or more computing devices, a first speech input from a first user and performing voice activity detection upon the first speech input. The method may also include analyzing a spectral tilt associated with the first speech input, wherein analyzing includes computing an impulse response of a linear predictive coding (“LPC”) synthesis filter in a linear pulse code modulation (“PCM”) domain and wherein the one or more computing devices includes an adaptive high pass filter configured to recalculate one or more linear prediction coefficients.
    Type: Application
    Filed: June 23, 2014
    Publication date: December 24, 2015
    Inventors: Sridhar Pilli, Mahesh Godavarti, Qian-Yu Tang, Jose Lainez, Jagadeesh Balam