Patents by Inventor Daniel Almendro Barreda

Daniel Almendro Barreda has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9530400
    Abstract: Embodiments included herein are directed towards a system and method for compressed domain language identification. Embodiments may include receiving a bitstream of a sequence of packets at one or more computing devices and classifying each packet into speech or non-speech based upon, at least in part, compressed domain voice activity detection (VAD). Embodiments may further include extracting a pseudo-cepstral representation from the speech detected packets and partially decoding without extracting a PCM format and generating a sequence of multi-frames, based upon, at least in part, the pseudo-cepstral representation. Embodiments may also include providing in real time the sequence of multi-frames to a deep neural network (DNN), wherein the DNN has been trained off-line for one or more desired target languages.
    Type: Grant
    Filed: September 29, 2014
    Date of Patent: December 27, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Jose Lainez, Daniel Almendro Barreda
  • Publication number: 20160093290
    Abstract: Embodiments included herein are directed towards a system and method for compressed domain language identification. Embodiments may include receiving a bitstream of a sequence of packets at one or more computing devices and classifying each packet into speech or non-speech based upon, at least in part, compressed domain voice activity detection (VAD). Embodiments may further include extracting a pseudo-cepstral representation from the speech detected packets and partially decoding without extracting a PCM format and generating a sequence of multi-frames, based upon, at least in part, the pseudo-cepstral representation. Embodiments may also include providing in real time the sequence of multi-frames to a deep neural network (DNN), wherein the DNN has been trained off-line for one or more desired target languages.
    Type: Application
    Filed: September 29, 2014
    Publication date: March 31, 2016
    Inventors: Jose Lainez, Daniel Almendro Barreda
  • Patent number: 9099091
    Abstract: Typical textual prediction of voice data employs a predefined implementation arrangement of a single or multiple prediction sources. Using a predefined implementation arrangement of the prediction sources may not provide a good prediction performance in a consistent manner with variations in voice data quality. Prediction performance may be improved by employing adaptive textual prediction. According to at least one embodiment determining a configuration of a plurality of prediction sources, used for textual interpretation of the voice data, is determined based at least in part on one or more features associated with the voice data or one or more a-priori interpretations of the voice data. A textual output prediction of the voice data is then generated using the plurality of prediction sources according to the determined configuration. Employing an adaptive configuration of the text prediction sources facilitates providing more accurate text transcripts of the voice data.
    Type: Grant
    Filed: January 22, 2013
    Date of Patent: August 4, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Diven Topiwala, Uwe Helmut Jost, Lisa Meredith, Daniel Almendro Barreda
  • Publication number: 20140207451
    Abstract: Typical textual prediction of voice data employs a predefined implementation arrangement of a single or multiple prediction sources. Using a predefined implementation arrangement of the prediction sources may not provide a good prediction performance in a consistent manner with variations in voice data quality. Prediction performance may be improved by employing adaptive textual prediction. According to at least one embodiment determining a configuration of a plurality of prediction sources, used for textual interpretation of the voice data, is determined based at least in part on one or more features associated with the voice data or one or more a-priori interpretations of the voice data. A textual output prediction of the voice data is then generated using the plurality of prediction sources according to the determined configuration. Employing an adaptive configuration of the text prediction sources facilitates providing more accurate text transcripts of the voice data.
    Type: Application
    Filed: January 22, 2013
    Publication date: July 24, 2014
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Diven Topiwala, Uwe Helmut Jost, Lisa Meredith, Daniel Almendro Barreda