Patents by Inventor Daniel Almendro Barreda

Daniel Almendro Barreda has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

System and method for compressed domain language identification

Patent number: 9530400

Abstract: Embodiments included herein are directed towards a system and method for compressed domain language identification. Embodiments may include receiving a bitstream of a sequence of packets at one or more computing devices and classifying each packet into speech or non-speech based upon, at least in part, compressed domain voice activity detection (VAD). Embodiments may further include extracting a pseudo-cepstral representation from the speech detected packets and partially decoding without extracting a PCM format and generating a sequence of multi-frames, based upon, at least in part, the pseudo-cepstral representation. Embodiments may also include providing in real time the sequence of multi-frames to a deep neural network (DNN), wherein the DNN has been trained off-line for one or more desired target languages.

Type: Grant

Filed: September 29, 2014

Date of Patent: December 27, 2016

Assignee: Nuance Communications, Inc.

Inventors: Jose Lainez, Daniel Almendro Barreda
SYSTEM AND METHOD FOR COMPRESSED DOMAIN LANGUAGE IDENTIFICATION

Publication number: 20160093290

Abstract: Embodiments included herein are directed towards a system and method for compressed domain language identification. Embodiments may include receiving a bitstream of a sequence of packets at one or more computing devices and classifying each packet into speech or non-speech based upon, at least in part, compressed domain voice activity detection (VAD). Embodiments may further include extracting a pseudo-cepstral representation from the speech detected packets and partially decoding without extracting a PCM format and generating a sequence of multi-frames, based upon, at least in part, the pseudo-cepstral representation. Embodiments may also include providing in real time the sequence of multi-frames to a deep neural network (DNN), wherein the DNN has been trained off-line for one or more desired target languages.

Type: Application

Filed: September 29, 2014

Publication date: March 31, 2016

Inventors: Jose Lainez, Daniel Almendro Barreda
Method and apparatus of adaptive textual prediction of voice data

Patent number: 9099091

Abstract: Typical textual prediction of voice data employs a predefined implementation arrangement of a single or multiple prediction sources. Using a predefined implementation arrangement of the prediction sources may not provide a good prediction performance in a consistent manner with variations in voice data quality. Prediction performance may be improved by employing adaptive textual prediction. According to at least one embodiment determining a configuration of a plurality of prediction sources, used for textual interpretation of the voice data, is determined based at least in part on one or more features associated with the voice data or one or more a-priori interpretations of the voice data. A textual output prediction of the voice data is then generated using the plurality of prediction sources according to the determined configuration. Employing an adaptive configuration of the text prediction sources facilitates providing more accurate text transcripts of the voice data.

Type: Grant

Filed: January 22, 2013

Date of Patent: August 4, 2015

Assignee: Nuance Communications, Inc.

Inventors: Diven Topiwala, Uwe Helmut Jost, Lisa Meredith, Daniel Almendro Barreda
Method and Apparatus of Adaptive Textual Prediction of Voice Data

Publication number: 20140207451

Abstract: Typical textual prediction of voice data employs a predefined implementation arrangement of a single or multiple prediction sources. Using a predefined implementation arrangement of the prediction sources may not provide a good prediction performance in a consistent manner with variations in voice data quality. Prediction performance may be improved by employing adaptive textual prediction. According to at least one embodiment determining a configuration of a plurality of prediction sources, used for textual interpretation of the voice data, is determined based at least in part on one or more features associated with the voice data or one or more a-priori interpretations of the voice data. A textual output prediction of the voice data is then generated using the plurality of prediction sources according to the determined configuration. Employing an adaptive configuration of the text prediction sources facilitates providing more accurate text transcripts of the voice data.

Type: Application

Filed: January 22, 2013

Publication date: July 24, 2014

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Diven Topiwala, Uwe Helmut Jost, Lisa Meredith, Daniel Almendro Barreda

System and method for compressed domain language identification

SYSTEM AND METHOD FOR COMPRESSED DOMAIN LANGUAGE IDENTIFICATION

Method and apparatus of adaptive textual prediction of voice data

Method and Apparatus of Adaptive Textual Prediction of Voice Data