Patents by Inventor Paul J. Vozila

Paul J. Vozila has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for providing metadata-dependent language models

Patent number: 10102849

Abstract: Techniques for generating language models. The techniques include: obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data using at least one processor, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and generating a language model for each of the plurality of clusters.

Type: Grant

Filed: March 24, 2017

Date of Patent: October 16, 2018

Assignee: Nuance Communications, Inc.

Inventors: Paul J. Vozila, Wilson Tam, Nils Lenke
Hybrid controller for ASR

Patent number: 9886944

Abstract: A mobile device is described which is adapted for automatic speech recognition (ASR). A speech input receives an unknown speech input signal from a user. A local controller determines if a remote ASR processing condition is met, transforms the speech input signal into a selected one of multiple different speech representation types, and sends the transformed speech input signal to a remote server for remote ASR processing. A local ASR arrangement performs local ASR processing of the speech input including processing any speech recognition results received from the remote server.

Type: Grant

Filed: October 4, 2012

Date of Patent: February 6, 2018

Assignee: Nuance Communications, Inc.

Inventors: Daniel Willett, Jianxiong Wu, Paul J. Vozila, William F. Ganong, III
SYSTEMS AND METHODS FOR PROVIDING METADATA-DEPENDENT LANGUAGE MODELS

Publication number: 20170200447

Abstract: Techniques for generating language models. The techniques include: obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data using at least one processor, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and generating a language model for each of the plurality of clusters.

Type: Application

Filed: March 24, 2017

Publication date: July 13, 2017

Applicant: Nuance Communications, Inc.

Inventors: Paul J. Vozila, Wilson Tam, Nils Lenke
Systems and methods for providing metadata-dependent language models

Patent number: 9626960

Abstract: Techniques for generating language models. The techniques include: obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data using at least one processor, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and generating a language model for each of the plurality of clusters.

Type: Grant

Filed: April 25, 2013

Date of Patent: April 18, 2017

Assignee: Nuance Communications, Inc.

Inventors: Paul J. Vozila, Wilson Tam, Nils Lenke
Systems and methods for providing unnormalized language models

Patent number: 9524716

Abstract: Some embodiments relate to using an unnormalized neural network language model in connection with a speech processing application. The techniques include obtaining a language segment sequence comprising one or more language segments in a vocabulary; accessing an unnormalized neural network language model having a normalizer node and an output layer comprising a plurality of output nodes, each of the plurality of output nodes associated with a respective language segment in the vocabulary; and determining, using the unnormalized neural network language model, a first likelihood that a first language segment in the vocabulary follows the language segment sequence.

Type: Grant

Filed: April 17, 2015

Date of Patent: December 20, 2016

Assignee: Nuance Communications, Inc.

Inventors: Abhinav Sethy, Stanley Chen, Bhuvana Ramabhadran, Paul J. Vozila
Data shredding for speech recognition acoustic model training under data retention restrictions

Patent number: 9514741

Abstract: Training speech recognizers, e.g., their language or acoustic models, using actual user data is useful, but retaining personally identifiable information may be restricted in certain environments due to regulations. Accordingly, a method or system is provided for enabling training of an acoustic model which includes dynamically shredding a speech corpus to produce text segments and depersonalized audio features corresponding to the text segments. The method further includes enabling a system to train an acoustic model using the text segments and the depersonalized audio features. Because the data is depersonalized, actual data may be used, enabling speech recognizers to keep up-to-date with user trends in speech and usage, among other benefits.

Type: Grant

Filed: March 13, 2013

Date of Patent: December 6, 2016

Assignee: Nuance Communications, Inc.

Inventors: Uwe Helmut Jost, Philip Charles Woodland, Marcel Katz, Syed Raza Shahid, Paul J. Vozila, William F. Ganong, III
Data shredding for speech recognition language model training under data retention restrictions

Patent number: 9514740

Abstract: Training speech recognizers, e.g., their language or acoustic models, using actual user data is useful, but retaining personally identifiable information may be restricted in certain environments due to regulations. Accordingly, a method or system is provided for enabling training of a language model which includes producing segments of text in a text corpus and counts corresponding to the segments of text, the text corpus being in a depersonalized state. The method further includes enabling a system to train a language model using the segments of text in the depersonalized state and the counts. Because the data is depersonalized, actual data may be used, enabling speech recognizers to keep up-to-date with user trends in speech and usage, among other benefits.

Type: Grant

Filed: March 13, 2013

Date of Patent: December 6, 2016

Assignee: Nuance Communications, Inc.

Inventors: Uwe Helmut Jost, Philip Charles Woodland, Marcel Katz, Syed Raza Shahid, Paul J. Vozila, William F. Ganong, III
SYSTEMS AND METHODS FOR PROVIDING UNNORMALIZED LANGUAGE MODELS

Publication number: 20160307564

Abstract: Some embodiments relate to using an unnormalized neural network language model in connection with a speech processing application. The techniques include obtaining a language segment sequence comprising one or more language segments in a vocabulary; accessing an unnormalized neural network language model having a normalizer node and an output layer comprising a plurality of output nodes, each of the plurality of output nodes associated with a respective language segment in the vocabulary; and determining, using the unnormalized neural network language model, a first likelihood that a first language segment in the vocabulary follows the language segment sequence.

Type: Application

Filed: April 17, 2015

Publication date: October 20, 2016

Applicant: Nuance Communications, Inc.

Inventors: Abhinav Sethy, Stanley Chen, Bhuvana Ramabhadran, Paul J. Vozila
Correcting N-gram probabilities by page view information

Patent number: 9311291

Abstract: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.

Type: Grant

Filed: September 9, 2013

Date of Patent: April 12, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Nathan M. Bodenstab, Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura, Paul J. Vozila
Correcting N-gram probabilities by page view information

Patent number: 9251135

Abstract: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.

Type: Grant

Filed: August 13, 2013

Date of Patent: February 2, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Nathan M. Bodenstab, Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura, Paul J. Vozila
HYBRID CONTROLLER FOR ASR

Publication number: 20150279352

Abstract: A mobile device is described which is adapted for automatic speech recognition (ASR). A speech input receives an unknown speech input signal from a user. A local controller determines if a remote ASR processing condition is met, transforms the speech input signal into a selected one of multiple different speech representation types, and sends the transformed speech input signal to a remote server for remote ASR processing. A local ASR arrangement performs local ASR processing of the speech input including processing any speech recognition results received from the remote server.

Type: Application

Filed: October 4, 2012

Publication date: October 1, 2015

Applicant: Nuance Communications, Inc.

Inventors: Daniel Willett, Jianxiong Wu, Paul J. Vozila, William F. Ganong, III
Protection of private information in a client/server automatic speech recognition system

Patent number: 9131369

Abstract: A mobile device is adapted for protecting private information on the mobile device in a hybrid automatic speech recognition arrangement. The mobile device includes a speech input component for receiving a speech input signal from a user. Additionally, the mobile device includes a local ASR arrangement for performing local ASR processing of the speech input signal and determining if private information is included within the speech input signal. A control unit on the mobile device obscures private information in the speech input signal if the local ASR arrangement identifies information within a speech recognition result as private information. The control unit releases the speech input signal with the obscured private information for transmission to a remote server for further ASR processing. Results from the remote server's ASR processing are integrated and combined with results from local ASR processing to display information on the mobile device.

Type: Grant

Filed: January 24, 2013

Date of Patent: September 8, 2015

Assignee: Nuance Communications, Inc.

Inventors: William F. Ganong, III, Paul J. Vozila
CORRECTING N-GRAM PROBABILITIES BY PAGE VIEW INFORMATION

Publication number: 20150051899

Abstract: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.

Type: Application

Filed: August 13, 2013

Publication date: February 19, 2015

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Nathan M. Bodenstab, Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura, Paul J. Vozila
CORRECTING N-GRAM PROBABILITIES BY PAGE VIEW INFORMATION

Publication number: 20150051902

Abstract: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.

Type: Application

Filed: September 9, 2013

Publication date: February 19, 2015

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Nathan M. Bodenstab, Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura, Paul J. Vozila
SYSTEMS AND METHODS FOR PROVIDING METADATA-DEPENDENT LANGUAGE MODELS

Publication number: 20140324434

Abstract: Techniques for generating language models. The techniques include: obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data using at least one processor, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and generating a language model for each of the plurality of clusters.

Type: Application

Filed: April 25, 2013

Publication date: October 30, 2014

Applicant: Nuance Communications, Inc.

Inventors: Paul J. Vozila, Wilson Tam, Nils Lenke
DATA SHREDDING FOR SPEECH RECOGNITION LANGUAGE MODEL TRAINING UNDER DATA RETENTION RESTRICTIONS

Publication number: 20140278425

Abstract: Training speech recognizers, e.g., their language or acoustic models, using actual user data is useful, but retaining personally identifiable information may be restricted in certain environments due to regulations. Accordingly, a method or system is provided for enabling training of a language model which includes producing segments of text in a text corpus and counts corresponding to the segments of text, the text corpus being in a depersonalized state. The method further includes enabling a system to train a language model using the segments of text in the depersonalized state and the counts. Because the data is depersonalized, actual data may be used, enabling speech recognizers to keep up-to-date with user trends in speech and usage, among other benefits.

Type: Application

Filed: March 13, 2013

Publication date: September 18, 2014

Inventors: Uwe Helmut Jost, Philip Charles Woodland, Marcel Katz, Syed Raza Shahid, Paul J. Vozila, William F. Ganong, III
DATA SHREDDING FOR SPEECH RECOGNITION ACOUSTIC MODEL TRAINING UNDER DATA RETENTION RESTRICTIONS

Publication number: 20140278426

Abstract: Training speech recognizers, e.g., their language or acoustic models, using actual user data is useful, but retaining personally identifiable information may be restricted in certain environments due to regulations. Accordingly, a method or system is provided for enabling training of an acoustic model which includes dynamically shredding a speech corpus to produce text segments and depersonalized audio features corresponding to the text segments. The method further includes enabling a system to train an acoustic model using the text segments and the depersonalized audio features. Because the data is depersonalized, actual data may be used, enabling speech recognizers to keep up-to-date with user trends in speech and usage, among other benefits.

Type: Application

Filed: March 13, 2013

Publication date: September 18, 2014

Applicant: Nuance Communications, Inc.

Inventors: Uwe Helmut Jost, Philip Charles Woodland, Marcel Katz, Syed Raza Shahid, Paul J. Vozila, William F. Ganong, III
Protection of Private Information in a Client/Server Automatic Speech Recognition System

Publication number: 20140207442

Abstract: A mobile device is adapted for protecting private information on the mobile device in a hybrid automatic speech recognition arrangement. The mobile device includes a speech input component for receiving a speech input signal from a user. Additionally, the mobile device includes a local ASR arrangement for performing local ASR processing of the speech input signal and determining if private information is included within the speech input signal. A control unit on the mobile device obscures private information in the speech input signal if the local ASR arrangement identifies information within a speech recognition result as private information. The control unit releases the speech input signal with the obscured private information for transmission to a remote server for further ASR processing. Results from the remote server's ASR processing are integrated and combined with results from local ASR processing to display information on the mobile device.

Type: Application

Filed: January 24, 2013

Publication date: July 24, 2014

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: William F. Ganong, III, Paul J. Vozila
Method and apparatus for processing spoken search queries

Patent number: 8666963

Abstract: Some embodiments relate to a method of performing a search for content on the Internet, in which a user may speak a search query and speech recognition may be performed on the spoken query to generate a text search query to be provided to a plurality of search engines. This enables a user to speak the search query rather than having to type it, and also allows the user to provide the search query only once, rather than having to provide it separately to multiple different search engines.

Type: Grant

Filed: June 19, 2012

Date of Patent: March 4, 2014

Assignee: Nuance Communications, Inc.

Inventors: Vladimir Sejnoha, William F. Ganong, III, Paul J. Vozila, Nathan M. Bodenstab, Yik-Cheung Tam
Method and Apparatus for Applying Steganography in a Signed Model

Publication number: 20130317817

Abstract: Computer models are powerful resources that can be accessed by remote users. Models can be copied without authorization or can become an out-of-date version. A model with a signature, referred to herein as a “signed” model, can indicate the signature without affecting usage by users who are unaware that the model contains the signature. The signed model can respond to an input in a steganographic way such that only the designer of the model knows that the signature is embedded in the model. The response is a way to check the source or other characteristics of the model. The signed model can include embedded signatures of various degrees of detectability to respond to select steganographic inputs with steganographic outputs. In this manner, a designer of signed models can prove whether an unauthorized copy of the signed model is being used by a third party while using publically-available user interfaces.

Type: Application

Filed: May 22, 2012

Publication date: November 28, 2013

Applicant: Nuance Communications, Inc.

Inventors: William F. Ganong, III, Paul J. Vozila, Puming Zhan

1 2 next