Patents by Inventor Paul J. Vozila
Paul J. Vozila has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10102849Abstract: Techniques for generating language models. The techniques include: obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data using at least one processor, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and generating a language model for each of the plurality of clusters.Type: GrantFiled: March 24, 2017Date of Patent: October 16, 2018Assignee: Nuance Communications, Inc.Inventors: Paul J. Vozila, Wilson Tam, Nils Lenke
-
Patent number: 9886944Abstract: A mobile device is described which is adapted for automatic speech recognition (ASR). A speech input receives an unknown speech input signal from a user. A local controller determines if a remote ASR processing condition is met, transforms the speech input signal into a selected one of multiple different speech representation types, and sends the transformed speech input signal to a remote server for remote ASR processing. A local ASR arrangement performs local ASR processing of the speech input including processing any speech recognition results received from the remote server.Type: GrantFiled: October 4, 2012Date of Patent: February 6, 2018Assignee: Nuance Communications, Inc.Inventors: Daniel Willett, Jianxiong Wu, Paul J. Vozila, William F. Ganong, III
-
Publication number: 20170200447Abstract: Techniques for generating language models. The techniques include: obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data using at least one processor, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and generating a language model for each of the plurality of clusters.Type: ApplicationFiled: March 24, 2017Publication date: July 13, 2017Applicant: Nuance Communications, Inc.Inventors: Paul J. Vozila, Wilson Tam, Nils Lenke
-
Patent number: 9626960Abstract: Techniques for generating language models. The techniques include: obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data using at least one processor, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and generating a language model for each of the plurality of clusters.Type: GrantFiled: April 25, 2013Date of Patent: April 18, 2017Assignee: Nuance Communications, Inc.Inventors: Paul J. Vozila, Wilson Tam, Nils Lenke
-
Patent number: 9524716Abstract: Some embodiments relate to using an unnormalized neural network language model in connection with a speech processing application. The techniques include obtaining a language segment sequence comprising one or more language segments in a vocabulary; accessing an unnormalized neural network language model having a normalizer node and an output layer comprising a plurality of output nodes, each of the plurality of output nodes associated with a respective language segment in the vocabulary; and determining, using the unnormalized neural network language model, a first likelihood that a first language segment in the vocabulary follows the language segment sequence.Type: GrantFiled: April 17, 2015Date of Patent: December 20, 2016Assignee: Nuance Communications, Inc.Inventors: Abhinav Sethy, Stanley Chen, Bhuvana Ramabhadran, Paul J. Vozila
-
Patent number: 9514741Abstract: Training speech recognizers, e.g., their language or acoustic models, using actual user data is useful, but retaining personally identifiable information may be restricted in certain environments due to regulations. Accordingly, a method or system is provided for enabling training of an acoustic model which includes dynamically shredding a speech corpus to produce text segments and depersonalized audio features corresponding to the text segments. The method further includes enabling a system to train an acoustic model using the text segments and the depersonalized audio features. Because the data is depersonalized, actual data may be used, enabling speech recognizers to keep up-to-date with user trends in speech and usage, among other benefits.Type: GrantFiled: March 13, 2013Date of Patent: December 6, 2016Assignee: Nuance Communications, Inc.Inventors: Uwe Helmut Jost, Philip Charles Woodland, Marcel Katz, Syed Raza Shahid, Paul J. Vozila, William F. Ganong, III
-
Patent number: 9514740Abstract: Training speech recognizers, e.g., their language or acoustic models, using actual user data is useful, but retaining personally identifiable information may be restricted in certain environments due to regulations. Accordingly, a method or system is provided for enabling training of a language model which includes producing segments of text in a text corpus and counts corresponding to the segments of text, the text corpus being in a depersonalized state. The method further includes enabling a system to train a language model using the segments of text in the depersonalized state and the counts. Because the data is depersonalized, actual data may be used, enabling speech recognizers to keep up-to-date with user trends in speech and usage, among other benefits.Type: GrantFiled: March 13, 2013Date of Patent: December 6, 2016Assignee: Nuance Communications, Inc.Inventors: Uwe Helmut Jost, Philip Charles Woodland, Marcel Katz, Syed Raza Shahid, Paul J. Vozila, William F. Ganong, III
-
Publication number: 20160307564Abstract: Some embodiments relate to using an unnormalized neural network language model in connection with a speech processing application. The techniques include obtaining a language segment sequence comprising one or more language segments in a vocabulary; accessing an unnormalized neural network language model having a normalizer node and an output layer comprising a plurality of output nodes, each of the plurality of output nodes associated with a respective language segment in the vocabulary; and determining, using the unnormalized neural network language model, a first likelihood that a first language segment in the vocabulary follows the language segment sequence.Type: ApplicationFiled: April 17, 2015Publication date: October 20, 2016Applicant: Nuance Communications, Inc.Inventors: Abhinav Sethy, Stanley Chen, Bhuvana Ramabhadran, Paul J. Vozila
-
Patent number: 9311291Abstract: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.Type: GrantFiled: September 9, 2013Date of Patent: April 12, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Nathan M. Bodenstab, Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura, Paul J. Vozila
-
Patent number: 9251135Abstract: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.Type: GrantFiled: August 13, 2013Date of Patent: February 2, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Nathan M. Bodenstab, Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura, Paul J. Vozila
-
Publication number: 20150279352Abstract: A mobile device is described which is adapted for automatic speech recognition (ASR). A speech input receives an unknown speech input signal from a user. A local controller determines if a remote ASR processing condition is met, transforms the speech input signal into a selected one of multiple different speech representation types, and sends the transformed speech input signal to a remote server for remote ASR processing. A local ASR arrangement performs local ASR processing of the speech input including processing any speech recognition results received from the remote server.Type: ApplicationFiled: October 4, 2012Publication date: October 1, 2015Applicant: Nuance Communications, Inc.Inventors: Daniel Willett, Jianxiong Wu, Paul J. Vozila, William F. Ganong, III
-
Patent number: 9131369Abstract: A mobile device is adapted for protecting private information on the mobile device in a hybrid automatic speech recognition arrangement. The mobile device includes a speech input component for receiving a speech input signal from a user. Additionally, the mobile device includes a local ASR arrangement for performing local ASR processing of the speech input signal and determining if private information is included within the speech input signal. A control unit on the mobile device obscures private information in the speech input signal if the local ASR arrangement identifies information within a speech recognition result as private information. The control unit releases the speech input signal with the obscured private information for transmission to a remote server for further ASR processing. Results from the remote server's ASR processing are integrated and combined with results from local ASR processing to display information on the mobile device.Type: GrantFiled: January 24, 2013Date of Patent: September 8, 2015Assignee: Nuance Communications, Inc.Inventors: William F. Ganong, III, Paul J. Vozila
-
Publication number: 20150051899Abstract: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.Type: ApplicationFiled: August 13, 2013Publication date: February 19, 2015Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Nathan M. Bodenstab, Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura, Paul J. Vozila
-
Publication number: 20150051902Abstract: Methods and a system for calculating N-gram probabilities in a language model. A method includes counting N-grams in each page of a plurality of pages or in each document of a plurality of documents to obtain respective N-gram counts therefor. The method further includes applying weights to the respective N-gram counts based on at least one of view counts and rankings to obtain weighted respective N-gram counts. The view counts and the rankings are determined with respect to the plurality of pages or the plurality of documents. The method also includes merging the weighted respective N-gram counts to obtain merged weighted respective N-gram counts for the plurality of pages or the plurality of documents. The method additionally includes calculating a respective probability for each of the N-grams based on the merged weighted respective N-gram counts.Type: ApplicationFiled: September 9, 2013Publication date: February 19, 2015Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Nathan M. Bodenstab, Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura, Paul J. Vozila
-
Publication number: 20140324434Abstract: Techniques for generating language models. The techniques include: obtaining language data comprising training data and associated values for one or more metadata attributes, the language data comprising a plurality of instances of language data, an instance of language data comprising an instance of training data and one or more metadata attribute values associated with the instance of training data; identifying, by processing the language data using at least one processor, a set of one or more of the metadata attributes to use for clustering the instances of training data into a plurality of clusters; clustering the training data instances based on their respective values for the identified set of metadata attributes into the plurality of clusters; and generating a language model for each of the plurality of clusters.Type: ApplicationFiled: April 25, 2013Publication date: October 30, 2014Applicant: Nuance Communications, Inc.Inventors: Paul J. Vozila, Wilson Tam, Nils Lenke
-
Publication number: 20140278425Abstract: Training speech recognizers, e.g., their language or acoustic models, using actual user data is useful, but retaining personally identifiable information may be restricted in certain environments due to regulations. Accordingly, a method or system is provided for enabling training of a language model which includes producing segments of text in a text corpus and counts corresponding to the segments of text, the text corpus being in a depersonalized state. The method further includes enabling a system to train a language model using the segments of text in the depersonalized state and the counts. Because the data is depersonalized, actual data may be used, enabling speech recognizers to keep up-to-date with user trends in speech and usage, among other benefits.Type: ApplicationFiled: March 13, 2013Publication date: September 18, 2014Inventors: Uwe Helmut Jost, Philip Charles Woodland, Marcel Katz, Syed Raza Shahid, Paul J. Vozila, William F. Ganong, III
-
Publication number: 20140278426Abstract: Training speech recognizers, e.g., their language or acoustic models, using actual user data is useful, but retaining personally identifiable information may be restricted in certain environments due to regulations. Accordingly, a method or system is provided for enabling training of an acoustic model which includes dynamically shredding a speech corpus to produce text segments and depersonalized audio features corresponding to the text segments. The method further includes enabling a system to train an acoustic model using the text segments and the depersonalized audio features. Because the data is depersonalized, actual data may be used, enabling speech recognizers to keep up-to-date with user trends in speech and usage, among other benefits.Type: ApplicationFiled: March 13, 2013Publication date: September 18, 2014Applicant: Nuance Communications, Inc.Inventors: Uwe Helmut Jost, Philip Charles Woodland, Marcel Katz, Syed Raza Shahid, Paul J. Vozila, William F. Ganong, III
-
Publication number: 20140207442Abstract: A mobile device is adapted for protecting private information on the mobile device in a hybrid automatic speech recognition arrangement. The mobile device includes a speech input component for receiving a speech input signal from a user. Additionally, the mobile device includes a local ASR arrangement for performing local ASR processing of the speech input signal and determining if private information is included within the speech input signal. A control unit on the mobile device obscures private information in the speech input signal if the local ASR arrangement identifies information within a speech recognition result as private information. The control unit releases the speech input signal with the obscured private information for transmission to a remote server for further ASR processing. Results from the remote server's ASR processing are integrated and combined with results from local ASR processing to display information on the mobile device.Type: ApplicationFiled: January 24, 2013Publication date: July 24, 2014Applicant: NUANCE COMMUNICATIONS, INC.Inventors: William F. Ganong, III, Paul J. Vozila
-
Patent number: 8666963Abstract: Some embodiments relate to a method of performing a search for content on the Internet, in which a user may speak a search query and speech recognition may be performed on the spoken query to generate a text search query to be provided to a plurality of search engines. This enables a user to speak the search query rather than having to type it, and also allows the user to provide the search query only once, rather than having to provide it separately to multiple different search engines.Type: GrantFiled: June 19, 2012Date of Patent: March 4, 2014Assignee: Nuance Communications, Inc.Inventors: Vladimir Sejnoha, William F. Ganong, III, Paul J. Vozila, Nathan M. Bodenstab, Yik-Cheung Tam
-
Publication number: 20130317817Abstract: Computer models are powerful resources that can be accessed by remote users. Models can be copied without authorization or can become an out-of-date version. A model with a signature, referred to herein as a “signed” model, can indicate the signature without affecting usage by users who are unaware that the model contains the signature. The signed model can respond to an input in a steganographic way such that only the designer of the model knows that the signature is embedded in the model. The response is a way to check the source or other characteristics of the model. The signed model can include embedded signatures of various degrees of detectability to respond to select steganographic inputs with steganographic outputs. In this manner, a designer of signed models can prove whether an unauthorized copy of the signed model is being used by a third party while using publically-available user interfaces.Type: ApplicationFiled: May 22, 2012Publication date: November 28, 2013Applicant: Nuance Communications, Inc.Inventors: William F. Ganong, III, Paul J. Vozila, Puming Zhan