Patents by Inventor Michael A. Picheny

Michael A. Picheny has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Using closed captions as parallel training data for customization of closed captioning systems

Patent number: 11250872

Abstract: Method, apparatus, and computer program product are provided for customizing an automatic closed captioning system. In some embodiments, at a data use (DU) location, an automatic closed captioning system that includes a base model is provided, search criteria are defined to request from one or more data collection (DC) locations, a search request based on the search criteria is sent to the one or more DC locations, relevant closed caption data from the one or more DC locations are received responsive to the search request, the received relevant closed caption data are processed by computing a confidence score for each of a plurality of data sub-sets of the received relevant closed caption data and selecting one or more of the data sub-sets based on the confidence scores, and the automatic closed captioning system is customized by using the selected one or more data sub-sets to train the base model.

Type: Grant

Filed: December 14, 2019

Date of Patent: February 15, 2022

Assignee: International Business Machines Corporation

Inventors: Samuel Thomas, Yinghui Huang, Masayuki Suzuki, Zoltan Tueske, Laurence P. Sansone, Michael A. Picheny
USING CLOSED CAPTIONS AS PARALLEL TRAINING DATA FOR CUSTOMIZATION OF CLOSED CAPTIONING SYSTEMS

Publication number: 20210183404

Abstract: Method, apparatus, and computer program product are provided for customizing an automatic closed captioning system. In some embodiments, at a data use (DU) location, an automatic closed captioning system that includes a base model is provided, search criteria are defined to request from one or more data collection (DC) locations, a search request based on the search criteria is sent to the one or more DC locations, relevant closed caption data from the one or more DC locations are received responsive to the search request, the received relevant closed caption data are processed by computing a confidence score for each of a plurality of data sub-sets of the received relevant closed caption data and selecting one or more of the data sub-sets based on the confidence scores, and the automatic closed captioning system is customized by using the selected one or more data sub-sets to train the base model.

Type: Application

Filed: December 14, 2019

Publication date: June 17, 2021

Inventors: Samuel Thomas, Yinghui Huang, Masayuki Suzuki, Zoltan Tueske, Laurence P. Sansone, Michael A. Picheny
Smart medical room optimization of speech recognition systems

Patent number: 10726844

Abstract: A method, computer system, and a computer program product for optimizing speech recognition in a smart medical room. The present invention may include selecting, from a database, one or more speech domain models based on a plurality of signals from a plurality of biometric sensors associated with a plurality of medical equipment, wherein the one or more speech domain models are trained with one or more feedback from a clinician based on a medical encounter and from a continuous feedback display in the smart medical room, wherein the one or more feedback from the clinician is based on an optional notification to the clinician to confirm the one or more speech models in use.

Type: Grant

Filed: September 9, 2019

Date of Patent: July 28, 2020

Assignee: International Business Machines Corporation

Inventors: Andrew J. Lavery, Kenney Ng, Michael A. Picheny, Paul C. Tang
SMART MEDICAL ROOM OPTIMIZATION OF SPEECH RECOGNITION SYSTEMS

Publication number: 20200105271

Abstract: A method, computer system, and a computer program product for optimizing speech recognition in a smart medical room. The present invention may include selecting, from a database, one or more speech domain models based on a plurality of signals from a plurality of biometric sensors associated with a plurality of medical equipment, wherein the one or more speech domain models are trained with one or more feedback from a clinician based on a medical encounter and from a continuous feedback display in the smart medical room, wherein the one or more feedback from the clinician is based on an optional notification to the clinician to confirm the one or more speech models in use.

Type: Application

Filed: September 9, 2019

Publication date: April 2, 2020

Inventors: Andrew J. Lavery, Kenney Ng, Michael A. Picheny, Paul C. Tang
Smart medical room optimization of speech recognition systems

Patent number: 10510348

Abstract: A method, computer system, and a computer program product for optimizing speech recognition in a smart medical room. The present invention may include receiving a piece of verbal data associated with a medical encounter from one or more audio recording devices. The present invention may also include accessing a plurality of signals from a plurality of biometric sensors associated with a plurality of medical equipment associated with the smart medical room based on the received piece of verbal data associated with the medical encounter. The present invention may further include selecting, from a database, one or more speech domain models based on the accessed plurality of signals from the plurality of biometric sensors associated with the plurality of medical equipment, wherein the one or more speech domain models are utilized to optimize a transcription of speech during the medical encounter in the smart medical room.

Type: Grant

Filed: September 28, 2018

Date of Patent: December 17, 2019

Assignee: International Business Machines Corporation

Inventors: Andrew J. Lavery, Kenney Ng, Michael A. Picheny, Paul C. Tang
Generalized Sigmoids and Activation Function Learning

Publication number: 20180005111

Abstract: Disclosed herein are systems, methods, and computer-readable media for classifying a set of inputs via a supervised classifier model that utilizes a novel activation function that provides the capability to learn a scale parameter in addition to a bias parameter and other weight parameters.

Type: Application

Filed: June 30, 2016

Publication date: January 4, 2018

Inventors: Upendra V. Chaudhari, Michael A. Picheny
Method and system for order-free spoken term detection

Patent number: 9704482

Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.

Type: Grant

Filed: March 11, 2015

Date of Patent: July 11, 2017

Assignee: International Business Machines Corporation

Inventors: Brian E. D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
Method and system for order-free spoken term detection

Patent number: 9697830

Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.

Type: Grant

Filed: June 25, 2015

Date of Patent: July 4, 2017

Assignee: International Business Machines Corporation

Inventors: Brian E. D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
AUTONOMOUS COLLABORATION AGENT FOR MEETINGS

Publication number: 20170154264

Abstract: A method, executed by a computer, includes monitoring a conversation between a plurality of meeting participants, identifying a conversational focus within the conversation, generating at least one question corresponding to the conversational focus, and retrieving at least one answer corresponding to the at least one question. A computer system and computer program product corresponding to the method are also disclosed herein.

Type: Application

Filed: November 30, 2015

Publication date: June 1, 2017

Inventors: Stanley Chen, Kenneth W. Church, Robert G. Farrell, Vaibhava Goel, Lidia L. Mangu, Etienne Marcheret, Michael A. Picheny, Bhuvana Ramabhadran, Laurence P. Sansone, Abhinav Sethy, Samuel Thomas
METHOD AND SYSTEM FOR ORDER-FREE SPOKEN TERM DETECTION

Publication number: 20160267906

Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.

Type: Application

Filed: March 11, 2015

Publication date: September 15, 2016

Inventors: Brian E.D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
METHOD AND SYSTEM FOR ORDER-FREE SPOKEN TERM DETECTION

Publication number: 20160267907

Abstract: A method for spoken term detection, comprising generating a time-marked word list, wherein the time-marked word list is an output of an automatic speech recognition system, generating an index from the time-marked word list, wherein generating the index comprises creating a word loop weighted finite state transducer for each utterance, i, receiving a plurality of keyword queries, and searching the index for a plurality of keyword hits.

Type: Application

Filed: June 25, 2015

Publication date: September 15, 2016

Inventors: Brian E.D. Kingsbury, Lidia Mangu, Michael A. Picheny, George A. Saon
Translating between spoken and written language

Patent number: 9195650

Abstract: Techniques for converting spoken speech into written speech are provided. The techniques include transcribing input speech via speech recognition, mapping each spoken utterance from input speech into a corresponding formal utterance, and mapping each formal utterance into a stylistically formatted written utterance.

Type: Grant

Filed: September 23, 2014

Date of Patent: November 24, 2015

Assignee: Nuance Communications, Inc.

Inventors: Sara H. Basson, Rick A. Hamilton, II, Dan Ning Jiang, Dimitri Kanevsky, David Nahamoo, Michael A. Picheny, Bhuvana Ramabhadran, Tara N. Sainath
TRANSLATING BETWEEN SPOKEN AND WRITTEN LANGUAGE

Publication number: 20150120275

Abstract: Techniques for converting spoken speech into written speech are provided. The techniques include transcribing input speech via speech recognition, mapping each spoken utterance from input speech into a corresponding formal utterance, and mapping each formal utterance into a stylistically formatted written utterance.

Type: Application

Filed: September 23, 2014

Publication date: April 30, 2015

Applicant: Nuance Communications, Inc.

Inventors: Sara H. Basson, Rick A. Hamilton, II, Dan Ning Jiang, Dimitri Kanevsky, David Nahamoo, Michael A. Picheny, Bhuvana Ramabhadran, Tara N. Sainath
Method, apparatus and computer program providing a multi-speaker database for concatenative text-to-speech synthesis

Patent number: 7716052

Abstract: A method, apparatus and a computer program product to generate an audible speech word that corresponds to text. The method includes providing a text word and, in response to the text word, processing pre-recorded speech segments that are derived from a plurality of speakers to selectively concatenate together speech segments based on at least one cost function to form audio data for generating an audible speech word that corresponds to the text word. A data structure is also provided for use in a concatenative text-to-speech system that includes a plurality of speech segments derived from a plurality of speakers, where each speech segment includes an associated attribute vector each of which is comprised of at least one attribute vector element that identifies the speaker from which the speech segment was derived.

Type: Grant

Filed: April 7, 2005

Date of Patent: May 11, 2010

Assignee: Nuance Communications, Inc.

Inventors: Andrew S. Aaron, Ellen M. Eide, Wael M. Hamza, Michael A. Picheny, Charles T. Rutherfoord, Zhi Wei Shuang, Maria E. Smith
System and method for dynamically selecting among TTS systems

Patent number: 7702510

Abstract: Systems and methods for dynamically selecting among text-to-speech (TTS) systems. Exemplary embodiments of the systems and methods include identifying text for converting into a speech waveform, synthesizing said text by three TTS systems, generating a candidate waveform from each of the three systems, generating a score from each of the three systems, comparing each of the three scores, selecting a score based on a criteria and selecting one of the three waveforms based on the selected of the three scores.

Type: Grant

Filed: January 12, 2007

Date of Patent: April 20, 2010

Assignee: Nuance Communications, Inc.

Inventors: Ellen M. Eide, Raul Fernandez, Wael M. Hamza, Michael A. Picheny
Semantic language modeling and confidence measurement

Patent number: 7475015

Abstract: A system and method for speech recognition includes generating a set of likely hypotheses in recognizing speech, rescoring the likely hypotheses by using semantic content by employing semantic structured language models, and scoring parse trees to identify a best sentence according to the sentence's parse tree by employing the semantic structured language models to clarify the recognized speech.

Type: Grant

Filed: September 5, 2003

Date of Patent: January 6, 2009

Assignee: International Business Machines Corporation

Inventors: Mark E. Epstein, Hakan Erdogan, Yuqing Gao, Michael A. Picheny, Ruhi Sarikaya
SYSTEM AND METHOD FOR DYNAMICALLY SELECTING AMONG TTS SYSTEMS

Publication number: 20080172234

Abstract: Systems and methods for dynamically selecting among text-to-speech (TTS) systems. Exemplary embodiments of the systems and methods include identifying text for converting into a speech waveform, synthesizing said text by three TTS systems, generating a candidate waveform from each of the three systems, generating a score from each of the three systems, comparing each of the three scores, selecting a score based on a criteria and selecting one of the three waveforms based on the selected of the three scores.

Type: Application

Filed: January 12, 2007

Publication date: July 17, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ellen M. Eide, Raul Fernandez, Wael M. Hamza, Michael A. Picheny
METHODS AND COMPUTER PROGRAM PRODUCTS FOR PROVIDING PARAPHRASING IN A TEXT-TO-SPEECH SYSTEM

Publication number: 20080167876

Abstract: A method and computer program product for providing paraphrasing in a text-to-speech (TTS) system is provided. The method includes receiving an input text, parsing the input text, and determining a paraphrase of the input text. The method also includes synthesizing the paraphrase into synthesized speech. The method further includes selecting synthesized speech to output, which includes: assigning a score to each synthesized speech associated with each paraphrase, comparing the score of each synthesized speech associated with each paraphrase, and selecting the top-scoring synthesized speech to output. Furthermore, the method includes outputting the selected synthesized speech.

Type: Application

Filed: January 4, 2007

Publication date: July 10, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Raimo Bakis, Ellen M. Eide, Wael Hamza, Michael A. Picheny
Feature vector-based apparatus and method for robust pattern recognition

Patent number: 7054810

Abstract: N sets of feature vectors are generated from a set of observation vectors which are indicative of a pattern which it is desired to recognize. At least one of the sets of feature vectors is different than at least one other of the sets of feature vectors, and is preselected for purposes of containing at least some complimentary information with regard to the at least one other set of feature vectors. The N sets of feature vectors are combined in a manner to obtain an optimized set of feature vectors which best represents the pattern. The combination is performed via one of a weighted likelihood combination scheme and a rank-based state-selection scheme; preferably, it is done in accordance with an equation set forth herein. In one aspect, a weighted likelihood combination can be employed, while in another aspect, rank-based state selection can be employed. An apparatus suitable for performing the method is described, and implementation in a computer program product is also contemplated.

Type: Grant

Filed: October 1, 2001

Date of Patent: May 30, 2006

Assignee: International Business Machines Corporation

Inventors: Yuging Gao, Michael A. Picheny, Bhuvana Ramabhadran
Method and system for text-to-speech caching

Patent number: 7043432

Abstract: In a text-to-speech system, a method of converting text-to-speech can include receiving a text input and comparing the received text input to at least one entry in a text-to-speech cache memory. Each entry in the text-to-speech cache memory can specify a corresponding spoken output. If the text input matches one of the entries in the text-to-speech cache memory, the cached speech output specified by the matching entry can be provided.

Type: Grant

Filed: August 29, 2001

Date of Patent: May 9, 2006

Assignee: International Business Machines Corporation

Inventors: Raimo Bakis, Hari Chittaluru, Edward A. Epstein, Steven J. Friedland, Abraham Ittycheriah, Stephen G. Lawrence, Michael A. Picheny, Charles Rutherfoord, Maria E. Smith

1 2 3 next