Patents by Inventor Kenneth W. Church
Kenneth W. Church has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11145308Abstract: Symbol sequences are estimated using a computer-implemented method including detecting one or more candidates of a target symbol sequence from a speech-to-text data, extracting a related portion of each candidate from the speech-to-text data, detecting repetition of at least a partial sequence of each candidate within the related portion of the corresponding candidate, labeling the detected repetition with a repetition indication, and estimating whether each candidate is the target symbol sequence, using the corresponding related portion including the repetition indication of each of the candidates.Type: GrantFiled: September 20, 2019Date of Patent: October 12, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Kenneth W. Church, Gakuto Kurata, Bhuvana Ramabhadran, Abhinav Sethy, Masayuki Suzuki, Ryuki Tachibana
-
Patent number: 11120802Abstract: An approach is provided that receives an audio stream and utilizes a voice activation detection (VAD) process to create a digital audio stream of voices from at least two different speakers. An automatic speech recognition (ASR) process is applied to the digital stream with the ASR process resulting in the spoken words to which a speaker turn detection (STD) process is applied to identify a number of speaker segments with each speaker segment ending at a word boundary. A speaker clustering algorithm is then applied to the speaker segments to associate one of the speakers with each of the speaker segments.Type: GrantFiled: November 21, 2017Date of Patent: September 14, 2021Assignee: International Business Machines CorporationInventors: Kenneth W. Church, Dimitrios B. Dimitriadis, Petr Fousek, Miroslav Novak, George A. Saon
-
Patent number: 11019306Abstract: A method of combining data streams from fixed audio-visual sensors with data streams from personal mobile devices including, forming a communication link with at least one of one or more personal mobile devices; receiving at least one of an audio data stream and/or a video data stream from the at least one of the one or more personal mobile devices; determining the quality of the at least one of the audio data stream and/or the video data stream, wherein the audio data stream and/or the video data stream having a quality above a threshold quality is retained; and combining the retained audio data stream and/or the video data stream with the data streams from the fixed audio-visual sensors.Type: GrantFiled: January 9, 2019Date of Patent: May 25, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Stanley Chen, Kenneth W. Church, Vaibhava Goel, Lidia L. Mangu, Etienne Marcheret, Bhuvana Ramabhadran, Laurence P. Sansone, Abhinav Sethy, Samuel Thomas
-
Patent number: 10614797Abstract: A diarization embodiment may include a system that clusters data up to a current point in time and consolidates it with the past decisions, and then returns the result that minimizes the difference with past decisions. The consolidation may be achieved by performing a permutation of the different possible labels and comparing the distance. For speaker diarization, a distance may be determined based on a minimum edit or hamming distance. The distance may alternatively be a measure other than the minimum edit or hamming distance. The clustering may have a finite time window over which the analysis is performed.Type: GrantFiled: November 30, 2017Date of Patent: April 7, 2020Assignee: International Business Machines CorporationInventors: Kenneth W. Church, Dimitrios B. Dimitriadis, Petr Fousek, Jason W. Pelecanos, Weizhong Zhu
-
Publication number: 20200013408Abstract: Symbol sequences are estimated using a computer-implemented method including detecting one or more candidates of a target symbol sequence from a speech-to-text data, extracting a related portion of each candidate from the speech-to-text data, detecting repetition of at least a partial sequence of each candidate within the related portion of the corresponding candidate, labeling the detected repetition with a repetition indication, and estimating whether each candidate is the target symbol sequence, using the corresponding related portion including the repetition indication of each of the candidates.Type: ApplicationFiled: September 20, 2019Publication date: January 9, 2020Inventors: Kenneth W. Church, Gakuto Kurata, Bhuvana Ramabhadran, Abhinav Sethy, Masayuki Suzuki, Ryuki Tachibana
-
Patent number: 10529337Abstract: Symbol sequences are estimated using a computer-implemented method including detecting one or more candidates of a target symbol sequence from a speech-to-text data, extracting a related portion of each candidate from the speech-to-text data, detecting repetition of at least a partial sequence of each candidate within the related portion of the corresponding candidate, labeling the detected repetition with a repetition indication, and estimating whether each candidate is the target symbol sequence, using the corresponding related portion including the repetition indication of each of the candidates.Type: GrantFiled: January 7, 2019Date of Patent: January 7, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Kenneth W. Church, Gakuto Kurata, Bhuvana Ramabhadran, Abhinav Sethy, Masayuki Suzuki, Ryuki Tachibana
-
Patent number: 10468031Abstract: An approach is provided that receives an audio stream and utilizes a voice activation detection (VAD) process to create a digital audio stream of voices from at least two different speakers. An automatic speech recognition (ASR) process is applied to the digital stream with the ASR process resulting in the spoken words to which a speaker turn detection (STD) process is applied to identify a number of speaker segments with each speaker segment ending at a word boundary. The STD process analyzes a number of speaker segments using a language model that determines when speaker changes occur. A speaker clustering algorithm is then applied to the speaker segments to associate one of the speakers with each of the speaker segments.Type: GrantFiled: November 21, 2017Date of Patent: November 5, 2019Assignee: International Business Machines CorporationInventors: Kenneth W. Church, Dimitrios B. Dimitriadis, Petr Fousek, Miroslav Novak, George A. Saon
-
Patent number: 10354677Abstract: Identification of an intent of a conversation can be useful for real-time or post-processing purposes. According to example embodiments, a method, and corresponding apparatus of identifying at least one intent-bearing utterance in a conversation, comprises determining at least one feature for each utterance among a subset of utterances of the conversation; classifying each utterance among the subset of utterances, using a classifier, as an intent classification or a non-intent classification based at least in part on a subset of the at least one determined feature; and selecting at least one utterance, with intent classification, as an intent-bearing utterance based at least in part on classification results by the classifier. Through identification of an intent bearing utterance, a call center for example, can provide improved service for callers through, for example, more effective directing of a call to a live agent.Type: GrantFiled: February 28, 2013Date of Patent: July 16, 2019Assignees: Nuance Communications, Inc., International Business Machines CorporationInventors: Shajith Ikbal Mohamed, Kenneth W. Church, Ashish Verma, Prasanta Ghosh, Jeffrey N. Marcus
-
Adaptive selection of message data properties for improving communication throughput and reliability
Patent number: 10305765Abstract: Embodiments of the present invention provide a computer-implemented method for communicating a reference code for a transaction. The method monitors a communication session conducted between a user and an agent via a communication channel, extracts user and channel properties from the monitored communication session, selects a reference code from a set of references codes stored on a database, in which the selection is based at least in part on the extracted communication channel properties and the extracted user properties, and then communicates the selected reference code to the user.Type: GrantFiled: July 21, 2017Date of Patent: May 28, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Kenneth W. Church, Martin Franz, Nicholas S. Kersting, Jeffrey S. McCarley, Jason W. Pelecanos, Weizhong Zhu -
Publication number: 20190156832Abstract: An approach is provided that receives an audio stream and utilizes a voice activation detection (VAD) process to create a digital audio stream of voices from at least two different speakers. An automatic speech recognition (ASR) process is applied to the digital stream with the ASR process resulting in the spoken words to which a speaker turn detection (STD) process is applied to identify a number of speaker segments with each speaker segment ending at a word boundary. A speaker clustering algorithm is then applied to the speaker segments to associate one of the speakers with each of the speaker segments.Type: ApplicationFiled: November 21, 2017Publication date: May 23, 2019Inventors: Kenneth W. Church, Dimitrios B. Dimitriadis, Petr Fousek, Miroslav Novak, George A. Saon
-
Publication number: 20190156835Abstract: An approach is provided that receives an audio stream and utilizes a voice activation detection (VAD) process to create a digital audio stream of voices from at least two different speakers. An automatic speech recognition (ASR) process is applied to the digital stream with the ASR process resulting in the spoken words to which a speaker turn detection (STD) process is applied to identify a number of speaker segments with each speaker segment ending at a word boundary. The STD process analyzes a number of speaker segments using a language model that determines when speaker changes occur. A speaker clustering algorithm is then applied to the speaker segments to associate one of the speakers with each of the speaker segments.Type: ApplicationFiled: November 21, 2017Publication date: May 23, 2019Inventors: Kenneth W. Church, Dimitrios B. Dimitriadis, Petr Fousek, Miroslav Novak, George A. Saon
-
Publication number: 20190149769Abstract: A method of combining data streams from fixed audio-visual sensors with data streams from personal mobile devices including, forming a communication link with at least one of one or more personal mobile devices; receiving at least one of an audio data stream and/or a video data stream from the at least one of the one or more personal mobile devices; determining the quality of the at least one of the audio data stream and/or the video data stream, wherein the audio data stream and/or the video data stream having a quality above a threshold quality is retained; and combining the retained audio data stream and/or the video data stream with the data streams from the fixed audio-visual sensors.Type: ApplicationFiled: January 9, 2019Publication date: May 16, 2019Inventors: STANLEY CHEN, KENNETH W. CHURCH, VAIBHAVA GOEL, LIDIA L. MANGU, ETIENNE MARCHERET, BHUVANA RAMABHADRAN, LAURENCE P. SANSONE, ABHINAV SETHY, SAMUEL THOMAS
-
Publication number: 20190139550Abstract: Symbol sequences are estimated using a computer-implemented method including detecting one or more candidates of a target symbol sequence from a speech-to-text data, extracting a related portion of each candidate from the speech-to-text data, detecting repetition of at least a partial sequence of each candidate within the related portion of the corresponding candidate, labeling the detected repetition with a repetition indication, and estimating whether each candidate is the target symbol sequence, using the corresponding related portion including the repetition indication of each of the candidates.Type: ApplicationFiled: January 7, 2019Publication date: May 9, 2019Inventors: Kenneth W. Church, Gakuto Kurata, Bhuvana Ramabhadran, Abhinav Sethy, Masayuki Suzuki, Ryuki Tachibana
-
Patent number: 10229685Abstract: Symbol sequences are estimated using a computer-implemented method including detecting one or more candidates of a target symbol sequence from a speech-to-text data, extracting a related portion of each candidate from the speech-to-text data, detecting repetition of at least a partial sequence of each candidate within the related portion of the corresponding candidate, labeling the detected repetition with a repetition indication, and estimating whether each candidate is the target symbol sequence, using the corresponding related portion including the repetition indication of each of the candidates.Type: GrantFiled: January 18, 2017Date of Patent: March 12, 2019Assignee: International Business Machines CorporationInventors: Kenneth W. Church, Gakuto Kurata, Bhuvana Ramabhadran, Abhinav Sethy, Masayuki Suzuki, Ryuki Tachibana
-
Patent number: 10230922Abstract: A method of combining data streams from fixed audio-visual sensors with data streams from personal mobile devices including, forming a communication link with at least one of one or more personal mobile devices; receiving at least one of an audio data stream and/or a video data stream from the at least one of the one or more personal mobile devices; determining the quality of the at least one of the audio data stream and/or the video data stream, wherein the audio data stream and/or the video data stream having a quality above a threshold quality is retained; and combining the retained audio data stream and/or the video data stream with the data streams from the fixed audio-visual sensors.Type: GrantFiled: October 2, 2017Date of Patent: March 12, 2019Assignee: International Business Machines CorporationInventors: Stanley Chen, Kenneth W. Church, Vaibhava Goel, Lidia L. Mangu, Etienne Marcheret, Bhuvana Ramabhadran, Laurence P. Sansone, Abhinav Sethy, Samuel Thomas
-
ADAPTIVE SELECTION OF MESSAGE DATA PROPERTIES FOR IMPROVING COMMUNICATION THROUGHPUT AND RELIABILITY
Publication number: 20190028370Abstract: Embodiments of the present invention provide a computer-implemented method for communicating a reference code for a transaction. The method monitors a communication session conducted between a user and an agent via a communication channel, extracts user and channel properties from the monitored communication session, selects a reference code from a set of references codes stored on a database, in which the selection is based at least in part on the extracted communication channel properties and the extracted user properties, and then communicates the selected reference code to the user.Type: ApplicationFiled: July 21, 2017Publication date: January 24, 2019Inventors: Kenneth W. Church, Martin Franz, Nicholas S. Kersting, Jeffrey S. McCarley, Jason W. Pelecanos, Weizhong Zhu -
Patent number: 10147438Abstract: Embodiments of the invention include method, systems and computer program products for role modeling. Aspects of the invention include receiving, by a processor, audio data, wherein the audio data includes a plurality of audio conversation for one or more speakers. The one or more segments for each of the plurality of audio conversations are partitioned. A speaker is associated with each of the one or more segments. The one or more segments for each of the plurality of audio conversations are labeled with roles utilizing a speaker recognition engine. Speakers are clustered based at least in part on a number of times the speakers are present in an audio conversation.Type: GrantFiled: March 2, 2017Date of Patent: December 4, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Kenneth W. Church, Jason W. Pelecanos, Josef Vopicka, Weizhong Zhu
-
Publication number: 20180254051Abstract: Embodiments of the invention include method, systems and computer program products for role modeling. Aspects of the invention include receiving, by a processor, audio data, wherein the audio data includes a plurality of audio conversation for one or more speakers. The one or more segments for each of the plurality of audio conversations are partitioned. A speaker is associated with each of the one or more segments. The one or more segments for each of the plurality of audio conversations are labeled with roles utilizing a speaker recognition engine. Speakers are clustered based at least in part on a number of times the speakers are present in an audio conversation.Type: ApplicationFiled: March 2, 2017Publication date: September 6, 2018Inventors: Kenneth W. Church, Jason W. Pelecanos, Josef Vopicka, Weizhong Zhu
-
Publication number: 20180204567Abstract: Symbol sequences are estimated using a computer-implemented method including detecting one or more candidates of a target symbol sequence from a speech-to-text data, extracting a related portion of each candidate from the speech-to-text data, detecting repetition of at least a partial sequence of each candidate within the related portion of the corresponding candidate, labeling the detected repetition with a repetition indication, and estimating whether each candidate is the target symbol sequence, using the corresponding related portion including the repetition indication of each of the candidates.Type: ApplicationFiled: January 18, 2017Publication date: July 19, 2018Inventors: Kenneth W. Church, Gakuto Kurata, Bhuvana Ramabhadran, Abhinav Sethy, Masayuki Suzuki, Ryuki Tachibana
-
Publication number: 20180158451Abstract: A diarization embodiment may include a system that clusters data up to a current point in time and consolidates it with the past decisions, and then returns the result that minimizes the difference with past decisions. The consolidation may be achieved by performing a permutation of the different possible labels and comparing the distance. For speaker diarization, a distance may be determined based on a minimum edit or hamming distance. The distance may alternatively be a measure other than the minimum edit or hamming distance. The clustering may have a finite time window over which the analysis is performed.Type: ApplicationFiled: November 30, 2017Publication date: June 7, 2018Inventors: Kenneth W. CHURCH, Dimitrios B. DIMITRIADIS, Petr FOUSEK, Jason W. PELECANOS, Weizhong ZHU