Patents by Inventor DIMITRIOS B. DIMITRIADIS

DIMITRIOS B. DIMITRIADIS has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Using long short-term memory recurrent neural network for speaker diarization segmentation

Patent number: 10249292

Abstract: Speaker diarization is performed on audio data including speech by a first speaker, speech by a second speaker, and silence. The speaker diarization includes segmenting the audio data using a long short-term memory (LSTM) recurrent neural network (RNN) to identify change points of the audio data that divide the audio data into segments. The speaker diarization includes assigning a label selected from a group of labels to each segment of the audio data using the LSTM RNN. The group of labels comprising includes labels corresponding to the first speaker, the second speaker, and the silence. Each change point is a transition from one of the first speaker, the second speaker, and the silence to a different one of the first speaker, the second speaker, and the silence. Speech recognition can be performed on the segments that each correspond to one of the first speaker and the second speaker.

Type: Grant

Filed: December 14, 2016

Date of Patent: April 2, 2019

Assignee: International Business Machines Corporation

Inventors: Dimitrios B. Dimitriadis, David C. Haws, Michael Picheny, George Saon, Samuel Thomas
Self-organized acoustic signal cancellation over a network

Patent number: 10242658

Abstract: A system for self-organized acoustic signal cancellation over a network is disclosed. The system may transmit an acoustic sounding signal to an interfering device so that a channel measurement may be performed for a channel between the interfering device and an interferee device. The system may receive the channel measurement for the channel from the interfering device and also receive a digitized version of an audio interference signal associated with the interfering device. Based on the channel measurement and the digital version of the interference signal, the system may calculate a cancellation signal prior to the arrival of the original over-the-air audio interference signal that corresponds to the digital version of audio interference signal. The system may then apply the cancellation signal to an audio signal associated with the interferee device to remove the interference signal from the audio signal.

Type: Grant

Filed: November 9, 2017

Date of Patent: March 26, 2019

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Lusheng Ji, Donald J. Bowen, Dimitrios B. Dimitriadis, Horst J. Schroeter
Acoustic enhancement by leveraging metadata to mitigate the impact of noisy environments

Patent number: 10170133

Abstract: A system for cloud acoustic enhancement is disclosed. In particular, the system may leverage metadata and cloud-computing network resources to mitigate the impact of noisy environments that may potentially interfere with user communications. In order to do so, the system may receive an audio stream including an audio signal associated with a user, and determine if the audio stream also includes an interference signal. The system may determine that the audio stream includes the interference signal if a portion of the audio stream correlates with metadata that identifies the interference signal. If the audio stream is determined to include the interference signal, the system may cancel the interference signal from the audio stream by utilizing the metadata and the cloud-computing network resources. Once the interference signal is cancelled, the system may transmit the audio stream including the audio signal associated with the user to an intended destination.

Type: Grant

Filed: August 22, 2017

Date of Patent: January 1, 2019

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji, Horst J. Schroeter
ACOUSTIC ENVIRONMENT RECOGNIZER FOR OPTIMAL SPEECH PROCESSING

Publication number: 20180197558

Abstract: A system for providing an acoustic environment recognizer for optimal speech processing is disclosed. In particular, the system may utilize metadata obtained from various acoustic environments to assist in suppressing ambient noise interfering with a desired audio signal. In order to do so, the system may receive an audio stream including an audio signal associated with a user and including ambient noise obtained from an acoustic environment of the user. The system may obtain first metadata associated with the ambient noise, and may determine if the first metadata corresponds to second metadata in a profile for the acoustic environment. If the first metadata corresponds to the second metadata, the system may select a processing scheme for suppressing the ambient noise from the audio stream, and process the audio stream using the processing scheme. Once the audio stream is processed, the system may provide the audio stream to a destination.

Type: Application

Filed: March 5, 2018

Publication date: July 12, 2018

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Horst J. Schroeter, Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji
USING LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK FOR SPEAKER DIARIZATION SEGMENTATION

Publication number: 20180166066

Abstract: Speaker diarization is performed on audio data including speech by a first speaker, speech by a second speaker, and silence. The speaker diarization includes segmenting the audio data using a long short-term memory (LSTM) recurrent neural network (RNN) to identify change points of the audio data that divide the audio data into segments. The speaker diarization includes assigning a label selected from a group of labels to each segment of the audio data using the LSTM RNN. The group of labels comprising includes labels corresponding to the first speaker, the second speaker, and the silence. Each change point is a transition from one of the first speaker, the second speaker, and the silence to a different one of the first speaker, the second speaker, and the silence. Speech recognition can be performed on the segments that each correspond to one of the first speaker and the second speaker.

Type: Application

Filed: December 14, 2016

Publication date: June 14, 2018

Inventors: Dimitrios B. Dimitriadis, David C. Haws, Michael Picheny, George Saon, Samuel Thomas
USING RECURRENT NEURAL NETWORK FOR PARTITIONING OF AUDIO DATA INTO SEGMENTS THAT EACH CORRESPOND TO A SPEECH FEATURE CLUSTER IDENTIFIER

Publication number: 20180166067

Abstract: Audio features, such as perceptual linear prediction (PLP) features and time derivatives thereof, are extracted from frames of training audio data including speech by multiple speakers, and silence, such as by using linear discriminant analysis (LDA). The frames are clustered into k-means clusters using distance measures, such as Mahalanobis distance measures, of means and variances of the extracted audio features of the frames. A recurrent neural network (RNN) is trained on the extracted audio features of the frames and cluster identifiers of the k-means clusters into which the frames have been clustered. The RNN is applied to audio data to segment audio data into segments that each correspond to one of the cluster identifiers. Each segment can be assigned a label corresponding to one of the cluster identifiers. Speech recognition can be performed on the segments.

Type: Application

Filed: December 14, 2016

Publication date: June 14, 2018

Inventors: Dimitrios B. Dimitriadis, David C. Haws, Michael Picheny, George Saon, Samuel Thomas
PREFIX METHODS FOR DIARIZATION IN STREAMING MODE

Publication number: 20180158451

Abstract: A diarization embodiment may include a system that clusters data up to a current point in time and consolidates it with the past decisions, and then returns the result that minimizes the difference with past decisions. The consolidation may be achieved by performing a permutation of the different possible labels and comparing the distance. For speaker diarization, a distance may be determined based on a minimum edit or hamming distance. The distance may alternatively be a measure other than the minimum edit or hamming distance. The clustering may have a finite time window over which the analysis is performed.

Type: Application

Filed: November 30, 2017

Publication date: June 7, 2018

Inventors: Kenneth W. CHURCH, Dimitrios B. DIMITRIADIS, Petr FOUSEK, Jason W. PELECANOS, Weizhong ZHU
SELF-ORGANIZED ACOUSTIC SIGNAL CANCELLATION OVER A NETWORK

Publication number: 20180068650

Abstract: A system for self-organized acoustic signal cancellation over a network is disclosed. The system may transmit an acoustic sounding signal to an interfering device so that a channel measurement may be performed for a channel between the interfering device and an interferee device. The system may receive the channel measurement for the channel from the interfering device and also receive a digitized version of an audio interference signal associated with the interfering device. Based on the channel measurement and the digital version of the interference signal, the system may calculate a cancellation signal prior to the arrival of the original over-the-air audio interference signal that corresponds to the digital version of audio interference signal. The system may then apply the cancellation signal to an audio signal associated with the interferee device to remove the interference signal from the audio signal.

Type: Application

Filed: November 9, 2017

Publication date: March 8, 2018

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Lusheng Ji, Donald J. Bowen, Dimitrios B. Dimitriadis, Horst J. Schroeter
Acoustic environment recognizer for optimal speech processing

Patent number: 9911430

Abstract: A system for providing an acoustic environment recognizer for optimal speech processing is disclosed. In particular, the system may utilize metadata obtained from various acoustic environments to assist in suppressing ambient noise interfering with a desired audio signal. In order to do so, the system may receive an audio stream including an audio signal associated with a user and including ambient noise obtained from an acoustic environment of the user. The system may obtain first metadata associated with the ambient noise, and may determine if the first metadata corresponds to second metadata in a profile for the acoustic environment. If the first metadata corresponds to the second metadata, the system may select a processing scheme for suppressing the ambient noise from the audio stream, and process the audio stream using the processing scheme. Once the audio stream is processed, the system may provide the audio stream to a destination.

Type: Grant

Filed: November 28, 2016

Date of Patent: March 6, 2018

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Horst J. Schroeter, Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji
DENOISING A SIGNAL

Publication number: 20180047409

Abstract: A computer-implemented method according to one embodiment includes creating a clean dictionary, utilizing a clean signal, creating a noisy dictionary, utilizing a first noisy signal, determining a time varying projection, utilizing the clean dictionary and the noisy dictionary, and denoising a second noisy signal, utilizing the time varying projection.

Type: Application

Filed: October 25, 2017

Publication date: February 15, 2018

Inventors: Dimitrios B. Dimitriadis, Samuel Thomas, Colin C. Vaz
ACOUSTIC ENHANCEMENT BY LEVERAGING METADATA TO MITIGATE THE IMPACT OF NOISY ENVIRONMENTS

Publication number: 20170372720

Abstract: A system for cloud acoustic enhancement is disclosed. In particular, the system may leverage metadata and cloud-computing network resources to mitigate the impact of noisy environments that may potentially interfere with user communications. In order to do so, the system may receive an audio stream including an audio signal associated with a user, and determine if the audio stream also includes an interference signal. The system may determine that the audio stream includes the interference signal if a portion of the audio stream correlates with metadata that identifies the interference signal. If the audio stream is determined to include the interference signal, the system may cancel the interference signal from the audio stream by utilizing the metadata and the cloud-computing network resources. Once the interference signal is cancelled, the system may transmit the audio stream including the audio signal associated with the user to an intended destination.

Type: Application

Filed: August 22, 2017

Publication date: December 28, 2017

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji, Horst J. Schroeter
Methods, systems, and products for language preferences

Patent number: 9842107

Abstract: Methods, systems, and computer program products provide personalized feedback in a cloud-based environment. A client device routes image data to a server for analysis. The server analyzes the image data to recognize people of interest. Because the server performs image recognition, the client device is relieved of these intensive operations.

Type: Grant

Filed: October 31, 2016

Date of Patent: December 12, 2017

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Dimitrios B. Dimitriadis, Horst J. Schroeter
Self-organized acoustic signal cancellation over a network

Patent number: 9842582

Abstract: A system for self-organized acoustic signal cancellation over a network is disclosed. The system may transmit an acoustic sounding signal to an interfering device so that a channel measurement may be performed for a channel between the interfering device and an interferee device. The system may receive the channel measurement for the channel from the interfering device and also receive a digitized version of an audio interference signal associated with the interfering device. Based on the channel measurement and the digital version of the interference signal, the system may calculate a cancellation signal prior to the arrival of the original over-the-air audio interference signal that corresponds to the digital version of audio interference signal. The system may then apply the cancellation signal to an audio signal associated with the interferee device to remove the interference signal from the audio signal.

Type: Grant

Filed: May 9, 2016

Date of Patent: December 12, 2017

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Lusheng Ji, Donald J. Bowen, Dimitrios B. Dimitriadis, Horst J. Schroeter
Acoustic enhancement by leveraging metadata to mitigate the impact of noisy environments

Patent number: 9779752

Abstract: A system for cloud acoustic enhancement is disclosed. In particular, the system may leverage metadata and cloud-computing network resources to mitigate the impact of noisy environments that may potentially interfere with user communications. In order to do so, the system may receive an audio stream including an audio signal associated with a user, and determine if the audio stream also includes an interference signal. The system may determine that the audio stream includes the interference signal if a portion of the audio stream correlates with metadata that identifies the interference signal. If the audio stream is determined to include the interference signal, the system may cancel the interference signal from the audio stream by utilizing the metadata and the cloud-computing network resources. Once the interference signal is cancelled, the system may transmit the audio stream including the audio signal associated with the user to an intended destination.

Type: Grant

Filed: October 31, 2014

Date of Patent: October 3, 2017

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji, Horst J. Schroeter
DENOISING A SIGNAL

Publication number: 20170270945

Abstract: A computer-implemented method according to one embodiment includes creating a clean dictionary, utilizing a clean signal, creating a noisy dictionary, utilizing a first noisy signal, determining a time varying projection, utilizing the clean dictionary and the noisy dictionary, and denoising a second noisy signal, utilizing the time varying projection.

Type: Application

Filed: July 28, 2016

Publication date: September 21, 2017

Inventors: Dimitrios B. Dimitriadis, Samuel Thomas, Colin C. Vaz
ACOUSTIC ENVIRONMENT RECOGNIZER FOR OPTIMAL SPEECH PROCESSING

Publication number: 20170076736

Abstract: A system for providing an acoustic environment recognizer for optimal speech processing is disclosed. In particular, the system may utilize metadata obtained from various acoustic environments to assist in suppressing ambient noise interfering with a desired audio signal. In order to do so, the system may receive an audio stream including an audio signal associated with a user and including ambient noise obtained from an acoustic environment of the user. The system may obtain first metadata associated with the ambient noise, and may determine if the first metadata corresponds to second metadata in a profile for the acoustic environment. If the first metadata corresponds to the second metadata, the system may select a processing scheme for suppressing the ambient noise from the audio stream, and process the audio stream using the processing scheme. Once the audio stream is processed, the system may provide the audio stream to a destination.

Type: Application

Filed: November 28, 2016

Publication date: March 16, 2017

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Horst J. Schroeter, Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji
Methods, Systems, and Products for Language Preferences

Publication number: 20170046335

Abstract: Methods, systems, and computer program products provide personalized feedback in a cloud-based environment. A client device routes image data to a server for analysis. The server analyzes the image data to recognize people of interest. Because the server performs image recognition, the client device is relieved of these intensive operations.

Type: Application

Filed: October 31, 2016

Publication date: February 16, 2017

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Dimitrios B. Dimitriadis, Horst J. Schroeter
Acoustic environment recognizer for optimal speech processing

Patent number: 9530408

Abstract: A system for providing an acoustic environment recognizer for optimal speech processing is disclosed. In particular, the system may utilize metadata obtained from various acoustic environments to assist in suppressing ambient noise interfering with a desired audio signal. In order to do so, the system may receive an audio stream including an audio signal associated with a user and including ambient noise obtained from an acoustic environment of the user. The system may obtain first metadata associated with the ambient noise, and may determine if the first metadata corresponds to second metadata in a profile for the acoustic environment. If the first metadata corresponds to the second metadata, the system may select a processing scheme for suppressing the ambient noise from the audio stream, and process the audio stream using the processing scheme. Once the audio stream is processed, the system may provide the audio stream to a destination.

Type: Grant

Filed: October 31, 2014

Date of Patent: December 27, 2016

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Horst J. Schroeter, Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji
Methods, systems, and products for language preferences

Patent number: 9507770

Abstract: Methods, systems, and computer program products provide personalized feedback in a cloud-based environment. A client device routes image data to a server for analysis. The server analyzes the image data to recognize people of interest. Because the server performs image recognition, the client device is relieved of these intensive operations.

Type: Grant

Filed: August 15, 2015

Date of Patent: November 29, 2016

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: Dimitrios B. Dimitriadis, Horst J. Schroeter
Self-Organized Acoustic Signal Cancellation Over a Network

Publication number: 20160253988

Abstract: A system for self-organized acoustic signal cancellation over a network is disclosed. The system may transmit an acoustic sounding signal to an interfering device so that a channel measurement may be performed for a channel between the interfering device and an interferee device. The system may receive the channel measurement for the channel from the interfering device and also receive a digitized version of an audio interference signal associated with the interfering device. Based on the channel measurement and the digital version of the interference signal, the system may calculate a cancellation signal prior to the arrival of the original over-the-air audio interference signal that corresponds to the digital version of audio interference signal. The system may then apply the cancellation signal to an audio signal associated with the interferee device to remove the interference signal from the audio signal.

Type: Application

Filed: May 9, 2016

Publication date: September 1, 2016

Applicant: AT&T Intellectual Property I, L.P.

Inventors: Lusheng Ji, Donald J. Bowen, Dimitrios B. Dimitriadis, Horst J. Schroeter

prev 1 2 3 next