Patents by Inventor DIMITRIOS B. DIMITRIADIS
DIMITRIOS B. DIMITRIADIS has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10249292Abstract: Speaker diarization is performed on audio data including speech by a first speaker, speech by a second speaker, and silence. The speaker diarization includes segmenting the audio data using a long short-term memory (LSTM) recurrent neural network (RNN) to identify change points of the audio data that divide the audio data into segments. The speaker diarization includes assigning a label selected from a group of labels to each segment of the audio data using the LSTM RNN. The group of labels comprising includes labels corresponding to the first speaker, the second speaker, and the silence. Each change point is a transition from one of the first speaker, the second speaker, and the silence to a different one of the first speaker, the second speaker, and the silence. Speech recognition can be performed on the segments that each correspond to one of the first speaker and the second speaker.Type: GrantFiled: December 14, 2016Date of Patent: April 2, 2019Assignee: International Business Machines CorporationInventors: Dimitrios B. Dimitriadis, David C. Haws, Michael Picheny, George Saon, Samuel Thomas
-
Patent number: 10242658Abstract: A system for self-organized acoustic signal cancellation over a network is disclosed. The system may transmit an acoustic sounding signal to an interfering device so that a channel measurement may be performed for a channel between the interfering device and an interferee device. The system may receive the channel measurement for the channel from the interfering device and also receive a digitized version of an audio interference signal associated with the interfering device. Based on the channel measurement and the digital version of the interference signal, the system may calculate a cancellation signal prior to the arrival of the original over-the-air audio interference signal that corresponds to the digital version of audio interference signal. The system may then apply the cancellation signal to an audio signal associated with the interferee device to remove the interference signal from the audio signal.Type: GrantFiled: November 9, 2017Date of Patent: March 26, 2019Assignee: AT&T Intellectual Property I, L.P.Inventors: Lusheng Ji, Donald J. Bowen, Dimitrios B. Dimitriadis, Horst J. Schroeter
-
Patent number: 10170133Abstract: A system for cloud acoustic enhancement is disclosed. In particular, the system may leverage metadata and cloud-computing network resources to mitigate the impact of noisy environments that may potentially interfere with user communications. In order to do so, the system may receive an audio stream including an audio signal associated with a user, and determine if the audio stream also includes an interference signal. The system may determine that the audio stream includes the interference signal if a portion of the audio stream correlates with metadata that identifies the interference signal. If the audio stream is determined to include the interference signal, the system may cancel the interference signal from the audio stream by utilizing the metadata and the cloud-computing network resources. Once the interference signal is cancelled, the system may transmit the audio stream including the audio signal associated with the user to an intended destination.Type: GrantFiled: August 22, 2017Date of Patent: January 1, 2019Assignee: AT&T Intellectual Property I, L.P.Inventors: Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji, Horst J. Schroeter
-
Publication number: 20180197558Abstract: A system for providing an acoustic environment recognizer for optimal speech processing is disclosed. In particular, the system may utilize metadata obtained from various acoustic environments to assist in suppressing ambient noise interfering with a desired audio signal. In order to do so, the system may receive an audio stream including an audio signal associated with a user and including ambient noise obtained from an acoustic environment of the user. The system may obtain first metadata associated with the ambient noise, and may determine if the first metadata corresponds to second metadata in a profile for the acoustic environment. If the first metadata corresponds to the second metadata, the system may select a processing scheme for suppressing the ambient noise from the audio stream, and process the audio stream using the processing scheme. Once the audio stream is processed, the system may provide the audio stream to a destination.Type: ApplicationFiled: March 5, 2018Publication date: July 12, 2018Applicant: AT&T Intellectual Property I, L.P.Inventors: Horst J. Schroeter, Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji
-
Publication number: 20180166066Abstract: Speaker diarization is performed on audio data including speech by a first speaker, speech by a second speaker, and silence. The speaker diarization includes segmenting the audio data using a long short-term memory (LSTM) recurrent neural network (RNN) to identify change points of the audio data that divide the audio data into segments. The speaker diarization includes assigning a label selected from a group of labels to each segment of the audio data using the LSTM RNN. The group of labels comprising includes labels corresponding to the first speaker, the second speaker, and the silence. Each change point is a transition from one of the first speaker, the second speaker, and the silence to a different one of the first speaker, the second speaker, and the silence. Speech recognition can be performed on the segments that each correspond to one of the first speaker and the second speaker.Type: ApplicationFiled: December 14, 2016Publication date: June 14, 2018Inventors: Dimitrios B. Dimitriadis, David C. Haws, Michael Picheny, George Saon, Samuel Thomas
-
Publication number: 20180166067Abstract: Audio features, such as perceptual linear prediction (PLP) features and time derivatives thereof, are extracted from frames of training audio data including speech by multiple speakers, and silence, such as by using linear discriminant analysis (LDA). The frames are clustered into k-means clusters using distance measures, such as Mahalanobis distance measures, of means and variances of the extracted audio features of the frames. A recurrent neural network (RNN) is trained on the extracted audio features of the frames and cluster identifiers of the k-means clusters into which the frames have been clustered. The RNN is applied to audio data to segment audio data into segments that each correspond to one of the cluster identifiers. Each segment can be assigned a label corresponding to one of the cluster identifiers. Speech recognition can be performed on the segments.Type: ApplicationFiled: December 14, 2016Publication date: June 14, 2018Inventors: Dimitrios B. Dimitriadis, David C. Haws, Michael Picheny, George Saon, Samuel Thomas
-
Publication number: 20180158451Abstract: A diarization embodiment may include a system that clusters data up to a current point in time and consolidates it with the past decisions, and then returns the result that minimizes the difference with past decisions. The consolidation may be achieved by performing a permutation of the different possible labels and comparing the distance. For speaker diarization, a distance may be determined based on a minimum edit or hamming distance. The distance may alternatively be a measure other than the minimum edit or hamming distance. The clustering may have a finite time window over which the analysis is performed.Type: ApplicationFiled: November 30, 2017Publication date: June 7, 2018Inventors: Kenneth W. CHURCH, Dimitrios B. DIMITRIADIS, Petr FOUSEK, Jason W. PELECANOS, Weizhong ZHU
-
Publication number: 20180068650Abstract: A system for self-organized acoustic signal cancellation over a network is disclosed. The system may transmit an acoustic sounding signal to an interfering device so that a channel measurement may be performed for a channel between the interfering device and an interferee device. The system may receive the channel measurement for the channel from the interfering device and also receive a digitized version of an audio interference signal associated with the interfering device. Based on the channel measurement and the digital version of the interference signal, the system may calculate a cancellation signal prior to the arrival of the original over-the-air audio interference signal that corresponds to the digital version of audio interference signal. The system may then apply the cancellation signal to an audio signal associated with the interferee device to remove the interference signal from the audio signal.Type: ApplicationFiled: November 9, 2017Publication date: March 8, 2018Applicant: AT&T Intellectual Property I, L.P.Inventors: Lusheng Ji, Donald J. Bowen, Dimitrios B. Dimitriadis, Horst J. Schroeter
-
Patent number: 9911430Abstract: A system for providing an acoustic environment recognizer for optimal speech processing is disclosed. In particular, the system may utilize metadata obtained from various acoustic environments to assist in suppressing ambient noise interfering with a desired audio signal. In order to do so, the system may receive an audio stream including an audio signal associated with a user and including ambient noise obtained from an acoustic environment of the user. The system may obtain first metadata associated with the ambient noise, and may determine if the first metadata corresponds to second metadata in a profile for the acoustic environment. If the first metadata corresponds to the second metadata, the system may select a processing scheme for suppressing the ambient noise from the audio stream, and process the audio stream using the processing scheme. Once the audio stream is processed, the system may provide the audio stream to a destination.Type: GrantFiled: November 28, 2016Date of Patent: March 6, 2018Assignee: AT&T Intellectual Property I, L.P.Inventors: Horst J. Schroeter, Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji
-
Publication number: 20180047409Abstract: A computer-implemented method according to one embodiment includes creating a clean dictionary, utilizing a clean signal, creating a noisy dictionary, utilizing a first noisy signal, determining a time varying projection, utilizing the clean dictionary and the noisy dictionary, and denoising a second noisy signal, utilizing the time varying projection.Type: ApplicationFiled: October 25, 2017Publication date: February 15, 2018Inventors: Dimitrios B. Dimitriadis, Samuel Thomas, Colin C. Vaz
-
Publication number: 20170372720Abstract: A system for cloud acoustic enhancement is disclosed. In particular, the system may leverage metadata and cloud-computing network resources to mitigate the impact of noisy environments that may potentially interfere with user communications. In order to do so, the system may receive an audio stream including an audio signal associated with a user, and determine if the audio stream also includes an interference signal. The system may determine that the audio stream includes the interference signal if a portion of the audio stream correlates with metadata that identifies the interference signal. If the audio stream is determined to include the interference signal, the system may cancel the interference signal from the audio stream by utilizing the metadata and the cloud-computing network resources. Once the interference signal is cancelled, the system may transmit the audio stream including the audio signal associated with the user to an intended destination.Type: ApplicationFiled: August 22, 2017Publication date: December 28, 2017Applicant: AT&T Intellectual Property I, L.P.Inventors: Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji, Horst J. Schroeter
-
Patent number: 9842107Abstract: Methods, systems, and computer program products provide personalized feedback in a cloud-based environment. A client device routes image data to a server for analysis. The server analyzes the image data to recognize people of interest. Because the server performs image recognition, the client device is relieved of these intensive operations.Type: GrantFiled: October 31, 2016Date of Patent: December 12, 2017Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.Inventors: Dimitrios B. Dimitriadis, Horst J. Schroeter
-
Patent number: 9842582Abstract: A system for self-organized acoustic signal cancellation over a network is disclosed. The system may transmit an acoustic sounding signal to an interfering device so that a channel measurement may be performed for a channel between the interfering device and an interferee device. The system may receive the channel measurement for the channel from the interfering device and also receive a digitized version of an audio interference signal associated with the interfering device. Based on the channel measurement and the digital version of the interference signal, the system may calculate a cancellation signal prior to the arrival of the original over-the-air audio interference signal that corresponds to the digital version of audio interference signal. The system may then apply the cancellation signal to an audio signal associated with the interferee device to remove the interference signal from the audio signal.Type: GrantFiled: May 9, 2016Date of Patent: December 12, 2017Assignee: AT&T Intellectual Property I, L.P.Inventors: Lusheng Ji, Donald J. Bowen, Dimitrios B. Dimitriadis, Horst J. Schroeter
-
Patent number: 9779752Abstract: A system for cloud acoustic enhancement is disclosed. In particular, the system may leverage metadata and cloud-computing network resources to mitigate the impact of noisy environments that may potentially interfere with user communications. In order to do so, the system may receive an audio stream including an audio signal associated with a user, and determine if the audio stream also includes an interference signal. The system may determine that the audio stream includes the interference signal if a portion of the audio stream correlates with metadata that identifies the interference signal. If the audio stream is determined to include the interference signal, the system may cancel the interference signal from the audio stream by utilizing the metadata and the cloud-computing network resources. Once the interference signal is cancelled, the system may transmit the audio stream including the audio signal associated with the user to an intended destination.Type: GrantFiled: October 31, 2014Date of Patent: October 3, 2017Assignee: AT&T Intellectual Property I, L.P.Inventors: Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji, Horst J. Schroeter
-
Publication number: 20170270945Abstract: A computer-implemented method according to one embodiment includes creating a clean dictionary, utilizing a clean signal, creating a noisy dictionary, utilizing a first noisy signal, determining a time varying projection, utilizing the clean dictionary and the noisy dictionary, and denoising a second noisy signal, utilizing the time varying projection.Type: ApplicationFiled: July 28, 2016Publication date: September 21, 2017Inventors: Dimitrios B. Dimitriadis, Samuel Thomas, Colin C. Vaz
-
Publication number: 20170076736Abstract: A system for providing an acoustic environment recognizer for optimal speech processing is disclosed. In particular, the system may utilize metadata obtained from various acoustic environments to assist in suppressing ambient noise interfering with a desired audio signal. In order to do so, the system may receive an audio stream including an audio signal associated with a user and including ambient noise obtained from an acoustic environment of the user. The system may obtain first metadata associated with the ambient noise, and may determine if the first metadata corresponds to second metadata in a profile for the acoustic environment. If the first metadata corresponds to the second metadata, the system may select a processing scheme for suppressing the ambient noise from the audio stream, and process the audio stream using the processing scheme. Once the audio stream is processed, the system may provide the audio stream to a destination.Type: ApplicationFiled: November 28, 2016Publication date: March 16, 2017Applicant: AT&T Intellectual Property I, L.P.Inventors: Horst J. Schroeter, Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji
-
Publication number: 20170046335Abstract: Methods, systems, and computer program products provide personalized feedback in a cloud-based environment. A client device routes image data to a server for analysis. The server analyzes the image data to recognize people of interest. Because the server performs image recognition, the client device is relieved of these intensive operations.Type: ApplicationFiled: October 31, 2016Publication date: February 16, 2017Applicant: AT&T Intellectual Property I, L.P.Inventors: Dimitrios B. Dimitriadis, Horst J. Schroeter
-
Patent number: 9530408Abstract: A system for providing an acoustic environment recognizer for optimal speech processing is disclosed. In particular, the system may utilize metadata obtained from various acoustic environments to assist in suppressing ambient noise interfering with a desired audio signal. In order to do so, the system may receive an audio stream including an audio signal associated with a user and including ambient noise obtained from an acoustic environment of the user. The system may obtain first metadata associated with the ambient noise, and may determine if the first metadata corresponds to second metadata in a profile for the acoustic environment. If the first metadata corresponds to the second metadata, the system may select a processing scheme for suppressing the ambient noise from the audio stream, and process the audio stream using the processing scheme. Once the audio stream is processed, the system may provide the audio stream to a destination.Type: GrantFiled: October 31, 2014Date of Patent: December 27, 2016Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.Inventors: Horst J. Schroeter, Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji
-
Patent number: 9507770Abstract: Methods, systems, and computer program products provide personalized feedback in a cloud-based environment. A client device routes image data to a server for analysis. The server analyzes the image data to recognize people of interest. Because the server performs image recognition, the client device is relieved of these intensive operations.Type: GrantFiled: August 15, 2015Date of Patent: November 29, 2016Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.Inventors: Dimitrios B. Dimitriadis, Horst J. Schroeter
-
Publication number: 20160253988Abstract: A system for self-organized acoustic signal cancellation over a network is disclosed. The system may transmit an acoustic sounding signal to an interfering device so that a channel measurement may be performed for a channel between the interfering device and an interferee device. The system may receive the channel measurement for the channel from the interfering device and also receive a digitized version of an audio interference signal associated with the interfering device. Based on the channel measurement and the digital version of the interference signal, the system may calculate a cancellation signal prior to the arrival of the original over-the-air audio interference signal that corresponds to the digital version of audio interference signal. The system may then apply the cancellation signal to an audio signal associated with the interferee device to remove the interference signal from the audio signal.Type: ApplicationFiled: May 9, 2016Publication date: September 1, 2016Applicant: AT&T Intellectual Property I, L.P.Inventors: Lusheng Ji, Donald J. Bowen, Dimitrios B. Dimitriadis, Horst J. Schroeter