Patents by Inventor Alexandros Potamianos

Alexandros Potamianos has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11862145
    Abstract: A method for processing multi-modal input includes receiving multiple signal inputs, each signal input having a corresponding input mode. Each signal input is processed in a series of mode-specific processing stages. Each successive mode-specific stage is associated with a successively longer scale of analysis of the signal input. A fused output is generated based on the output of a series of fused processing stages. Each successive fused processing stage is associated with a successively longer scale of analysis of the signal input. Multiple fused processing stages receive inputs from corresponding mode-specific processing stages, so that the fused output depends on the multiple of signal inputs.
    Type: Grant
    Filed: April 20, 2020
    Date of Patent: January 2, 2024
    Assignee: Behavioral Signal Technologies, Inc.
    Inventors: Efthymis Georgiou, Georgios Paraskevopoulos, James Gibson, Alexandros Potamianos, Shrikanth Narayanan
  • Publication number: 20200335086
    Abstract: Data augmentation is used for speech emotion recognition tasks where certain emotional labels, e.g., sadness, are significantly underrepresented in a training dataset. This is typical for data collected in real-life applications. We propose conditioned data augmentation using Generative Adversarial Networks (GANs), in order to generate samples for underrepresented emotions. We propose a conditional GAN architecture to generate synthetic spectrograms for the minority class. For comparison purposes, we implement a series of signal-based data augmentation methods. Results on the speech emotion recognition task show that the proposed data augmentation method significantly improves classification performance as compared to traditional speech data augmentation methods.
    Type: Application
    Filed: April 20, 2020
    Publication date: October 22, 2020
    Inventors: Georgios Paraskevopoulos, Evangelia Chatziagapi, Theodoros Giannakopoulos, Alexandros Potamianos, Shrikanth Narayanan
  • Publication number: 20200335092
    Abstract: A method for processing multi-modal input includes receiving multiple signal inputs, each signal input having a corresponding input mode. Each signal input is processed in a series of mode-specific processing stages. Each successive mode-specific stage is associated with a successively longer scale of analysis of the signal input. A fused output is generated based on the output of a series of fused processing stages. Each successive fused processing stage is associated with a successively longer scale of analysis of the signal input. Multiple fused processing stages receive inputs from corresponding mode-specific processing stages, so that the fused output depends on the multiple of signal inputs.
    Type: Application
    Filed: April 20, 2020
    Publication date: October 22, 2020
    Inventors: Efthymis Georgiou, Georgios Paraskevopoulos, James Gibson, Alexandros Potamianos, Shrikanth Narayanan
  • Publication number: 20190385597
    Abstract: Behavioral profiling and shaping is used in a “closed-loop” in that an interaction with at least one human is monitored and based on inferred characteristics of the interaction with that human (e.g., their behavioral profile) the interaction is guided. In one exemplary embodiment, the interaction is between two humans, for example, a “customer” and an “agent” and the interaction is monitored and the agent is guided according to the inferred behavioral profile of the customer (or optionally of the agent themselves).
    Type: Application
    Filed: June 14, 2019
    Publication date: December 19, 2019
    Inventors: Athanasios Katsamanis, Shrikanth Narayanan, Alexandros Potamianos
  • Patent number: 6760699
    Abstract: A method and apparatus for performing automatic speech recognition (ASR) in a distributed ASR system for use over a wireless channel takes advantage of probabilistic information concerning the likelihood that a given, portion of the data has been accurately decoded to a particular value. The probability of error in each feature in a transmitted feature set is employed to improve speech recognition performance under adverse channel conditions. Bit error probabilities for each of the bits which are used to encode a given ASR feature are used to compute the confidence level that the system may have in the decoded value of that feature. Features that have been corrupted with high probability are advantageously either not used or are weighted less in the acoustic distance computation performed by the speech recognizer.
    Type: Grant
    Filed: April 24, 2000
    Date of Patent: July 6, 2004
    Assignee: Lucent Technologies Inc.
    Inventors: Vijitha Weerackody, Wolfgang Reichl, Alexandros Potamianos
  • Publication number: 20030233230
    Abstract: A system for, and method of, representing and resolving ambiguity in natural language text and a spoken dialogue system incorporating the system for representing and resolving ambiguity or the method. In one embodiment, the system for representing and resolving ambiguity includes: (1) a context tracker that places the natural language text in context to yield candidate attribute-value (AV) pairs and (2) a candidate scorer, associated with the context tracker, that adjusts a confidence associated with each candidate AV pair based on system intent.
    Type: Application
    Filed: June 12, 2002
    Publication date: December 18, 2003
    Applicant: Lucent Technologies Inc.
    Inventors: Egbert Ammicht, J. Eric Fosler-Lussier, Alexandros Potamianos
  • Publication number: 20030233232
    Abstract: A system for, and method of, measuring a degree of independence of semantic classes in separate domains. In one embodiment, the system includes: (1) a cross-domain distance calculator that estimates a similarity between n-gram contexts for the semantic classes in each of the separate domains to determine domain-dependent relative entropies associated with the semantic classes and (2) a distance summer, associated with the cross-domain distance calculator, that adds the domain-dependent distances over a domain vocabulary to yield the degree of independence of the semantic classes.
    Type: Application
    Filed: June 12, 2002
    Publication date: December 18, 2003
    Applicant: Lucent Technologies Inc.
    Inventors: J. Eric Fosler-Lussier, Chin-Hui Lee, Andrew N. Pargellis, Alexandros Potamianos
  • Patent number: 6076057
    Abstract: An unsupervised, discriminative, sentence level, HMM adaptation based on speech-silence classification is presented. Silence and speech regions are determined either using a speech end-pointer or the segmentation obtained from the recognizer in a first pass. The discriminative training procedure using a GPD or any other discriminative training algorithm, employed in conjunction with the HMM-based recognizer, is then used to increase the discrimination between silence and speech.
    Type: Grant
    Filed: May 21, 1997
    Date of Patent: June 13, 2000
    Assignee: AT&T Corp
    Inventors: Shrikanth Sambasivan Narayanan, Alexandros Potamianos, Ilija Zeljkovic
  • Patent number: 5930753
    Abstract: Frequency warping approaches to speaker normalization have been proposed and evaluated on various speech recognition tasks. In all cases, frequency warping was found to significantly improve recognition performance by reducing the mismatch between test utterances presented to the recognizer and the speaker independent HMM model. This invention relates to a procedure which compensates utterances by simultaneously scaling the frequency axis and reshaping the spectral energy contour. This procedure is shown to reduce the error rate in a telephone based connected digit recognition task by 30%.
    Type: Grant
    Filed: March 20, 1997
    Date of Patent: July 27, 1999
    Assignee: AT&T Corp
    Inventors: Alexandros Potamianos, Richard Cameron Rose
  • Patent number: 5765124
    Abstract: An improved speech recognition system, in which transformation process parameters are generated in response to selected characteristics derived from speech inputs obtained from both carbon and linear microphones. The transformation process parameters are utilized in conjunction with selected digitized speech models to improve the speech recognition process based on the carbon microphone property of suppressing speech spectral energy for low energy invoiced sounds, and also for low energy regions of the spectrum between formant peaks for voices sounds.
    Type: Grant
    Filed: December 29, 1995
    Date of Patent: June 9, 1998
    Assignee: Lucent Technologies Inc.
    Inventors: Richard C. Rose, Alexandros Potamianos