Patents by Inventor Athanasios Mouchtaris

Athanasios Mouchtaris has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

AUTOMATIC SPEECH RECOGNITION

Publication number: 20240321264

Abstract: Techniques for performing automatic speech recognition (ASR) are described. In some embodiments, an ASR component integrates contextual information from user profile data into audio encoding data to predict a token(s) corresponding to a spoken input. The user profile data may include personalized words, such as, contact names, device names, etc. The ASR component determines word embedding data using the personalized words. The ASR component is configured to apply attention to audio frames that are relevant to the personalized words based on processing the audio encoding data and the word embedding data.

Type: Application

Filed: May 31, 2024

Publication date: September 26, 2024

Inventors: Jing Liu, Feng-Ju Chang, Athanasios Mouchtaris, Martin Radfar, Maurizio Omologo, Siegfried Kunzmann
Automatic speech recognition

Patent number: 12002451

Abstract: Techniques for performing automatic speech recognition (ASR) are described. In some embodiments, an ASR component integrates contextual information from user profile data into audio encoding data to predict a token(s) corresponding to a spoken input. The user profile data may include personalized words, such as, contact names, device names, etc. The ASR component determines word embedding data using the personalized words. The ASR component is configured to apply attention to audio frames that are relevant to the personalized words based on processing the audio encoding data and the word embedding data.

Type: Grant

Filed: September 24, 2021

Date of Patent: June 4, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Jing Liu, Feng-Ju Chang, Athanasios Mouchtaris, Martin Radfar, Maurizio Omologo, Siegfried Kunzmann
Automatic speech recognition

Patent number: 11915690

Abstract: A multi-channel transformer acoustic model that processes a plurality of audio signals output by microphones of a microphone array and outputs probabilities for acoustic units of an utterance represented in the audio signals. The audio signals represent the individual microphones' respective capturing of the utterance. The multi-channel model may perform self-attention on embeddings of the audio signals and then cross-channel attention across the attended audio signals. The cross-channel attention may involve processing of signals relative to each other to model the relationships across channels within and across time frames. The multi-channel model may include a transducer to perform processing frame-by-frame.

Type: Grant

Filed: September 29, 2021

Date of Patent: February 27, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Feng-Ju Chang, Martin Radfar, Athanasios Mouchtaris, Brian King, Siegfried Kunzmann, Maurizio Omologo
Media content mixing apparatuses, methods and systems

Patent number: 10685667

Abstract: In aspects, systems, methods, apparatuses and computer-readable storage media implementing embodiments for mixing audio content based on a plurality of user generated recordings (UGRs) are disclosed. In embodiments, the mixing comprises: receiving a plurality of UGRs, each UGR of the plurality of UGRs comprising at least audio content; determining a correlation between samples of audio content associated with at least two UGRs of the plurality of UGRs; generating one or more clusters comprising samples of the audio content identified as having a relationship based on the determined correlations; synchronizing, for each of the one or more clusters, the samples of the audio content to produce synchronized audio content for each of the one or more clusters, normalizing, for each of the one or more clusters, the synchronized audio content to produce normalized audio content; and mixing, for each of the one or more clusters, the normalized audio content.

Type: Grant

Filed: June 12, 2018

Date of Patent: June 16, 2020

Assignee: FOUNDATION FOR RESEARCH AND TECHNOLOGY—HELLAS (FORTH)

Inventors: Nikolaos Stefanakis, Athanasios Mouchtaris
Direction of arrival (DOA) estimation apparatuses, methods, and systems

Patent number: 10175335

Abstract: A processor-implemented method for direction-of-arrival estimation. The method includes: receiving a plurality of input signals at a sensor array, each sensor having an angle estimator and a cross-spectra term; transforming the input signal from each of the plurality of sensors to the time-frequency domain using a short-time Fourier transform; constructing a Perpendicular Cross-Spectra Difference (PCSD) for each of the plurality of angle estimators associated with each sensor for each frequency bin and time index; calculating an auxiliary observation for each of the angle estimators; and determining an impinging angle for each of the angle estimators based on the auxiliary observation.

Type: Grant

Filed: June 15, 2016

Date of Patent: January 8, 2019

Assignee: FOUNDATION FOR RESEARCH AND TECHNOLOGY-HELLAS (FORTH)

Inventors: Nikolaos Stefanakis, Athanasios Mouchtaris
Foreground signal suppression apparatuses, methods, and systems

Patent number: 10178475

Abstract: A processor-implemented method for foreground signal suppression. The method includes: capturing a plurality of input signals using a plurality of sensors within a sound field; subjecting each input signal to a short-time Fourier transform to transform each signal into a plurality of non-overlapping subband regions; estimating the diffuseness of the sound field based on the plurality of input signals; decomposing each of the plurality of input signals into a diffuse component and a directional component based on the diffuseness estimate; applying a spatial analysis operation to filter the directional component of each of the plurality of input signals, wherein the spatial analysis operation includes applying a set of beamformers to the directional components to produce a plurality of beamformer signals; and processing the plurality of beamformer signals to decompose the signal into a foreground channel and a background channel.

Type: Grant

Filed: June 15, 2016

Date of Patent: January 8, 2019

Assignee: FOUNDATION FOR RESEARCH AND TECHNOLOGY—HELLAS (F.O.R.T.H.)

Inventors: Nikolaos Stefanakis, Athanasios Mouchtaris
MEDIA CONTENT MIXING APPARATUSES, METHODS AND SYSTEMS

Publication number: 20180358030

Abstract: In aspects, systems, methods, apparatuses and computer-readable storage media implementing embodiments for mixing audio content based on a plurality of user generated recordings (UGRs) are disclosed. In embodiments, the mixing comprises: receiving a plurality of UGRs, each UGR of the plurality of UGRs comprising at least audio content; determining a correlation between samples of audio content associated with at least two UGRs of the plurality of UGRs; generating one or more clusters comprising samples of the audio content identified as having a relationship based on the determined correlations; synchronizing, for each of the one or more clusters, the samples of the audio content to produce synchronized audio content for each of the one or more clusters, normalizing, for each of the one or more clusters, the synchronized audio content to produce normalized audio content; and mixing, for each of the one or more clusters, the normalized audio content.

Type: Application

Filed: June 8, 2018

Publication date: December 13, 2018

Inventors: Nikolaos Stefanakis, Athanasios Mouchtaris
Direction of arrival estimation and sound source enhancement in the presence of a reflective surface apparatuses, methods, and systems

Patent number: 10149048

Abstract: A processor-implemented method for sound-source enhancement, including: capturing a signal from a sound source using a sensor array having a plurality of sensors, the sensor array being positioned between the sound source and the reflective surface; calculating a half-space propagation model by determining a modified steering vector associated with a plane sound wave produced by the sound source as a function of signal direction and the reflectivity value; calculating a half-space spatial coherence model by dividing a sphere with its center on the reflecting surface into two mirror symmetric parts intersected by a plane to create two half spheres; creating a half-space signal-enhancement module using the half-space propagation model and the half-space coherence model; and applying the half-space signal-enhancement module to the signal.

Type: Grant

Filed: September 26, 2016

Date of Patent: December 4, 2018

Assignee: FOUNDATION FOR RESEARCH AND TECHNOLOGY—HELLAS (F.O.R.T.H.) INSTITUTE OF COMPUTER SCIENCE (I.C.S.)

Inventors: Nikolaos Stefanakis, Athanasios Mouchtaris
Capturing and reproducing spatial sound apparatuses, methods, and systems

Patent number: 10136239

Abstract: A processor-implemented method for capturing and reproducing spatial sound. The method includes: capturing a plurality of input signals using a plurality of sensors within a sound field; subjecting each input signal to a short-time Fourier transform to transform each signal into a transformed signal in the time-frequency domain; decomposing each of the transformed signals into a directional component and a diffuse component; optimizing beamformer weights using vector based amplitude panning to determine an optimal directivity pattern for the diffuse component of each transformed signal; constructing a set of diffuse sound channels using the diffuse components of the transformed signals and the optimized beamformer weights; constructing a set of directional sound channels using the directional components of the transformed signals; and reproducing the sound field by distributing the directional and diffuse sound channels to a plurality of output devices.

Type: Grant

Filed: June 15, 2016

Date of Patent: November 20, 2018

Assignee: FOUNDATION FOR RESEARCH AND TECHNOLOGY—HELLAS (F.O.R.T.H.)

Inventors: Nikolaos Stefanakis, Athanasios Mouchtaris
Spatial sound characterization apparatuses, methods and systems

Patent number: 9955277

Abstract: A processor-implemented method for spatial sound characterization is described. In one implementation, each of a plurality of source signals detected by a plurality of sensing devices, is segmented into a plurality of time frames. For each time frame, time-frequency transform of the source signals is derived, an estimated number of sources and at least one estimated direction of arrival corresponding to each of the source signals is obtained. Further, source signals are extracted by spatial separation based at least on the estimated directions of arrival and the estimated number of sources, and separated source signals are processed to yield a reference signal and side information.

Type: Grant

Filed: June 2, 2014

Date of Patent: April 24, 2018

Assignee: FOUNDATION FOR RESEARCH AND TECHNOLOGY-HELLAS (F.O.R.T.H.) INSTITUTE OF COMPUTER SCIENCE (I.C.S.)

Inventors: Anastasios Alexandridis, Anthony Griffin, Athanasios Mouchtaris
Sound source characterization apparatuses, methods and systems

Patent number: 9554203

Abstract: A processor-implemented method for sound characterization is described. In one implementation, time-frequency transform of each of a plurality of sound signals from one or more sources, the sound signals being detected by a plurality of sensing devices, is derived. One or more single-source constant-time analysis zones based at least on correlation between the time-frequency transform signals from a pair of sensing devices are detected. At least one direction of arrival for each source in the detected single source analysis zones are detected. A histogram of the estimated directions of arrival is created and an estimate of a number of the sound sources and corresponding directions of arrival are generated based at least on the histogram.

Type: Grant

Filed: September 26, 2013

Date of Patent: January 24, 2017

Assignee: Foundation for Research and Technolgy—Hellas (FORTH) Institute of Computer Science (ICS)

Inventors: Despoina Pavlidi, Anthony Griffin, Athanasios Mouchtaris
Sound source localization and isolation apparatuses, methods and systems

Patent number: 9549253

Abstract: A processor-implemented method for spatial sound localization and isolation is described. The method includes segmenting, via a processor, each of a plurality of source signals detected by a plurality of sensors, into a plurality of time frames. For each time frame, the method further includes obtaining, via a processor, a plurality of direction of arrival (DOA) estimates from the plurality of sensors, discretizing an area of interest into a plurality of grid points, calculating, via the processor, DOA at each of grid points, comparing, via the processor, the DOA estimates with the computed DOAs. If the number of sources is more than 1, the method includes obtaining via the processor, a plurality of combinations of DOA estimates, from amongst the plurality of combinations, estimating, via the processor, one or more initial candidate locations corresponding to each of the combinations, selecting location of the sources from amongst the initial candidate locations.

Type: Grant

Filed: November 28, 2014

Date of Patent: January 17, 2017

Assignee: Foundation for Research and Technology—Hellas (FORTH) Institute of Computer Science (ICS)

Inventors: Anastasios Alexandridis, Anthony Griffin, Athanasios Mouchtaris
Apparatuses, methods and systems for audio processing and transmission

Patent number: 9111525

Abstract: This disclosure details the implementation of apparatuses, methods and systems for audio processing and transmission. Some implementations of the system are configured to provide a method for encoding an arbitrary number of audio source signals using only a small amount of (transmitted or stored) information, while facilitating high-quality audio playback at the decoder side. Some implementations may be configured to implement, a parametric model for retaining the essential information of each source signal (side information). After the side information is extracted, the remaining information for all source signals may be summed to create a reference signal from which noise information for the original source signals may be reconstructed. The reference signal and the side information form the new collection of information to be transmitted or stored for subsequent decoding.

Type: Grant

Filed: September 30, 2008

Date of Patent: August 18, 2015

Assignee: Foundation for Research and Technology—Hellas (FORTH) Institute of Computer Science (ICS)

Inventors: Athanasios Mouchtaris, Panagiotis Tsakalides
SOUND SOURCE LOCALIZATION AND ISOLATION APPARATUSES, METHODS AND SYSTEMS

Publication number: 20150156578

Abstract: A processor-implemented method for spatial sound localization and isolation is described. The method includes segmenting, via a processor, each of a plurality of source signals detected by a plurality of sensors, into a plurality of time frames. For each time frame, the method further includes obtaining, via a processor, a plurality of direction of arrival (DOA) estimates from the plurality of sensors, discretizing an area of interest into a plurality of grid points, calculating, via the processor, DOA at each of grid points, comparing, via the processor, the DOA estimates with the computed DOAs. If the number of sources is more than 1, the method includes obtaining via the processor, a plurality of combinations of DOA estimates, from amongst the plurality of combinations, estimating, via the processor, one or more initial candidate locations corresponding to each of the combinations, selecting location of the sources from amongst the initial candidate locations.

Type: Application

Filed: November 28, 2014

Publication date: June 4, 2015

Applicant: Foundation for Research and Technology - Hellas (F.O.R.T.H) Institute of Computer Science (I.C.S.)

Inventors: Anastasios Alexandridis, Anthony Griffin, Athanasios Mouchtaris
Apparatuses, methods and systems for sparse sinusoidal audio processing and transmission

Patent number: 8489403

Abstract: The APPARATUSES, METHODS AND SYSTEMS FOR SPARSE SINUSOIDAL AUDIO PROCESSING AND TRANSMISSION (hereinafter “SS-Audio”) provides a platform for encoding and decoding audio signals based on a sparse sinusoidal structure. In one embodiment, the SS-Audio encoder may encode received audio inputs based on its sparse representation in the frequency domain and transmit the encoded and quantized bit streams. In one embodiment, the SS-Audio decoder may decode received quantized bit streams based on sparse reconstruction and recover the original audio input by reconstructing the sinusoidal parameters in the frequency domain.

Type: Grant

Filed: August 25, 2010

Date of Patent: July 16, 2013

Assignee: Foundation For Research and Technology—Institute of Computer Science ‘FORTH-ICS’

Inventors: Anthony Griffin, Athanasios Mouchtaris, Panagiotis Tsakalides