Patents by Inventor Francesco Nesta

Francesco Nesta has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11937054
    Abstract: Embodiments described herein provide a combined multi-source time difference of arrival (TDOA) tracking and voice activity detection (VAD) mechanism that is applicable for generic array geometries, e.g., a microphone array that lies on a plane. The combined multi-source TDOA tracking and VAD mechanism scans the azimuth and elevation angles of the microphone array in microphone pairs, based on which a planar locus of physically admissible TDOAs can be formed in the multi-dimensional TDOA space of multiple microphone pairs. In this way, the multi-dimensional TDOA tracking reduces the number of calculations that was usually involved in traditional TDOA by performing the TDOA search for each dimension separately.
    Type: Grant
    Filed: June 16, 2021
    Date of Patent: March 19, 2024
    Assignee: Synaptics Incorporated
    Inventors: Alireza Masnadi-Shirazi, Francesco Nesta
  • Patent number: 11763832
    Abstract: Systems and methods for generating an enhanced audio signal comprise a trained neural network configured to receive an input audio signal and generate an enhanced target signal, the trained neural network comprising a pre-processing neural network configured to receive a segment of the input audio signal and output an audio classification, the pre-processing neural network including at least one hidden layer comprising an embedding vector, and a noise reduction neural network configured to receive the segment of the input audio signal, and the embedding vector and generate the enhanced target signal. The pre-processing neural network may comprise a target signal pre-processing neural network configured to output a target signal classification and comprising at least one hidden layer comprising a target embedding vector.
    Type: Grant
    Filed: May 1, 2020
    Date of Patent: September 19, 2023
    Assignees: Synaptics Incorporated, The Trustees of Indiana University
    Inventors: Francesco Nesta, Minje Kim, Sanna Wager
  • Patent number: 11694710
    Abstract: Audio processing systems and methods include an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and target-speech detection logic and an automatic speech recognition engine or VoIP application. An audio processing device includes a target speech enhancement engine configured to analyze a multichannel audio input signal and generate a plurality of enhanced target streams, a multi-stream target-speech detection generator comprising a plurality of target-speech detector engines each configured to determine a probability of detecting a specific target-speech of interest in the stream, wherein the multi-stream target-speech detection generator is configured to determine a plurality of weights associated with the enhanced target streams, and a fusion subsystem configured to apply the plurality of weights to the enhanced target streams to generate an enhancement output signal.
    Type: Grant
    Filed: September 24, 2021
    Date of Patent: July 4, 2023
    Assignee: Synaptics Incorporated
    Inventors: Francesco Nesta, Saeed Mosayyebpour Kaskari
  • Patent number: 11687770
    Abstract: Systems and methods for multimodal classification include a plurality of expert modules, each expert module configured to receive data corresponding to one of a plurality of input modalities and extract associated features, a plurality of class prediction modules, each class prediction module configured to receive extracted features from a corresponding one of the expert modules and predict an associated class, a gate expert configured to receive the extracted features from the plurality of expert modules and output a set of weights for the input modalities, and a fusion module configured to generate a weighted prediction based on the class predictions and the set of weights. Various embodiments include one or more of an image expert, a video expert, an audio expert, class prediction modules, a gate expert, and a co-learning framework.
    Type: Grant
    Filed: May 20, 2019
    Date of Patent: June 27, 2023
    Assignees: SYNAPTICS INCORPORATED, THE TRUSTEES OF INDIANA UNIVERSITY
    Inventors: Francesco Nesta, Lijiang Guo, Minje Kim
  • Patent number: 11373667
    Abstract: Systems and methods for processing an audio signal include an audio input operable to receive an input signal comprising a time-domain, single-channel audio signal, a subband analysis block operable to transform the input signal to a frequency domain input signal comprising a plurality of k-spaced under-sampled subband signals, a reverberation reduction block operable to reduce reverberation effect, including late reverberation, in the plurality of k-spaced under-sampled subband signals, a noise reduction block operable to reduce background noise from the plurality of k-spaced under-sampled subband signals, and a subband synthesis block operable to transform the subband signals to the time-domain, thereby producing an enhanced output signal.
    Type: Grant
    Filed: April 19, 2018
    Date of Patent: June 28, 2022
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Francesco Nesta, Trausti Thormundsson, Thomas Aaron Gulliver
  • Patent number: 11264017
    Abstract: Systems and methods include a plurality of audio input components configured to generate a plurality of audio input signals, and a logic device configured to receive the plurality of audio input signals, determine whether the plurality of audio signals comprise target audio associated with an audio source, estimate a relative location of the audio source with respect to the plurality of audio input components based on the plurality of audio signals and a determination of whether the plurality of audio signals comprise the target audio, and process the plurality of audio signals to generate an audio output signal by enhancing the target audio based on the estimated relative location. The logic device is further configured to use relative transfer-based covariance to construct directional covariance matrix aligned across frequency bands and find a direction that minimizes beam power subject to distortionless criteria.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: March 1, 2022
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Alireza Masnadi-Shirazi, Francesco Nesta
  • Patent number: 11257512
    Abstract: Systems and methods include a first voice activity detector operable to detect speech in a frame of a multichannel audio input signal and output a speech determination, a constrained minimum variance adaptive filter operable to receive the multichannel audio input signal and the speech determination and minimize a signal variance at the output of the filter, thereby producing an equalized target speech signal, a mask estimator operable to receive the equalized target speech signal and the speech determination and generate a spectral-temporal mask to discriminate a target speech from noise and interference speech, and a second activity voice detector operable to detect voice in a frame of the speech discriminated signal. An audio input sensor array including a plurality of microphones, each microphone generating a channel of the multichannel audio input signal. A sub-band analysis module operable to decompose each of the channels into a plurality of frequency sub-bands.
    Type: Grant
    Filed: January 6, 2020
    Date of Patent: February 22, 2022
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Francesco Nesta, Alireza Masnadi-Shirazi
  • Publication number: 20220013134
    Abstract: Audio processing systems and methods include an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and target-speech detection logic and an automatic speech recognition engine or VoIP application. An audio processing device includes a target speech enhancement engine configured to analyze a multichannel audio input signal and generate a plurality of enhanced target streams, a multi-stream target-speech detection generator comprising a plurality of target-speech detector engines each configured to determine a probability of detecting a specific target-speech of interest in the stream, wherein the multi-stream target-speech detection generator is configured to determine a plurality of weights associated with the enhanced target streams, and a fusion subsystem configured to apply the plurality of weights to the enhanced target streams to generate an enhancement output signal.
    Type: Application
    Filed: September 24, 2021
    Publication date: January 13, 2022
    Inventors: Francesco NESTA, Saeed MOSAYYEBPOUR KASKARI
  • Publication number: 20210390952
    Abstract: Systems and methods include a plurality of audio input components configured to generate a plurality of audio input signals, and a logic device configured to receive the plurality of audio input signals, determine whether the plurality of audio signals comprise target audio associated with an audio source, estimate a relative location of the audio source with respect to the plurality of audio input components based on the plurality of audio signals and a determination of whether the plurality of audio signals comprise the target audio, and process the plurality of audio signals to generate an audio output signal by enhancing the target audio based on the estimated relative location. The logic device is further configured to use relative transfer-based covariance to construct directional covariance matrix aligned across frequency bands and find a direction that minimizes beam power subject to distortionless criteria.
    Type: Application
    Filed: June 12, 2020
    Publication date: December 16, 2021
    Inventors: Alireza Masnadi-Shirazi, Francesco Nesta
  • Patent number: 11158333
    Abstract: Audio processing systems and methods include an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and target-speech detection logic and an automatic speech recognition engine or VoIP application. An audio processing device includes a target speech enhancement engine configured to analyze a multichannel audio input signal and generate a plurality of enhanced target streams, a multi-stream target-speech detection generator comprising a plurality of target-speech detector engines each configured to determine a probability of detecting a specific target-speech of interest in the stream, wherein the multi-stream target-speech detection generator is configured to determine a plurality of weights associated with the enhanced target streams, and a fusion subsystem configured to apply the plurality of weights to the enhanced target streams to generate an enhancement output signal.
    Type: Grant
    Filed: December 6, 2019
    Date of Patent: October 26, 2021
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Francesco Nesta, Saeed Mosayyebpour Kaskari
  • Publication number: 20210314701
    Abstract: Embodiments described herein provide a combined multi-source time difference of arrival (TDOA) tracking and voice activity detection (VAD) mechanism that is applicable for generic array geometries, e.g., a microphone array that lies on a plane. The combined multi-source TDOA tracking and VAD mechanism scans the azimuth and elevation angles of the microphone array in microphone pairs, based on which a planar locus of physically admissible TDOAs can be formed in the multi-dimensional TDOA space of multiple microphone pairs. In this way, the multi-dimensional TDOA tracking reduces the number of calculations that was usually involved in traditional TDOA by performing the TDOA search for each dimension separately.
    Type: Application
    Filed: June 16, 2021
    Publication date: October 7, 2021
    Inventors: Alireza Masnadi-Shirazi, Francesco Nesta
  • Patent number: 11100932
    Abstract: An end detector configured to receive the feature data and detect an end point of a keyword, and a start detector configured to receive an indication of the detected end point and process the feature data associated with corresponding input frames to detect a start point of the keyword. The start detector and end detector comprise neural networks trained through a process using a cross-entropy cost function for non-Region of Target (ROT) frames and a One-Spike Connectionist Temporal Classification cost function for ROT frames.
    Type: Grant
    Filed: December 20, 2019
    Date of Patent: August 24, 2021
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour, Francesco Nesta, Trausti Thormundsson
  • Patent number: 11087213
    Abstract: A classification training system for binary and multi-class classification comprises a neural network operable to perform classification of input data, a training dataset including pre-segmented, labeled training samples, and a classification training module operable to train the neural network using the training dataset. The classification training module includes a forward pass processing module, and a backward pass processing module. The backward pass processing module is operable to determine whether a current frame is in a region of target (ROT), determine ROT information such as beginning and length of the ROT and update weights and biases using a cross-entropy cost function and One Spike Connectionist Temporal Classification (OSCTC) cost function. The backward pass module further computes a soft target value using ROT information and computes a signal output error using the soft target value and network output value.
    Type: Grant
    Filed: December 20, 2019
    Date of Patent: August 10, 2021
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour, Trausti Thormundsson, Francesco Nesta
  • Patent number: 11082460
    Abstract: Systems and methods for audio signal enhancement facilitated using video data are provided. In one example, a method includes receiving a multi-channel audio signal including audio inputs detected by a plurality of audio input devices. The method further includes receiving an image captured by a video input device. The method further includes determining a first signal based at least in part on the image. The first signal is indicative of a likelihood associated with a target audio source. The method further includes determining a second signal based at least in part on the multi-channel audio signal and the first signal. The second signal is indicative of a likelihood associated with an audio component attributed to the target audio source. The method further includes processing the multi-channel audio signal based at least in part on the second signal to generate an output audio signal.
    Type: Grant
    Filed: June 27, 2019
    Date of Patent: August 3, 2021
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Francesco Nesta, Boyan Bonev, Utkarsh Gaur
  • Publication number: 20210219053
    Abstract: Embodiments described herein provide a combined multi-source time difference of arrival (TDOA) tracking and voice activity detection (VAD) mechanism that is applicable for generic array geometries, e.g., a microphone array that lies on a plane. The combined multi-source TDOA tracking and VAD mechanism scans the azimuth and elevation angles of the microphone array in microphone pairs, based on which a planar locus of physically admissible TDOAs can be formed in the multi-dimensional TDOA space of multiple microphone pairs. In this way, the multi-dimensional TDOA tracking reduces the number of calculations that was usually involved in traditional TDOA by performing the TDOA search for each dimension separately.
    Type: Application
    Filed: January 10, 2020
    Publication date: July 15, 2021
    Inventors: Alireza Masnadi-Shirazi, Francesco Nesta
  • Patent number: 11064294
    Abstract: Embodiments described herein provide a combined multi-source time difference of arrival (TDOA) tracking and voice activity detection (VAD) mechanism that is applicable for generic array geometries, e.g., a microphone array that lies on a plane. The combined multi-source TDOA tracking and VAD mechanism scans the azimuth and elevation angles of the microphone array in microphone pairs, based on which a planar locus of physically admissible TDOAs can be formed in the multi-dimensional TDOA space of multiple microphone pairs. In this way, the multi-dimensional TDOA tracking reduces the number of calculations that was usually involved in traditional TDOA by performing the TDOA search for each dimension separately.
    Type: Grant
    Filed: January 10, 2020
    Date of Patent: July 13, 2021
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Alireza Masnadi-Shirazi, Francesco Nesta
  • Patent number: 10957338
    Abstract: Audio processing systems and methods comprise an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and a target activity detector configured to identify audio target sources in the multichannel audio signal. The target activity detector includes a VAD, an instantaneous locations component configured to detect a location of a plurality of audio sources, a dominant locations component configured to selectively buffer a subset of the plurality of audio sources comprising dominant audio sources, a source tracker configured to track locations of the dominant audio sources over time, and a dominance selection component configured to select the dominant target sources for further audio processing. The instantaneous location component computes a discrete spatial map comprising the location of the plurality of audio sources, and the dominant location component selects N of the dominant sources from the discrete spatial map for source tracking.
    Type: Grant
    Filed: May 16, 2019
    Date of Patent: March 23, 2021
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Francesco Nesta, Saeed Mosayyebpour Kaskari, Dror Givon
  • Patent number: 10930298
    Abstract: Audio signal processing for adaptive de-reverberation uses a least mean squares (LMS) filter that has improved convergence over conventional LMS filters, making embodiments practical for reducing the effects of reverberation for use in many portable and embedded devices, such as smartphones, tablets, laptops, and hearing aids, for applications such as speech recognition and audio communication in general. The LMS filter employs a frequency-dependent adaptive step size to speed up the convergence of the predictive filter process, requiring fewer computational steps compared to a conventional LMS filter applied to the same inputs. The improved convergence is achieved at low memory consumption cost. Controlling the updates of the prediction filter in a high non-stationary condition of the acoustic channel improves the performance under such conditions. The techniques are suitable for single or multiple channels and are applicable to microphone array processing.
    Type: Grant
    Filed: December 22, 2017
    Date of Patent: February 23, 2021
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Francesco Nesta
  • Publication number: 20200412772
    Abstract: Systems and methods for audio signal enhancement facilitated using video data are provided. In one example, a method includes receiving a multi-channel audio signal including audio inputs detected by a plurality of audio input devices. The method further includes receiving an image captured by a video input device. The method further includes determining a first signal based at least in part on the image. The first signal is indicative of a likelihood associated with a target audio source. The method further includes determining a second signal based at least in part on the multi-channel audio signal and the first signal. The second signal is indicative of a likelihood associated with an audio component attributed to the target audio source. The method further includes processing the multi-channel audio signal based at least in part on the second signal to generate an output audio signal.
    Type: Application
    Filed: June 27, 2019
    Publication date: December 31, 2020
    Inventors: Francesco Nesta, Boyan Bonev, Utkarsh Gaur
  • Publication number: 20200349965
    Abstract: Systems and methods for generating an enhanced audio signal comprise a trained neural network configured to receive an input audio signal and generate an enhanced target signal, the trained neural network comprising a pre-processing neural network configured to receive a segment of the input audio signal and output an audio classification, the pre-processing neural network including at least one hidden layer comprising an embedding vector, and a noise reduction neural network configured to receive the segment of the input audio signal, and the embedding vector and generate the enhanced target signal. The pre-processing neural network may comprise a target signal pre-processing neural network configured to output a target signal classification and comprising at least one hidden layer comprising a target embedding vector.
    Type: Application
    Filed: May 1, 2020
    Publication date: November 5, 2020
    Inventors: Francesco Nesta, Minje Kim, Sanna Wager