Patents by Inventor Francesco Nesta

Francesco Nesta has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10762891
    Abstract: A classification training system for binary and multi-class classification comprises a neural network operable to perform classification of input data, a training dataset including pre-segmented, labeled training samples, and a classification training module operable to train the neural network using the training dataset. The classification training module includes a forward pass processing module, and a backward pass processing module. The backward pass processing module is operable to determine whether a current frame is in a region of target (ROT), determine ROT information such as beginning and length of the ROT and update weights and biases using a cross-entropy cost function and connectionist temporal classification cost function. The backward pass module further computes a soft target value using ROT information and computes a signal output error using the soft target value and network output value.
    Type: Grant
    Filed: February 12, 2018
    Date of Patent: September 1, 2020
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Trausti Thormundsson, Francesco Nesta
  • Patent number: 10762427
    Abstract: Classification training systems and methods include a neural network for classification of input data, a training dataset providing segmented labeled training data, and a classification training module operable to train the neural network using the training data. A forward pass processing module is operable to generate neural network outputs for the training data using weights and bias for the neural network, and a backward pass processing module is operable to update the weights and biases in a backward pass, including obtaining Region of Target (ROT) information from the training data, generate a forward-backward masking based on the ROT information, the forward-backward masking placing at least one restriction on a neural network output path, compute modified forward and backward variables based on the neural network outputs and the forward-backward masking, and update the weights and biases.
    Type: Grant
    Filed: March 1, 2018
    Date of Patent: September 1, 2020
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Trausti Thormundsson, Francesco Nesta
  • Patent number: 10762417
    Abstract: A classification system and method for training a neural network includes receiving a stream of segmented, labeled training data having a sequence of frames, computing a stream of input features data for the sequence of frames, and generating neural network outputs for the sequence of frames in a forward pass through the training data and in accordance weights and biases. The weights and biases are updated in a backward pass through the training data, including determining Region of Target (ROT) information from the segmented, labeled training data, computing modified forward and backward variables based on the neural network outputs and the ROT information, deriving a signal error for each frame within the sequence of frames based on the modified forward and backward variables, and updating the weights and biases based on the derived signal error. An adaptive learning module is provided to improve a convergence rate of the neural network.
    Type: Grant
    Filed: February 12, 2018
    Date of Patent: September 1, 2020
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Trausti Thormundsson, Francesco Nesta
  • Publication number: 20200219530
    Abstract: Systems and methods include a first voice activity detector operable to detect speech in a frame of a multichannel audio input signal and output a speech determination, a constrained minimum variance adaptive filter operable to receive the multichannel audio input signal and the speech determination and minimize a signal variance at the output of the filter, thereby producing an equalized target speech signal, a mask estimator operable to receive the equalized target speech signal and the speech determination and generate a spectral-temporal mask to discriminate a target speech from noise and interference speech, and a second activity voice detector operable to detect voice in a frame of the speech discriminated signal. An audio input sensor array including a plurality of microphones, each microphone generating a channel of the multichannel audio input signal. A sub-band analysis module operable to decompose each of the channels into a plurality of frequency sub-bands.
    Type: Application
    Filed: January 6, 2020
    Publication date: July 9, 2020
    Inventors: Francesco Nesta, Alireza Masnadi-Shirazi
  • Publication number: 20200184985
    Abstract: Audio processing systems and methods include an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and target-speech detection logic and an automatic speech recognition engine or VoIP application. An audio processing device includes a target speech enhancement engine configured to analyze a multichannel audio input signal and generate a plurality of enhanced target streams, a multi-stream target-speech detection generator comprising a plurality of target-speech detector engines each configured to determine a probability of detecting a specific target-speech of interest in the stream, wherein the multi-stream target-speech detection generator is configured to determine a plurality of weights associated with the enhanced target streams, and a fusion subsystem configured to apply the plurality of weights to the enhanced target streams to generate an enhancement output signal.
    Type: Application
    Filed: December 6, 2019
    Publication date: June 11, 2020
    Inventors: Francesco Nesta, Saeed Mosayyebpour Kaskari
  • Patent number: 10679617
    Abstract: A real-time audio signal processing system includes an audio signal processor configured to process audio signals using a modified generalized eigenvalue (GEV) beamforming technique to generate an enhanced target audio output signal. The digital signal processor includes a sub-band decomposition circuitry configured to decompose the audio signal into sub-band frames in the frequency domain and a target activity detector configured to detect whether a target audio is present in the sub-band frames. Based on information related to the sub-band frames and the determination of whether the target audio is present in the sub-band frames, the digital signal processor is configured to use the modified GEV technique to estimate the relative transfer function (RTF) of the target audio source, and generate a filter based on the estimated RTF. The filter may then be applied to the audio signals to generate the enhanced audio output signal.
    Type: Grant
    Filed: December 6, 2017
    Date of Patent: June 9, 2020
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Frederic Philippe Denis Mustiere, Francesco Nesta
  • Publication number: 20200126556
    Abstract: An end detector configured to receive the feature data and detect an end point of a keyword, and a start detector configured to receive an indication of the detected end point and process the feature data associated with corresponding input frames to detect a start point of the keyword. The start detector and end detector comprise neural networks trained through a process using a cross-entropy cost function for non-Region of Target (ROT) frames and a One-Spike Connectionist Temporal Classification cost function for ROT frames.
    Type: Application
    Filed: December 20, 2019
    Publication date: April 23, 2020
    Inventors: Saeed Mosayyebpour, Francesco Nesta, Trausti Thormundsson
  • Publication number: 20200125951
    Abstract: A classification training system for binary and multi-class classification comprises a neural network operable to perform classification of input data, a training dataset including pre-segmented, labeled training samples, and a classification training module operable to train the neural network using the training dataset. The classification training module includes a forward pass processing module, and a backward pass processing module. The backward pass processing module is operable to determine whether a current frame is in a region of target (ROT), determine ROT information such as beginning and length of the ROT and update weights and biases using a cross-entropy cost function and One Spike Connectionist Temporal Classification (OSCTC) cost function. The backward pass module further computes a soft target value using ROT information and computes a signal output error using the soft target value and network output value.
    Type: Application
    Filed: December 20, 2019
    Publication date: April 23, 2020
    Inventors: Saeed Mosayyebpour, Trausti Thormundsson, Francesco Nesta
  • Patent number: 10614788
    Abstract: Systems and methods for enhancing a headset user's own voice include an outside microphone, an inside microphone, audio input components operable to receive a plurality of time-domain microphone signals, including an outside microphone signal from the outside microphone and an inside microphone signal from the inside microphone, a subband decomposition module operable to transform the time-domain microphone signals to frequency domain subband signals, a voice activity detector operable to detect speech presence and absence in the subband signals, a speech extraction module operable to predict a clean speech signal in each of the inside microphone signal and the outside microphone signal, and cancel audio sources other than a headset user's own voice by combining the predicted clean speech signal from the inside microphone signal and the predicted clean speech signal from the outside microphone signal, and a postfiltering module operable to reduce residual noise.
    Type: Grant
    Filed: March 15, 2018
    Date of Patent: April 7, 2020
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Frederic Philippe Denis Mustiere, Francesco Nesta, Trausti Thormundsson
  • Patent number: 10504539
    Abstract: An audio processing device or method includes an audio transducer operable to receive audio input and generate an audio signal based on the audio input. The audio processing device or method also includes an audio signal processor operable to extract local features from the audio signal, such as Power-Normalized Coefficients (PNCC) of the audio signal. The audio signal processor also is operable to extract global features from the audio signal, such as chroma features and harmonicity features. A neural network is provided to determine a probability that a target audio is present in the audio signal based on the local and global features. In particular, the neural network is trained to output a value indicating whether the target audio is present and locally dominant in the audio signal.
    Type: Grant
    Filed: December 5, 2017
    Date of Patent: December 10, 2019
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Francesco Nesta
  • Publication number: 20190355373
    Abstract: Audio processing systems and methods comprise an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and a target activity detector configured to identify audio target sources in the multichannel audio signal. The target activity detector includes a VAD, an instantaneous locations component configured to detect a location of a plurality of audio sources, a dominant locations component configured to selectively buffer a subset of the plurality of audio sources comprising dominant audio sources, a source tracker configured to track locations of the dominant audio sources over time, and a dominance selection component configured to select the dominant target sources for further audio processing. The instantaneous location component computes a discrete spatial map comprising the location of the plurality of audio sources, and the dominant location component selects N of the dominant sources from the discrete spatial map for source tracking.
    Type: Application
    Filed: May 16, 2019
    Publication date: November 21, 2019
    Inventors: Francesco Nesta, Saeed Mosayyebpour Kaskari, Dror Givon
  • Publication number: 20190354797
    Abstract: Systems and methods for multimodal classification include a plurality of expert modules, each expert module configured to receive data corresponding to one of a plurality of input modalities and extract associated features, a plurality of class prediction modules, each class prediction module configured to receive extracted features from a corresponding one of the expert modules and predict an associated class, a gate expert configured to receive the extracted features from the plurality of expert modules and output a set of weights for the input modalities, and a fusion module configured to generate a weighted prediction based on the class predictions and the set of weights. Various embodiments include one or more of an image expert, a video expert, an audio expert, class prediction modules, a gate expert, and a co-learning framework.
    Type: Application
    Filed: May 20, 2019
    Publication date: November 21, 2019
    Inventors: Francesco Nesta, Lijiang Guo, Minje Kim
  • Patent number: 10446171
    Abstract: Systems and methods for processing multichannel audio signals include receiving a multichannel time-domain audio input, transforming the input signal to plurality of multi-channel frequency domain, k-spaced under-sampled subband signals, buffering and delaying each channel, saving a subset of spectral frames for prediction filter estimation at each of the spectral frames, estimating a variance of the frequency domain signal at each of the spectral frames, adaptively estimating the prediction filter in an online manner using a recursive least squares (RLS) algorithm, linearly filtering each channel using the estimated prediction filter, nonlinearly filtering the linearly filtered output signal to reduce residual reverberation and the estimated variances, producing a nonlinearly filtered output signal, and synthesizing the nonlinearly filtered output signal to reconstruct a dereverberated time-domain multi-channel audio signal.
    Type: Grant
    Filed: December 22, 2017
    Date of Patent: October 15, 2019
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Francesco Nesta, Trausti Thormundsson
  • Patent number: 10347271
    Abstract: Various techniques are provided to perform enhanced automatic speech recognition. For example, a subband analysis may be performed that transforms time-domain signals of multiple audio channels in subband signals. An adaptive configurable transformation may also be performed to produce single or multichannel-based features whose values are correlated to an Ideal Binary Mask (IBM). An unsupervised Gaussian Mixture Model (GMM) model fitting the distribution of the features and producing posterior probabilities may also be performed, and the posteriors may be combined to produce deep neural network (DNN) feature vectors. A DNN may be provided that predicts oracle spectral gains from the input feature vectors. Spectral processing may be performed to produce an estimate of the target source time-frequency magnitudes from the mixtures and the output of the DNN. Subband synthesis may be performed to transform signals back to time-domain.
    Type: Grant
    Filed: December 2, 2016
    Date of Patent: July 9, 2019
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Francesco Nesta, Xiangyuan Zhao, Trausti Thormundsson
  • Publication number: 20190172450
    Abstract: A real-time audio signal processing system includes an audio signal processor configured to process audio signals using a modified generalized eigenvalue (GEV) beamforming technique to generate an enhanced target audio output signal. The digital signal processor includes a sub-band decomposition circuitry configured to decompose the audio signal into sub-band frames in the frequency domain and a target activity detector configured to detect whether a target audio is present in the sub-band frames. Based on information related to the sub-band frames and the determination of whether the target audio is present in the sub-band frames, the digital signal processor is configured to use the modified GEV technique to estimate the relative transfer function (RTF) of the target audio source, and generate a filter based on the estimated RTF. The filter may then be applied to the audio signals to generate the enhanced audio output signal.
    Type: Application
    Filed: December 6, 2017
    Publication date: June 6, 2019
    Inventors: Frederic Philippe Denis Mustiere, Francesco Nesta
  • Publication number: 20190172480
    Abstract: An audio processing device or method includes an audio transducer operable to receive audio input and generate an audio signal based on the audio input. The audio processing device or method also includes an audio signal processor operable to extract local features from the audio signal, such as Power-Normalized Coefficients (PNCC) of the audio signal. The audio signal processor also is operable to extract global features from the audio signal, such as chroma features and harmonicity features. A neural network is provided to determine a probability that a target audio is present in the audio signal based on the local and global features. In particular, the neural network is trained to output a value indicating whether the target audio is present and locally dominant in the audio signal.
    Type: Application
    Filed: December 5, 2017
    Publication date: June 6, 2019
    Inventors: Saeed Mosayyebpour Kaskari, Francesco Nesta
  • Patent number: 10123113
    Abstract: A selective audio source enhancement system includes a processor and a memory, and a pre-processing unit configured to receive audio data including a target audio signal, and to perform sub-band domain decomposition of the audio data to generate buffered outputs. In addition, the system includes a target source detection unit configured to receive the buffered outputs, and to generate a target presence probability corresponding to the target audio signal, as well as a spatial filter estimation unit configured to receive the target presence probability, and to transform frames buffered in each sub-band into a higher resolution frequency-domain. The system also includes a spectral filtering unit configured to retrieve a multichannel image of the target audio signal and noise signals associated with the target audio signal, and an audio synthesis unit configured to extract an enhanced mono signal corresponding to the target audio signal from the multichannel image.
    Type: Grant
    Filed: May 15, 2017
    Date of Patent: November 6, 2018
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Francesco Nesta, Trausti Thormundsson, Willie Wu
  • Publication number: 20180308503
    Abstract: Systems and methods for processing an audio signal include an audio input operable to receive an input signal comprising a time-domain, single-channel audio signal, a subband analysis block operable to transform the input signal to a frequency domain input signal comprising a plurality of k-spaced under-sampled subband signals, a reverberation reduction block operable to reduce reverberation effect, including late reverberation, in the plurality of k-spaced under-sampled subband signals, a noise reduction block operable to reduce background noise from the plurality of k-spaced under-sampled subband signals, and a subband synthesis block operable to transform the subband signals to the time-domain, thereby producing an enhanced output signal.
    Type: Application
    Filed: April 19, 2018
    Publication date: October 25, 2018
    Inventors: Saeed Mosayyebpour Kaskari, Francesco Nesta, Trausti Thormundsson, Thomas Aaron Gulliver
  • Publication number: 20180268798
    Abstract: Systems and methods for enhancing a headset user's own voice include an outside microphone, an inside microphone, audio input components operable to receive a plurality of time-domain microphone signals, including an outside microphone signal from the outside microphone and an inside microphone signal from the inside microphone, a subband decomposition module operable to transform the time-domain microphone signals to frequency domain subband signals, a voice activity detector operable to detect speech presence and absence in the subband signals, a speech extraction module operable to predict a clean speech signal in each of the inside microphone signal and the outside microphone signal, and cancel audio sources other than a headset user's own voice by combining the predicted clean speech signal from the inside microphone signal and the predicted clean speech signal from the outside microphone signal, and a postfiltering module operable to reduce residual noise.
    Type: Application
    Filed: March 15, 2018
    Publication date: September 20, 2018
    Inventors: Frederic Philippe Denis Mustiere, Francesco Nesta, Trausti Thormundsson
  • Publication number: 20180253648
    Abstract: Classification training systems and methods include a neural network for classification of input data, a training dataset providing segmented labeled training data, and a classification training module operable to train the neural network using the training data. A forward pass processing module is operable to generate neural network outputs for the training data using weights and bias for the neural network, and a backward pass processing module is operable to update the weights and biases in a backward pass, including obtaining Region of Target (ROT) information from the training data, generate a forward-backward masking based on the ROT information, the forward-backward masking placing at least one restriction on a neural network output path, compute modified forward and backward variables based on the neural network outputs and the forward-backward masking, and update the weights and biases.
    Type: Application
    Filed: March 1, 2018
    Publication date: September 6, 2018
    Inventors: Saeed Mosayyebpour Kaskari, Trausti Thormundsson, Francesco Nesta