Patents by Inventor Saeed Mosayyebpour Kaskari

Saeed Mosayyebpour Kaskari has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11853884
    Abstract: A classification training system comprises a neural network configured to perform classification of input data, a training dataset including pre-segmented, labeled training samples, and a classification training module configured to train the neural network using the training dataset. The classification training module includes a forward pass processing module, and a backward pass processing module. The backward pass processing module is configured to determine whether a current frame is in a region of target (ROT), determine ROT information such as beginning and length of the ROT and update weights and biases using a cross-entropy cost function and a tunable many-or-one detection (MOOD) cost function, that comprises a tunable hyperparameter for tuning the classifier for a particular task. The backward pass module further computes a soft target value using ROT information and computes a signal output error using the soft target value and network output value.
    Type: Grant
    Filed: April 28, 2021
    Date of Patent: December 26, 2023
    Assignee: Synaptics Incorporated
    Inventor: Saeed Mosayyebpour Kaskari
  • Patent number: 11823707
    Abstract: An audio spotting system configured for various operating modes including a regular mode and sensitivity mode is described. An example cascade audio spotting system may include a high-power subsystem including a high-power trigger and a transfer module. This high-power trigger includes one or more detection models used to detect whether a target sound activity is included in the one or more audio streams. The one or more detection models are associated with a first set of hyperparameters when the cascade audio spotting system is in a regular mode, and the one or more detection models are associated with a second set of hyperparameters when the cascade audio spotting system is in a sensitivity mode. The transfer module provides at least one of one or more processed audio streams for further processing in response to the high-power trigger detecting the target sound activity in the one or more audio streams.
    Type: Grant
    Filed: January 10, 2022
    Date of Patent: November 21, 2023
    Assignee: Synaptics Incorporated
    Inventor: Saeed Mosayyebpour Kaskari
  • Publication number: 20230326167
    Abstract: Systems and methods for classification of data comprise optimizing a neural network by minimizing a rhino loss value, including receiving a training batch of data samples comprising a plurality of samples for each of a plurality of classifications, extracting features from the samples to generate a batch of features, processing the batch of features using a neural network to generate a plurality of classifications to differentiate the samples, computing a rhino loss value for the training batch based, at least in part, on the classifications, and modifying weights of the neural network to reduce the rhino loss value.
    Type: Application
    Filed: June 14, 2023
    Publication date: October 12, 2023
    Applicant: Synaptics Incorporated
    Inventors: Saeed MOSAYYEBPOUR KASKARI, Atabak POUYA
  • Publication number: 20230223042
    Abstract: An audio spotting system configured for various operating modes including a regular mode and sensitivity mode is described. An example cascade audio spotting system may include a high-power subsystem including a high-power trigger and a transfer module. This high-power trigger includes one or more detection models used to detect whether a target sound activity is included in the one or more audio streams. The one or more detection models are associated with a first set of hyperparameters when the cascade audio spotting system is in a regular mode, and the one or more detection models are associated with a second set of hyperparameters when the cascade audio spotting system is in a sensitivity mode. The transfer module provides at least one of one or more processed audio streams for further processing in response to the high-power trigger detecting the target sound activity in the one or more audio streams.
    Type: Application
    Filed: January 10, 2022
    Publication date: July 13, 2023
    Inventor: Saeed MOSAYYEBPOUR KASKARI
  • Publication number: 20230223041
    Abstract: Systems and methods for identifying audio events in one or more audio streams include the use of a cascade audio spotting system (such as a cascade keyword spotting system (KWS)) to reduce power consumption while maintaining a desired performance. An example cascade audio spotting system may include a first module and a high-power subsystem. The first module is to receive an audio stream from one or more audio streams, process the audio stream to detect a first target sound activity in the audio stream, and provide a first signal in response to detecting the first target sound activity in the audio stream. The high-power subsystem is to (in response to the first signal being provided by the first module) receive the one or more audio streams and process the one or more audio streams to detect a second target sound activity in the one or more audio streams.
    Type: Application
    Filed: January 10, 2022
    Publication date: July 13, 2023
    Inventors: Saeed MOSAYYEBPOUR KASKARI, Hong QIU, Atabak POUYA
  • Patent number: 11694710
    Abstract: Audio processing systems and methods include an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and target-speech detection logic and an automatic speech recognition engine or VoIP application. An audio processing device includes a target speech enhancement engine configured to analyze a multichannel audio input signal and generate a plurality of enhanced target streams, a multi-stream target-speech detection generator comprising a plurality of target-speech detector engines each configured to determine a probability of detecting a specific target-speech of interest in the stream, wherein the multi-stream target-speech detection generator is configured to determine a plurality of weights associated with the enhanced target streams, and a fusion subsystem configured to apply the plurality of weights to the enhanced target streams to generate an enhancement output signal.
    Type: Grant
    Filed: September 24, 2021
    Date of Patent: July 4, 2023
    Assignee: Synaptics Incorporated
    Inventors: Francesco Nesta, Saeed Mosayyebpour Kaskari
  • Publication number: 20220207305
    Abstract: Systems and methods for classification of data comprise optimizing a neural network by minimizing a rhino loss function, including receiving a training batch of data samples comprising a plurality of samples for each of a plurality of classifications, extracting features from the samples to generate a batch of features, processing the batch of features using a neural network to generate a plurality of classifications to differentiate the samples, computing a rhino loss value for the training batch based, at least in part, on the classifications, and modifying weights of the neural network to reduce the rhino loss value.
    Type: Application
    Filed: December 30, 2020
    Publication date: June 30, 2022
    Inventors: Saeed MOSAYYEBPOUR KASKARI, Atabak POUYA
  • Patent number: 11373667
    Abstract: Systems and methods for processing an audio signal include an audio input operable to receive an input signal comprising a time-domain, single-channel audio signal, a subband analysis block operable to transform the input signal to a frequency domain input signal comprising a plurality of k-spaced under-sampled subband signals, a reverberation reduction block operable to reduce reverberation effect, including late reverberation, in the plurality of k-spaced under-sampled subband signals, a noise reduction block operable to reduce background noise from the plurality of k-spaced under-sampled subband signals, and a subband synthesis block operable to transform the subband signals to the time-domain, thereby producing an enhanced output signal.
    Type: Grant
    Filed: April 19, 2018
    Date of Patent: June 28, 2022
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Francesco Nesta, Trausti Thormundsson, Thomas Aaron Gulliver
  • Patent number: 11328733
    Abstract: Systems and methods for speaker verification comprise optimizing a neural network by minimizing a generalized negative log likelihood function, including receiving a training batch of audio samples comprising a plurality of utterances for each of a plurality of speakers, extracting features from the audio samples to generate a batch of features, processing the batch of features using a neural network to generate a plurality of embedding vectors configured to differentiate audio samples by speaker, computing a generalized negative log-likelihood loss (GNLL) value for the training batch based, at least in part, on the embedding vectors, and modifying weights of the neural network to reduce the GNLL value. Computing the GNLL may include generating a centroid vector for each of a plurality of speakers, based at least in part on the embedding vectors.
    Type: Grant
    Filed: September 24, 2020
    Date of Patent: May 10, 2022
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Atabak Pouya
  • Publication number: 20220093106
    Abstract: Systems and methods for speaker verification comprise optimizing a neural network by minimizing a generalized negative log likelihood function, including receiving a training batch of audio samples comprising a plurality of utterances for each of a plurality of speakers, extracting features from the audio samples to generate a batch of features, processing the batch of features using a neural network to generate a plurality of embedding vectors configured to differentiate audio samples by speaker, computing a generalized negative log-likelihood loss (GNLL) value for the training batch based, at least in part, on the embedding vectors, and modifying weights of the neural network to reduce the GNLL value. Computing the GNLL may include generating a centroid vector for each of a plurality of speakers, based at least in part on the embedding vectors.
    Type: Application
    Filed: September 24, 2020
    Publication date: March 24, 2022
    Inventors: Saeed Mosayyebpour Kaskari, Atabak Pouya
  • Publication number: 20220013134
    Abstract: Audio processing systems and methods include an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and target-speech detection logic and an automatic speech recognition engine or VoIP application. An audio processing device includes a target speech enhancement engine configured to analyze a multichannel audio input signal and generate a plurality of enhanced target streams, a multi-stream target-speech detection generator comprising a plurality of target-speech detector engines each configured to determine a probability of detecting a specific target-speech of interest in the stream, wherein the multi-stream target-speech detection generator is configured to determine a plurality of weights associated with the enhanced target streams, and a fusion subsystem configured to apply the plurality of weights to the enhanced target streams to generate an enhancement output signal.
    Type: Application
    Filed: September 24, 2021
    Publication date: January 13, 2022
    Inventors: Francesco NESTA, Saeed MOSAYYEBPOUR KASKARI
  • Patent number: 11158333
    Abstract: Audio processing systems and methods include an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and target-speech detection logic and an automatic speech recognition engine or VoIP application. An audio processing device includes a target speech enhancement engine configured to analyze a multichannel audio input signal and generate a plurality of enhanced target streams, a multi-stream target-speech detection generator comprising a plurality of target-speech detector engines each configured to determine a probability of detecting a specific target-speech of interest in the stream, wherein the multi-stream target-speech detection generator is configured to determine a plurality of weights associated with the enhanced target streams, and a fusion subsystem configured to apply the plurality of weights to the enhanced target streams to generate an enhancement output signal.
    Type: Grant
    Filed: December 6, 2019
    Date of Patent: October 26, 2021
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Francesco Nesta, Saeed Mosayyebpour Kaskari
  • Publication number: 20210248470
    Abstract: A classification training system comprises a neural network configured to perform classification of input data, a training dataset including pre-segmented, labeled training samples, and a classification training module configured to train the neural network using the training dataset. The classification training module includes a forward pass processing module, and a backward pass processing module. The backward pass processing module is configured to determine whether a current frame is in a region of target (ROT), determine ROT information such as beginning and length of the ROT and update weights and biases using a cross-entropy cost function and a tunable many-or-one detection (MOOD) cost function, that comprises a tunable hyperparameter for tuning the classifier for a particular task. The backward pass module further computes a soft target value using ROT information and computes a signal output error using the soft target value and network output value.
    Type: Application
    Filed: April 28, 2021
    Publication date: August 12, 2021
    Inventor: Saeed Mosayyebpour Kaskari
  • Patent number: 10957338
    Abstract: Audio processing systems and methods comprise an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and a target activity detector configured to identify audio target sources in the multichannel audio signal. The target activity detector includes a VAD, an instantaneous locations component configured to detect a location of a plurality of audio sources, a dominant locations component configured to selectively buffer a subset of the plurality of audio sources comprising dominant audio sources, a source tracker configured to track locations of the dominant audio sources over time, and a dominance selection component configured to select the dominant target sources for further audio processing. The instantaneous location component computes a discrete spatial map comprising the location of the plurality of audio sources, and the dominant location component selects N of the dominant sources from the discrete spatial map for source tracking.
    Type: Grant
    Filed: May 16, 2019
    Date of Patent: March 23, 2021
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Francesco Nesta, Saeed Mosayyebpour Kaskari, Dror Givon
  • Patent number: 10930298
    Abstract: Audio signal processing for adaptive de-reverberation uses a least mean squares (LMS) filter that has improved convergence over conventional LMS filters, making embodiments practical for reducing the effects of reverberation for use in many portable and embedded devices, such as smartphones, tablets, laptops, and hearing aids, for applications such as speech recognition and audio communication in general. The LMS filter employs a frequency-dependent adaptive step size to speed up the convergence of the predictive filter process, requiring fewer computational steps compared to a conventional LMS filter applied to the same inputs. The improved convergence is achieved at low memory consumption cost. Controlling the updates of the prediction filter in a high non-stationary condition of the acoustic channel improves the performance under such conditions. The techniques are suitable for single or multiple channels and are applicable to microphone array processing.
    Type: Grant
    Filed: December 22, 2017
    Date of Patent: February 23, 2021
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Francesco Nesta
  • Patent number: 10762891
    Abstract: A classification training system for binary and multi-class classification comprises a neural network operable to perform classification of input data, a training dataset including pre-segmented, labeled training samples, and a classification training module operable to train the neural network using the training dataset. The classification training module includes a forward pass processing module, and a backward pass processing module. The backward pass processing module is operable to determine whether a current frame is in a region of target (ROT), determine ROT information such as beginning and length of the ROT and update weights and biases using a cross-entropy cost function and connectionist temporal classification cost function. The backward pass module further computes a soft target value using ROT information and computes a signal output error using the soft target value and network output value.
    Type: Grant
    Filed: February 12, 2018
    Date of Patent: September 1, 2020
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Trausti Thormundsson, Francesco Nesta
  • Patent number: 10762427
    Abstract: Classification training systems and methods include a neural network for classification of input data, a training dataset providing segmented labeled training data, and a classification training module operable to train the neural network using the training data. A forward pass processing module is operable to generate neural network outputs for the training data using weights and bias for the neural network, and a backward pass processing module is operable to update the weights and biases in a backward pass, including obtaining Region of Target (ROT) information from the training data, generate a forward-backward masking based on the ROT information, the forward-backward masking placing at least one restriction on a neural network output path, compute modified forward and backward variables based on the neural network outputs and the forward-backward masking, and update the weights and biases.
    Type: Grant
    Filed: March 1, 2018
    Date of Patent: September 1, 2020
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Trausti Thormundsson, Francesco Nesta
  • Patent number: 10762417
    Abstract: A classification system and method for training a neural network includes receiving a stream of segmented, labeled training data having a sequence of frames, computing a stream of input features data for the sequence of frames, and generating neural network outputs for the sequence of frames in a forward pass through the training data and in accordance weights and biases. The weights and biases are updated in a backward pass through the training data, including determining Region of Target (ROT) information from the segmented, labeled training data, computing modified forward and backward variables based on the neural network outputs and the ROT information, deriving a signal error for each frame within the sequence of frames based on the modified forward and backward variables, and updating the weights and biases based on the derived signal error. An adaptive learning module is provided to improve a convergence rate of the neural network.
    Type: Grant
    Filed: February 12, 2018
    Date of Patent: September 1, 2020
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Trausti Thormundsson, Francesco Nesta
  • Publication number: 20200184985
    Abstract: Audio processing systems and methods include an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and target-speech detection logic and an automatic speech recognition engine or VoIP application. An audio processing device includes a target speech enhancement engine configured to analyze a multichannel audio input signal and generate a plurality of enhanced target streams, a multi-stream target-speech detection generator comprising a plurality of target-speech detector engines each configured to determine a probability of detecting a specific target-speech of interest in the stream, wherein the multi-stream target-speech detection generator is configured to determine a plurality of weights associated with the enhanced target streams, and a fusion subsystem configured to apply the plurality of weights to the enhanced target streams to generate an enhancement output signal.
    Type: Application
    Filed: December 6, 2019
    Publication date: June 11, 2020
    Inventors: Francesco Nesta, Saeed Mosayyebpour Kaskari
  • Patent number: 10504539
    Abstract: An audio processing device or method includes an audio transducer operable to receive audio input and generate an audio signal based on the audio input. The audio processing device or method also includes an audio signal processor operable to extract local features from the audio signal, such as Power-Normalized Coefficients (PNCC) of the audio signal. The audio signal processor also is operable to extract global features from the audio signal, such as chroma features and harmonicity features. A neural network is provided to determine a probability that a target audio is present in the audio signal based on the local and global features. In particular, the neural network is trained to output a value indicating whether the target audio is present and locally dominant in the audio signal.
    Type: Grant
    Filed: December 5, 2017
    Date of Patent: December 10, 2019
    Assignee: SYNAPTICS INCORPORATED
    Inventors: Saeed Mosayyebpour Kaskari, Francesco Nesta