Abstract: A computer-implemented method of generating advanced feature discrimination vectors (AFDVs) representing sounds forming part of an audio signal input to a device is provided. The method includes taking a plurality of samples of the audio signal, and for each sample of the audio signal taken: performing a signal analysis on the sample to extract one or more high resolution oscillator peaks therefrom; renormalizing the extracted oscillator peaks to eliminate variations in the fundamental frequency and time duration for each sample occurring over the window; normalizing the power of the renormalized extracted oscillator peaks; and forming the renormalized and power normalized extracted oscillator peaks into a respective AFDV for the sample. The method further includes outputting the respective AFDV to a comparison function configured to identify a characteristic of the sample based on a comparison of the respective AFDV with a library of AFDVs associated with known sounds and/or known speakers.
Abstract: A method of processing a signal includes taking a signal recorded by a plurality of signal recorders, applying at least one super-resolution technique to the signal to produce an oscillator peak representation of the signal comprising a plurality of frequency components for a plurality of oscillator peaks, computing at least one Cross Channel Complex Spectral Phase Evolution (XCSPE) attribute for the signal to produce a measure of a spatial evolution of the plurality of oscillator peaks between the signal, identifying a known predicted XCSPE curve (PXC) trace corresponding to the frequency components and at least one XCSPE attribute of the plurality of oscillator peaks and utilizing the identified PXC trace to determine a spatial attribute corresponding to an origin of the signal.
Type:
Grant
Filed:
October 17, 2019
Date of Patent:
April 13, 2021
Assignee:
XMOS INC.
Inventors:
Kevin M. Short, Brian T. Hone, Pascal Brunet
Abstract: A method includes receiving an input signal comprising an original domain signal and creating a first window data set and a second window data set from the signal, wherein an initiation of the second window data set is offset from an initiation of the first window data set, converting the first window data set and the second window data set to a frequency domain and storing the resulting data as data in a second domain different from the original domain, performing complex spectral phase evolution (CSPE) on the second domain data to estimate component frequencies of the first and second window data sets, using the component frequencies estimated in the CSPE, sampling a set of second-domain high resolution windows to select a mathematical representation comprising a second-domain high resolution window that fits at least one of the amplitude, phase, amplitude modulation and frequency modulation of a component of an underlying signal wherein the component comprises at least one oscillator peak, generating an ou
Abstract: A method of processing a signal includes taking a signal recorded by a plurality of signal recorders, applying at least one super-resolution technique to the signal to produce an oscillator peak representation of the signal comprising a plurality of frequency components for a plurality of oscillator peaks, computing at least one Cross Channel Complex Spectral Phase Evolution (XCSPE) attribute for the signal to produce a measure of a spatial evolution of the plurality of oscillator peaks between the signal, identifying a known predicted XCSPE curve (PXC) trace corresponding to the frequency components and at least one XCSPE attribute of the plurality of oscillator peaks and utilizing the identified PXC trace to determine a spatial attribute corresponding to an origin of the signal.
Type:
Grant
Filed:
April 8, 2015
Date of Patent:
December 3, 2019
Assignee:
XMOS INC.
Inventors:
Kevin M. Short, Brian T. Hone, Pascal Brunet
Abstract: A method of renormalizing high-resolution oscillator peaks, extracted from windowed samples of an audio signal, is disclosed. Feature vectors are generated for which variations in both fundamental frequency and time duration of speech are substantially mitigated. The feature vectors may be aligned within a common coordinate space, free of those variations in frequency and time duration that occurs between speakers, and even over speech by a single speaker, to facilitate a simple and accurate determination of matches between those AFDVs generated from a sample of the audio signal and corpus AFDVs generated for known speech at the phoneme and sub-phoneme level. The renormalized feature vectors can be combined with traditional feature vectors such as MFCCs, or they can be used exclusively to identify voiced, semi-voiced and unvoiced sounds.