Patents by Inventor Patrick Naylor
Patrick Naylor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250087231Abstract: There is provided a speech processing system that includes a neural encoder module. A processor that receives an audio signal; and the memory that contains instructions that control said processor to perform operations that process speech. In an implementation, a front end module can include a Neural Spatial RTF Estimator and a neural spatial and residual encoder (NSRE) configured accept as inputs a spectral encoded reference channel stream to output Neural Transfer Functions (NTFs). In another implementation, a front end module encodes and outputs a Ch1 bitstream; computes a plurality of relative transfer functions (RTFs) for an N-Channel signal and outputs an N?1 RTFs or an RTF codebook ids and computes and processes an N?1 residual stream; and a back end module comprising a neural encoder module configured to accept the RTFs and output an encoded speech signal comprising an embedding that comprises features extracted from RTFs.Type: ApplicationFiled: September 20, 2024Publication date: March 13, 2025Inventors: Dushyant SHARMA, Patrick NAYLOR, Daniel T. JONES
-
Patent number: 12165668Abstract: A method of performing at least de-reverberation and noise-reduction of an input sound signal of at least one input channel includes: performing, using at least one filter element, at least one of de-reverberation and noise-reduction of the input sound signal to generate a clean output sound signal; and determining, by a non-intrusive measure (NIM) estimation element, at least one non-intrusive measure (NIM) from the sound signal, wherein the at least one NIM includes at least one of voice activity detection (VAD) posterior, reverberation time, clarity index, direct-to-reverberant ratio (DRR), and signal-to-noise ratio (SNR); the de-reverberation is achieved by applying at least one channel shortening (CS) filter component of the at least one filter element in conjunction with the at least one NIM; and the noise reduction is performed in combination with the de-reverberation by the channel shortening (CS) filter component.Type: GrantFiled: February 18, 2022Date of Patent: December 10, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Dushyant Sharma, James Fosburgh, Patrick Naylor
-
Patent number: 12142293Abstract: There is provided a speech processing system that includes a neural encoder module. A processor that receives an audio signal; and the memory that contains instructions that control said processor to perform operations that process speech. In an implementation, a front end module can include a Neural Spatial RTF Estimator and a neural spatial and residual encoder (NSRE) configured accept as inputs a spectral encoded reference channel stream to output Neural Transfer Functions (NTFs). In another implementation, a front end module encodes and outputs a Ch1 bitstream; computes a plurality of relative transfer functions (RTFs) for an N-Channel signal and outputs an N?1 RTFs or an RTF codebook ids and computes and processes an N?1 residual stream; and a back end module comprising a neural encoder module configured to accept the RTFs and output an encoded speech signal comprising an embedding that comprises features extracted from RTFs.Type: GrantFiled: June 30, 2022Date of Patent: November 12, 2024Assignee: Microsoft Technology Licensing, LLC.Inventors: Dushyant Sharma, Patrick Naylor, Daniel T. Jones
-
Publication number: 20240005946Abstract: There is provided a speech processing system that includes a neural encoder module. A processor that receives an audio signal; and the memory that contains instructions that control said processor to perform operations that process speech. In an implementation, a front end module can include a Neural Spatial RTF Estimator and a neural spatial and residual encoder (NSRE) configured accept as inputs a spectral encoded reference channel stream to output Neural Transfer Functions (NTFs). In another implementation, a front end module encodes and outputs a Ch1 bitstream; computes a plurality of relative transfer functions (RTFs) for an N-Channel signal and outputs an N?1 RTFs or an RTF codebook ids and computes and processes an N?1 residual stream; and a back end module comprising a neural encoder module configured to accept the RTFs and output an encoded speech signal comprising an embedding that comprises features extracted from RTFs.Type: ApplicationFiled: June 30, 2022Publication date: January 4, 2024Applicant: NUANCE COMMUNICATIONS, INC.Inventors: Dushyant SHARMA, Patrick NAYLOR, Daniel T. JONES
-
Patent number: 11835625Abstract: A method of performing distance estimation between a first recording device at a first location and a second recording device at a second location includes: estimating acoustic relative transfer function (RTF) between the first recording device and the second recording device for a sound signal, e.g., by applying an improved proportionate normalized least mean square (IPNLMS) filter; and estimating the distance between the first recording device and the second recording device based on the RTF. The at least one acoustic feature extracted from the RTF estimated between the first recording device and the second recording device includes at least one of clarity index, direct-to-reverberant ratio (DRR), and reverberation time. A distributed-gradient-boosting algorithm with regression trees is used in combination with signal-to-reverberation ratio (SRR) and the at least one acoustic feature extracted from the RTF to estimate the distance between the first recording device and the second recording device.Type: GrantFiled: March 15, 2022Date of Patent: December 5, 2023Assignee: Microsoft Technology Licensing, LLC.Inventors: Francesco Nespoli, Patrick Naylor, Daniel Barreda
-
Publication number: 20230315815Abstract: A method includes: providing a workstation having a playback app configured for audio playback; providing a decryption module having a decryption functionality communicatively connected to the playback app; encrypting, by a server using an encryption key associated with the decryption module, audio data; and decrypting, using the decryption module, the encrypted audio data. The decryption module having the decryption functionality is provided as part of the playback app, as part of firmware of a headphone, or as part of a phone app. The method can additionally include: i) authenticating, using a voice biometric authentication module, a transcriber; ii) enabling decryption by the decryption module only upon input of a decode PIN by the transcriber; and iii) a) modifying the audio data to spatialize speech component and noise component of the audio data at different angles using head-related transfer function (HRTF) filtering, and b) playing back the audio data binaurally.Type: ApplicationFiled: April 5, 2022Publication date: October 5, 2023Applicant: NUANCE COMMUNICATIONS, INC.Inventors: William F. GANONG, III, Ljubomir MILANOVIC, Uwe JOST, Dushyant SHARMA, Patrick NAYLOR
-
Publication number: 20230296767Abstract: A method of performing distance estimation between a first recording device at a first location and a second recording device at a second location includes: estimating acoustic relative transfer function (RTF) between the first recording device and the second recording device for a sound signal, e.g., by applying an improved proportionate normalized least mean square (IPNLMS) filter; and estimating the distance between the first recording device and the second recording device based on the RTF. The at least one acoustic feature extracted from the RTF estimated between the first recording device and the second recording device includes at least one of clarity index, direct-to-reverberant ratio (DRR), and reverberation time. A distributed-gradient-boosting algorithm with regression trees is used in combination with signal-to-reverberation ratio (SRR) and the at least one acoustic feature extracted from the RTF to estimate the distance between the first recording device and the second recording device.Type: ApplicationFiled: March 15, 2022Publication date: September 21, 2023Applicant: NUANCE COMMUNICATIONS, INC.Inventors: Francesco NESPOLI, Patrick NAYLOR, Daniel BARREDA
-
Publication number: 20230267944Abstract: A method of performing at least de-reverberation and noise-reduction of an input sound signal of at least one input channel includes: performing, using at least one filter element, at least one of de-reverberation and noise-reduction of the input sound signal to generate a clean output sound signal; and determining, by a non-intrusive measure (NIM) estimation element, at least one non-intrusive measure (NIM) from the sound signal, wherein the at least one NIM includes at least one of voice activity detection (VAD) posterior, reverberation time, clarity index, direct-to-reverberant ratio (DRR), and signal-to-noise ratio (SNR); the de-reverberation is achieved by applying at least one channel shortening (CS) filter component of the at least one filter element in conjunction with the at least one NIM; and the noise reduction is performed in combination with the de-reverberation by the channel shortening (CS) filter component.Type: ApplicationFiled: February 18, 2022Publication date: August 24, 2023Applicant: NUANCE COMMUNICATIONS, INC.Inventors: Sharma DUSHYANT, James FOSBURGH, Patrick NAYLOR
-
Patent number: 11482241Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.Type: GrantFiled: March 27, 2017Date of Patent: October 25, 2022Assignee: Nuance Communications, IncInventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
-
Publication number: 20220051772Abstract: A computer-implemented method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least second audio encounter information. A computer-implemented method, computer program product, and computing system for compartmentalizing a virtual assistant is executed on a computing device and includes obtaining encounter information via a compartmentalized virtual assistant during a user encounter, wherein the compartmentalized virtual assistant includes a core functionality module.Type: ApplicationFiled: March 23, 2021Publication date: February 17, 2022Inventors: Guido Remi Marcel Gallopyn, Dushyant Sharma, Uwe Helmut Jost, Donald E. Owen, Patrick Naylor, Amr Nour-Eldin, Daniel Paulino Almendro Barreda, Mehmet Mert Öz, Garret N. Erskine
-
Patent number: 10847171Abstract: Disclosed methods and systems are directed to determining a best microphone pair and segmenting sound signals. The methods and systems may include receiving a collection of sound signals comprising speech from one or more audio sources (e.g., meeting participants) and/or background noise. The methods and systems may include calculating a TDOA and determining, based on the TDOA and via robust statistics, the best pair of microphones. The methods and systems may also include segmenting sound signals from multiple sources.Type: GrantFiled: September 24, 2019Date of Patent: November 24, 2020Assignee: Nuance Communications, Inc.Inventors: Pablo Peso Parada, Dushyant Sharma, Patrick Naylor
-
Publication number: 20200312349Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.Type: ApplicationFiled: March 27, 2017Publication date: October 1, 2020Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
-
Publication number: 20200184986Abstract: Disclosed methods and systems are directed to determining a best microphone pair and segmenting sound signals. The methods and systems may include receiving a collection of sound signals comprising speech from one or more audio sources (e.g., meeting participants) and/or background noise. The methods and systems may include calculating a TDOA and determining, based on the TDOA and via robust statistics, the best pair of microphones. The methods and systems may also include segmenting sound signals from multiple sources.Type: ApplicationFiled: September 24, 2019Publication date: June 11, 2020Inventors: Pablo Peso Parada, Dushyant Sharma, Patrick Naylor
-
Patent number: 10424317Abstract: Disclosed methods and systems are directed to determining a best microphone pair and segmenting sound signals. The methods and systems may include receiving a collection of sound signals comprising speech from one or more audio sources (e.g., meeting participants) and/or background noise. The methods and systems may include calculating a TDOA and determining, based on the TDOA and via robust statistics, the best pair of microphones. The methods and systems may also include segmenting sound signals from multiple sources.Type: GrantFiled: January 11, 2017Date of Patent: September 24, 2019Assignee: Nuance Communications, Inc.Inventors: Pablo Peso Parada, Dushyant Sharma, Patrick Naylor
-
Publication number: 20190051378Abstract: A method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a patient encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least a second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least a second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least a second audio encounter information.Type: ApplicationFiled: August 8, 2018Publication date: February 14, 2019Inventors: Guido Remi Marcel Gallopyn, Dushyant Sharma, Uwe Helmut Jost, Donald E. Owen, Patrick Naylor, Amr Nour-Eldin, Daniel Paulino Almendro Barreda
-
Patent number: 9997172Abstract: A system, method and computer program product are described for voice activity detection (VAD) within a digitally encoded bitstream. A parameter extraction module is configured to extract parameters from a sequence of coded frames from a digitally encoded bitstream containing speech. A VAD classifier is configured to operate with input of the digitally encoded bitstream to evaluate each coded frame based on bitstream coding parameter classification features to output a VAD decision indicative of whether or not speech is present in one or more of the coded frames.Type: GrantFiled: December 2, 2013Date of Patent: June 12, 2018Assignee: Nuance Communications, Inc.Inventors: Daniel A. Barreda, Jose E. G. Lainez, Dushyant Sharma, Patrick Naylor
-
Patent number: 9922664Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.Type: GrantFiled: March 28, 2016Date of Patent: March 20, 2018Assignee: Nuance Communications, Inc.Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
-
Publication number: 20180075860Abstract: Disclosed methods and systems are directed to determining a best microphone pair and segmenting sound signals. The methods and systems may include receiving a collection of sound signals comprising speech from one or more audio sources (e.g., meeting participants) and/or background noise. The methods and systems may include calculating a TDOA and determining, based on the TDOA and via robust statistics, the best pair of microphones. The methods and systems may also include segmenting sound signals from multiple sources.Type: ApplicationFiled: January 11, 2017Publication date: March 15, 2018Inventors: Pablo Peso Parada, Dushyant Sharma, Patrick Naylor
-
Patent number: 9870784Abstract: A system and method for speech quality detection is included. The method may include receiving, at a computing device, a first speech signal associated with a particular user. The method may include extracting one or more short-term features from the first speech signal wherein extracting short-term features includes extracting a time frame of between 10-50 ms. The method may also include determining one or more statistics of each of the one or more short-term features from the first speech signal. The method may further include classifying the one or more statistics as belonging to one of a set of quality classes.Type: GrantFiled: September 6, 2013Date of Patent: January 16, 2018Assignee: Nuance Communications, Inc.Inventors: Dushyant Sharma, Patrick Naylor
-
Publication number: 20170278527Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.Type: ApplicationFiled: March 28, 2016Publication date: September 28, 2017Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost