Patents by Inventor Patrick Naylor

Patrick Naylor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250087231
    Abstract: There is provided a speech processing system that includes a neural encoder module. A processor that receives an audio signal; and the memory that contains instructions that control said processor to perform operations that process speech. In an implementation, a front end module can include a Neural Spatial RTF Estimator and a neural spatial and residual encoder (NSRE) configured accept as inputs a spectral encoded reference channel stream to output Neural Transfer Functions (NTFs). In another implementation, a front end module encodes and outputs a Ch1 bitstream; computes a plurality of relative transfer functions (RTFs) for an N-Channel signal and outputs an N?1 RTFs or an RTF codebook ids and computes and processes an N?1 residual stream; and a back end module comprising a neural encoder module configured to accept the RTFs and output an encoded speech signal comprising an embedding that comprises features extracted from RTFs.
    Type: Application
    Filed: September 20, 2024
    Publication date: March 13, 2025
    Inventors: Dushyant SHARMA, Patrick NAYLOR, Daniel T. JONES
  • Patent number: 12165668
    Abstract: A method of performing at least de-reverberation and noise-reduction of an input sound signal of at least one input channel includes: performing, using at least one filter element, at least one of de-reverberation and noise-reduction of the input sound signal to generate a clean output sound signal; and determining, by a non-intrusive measure (NIM) estimation element, at least one non-intrusive measure (NIM) from the sound signal, wherein the at least one NIM includes at least one of voice activity detection (VAD) posterior, reverberation time, clarity index, direct-to-reverberant ratio (DRR), and signal-to-noise ratio (SNR); the de-reverberation is achieved by applying at least one channel shortening (CS) filter component of the at least one filter element in conjunction with the at least one NIM; and the noise reduction is performed in combination with the de-reverberation by the channel shortening (CS) filter component.
    Type: Grant
    Filed: February 18, 2022
    Date of Patent: December 10, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Dushyant Sharma, James Fosburgh, Patrick Naylor
  • Patent number: 12142293
    Abstract: There is provided a speech processing system that includes a neural encoder module. A processor that receives an audio signal; and the memory that contains instructions that control said processor to perform operations that process speech. In an implementation, a front end module can include a Neural Spatial RTF Estimator and a neural spatial and residual encoder (NSRE) configured accept as inputs a spectral encoded reference channel stream to output Neural Transfer Functions (NTFs). In another implementation, a front end module encodes and outputs a Ch1 bitstream; computes a plurality of relative transfer functions (RTFs) for an N-Channel signal and outputs an N?1 RTFs or an RTF codebook ids and computes and processes an N?1 residual stream; and a back end module comprising a neural encoder module configured to accept the RTFs and output an encoded speech signal comprising an embedding that comprises features extracted from RTFs.
    Type: Grant
    Filed: June 30, 2022
    Date of Patent: November 12, 2024
    Assignee: Microsoft Technology Licensing, LLC.
    Inventors: Dushyant Sharma, Patrick Naylor, Daniel T. Jones
  • Publication number: 20240005946
    Abstract: There is provided a speech processing system that includes a neural encoder module. A processor that receives an audio signal; and the memory that contains instructions that control said processor to perform operations that process speech. In an implementation, a front end module can include a Neural Spatial RTF Estimator and a neural spatial and residual encoder (NSRE) configured accept as inputs a spectral encoded reference channel stream to output Neural Transfer Functions (NTFs). In another implementation, a front end module encodes and outputs a Ch1 bitstream; computes a plurality of relative transfer functions (RTFs) for an N-Channel signal and outputs an N?1 RTFs or an RTF codebook ids and computes and processes an N?1 residual stream; and a back end module comprising a neural encoder module configured to accept the RTFs and output an encoded speech signal comprising an embedding that comprises features extracted from RTFs.
    Type: Application
    Filed: June 30, 2022
    Publication date: January 4, 2024
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Dushyant SHARMA, Patrick NAYLOR, Daniel T. JONES
  • Patent number: 11835625
    Abstract: A method of performing distance estimation between a first recording device at a first location and a second recording device at a second location includes: estimating acoustic relative transfer function (RTF) between the first recording device and the second recording device for a sound signal, e.g., by applying an improved proportionate normalized least mean square (IPNLMS) filter; and estimating the distance between the first recording device and the second recording device based on the RTF. The at least one acoustic feature extracted from the RTF estimated between the first recording device and the second recording device includes at least one of clarity index, direct-to-reverberant ratio (DRR), and reverberation time. A distributed-gradient-boosting algorithm with regression trees is used in combination with signal-to-reverberation ratio (SRR) and the at least one acoustic feature extracted from the RTF to estimate the distance between the first recording device and the second recording device.
    Type: Grant
    Filed: March 15, 2022
    Date of Patent: December 5, 2023
    Assignee: Microsoft Technology Licensing, LLC.
    Inventors: Francesco Nespoli, Patrick Naylor, Daniel Barreda
  • Publication number: 20230315815
    Abstract: A method includes: providing a workstation having a playback app configured for audio playback; providing a decryption module having a decryption functionality communicatively connected to the playback app; encrypting, by a server using an encryption key associated with the decryption module, audio data; and decrypting, using the decryption module, the encrypted audio data. The decryption module having the decryption functionality is provided as part of the playback app, as part of firmware of a headphone, or as part of a phone app. The method can additionally include: i) authenticating, using a voice biometric authentication module, a transcriber; ii) enabling decryption by the decryption module only upon input of a decode PIN by the transcriber; and iii) a) modifying the audio data to spatialize speech component and noise component of the audio data at different angles using head-related transfer function (HRTF) filtering, and b) playing back the audio data binaurally.
    Type: Application
    Filed: April 5, 2022
    Publication date: October 5, 2023
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: William F. GANONG, III, Ljubomir MILANOVIC, Uwe JOST, Dushyant SHARMA, Patrick NAYLOR
  • Publication number: 20230296767
    Abstract: A method of performing distance estimation between a first recording device at a first location and a second recording device at a second location includes: estimating acoustic relative transfer function (RTF) between the first recording device and the second recording device for a sound signal, e.g., by applying an improved proportionate normalized least mean square (IPNLMS) filter; and estimating the distance between the first recording device and the second recording device based on the RTF. The at least one acoustic feature extracted from the RTF estimated between the first recording device and the second recording device includes at least one of clarity index, direct-to-reverberant ratio (DRR), and reverberation time. A distributed-gradient-boosting algorithm with regression trees is used in combination with signal-to-reverberation ratio (SRR) and the at least one acoustic feature extracted from the RTF to estimate the distance between the first recording device and the second recording device.
    Type: Application
    Filed: March 15, 2022
    Publication date: September 21, 2023
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Francesco NESPOLI, Patrick NAYLOR, Daniel BARREDA
  • Publication number: 20230267944
    Abstract: A method of performing at least de-reverberation and noise-reduction of an input sound signal of at least one input channel includes: performing, using at least one filter element, at least one of de-reverberation and noise-reduction of the input sound signal to generate a clean output sound signal; and determining, by a non-intrusive measure (NIM) estimation element, at least one non-intrusive measure (NIM) from the sound signal, wherein the at least one NIM includes at least one of voice activity detection (VAD) posterior, reverberation time, clarity index, direct-to-reverberant ratio (DRR), and signal-to-noise ratio (SNR); the de-reverberation is achieved by applying at least one channel shortening (CS) filter component of the at least one filter element in conjunction with the at least one NIM; and the noise reduction is performed in combination with the de-reverberation by the channel shortening (CS) filter component.
    Type: Application
    Filed: February 18, 2022
    Publication date: August 24, 2023
    Applicant: NUANCE COMMUNICATIONS, INC.
    Inventors: Sharma DUSHYANT, James FOSBURGH, Patrick NAYLOR
  • Patent number: 11482241
    Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.
    Type: Grant
    Filed: March 27, 2017
    Date of Patent: October 25, 2022
    Assignee: Nuance Communications, Inc
    Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
  • Publication number: 20220051772
    Abstract: A computer-implemented method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least second audio encounter information. A computer-implemented method, computer program product, and computing system for compartmentalizing a virtual assistant is executed on a computing device and includes obtaining encounter information via a compartmentalized virtual assistant during a user encounter, wherein the compartmentalized virtual assistant includes a core functionality module.
    Type: Application
    Filed: March 23, 2021
    Publication date: February 17, 2022
    Inventors: Guido Remi Marcel Gallopyn, Dushyant Sharma, Uwe Helmut Jost, Donald E. Owen, Patrick Naylor, Amr Nour-Eldin, Daniel Paulino Almendro Barreda, Mehmet Mert Öz, Garret N. Erskine
  • Patent number: 10847171
    Abstract: Disclosed methods and systems are directed to determining a best microphone pair and segmenting sound signals. The methods and systems may include receiving a collection of sound signals comprising speech from one or more audio sources (e.g., meeting participants) and/or background noise. The methods and systems may include calculating a TDOA and determining, based on the TDOA and via robust statistics, the best pair of microphones. The methods and systems may also include segmenting sound signals from multiple sources.
    Type: Grant
    Filed: September 24, 2019
    Date of Patent: November 24, 2020
    Assignee: Nuance Communications, Inc.
    Inventors: Pablo Peso Parada, Dushyant Sharma, Patrick Naylor
  • Publication number: 20200312349
    Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.
    Type: Application
    Filed: March 27, 2017
    Publication date: October 1, 2020
    Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
  • Publication number: 20200184986
    Abstract: Disclosed methods and systems are directed to determining a best microphone pair and segmenting sound signals. The methods and systems may include receiving a collection of sound signals comprising speech from one or more audio sources (e.g., meeting participants) and/or background noise. The methods and systems may include calculating a TDOA and determining, based on the TDOA and via robust statistics, the best pair of microphones. The methods and systems may also include segmenting sound signals from multiple sources.
    Type: Application
    Filed: September 24, 2019
    Publication date: June 11, 2020
    Inventors: Pablo Peso Parada, Dushyant Sharma, Patrick Naylor
  • Patent number: 10424317
    Abstract: Disclosed methods and systems are directed to determining a best microphone pair and segmenting sound signals. The methods and systems may include receiving a collection of sound signals comprising speech from one or more audio sources (e.g., meeting participants) and/or background noise. The methods and systems may include calculating a TDOA and determining, based on the TDOA and via robust statistics, the best pair of microphones. The methods and systems may also include segmenting sound signals from multiple sources.
    Type: Grant
    Filed: January 11, 2017
    Date of Patent: September 24, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Pablo Peso Parada, Dushyant Sharma, Patrick Naylor
  • Publication number: 20190051378
    Abstract: A method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a patient encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least a second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least a second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least a second audio encounter information.
    Type: Application
    Filed: August 8, 2018
    Publication date: February 14, 2019
    Inventors: Guido Remi Marcel Gallopyn, Dushyant Sharma, Uwe Helmut Jost, Donald E. Owen, Patrick Naylor, Amr Nour-Eldin, Daniel Paulino Almendro Barreda
  • Patent number: 9997172
    Abstract: A system, method and computer program product are described for voice activity detection (VAD) within a digitally encoded bitstream. A parameter extraction module is configured to extract parameters from a sequence of coded frames from a digitally encoded bitstream containing speech. A VAD classifier is configured to operate with input of the digitally encoded bitstream to evaluate each coded frame based on bitstream coding parameter classification features to output a VAD decision indicative of whether or not speech is present in one or more of the coded frames.
    Type: Grant
    Filed: December 2, 2013
    Date of Patent: June 12, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Daniel A. Barreda, Jose E. G. Lainez, Dushyant Sharma, Patrick Naylor
  • Patent number: 9922664
    Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.
    Type: Grant
    Filed: March 28, 2016
    Date of Patent: March 20, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
  • Publication number: 20180075860
    Abstract: Disclosed methods and systems are directed to determining a best microphone pair and segmenting sound signals. The methods and systems may include receiving a collection of sound signals comprising speech from one or more audio sources (e.g., meeting participants) and/or background noise. The methods and systems may include calculating a TDOA and determining, based on the TDOA and via robust statistics, the best pair of microphones. The methods and systems may also include segmenting sound signals from multiple sources.
    Type: Application
    Filed: January 11, 2017
    Publication date: March 15, 2018
    Inventors: Pablo Peso Parada, Dushyant Sharma, Patrick Naylor
  • Patent number: 9870784
    Abstract: A system and method for speech quality detection is included. The method may include receiving, at a computing device, a first speech signal associated with a particular user. The method may include extracting one or more short-term features from the first speech signal wherein extracting short-term features includes extracting a time frame of between 10-50 ms. The method may also include determining one or more statistics of each of the one or more short-term features from the first speech signal. The method may further include classifying the one or more statistics as belonging to one of a set of quality classes.
    Type: Grant
    Filed: September 6, 2013
    Date of Patent: January 16, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Dushyant Sharma, Patrick Naylor
  • Publication number: 20170278527
    Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.
    Type: Application
    Filed: March 28, 2016
    Publication date: September 28, 2017
    Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost