Patents by Inventor Patrick Naylor

Patrick Naylor has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Automated clinical documentation system and method for synchronizing machine vision and audio encounter information

Patent number: 12658295

Abstract: A computer-implemented method, computer program product, and computing system for synchronizing machine vision and audio is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes machine vision encounter information and audio encounter information. The machine vision encounter information and the audio encounter information are temporally-aligned to produce a temporarily-aligned encounter recording.

Type: Grant

Filed: March 23, 2021

Date of Patent: June 16, 2026

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Guido Remi Marcel Gallopyn, Dushyant Sharma, Uwe Helmut Jost, Donald E. Owen, Patrick Naylor, Amr Nour-Eldin, Daniel Paulino Almendro Barreda, Mehmet Mert Öz, Garret N. Erskine
SPEECH DIALOG SYSTEM AND RECIPROCITY ENFORCED NEURAL RELATIVE TRANSFER FUNCTION ESTIMATOR

Publication number: 20250087231

Abstract: There is provided a speech processing system that includes a neural encoder module. A processor that receives an audio signal; and the memory that contains instructions that control said processor to perform operations that process speech. In an implementation, a front end module can include a Neural Spatial RTF Estimator and a neural spatial and residual encoder (NSRE) configured accept as inputs a spectral encoded reference channel stream to output Neural Transfer Functions (NTFs). In another implementation, a front end module encodes and outputs a Ch1 bitstream; computes a plurality of relative transfer functions (RTFs) for an N-Channel signal and outputs an N?1 RTFs or an RTF codebook ids and computes and processes an N?1 residual stream; and a back end module comprising a neural encoder module configured to accept the RTFs and output an encoded speech signal comprising an embedding that comprises features extracted from RTFs.

Type: Application

Filed: September 20, 2024

Publication date: March 13, 2025

Inventors: Dushyant SHARMA, Patrick NAYLOR, Daniel T. JONES
Method for neural beamforming, channel shortening and noise reduction

Patent number: 12165668

Abstract: A method of performing at least de-reverberation and noise-reduction of an input sound signal of at least one input channel includes: performing, using at least one filter element, at least one of de-reverberation and noise-reduction of the input sound signal to generate a clean output sound signal; and determining, by a non-intrusive measure (NIM) estimation element, at least one non-intrusive measure (NIM) from the sound signal, wherein the at least one NIM includes at least one of voice activity detection (VAD) posterior, reverberation time, clarity index, direct-to-reverberant ratio (DRR), and signal-to-noise ratio (SNR); the de-reverberation is achieved by applying at least one channel shortening (CS) filter component of the at least one filter element in conjunction with the at least one NIM; and the noise reduction is performed in combination with the de-reverberation by the channel shortening (CS) filter component.

Type: Grant

Filed: February 18, 2022

Date of Patent: December 10, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Dushyant Sharma, James Fosburgh, Patrick Naylor
Speech dialog system and recipirocity enforced neural relative transfer function estimator

Patent number: 12142293

Abstract: There is provided a speech processing system that includes a neural encoder module. A processor that receives an audio signal; and the memory that contains instructions that control said processor to perform operations that process speech. In an implementation, a front end module can include a Neural Spatial RTF Estimator and a neural spatial and residual encoder (NSRE) configured accept as inputs a spectral encoded reference channel stream to output Neural Transfer Functions (NTFs). In another implementation, a front end module encodes and outputs a Ch1 bitstream; computes a plurality of relative transfer functions (RTFs) for an N-Channel signal and outputs an N?1 RTFs or an RTF codebook ids and computes and processes an N?1 residual stream; and a back end module comprising a neural encoder module configured to accept the RTFs and output an encoded speech signal comprising an embedding that comprises features extracted from RTFs.

Type: Grant

Filed: June 30, 2022

Date of Patent: November 12, 2024

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Dushyant Sharma, Patrick Naylor, Daniel T. Jones
SPEECH DIALOG SYSTEM AND RECIPIROCITY ENFORCED NEURAL RELATIVE TRANSFER FUNCTION ESTIMATOR

Publication number: 20240005946

Abstract: There is provided a speech processing system that includes a neural encoder module. A processor that receives an audio signal; and the memory that contains instructions that control said processor to perform operations that process speech. In an implementation, a front end module can include a Neural Spatial RTF Estimator and a neural spatial and residual encoder (NSRE) configured accept as inputs a spectral encoded reference channel stream to output Neural Transfer Functions (NTFs). In another implementation, a front end module encodes and outputs a Ch1 bitstream; computes a plurality of relative transfer functions (RTFs) for an N-Channel signal and outputs an N?1 RTFs or an RTF codebook ids and computes and processes an N?1 residual stream; and a back end module comprising a neural encoder module configured to accept the RTFs and output an encoded speech signal comprising an embedding that comprises features extracted from RTFs.

Type: Application

Filed: June 30, 2022

Publication date: January 4, 2024

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Dushyant SHARMA, Patrick NAYLOR, Daniel T. JONES
Acoustic-environment mismatch and proximity detection with a novel set of acoustic relative features and adaptive filtering

Patent number: 11835625

Abstract: A method of performing distance estimation between a first recording device at a first location and a second recording device at a second location includes: estimating acoustic relative transfer function (RTF) between the first recording device and the second recording device for a sound signal, e.g., by applying an improved proportionate normalized least mean square (IPNLMS) filter; and estimating the distance between the first recording device and the second recording device based on the RTF. The at least one acoustic feature extracted from the RTF estimated between the first recording device and the second recording device includes at least one of clarity index, direct-to-reverberant ratio (DRR), and reverberation time. A distributed-gradient-boosting algorithm with regression trees is used in combination with signal-to-reverberation ratio (SRR) and the at least one acoustic feature extracted from the RTF to estimate the distance between the first recording device and the second recording device.

Type: Grant

Filed: March 15, 2022

Date of Patent: December 5, 2023

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Francesco Nespoli, Patrick Naylor, Daniel Barreda
SECURE AUDIO PLAYBACK

Publication number: 20230315815

Abstract: A method includes: providing a workstation having a playback app configured for audio playback; providing a decryption module having a decryption functionality communicatively connected to the playback app; encrypting, by a server using an encryption key associated with the decryption module, audio data; and decrypting, using the decryption module, the encrypted audio data. The decryption module having the decryption functionality is provided as part of the playback app, as part of firmware of a headphone, or as part of a phone app. The method can additionally include: i) authenticating, using a voice biometric authentication module, a transcriber; ii) enabling decryption by the decryption module only upon input of a decode PIN by the transcriber; and iii) a) modifying the audio data to spatialize speech component and noise component of the audio data at different angles using head-related transfer function (HRTF) filtering, and b) playing back the audio data binaurally.

Type: Application

Filed: April 5, 2022

Publication date: October 5, 2023

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: William F. GANONG, III, Ljubomir MILANOVIC, Uwe JOST, Dushyant SHARMA, Patrick NAYLOR
ACOUSTIC-ENVIRONMENT MISMATCH AND PROXIMITY DETECTION WITH A NOVEL SET OF ACOUSTIC RELATIVE FEATURES AND ADAPTIVE FILTERING

Publication number: 20230296767

Abstract: A method of performing distance estimation between a first recording device at a first location and a second recording device at a second location includes: estimating acoustic relative transfer function (RTF) between the first recording device and the second recording device for a sound signal, e.g., by applying an improved proportionate normalized least mean square (IPNLMS) filter; and estimating the distance between the first recording device and the second recording device based on the RTF. The at least one acoustic feature extracted from the RTF estimated between the first recording device and the second recording device includes at least one of clarity index, direct-to-reverberant ratio (DRR), and reverberation time. A distributed-gradient-boosting algorithm with regression trees is used in combination with signal-to-reverberation ratio (SRR) and the at least one acoustic feature extracted from the RTF to estimate the distance between the first recording device and the second recording device.

Type: Application

Filed: March 15, 2022

Publication date: September 21, 2023

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Francesco NESPOLI, Patrick NAYLOR, Daniel BARREDA
METHOD FOR NEURAL BEAMFORMING, CHANNEL SHORTENING AND NOISE REDUCTION

Publication number: 20230267944

Abstract: A method of performing at least de-reverberation and noise-reduction of an input sound signal of at least one input channel includes: performing, using at least one filter element, at least one of de-reverberation and noise-reduction of the input sound signal to generate a clean output sound signal; and determining, by a non-intrusive measure (NIM) estimation element, at least one non-intrusive measure (NIM) from the sound signal, wherein the at least one NIM includes at least one of voice activity detection (VAD) posterior, reverberation time, clarity index, direct-to-reverberant ratio (DRR), and signal-to-noise ratio (SNR); the de-reverberation is achieved by applying at least one channel shortening (CS) filter component of the at least one filter element in conjunction with the at least one NIM; and the noise reduction is performed in combination with the de-reverberation by the channel shortening (CS) filter component.

Type: Application

Filed: February 18, 2022

Publication date: August 24, 2023

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Sharma DUSHYANT, James FOSBURGH, Patrick NAYLOR
Characterizing, selecting and adapting audio and acoustic training data for automatic speech recognition systems

Patent number: 11482241

Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.

Type: Grant

Filed: March 27, 2017

Date of Patent: October 25, 2022

Assignee: Nuance Communications, Inc

Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
Automated Clinical Documentation System and Method

Publication number: 20220051772

Abstract: A computer-implemented method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a user encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least second audio encounter information. A computer-implemented method, computer program product, and computing system for compartmentalizing a virtual assistant is executed on a computing device and includes obtaining encounter information via a compartmentalized virtual assistant during a user encounter, wherein the compartmentalized virtual assistant includes a core functionality module.

Type: Application

Filed: March 23, 2021

Publication date: February 17, 2022

Inventors: Guido Remi Marcel Gallopyn, Dushyant Sharma, Uwe Helmut Jost, Donald E. Owen, Patrick Naylor, Amr Nour-Eldin, Daniel Paulino Almendro Barreda, Mehmet Mert Öz, Garret N. Erskine
Method for microphone selection and multi-talker segmentation with ambient automated speech recognition (ASR)

Patent number: 10847171

Abstract: Disclosed methods and systems are directed to determining a best microphone pair and segmenting sound signals. The methods and systems may include receiving a collection of sound signals comprising speech from one or more audio sources (e.g., meeting participants) and/or background noise. The methods and systems may include calculating a TDOA and determining, based on the TDOA and via robust statistics, the best pair of microphones. The methods and systems may also include segmenting sound signals from multiple sources.

Type: Grant

Filed: September 24, 2019

Date of Patent: November 24, 2020

Assignee: Nuance Communications, Inc.

Inventors: Pablo Peso Parada, Dushyant Sharma, Patrick Naylor
Characterizing, Selecting And Adapting Audio And Acoustic Training Data For Automatic Speech Recognition Systems

Publication number: 20200312349

Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.

Type: Application

Filed: March 27, 2017

Publication date: October 1, 2020

Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
METHOD FOR MICROPHONE SELECTION AND MULTI-TALKER SEGMENTATION WITH AMBIENT AUTOMATED SPEECH RECOGNITION (ASR)

Publication number: 20200184986

Abstract: Disclosed methods and systems are directed to determining a best microphone pair and segmenting sound signals. The methods and systems may include receiving a collection of sound signals comprising speech from one or more audio sources (e.g., meeting participants) and/or background noise. The methods and systems may include calculating a TDOA and determining, based on the TDOA and via robust statistics, the best pair of microphones. The methods and systems may also include segmenting sound signals from multiple sources.

Type: Application

Filed: September 24, 2019

Publication date: June 11, 2020

Inventors: Pablo Peso Parada, Dushyant Sharma, Patrick Naylor
Method for microphone selection and multi-talker segmentation with ambient automated speech recognition (ASR)

Patent number: 10424317

Abstract: Disclosed methods and systems are directed to determining a best microphone pair and segmenting sound signals. The methods and systems may include receiving a collection of sound signals comprising speech from one or more audio sources (e.g., meeting participants) and/or background noise. The methods and systems may include calculating a TDOA and determining, based on the TDOA and via robust statistics, the best pair of microphones. The methods and systems may also include segmenting sound signals from multiple sources.

Type: Grant

Filed: January 11, 2017

Date of Patent: September 24, 2019

Assignee: Nuance Communications, Inc.

Inventors: Pablo Peso Parada, Dushyant Sharma, Patrick Naylor
Automated Clinical Documentation System and Method

Publication number: 20190051378

Abstract: A method, computer program product, and computing system for source separation is executed on a computing device and includes obtaining encounter information of a patient encounter, wherein the encounter information includes first audio encounter information obtained from a first encounter participant and at least a second audio encounter information obtained from at least a second encounter participant. The first audio encounter information and the at least a second audio encounter information are processed to eliminate audio interference between the first audio encounter information and the at least a second audio encounter information.

Type: Application

Filed: August 8, 2018

Publication date: February 14, 2019

Inventors: Guido Remi Marcel Gallopyn, Dushyant Sharma, Uwe Helmut Jost, Donald E. Owen, Patrick Naylor, Amr Nour-Eldin, Daniel Paulino Almendro Barreda
Voice activity detection (VAD) for a coded speech bitstream without decoding

Patent number: 9997172

Abstract: A system, method and computer program product are described for voice activity detection (VAD) within a digitally encoded bitstream. A parameter extraction module is configured to extract parameters from a sequence of coded frames from a digitally encoded bitstream containing speech. A VAD classifier is configured to operate with input of the digitally encoded bitstream to evaluate each coded frame based on bitstream coding parameter classification features to output a VAD decision indicative of whether or not speech is present in one or more of the coded frames.

Type: Grant

Filed: December 2, 2013

Date of Patent: June 12, 2018

Assignee: Nuance Communications, Inc.

Inventors: Daniel A. Barreda, Jose E. G. Lainez, Dushyant Sharma, Patrick Naylor
Characterizing, selecting and adapting audio and acoustic training data for automatic speech recognition systems

Patent number: 9922664

Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.

Type: Grant

Filed: March 28, 2016

Date of Patent: March 20, 2018

Assignee: Nuance Communications, Inc.

Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
Method for Microphone Selection and Multi-Talker Segmentation with Ambient Automated Speech Recognition (ASR)

Publication number: 20180075860

Abstract: Disclosed methods and systems are directed to determining a best microphone pair and segmenting sound signals. The methods and systems may include receiving a collection of sound signals comprising speech from one or more audio sources (e.g., meeting participants) and/or background noise. The methods and systems may include calculating a TDOA and determining, based on the TDOA and via robust statistics, the best pair of microphones. The methods and systems may also include segmenting sound signals from multiple sources.

Type: Application

Filed: January 11, 2017

Publication date: March 15, 2018

Inventors: Pablo Peso Parada, Dushyant Sharma, Patrick Naylor
Method for voicemail quality detection

Patent number: 9870784

Abstract: A system and method for speech quality detection is included. The method may include receiving, at a computing device, a first speech signal associated with a particular user. The method may include extracting one or more short-term features from the first speech signal wherein extracting short-term features includes extracting a time frame of between 10-50 ms. The method may also include determining one or more statistics of each of the one or more short-term features from the first speech signal. The method may further include classifying the one or more statistics as belonging to one of a set of quality classes.

Type: Grant

Filed: September 6, 2013

Date of Patent: January 16, 2018

Assignee: Nuance Communications, Inc.

Inventors: Dushyant Sharma, Patrick Naylor

1 2 next