Patents Assigned to Nuance Communications, Inc.

System and method for dynamic facial features for speaker recognition

Patent number: 12080295

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for performing speaker verification. A system configured to practice the method receives a request to verify a speaker, generates a text challenge that is unique to the request, and, in response to the request, prompts the speaker to utter the text challenge. Then the system records a dynamic image feature of the speaker as the speaker utters the text challenge, and performs speaker verification based on the dynamic image feature and the text challenge. Recording the dynamic image feature of the speaker can include recording video of the speaker while speaking the text challenge. The dynamic feature can include a movement pattern of head, lips, mouth, eyes, and/or eyebrows of the speaker. The dynamic image feature can relate to phonetic content of the speaker speaking the challenge, speech prosody, and the speaker's facial expression responding to content of the challenge.

Type: Grant

Filed: March 15, 2021

Date of Patent: September 3, 2024

Assignee: Nuance Communications, Inc.

Inventors: Ann K. Syrdal, Sumit Chopra, Patrick Haffner, Taniya Mishra, Ilija Zeljkovic, Eric Zavesky
INTERACTIVE VOICE RESPONSE SYSTEMS HAVING IMAGE ANALYSIS

Publication number: 20240046683

Abstract: An interactive voice response system is provided that includes an interactive voice recognition module, an image collection module, and a data extraction module. The image collection module communicates with the voice recognition module and the user device. The extraction module communicates with the image collection module. The voice recognition module collects speech data from a user of the user device and provides an indication to the image collection module when the speech data includes complex data. The image collection module, in response to the indication, communicates with the user device in a text message. The text message includes a link that, when activated, opens a camera on the user device. The image collection module, in response to receiving an image having the complex data from the camera, communicates the image to the extraction module, which extracts the complex data from the image as textual data.

Type: Application

Filed: August 2, 2022

Publication date: February 8, 2024

Applicant: Nuance Communications, Inc.

Inventors: Akash Chawla, Jenny DeGroot, Sergey A. Vovk
GESTURAL PROMPTING BASED ON CONVERSATIONAL ARTIFICIAL INTELLIGENCE

Publication number: 20240038225

Abstract: There is provided a method that includes obtaining data that describes (a) a situation, (b) a gesture for a response to the situation, (c) a prompt to accompany the response, and (d) a gestural annotation for the response, and utilizing a conversational machine learning technique to train a natural language understanding (NLU) model to address the situation, based on the data.

Type: Application

Filed: July 26, 2022

Publication date: February 1, 2024

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Abhishek ROHATGI, Eduardo OLVERA, Dinesh SAMTANI, Flaviu Gelu NEGREAN, Manar ALAZMA
SPEECH DIALOG SYSTEM AND RECIPIROCITY ENFORCED NEURAL RELATIVE TRANSFER FUNCTION ESTIMATOR

Publication number: 20240005946

Abstract: There is provided a speech processing system that includes a neural encoder module. A processor that receives an audio signal; and the memory that contains instructions that control said processor to perform operations that process speech. In an implementation, a front end module can include a Neural Spatial RTF Estimator and a neural spatial and residual encoder (NSRE) configured accept as inputs a spectral encoded reference channel stream to output Neural Transfer Functions (NTFs). In another implementation, a front end module encodes and outputs a Ch1 bitstream; computes a plurality of relative transfer functions (RTFs) for an N-Channel signal and outputs an N?1 RTFs or an RTF codebook ids and computes and processes an N?1 residual stream; and a back end module comprising a neural encoder module configured to accept the RTFs and output an encoded speech signal comprising an embedding that comprises features extracted from RTFs.

Type: Application

Filed: June 30, 2022

Publication date: January 4, 2024

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Dushyant SHARMA, Patrick NAYLOR, Daniel T. JONES
Automated clinical documentation system and method

Patent number: 11853691

Abstract: A method, computer program product, and computing system for synchronizing machine vision and audio is executed on a computing device and includes obtaining encounter information of a patient encounter, wherein the encounter information includes machine vision encounter information and audio encounter information. The machine vision encounter information and the audio encounter information are temporally-aligned to produce a temporarily-aligned encounter recording.

Type: Grant

Filed: March 23, 2021

Date of Patent: December 26, 2023

Assignee: Nuance Communications, Inc.

Inventors: Donald E. Owen, Uwe Helmut Jost, Daniel Paulino Almendro Barreda, Dushyant Sharma
CROSS-ATTENTION BETWEEN SPARSE EXTERNAL FEATURES AND CONTEXTUAL WORD EMBEDDINGS TO IMPROVE TEXT CLASSIFICATION

Publication number: 20230401383

Abstract: There is provided a method that includes obtaining (a) a dense representation of external features, (b) a dense representation of text, and (c) a mask that associates the external features to tokens of the text, and employing a cross-attention process that utilizes the mask to perform an information fusion of the dense representation of the external features and the tokens of the text, thus yielding a joint representation of the external features and the tokens of the text. There is also provided a system that executes the method, and a storage device that includes instructions for controlling a processor to perform the method.

Type: Application

Filed: June 10, 2022

Publication date: December 14, 2023

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Jean-Michel Attendu, Alexandre Jules Dos Santos, François Duplessis Beaulieu
System and method for data augmentation for multi-microphone signal processing

Patent number: 11837228

Abstract: A method, computer program product, and computing system for receiving a speech signal from each microphone of a plurality of microphones, thus defining a plurality of signals. One or more noise signals associated with microphone self-noise may be received. One or more self-noise-based augmentations may be performed on the plurality of signals based upon, at least in part, the one or more noise signals associated with microphone self-noise, thus defining one or more self-noise-based augmented signals.

Type: Grant

Filed: May 7, 2021

Date of Patent: December 5, 2023

Assignee: Nuance Communications, Inc.

Inventors: Dushyant Sharma, Patrick A. Naylor, Rong Gong, Stanislav Kruchinin, Ljubomir Milanovic
AMBIENT LISTENING SYSTEM FOR SALES ASSISTANCE

Publication number: 20230385893

Abstract: An ambient listening system, includes: an ambient device configure to listen to a conversation between a customer and a sales agent regarding at least one of a desired service and a desired product, and generate an audio stream of the conversation and a unique identifier for the customer; a voice biometrics service module configure to perform at least one of i) identification of the customer based on the audio stream, and ii) voiceprint enrollment of the customer's voice based on the audio stream and the unique identifier for the customer; a business logic module configured to generate at least one business logic output based on at least one of customer's intent and entity extracted from the audio stream; and an automation platform configured to automate, based on the business logic output, the sales agent's workflow related to at least one of customer record, the desired service and the desired product.

Type: Application

Filed: May 26, 2022

Publication date: November 30, 2023

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Abhishek ROHATGI, Eduardo OLVERA, Dinesh SAMTANI, Manpreet SINGH, Manar ALAZMA
Ambient cooperative intelligence system and method

Patent number: 11817095

Abstract: A method, computer program product, and computing system for monitoring a plurality of conversations within a monitored space to generate a conversation data set; processing the conversation data set using machine learning to: define a system-directed command for an ACI system, and associate one or more conversational contexts with the system-directed command; detecting the occurrence of a specific conversational context within the monitored space, wherein the specific conversational context is included in the one or more conversational contexts associated with the system-directed command; and executing, in whole or in part, functionality associated with the system-directed command in response to detecting the occurrence of the specific conversational context without requiring the utterance of the system-directed command and/or a wake-up word/phrase.

Type: Grant

Filed: February 3, 2022

Date of Patent: November 14, 2023

Assignee: Nuance Communications, Inc.

Inventors: Paul Joseph Vozila, Neal Snider
END-TO-END AUTOMATIC SPEECH RECOGNITION SYSTEM FOR BOTH CONVERSATIONAL AND COMMAND-AND-CONTROL SPEECH

Publication number: 20230360646

Abstract: A contextual end-to-end automatic speech recognition (ASR) system includes: an audio encoder configured to process input audio signal to produce as output encoded audio signal; a bias encoder configured to produce as output at least one bias entry corresponding to a word to bias for recognition by the ASR system; a transcription token probability prediction network configured to produce as output a probability of a selected transcription token, based at least in part on the output of the bias encoder and the output of the audio encoder; a first attention mechanism configured to receive the at least one bias entry and determine whether the at least one bias entry is suitable to be transcribed at a specific moment of an ongoing transcription; and a second attention mechanism configured to produce prefix penalties for restricting the first attention mechanism to only entries fitting a current transcription context.

Type: Application

Filed: May 5, 2022

Publication date: November 9, 2023

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Alejandro COUCHEIRO LIMERES, Junho PARK
CONTEXT SHARING BETWEEN PHYSICAL AND DIGITAL WORLDS

Publication number: 20230359832

Abstract: There is provided a method that includes (a) obtaining context information that is associated with a physical object, and is related to a context concerning the physical object, (b) searching a database for resultant information, based on the context information, (c) extracting and inferencing intents and entities, from the context information and the resultant information, (d) providing the intents and entities to a virtual assistant, and (e) facilitating a conversation between the virtual assistant and a user.

Type: Application

Filed: May 6, 2022

Publication date: November 9, 2023

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Eduardo OLVERA, Abhishek ROHATGI, Marco PADRÓN, Dinesh SAMTANI
System and method for data augmentation and speech processing in dynamic acoustic environments

Patent number: 11783826

Abstract: A method, computer program product, and computing system for receiving one or more inputs indicative of at least one of: a relative location of a speaker and a microphone array, and a relative orientation of the speaker and the microphone array. One or more reference signals may be received. A speech processing system may be trained using the one or more inputs and the one or more reference signals.

Type: Grant

Filed: February 18, 2021

Date of Patent: October 10, 2023

Assignee: Nuance Communications, Inc.

Inventors: Patrick A. Naylor, Dushyant Sharma, Uwe Helmut Jost, William F. Ganong, III
SECURE AUDIO PLAYBACK

Publication number: 20230315815

Abstract: A method includes: providing a workstation having a playback app configured for audio playback; providing a decryption module having a decryption functionality communicatively connected to the playback app; encrypting, by a server using an encryption key associated with the decryption module, audio data; and decrypting, using the decryption module, the encrypted audio data. The decryption module having the decryption functionality is provided as part of the playback app, as part of firmware of a headphone, or as part of a phone app. The method can additionally include: i) authenticating, using a voice biometric authentication module, a transcriber; ii) enabling decryption by the decryption module only upon input of a decode PIN by the transcriber; and iii) a) modifying the audio data to spatialize speech component and noise component of the audio data at different angles using head-related transfer function (HRTF) filtering, and b) playing back the audio data binaurally.

Type: Application

Filed: April 5, 2022

Publication date: October 5, 2023

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: William F. GANONG, III, Ljubomir MILANOVIC, Uwe JOST, Dushyant SHARMA, Patrick NAYLOR
Ambient cooperative intelligence system and method

Patent number: 11777947

Abstract: A method, computer program product, and computing system for initiating a session within an ACI platform; receiving an authentication request from a requester; and authenticating that the requester has the authority to access the ACI platform.

Type: Grant

Filed: March 16, 2022

Date of Patent: October 3, 2023

Assignee: Nuance Communications, Inc.

Inventors: Guido Remi Marcel Gallopyn, William F. Ganong, III
System and method for data augmentation and speech processing in dynamic acoustic environments

Patent number: 11769486

Abstract: A method, computer program product, and computing system for defining model representative of a plurality of acoustic variations to a speech signal, thus defining a plurality of time-varying spectral modifications. The plurality of time-varying spectral modifications may be applied to a plurality of feature coefficients of a target domain of a reference signal, thus generating a plurality of time-varying spectrally-augmented feature coefficients of the reference signal.

Type: Grant

Filed: February 18, 2021

Date of Patent: September 26, 2023

Assignee: Nuance Communications, Inc.

Inventors: Patrick A. Naylor, Dushyant Sharma, Uwe Helmut Jost, William F. Ganong, III
ACOUSTIC-ENVIRONMENT MISMATCH AND PROXIMITY DETECTION WITH A NOVEL SET OF ACOUSTIC RELATIVE FEATURES AND ADAPTIVE FILTERING

Publication number: 20230296767

Abstract: A method of performing distance estimation between a first recording device at a first location and a second recording device at a second location includes: estimating acoustic relative transfer function (RTF) between the first recording device and the second recording device for a sound signal, e.g., by applying an improved proportionate normalized least mean square (IPNLMS) filter; and estimating the distance between the first recording device and the second recording device based on the RTF. The at least one acoustic feature extracted from the RTF estimated between the first recording device and the second recording device includes at least one of clarity index, direct-to-reverberant ratio (DRR), and reverberation time. A distributed-gradient-boosting algorithm with regression trees is used in combination with signal-to-reverberation ratio (SRR) and the at least one acoustic feature extracted from the RTF to estimate the distance between the first recording device and the second recording device.

Type: Application

Filed: March 15, 2022

Publication date: September 21, 2023

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Francesco NESPOLI, Patrick NAYLOR, Daniel BARREDA
Development system and method

Patent number: 11762638

Abstract: A method, computer program product, and computing system for defining a library of functional modules; enabling a user to select a plurality of functional modules from the library of functional modules; and enabling the user to visually arrange the plurality of functional modules to form a conversational application.

Type: Grant

Filed: December 28, 2022

Date of Patent: September 19, 2023

Assignee: Nuance Communications, Inc.

Inventors: David Ardman, Andrew Matkin, Nirvana Tikku, John B. Fisler, Matthias Haack, Christopher A. Starbird, Bryan A. Reif, Alfred Sterphone, III, Nikos Polis, Michael S. Gourlay, Robert A. Follett
Methods and apparatus for presenting alternative hypotheses for medical facts

Patent number: 11742088

Abstract: Techniques for presenting alternative hypotheses for medical facts may include identifying, using at least one statistical fact extraction model, a plurality of alternative hypotheses for a medical fact to be extracted from a portion of text documenting a patient encounter. At least two of the alternative hypotheses may be selected, and the selected hypotheses may be presented to a user documenting the patient encounter.

Type: Grant

Filed: December 23, 2020

Date of Patent: August 29, 2023

Assignee: Nuance Communications, Inc.

Inventor: Girija Yegnanarayanan
METHOD FOR NEURAL BEAMFORMING, CHANNEL SHORTENING AND NOISE REDUCTION

Publication number: 20230267944

Abstract: A method of performing at least de-reverberation and noise-reduction of an input sound signal of at least one input channel includes: performing, using at least one filter element, at least one of de-reverberation and noise-reduction of the input sound signal to generate a clean output sound signal; and determining, by a non-intrusive measure (NIM) estimation element, at least one non-intrusive measure (NIM) from the sound signal, wherein the at least one NIM includes at least one of voice activity detection (VAD) posterior, reverberation time, clarity index, direct-to-reverberant ratio (DRR), and signal-to-noise ratio (SNR); the de-reverberation is achieved by applying at least one channel shortening (CS) filter component of the at least one filter element in conjunction with the at least one NIM; and the noise reduction is performed in combination with the de-reverberation by the channel shortening (CS) filter component.

Type: Application

Filed: February 18, 2022

Publication date: August 24, 2023

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Sharma DUSHYANT, James FOSBURGH, Patrick NAYLOR
FREQUENCY MAPPING IN THE VOICEPRINT DOMAIN

Publication number: 20230267936

Abstract: There is provided a method that includes (a) obtaining a first voice vector that was derived from a signal of a voice that was sampled at a first sampling frequency, (b) obtaining a second voice vector that was derived from a signal of a voice that was sampled at a second sampling frequency, (c) mapping the second voice vector into a mapped voice vector in accordance with a machine learning model, and (d) comparing the first voice vector to the mapped voice vector to yield a score that indicates a probability that the first voice vector and the second voice vector originated from a same person.

Type: Application

Filed: February 23, 2022

Publication date: August 24, 2023

Applicant: NUANCE COMMUNICATIONS, INC.

Inventors: Claudio VAIR, Haydar TALIB, Kevin Robert FARRELL, Daniele Ernesto COLIBRO

1 2 3 4 5 … next