Patents Assigned to Nuance Communications

System and method for addressing acoustic signal reverberation

Patent number: 10276181

Abstract: A method, computer program product, and computer system for addressing acoustic signal reverberation is provided. Embodiments may include receiving, at one or more microphones, a first audio signal and a reverberation audio signal. Embodiments may further include processing at least one of the first audio signal and the reverberation audio signal. Embodiments may also include limiting a model based reverberation equalizer using a temporal constraint for direct sound distortions, the model based reverberation equalizer configured to generate one or more outputs, based upon, at least in part, at least one of the first audio signal and the reverberation audio signal.

Type: Grant

Filed: September 5, 2017

Date of Patent: April 30, 2019

Assignee: Nuance Communications, Inc.

Inventors: Tobias Wolff, Lars Tebelmann
Systems and methods for providing a voice agent user interface

Patent number: 10276157

Abstract: Some embodiments provide techniques performed by at least one voice agent. The techniques include receiving voice input; identifying at least one application program as relating to the received voice input; and displaying at least one selectable visual representation that, when selected, causes focus of the computing device to be directed to the at least one application program identified as relating to the received voice input.

Type: Grant

Filed: October 1, 2012

Date of Patent: April 30, 2019

Assignee: Nuance Communications, Inc.

Inventors: Timothy Lynch, Sean P. Brown, Paweena Attayadmawittaya, Tiago Goncalves Cabaco, Victor Shine Chen
Method and apparatus for detecting splicing attacks on a speaker verification system

Patent number: 10276166

Abstract: A method of detecting an occurrence of splicing in a speech signal includes comparing one or more discontinuities in the test speech signal to one or more reference speech signals corresponding to the test speech signal. The method may further include calculating a frame-based spectral-like representation ST of the speech signal, and calculating a frame-based spectral-like representation SE of a reference speech signal corresponding to the speech signal. The method further includes aligning ST and SE in time and frequency, calculating a distance function associated with aligned ST and SE, and evaluating the distance function to determine a score. The method also includes comparing the score to a threshold to detect if splicing occurs in the speech signal.

Type: Grant

Filed: July 22, 2014

Date of Patent: April 30, 2019

Assignee: Nuance Communications, Inc.

Inventors: Zvi Kons, Ron Hoory, Hagai Aronowitz
VIRTUAL AGENT COMMUNICATION FOR ELECTRONIC DEVICE

Publication number: 20190109880

Abstract: Methods and apparatus for communicating between virtual agents associated with users of electronic devices connected via at least one network. A first user may instruct an associated first virtual agent to invoke a communication session with a second virtual agent associated with a second user. To invoke the communication session, the first virtual agent may send an outgoing communication to the second virtual agent and the outgoing communication may instruct the second virtual agent to perform at least one action on behalf of the first user. Virtual agents associated with different users may alternatively communicate with each other in the absence of user interaction to perform a collaborative action.

Type: Application

Filed: December 7, 2018

Publication date: April 11, 2019

Applicant: Nuance Communications, Inc.

Inventors: Michael Stuart Phillips, John Nguyen, Thomas Jay Leonard, David Grannan
SYSTEMS AND METHODS FOR MULTI-STYLE SPEECH SYNTHESIS

Publication number: 20190108830

Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a desired speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identifying comprising identifying a first speech segment recorded and/or synthesized in a first speaking style that is different from the desired speaking style based at least in part on a measure of similarity between the desired speaking style and the first speaking style; synthesizing speech from the text in the desired speaking style at least in part by using the first speech segment; and outputting the synthesized speech.

Type: Application

Filed: June 4, 2018

Publication date: April 11, 2019

Applicant: Nuance Communications, Inc.

Inventor: Vincent Pollet
System and method for speech enhancement using a coherent to diffuse sound ratio

Patent number: 10242690

Abstract: Embodiments of the present disclosure may include a system and method for speech enhancement using the coherent to diffuse sound ratio. Embodiments may include receiving an audio signal at one or more microphones and controlling one or more adaptive filters of a beamformer using a coherent to diffuse ratio (“CDR”).

Type: Grant

Filed: December 12, 2014

Date of Patent: March 26, 2019

Assignee: Nuance Communications, Inc.

Inventors: Tobias Wolff, Timo Matheja, Markus Buck
System and method for audio conferencing

Patent number: 10237412

Abstract: The present disclosure is directed towards an audio conferencing method. Some embodiments may include receiving, at a first mixing device, a first audio stream from one or more participant conferencing devices. The method may further include generating a top-N voice stream at the first mixing device, wherein the top-N voice stream corresponds with at least one top-N talker and wherein the identification of the at least one top-N talker is based upon, at least in part, an activity ranking. The method may also include receiving the top-N voice stream at a centralized mixing device and generating at least one mixed audio stream at the centralized mixing device.

Type: Grant

Filed: April 18, 2014

Date of Patent: March 19, 2019

Assignee: Nuance Communications, Inc.

Inventors: Sridhar Pilli, Mahesh Godavarti
Ontology and annotation driven grammar inference

Patent number: 10235359

Abstract: Inferring a natural language grammar is based on providing natural language understanding (NLU) data with concept annotations according to an application ontology characterizing a relationship structure between application-related concepts for a given NLU application. An application grammar is then inferred from the concept annotations and the application ontology.

Type: Grant

Filed: July 15, 2013

Date of Patent: March 19, 2019

Assignee: Nuance Communications, Inc.

Inventors: Réal Tremblay, Jerome Tremblay, Stephen Douglas Peters, Serge Robillard
Initializing a workspace for building a natural language understanding system

Patent number: 10229106

Abstract: Designing a natural language understanding (NLU) model for an application from scratch can be difficult for non-experts. A system can simplify the design process by providing an interface allowing a designer to input example usage sentences and build an NLU model based on presented matches to those example sentences. In one embodiment, a method for initializing a workspace for building an NLU system includes parsing a sample sentence to select at least one candidate stub grammar from among multiple candidate stub grammars. The method can include presenting, to a user, respective representations of the candidate stub grammars selected by the parsing of the sample sentence. The method can include enabling the user to choose one of the respective representations of the candidate stub grammars. The method can include adding to the workspace a stub grammar corresponding to the representation of the candidate stub grammar chosen by the user.

Type: Grant

Filed: July 26, 2013

Date of Patent: March 12, 2019

Assignee: Nuance Communications, Inc.

Inventor: Jeffrey N. Marcus
Server-side ASR adaptation to speaker, device and noise condition via non-ASR audio transmission

Patent number: 10229701

Abstract: A mobile device is adapted for automatic speech recognition (ASR). A user interface for interaction with a user includes an input microphone for obtaining speech inputs from the user for automatic speech recognition, and an output interface for system output to the user based on ASR results that correspond to the speech input. A local controller obtains a sample of non-ASR audio from the input microphone for ASR-adaptation to channel-specific ASR characteristics, and then provides a representation of the non-ASR audio to a remote ASR server for server-side adaptation to the channel-specific ASR characteristics, and then provides a representation of an unknown ASR speech input from the input microphone to the remote ASR server for determining ASR results corresponding to the unknown ASR speech input, and then provides the system output to the output interface.

Type: Grant

Filed: June 12, 2017

Date of Patent: March 12, 2019

Assignee: Nuance Communications, Inc.

Inventors: Daniel Willett, Jean-Guy E. Dahan, William F. Ganong, III, Jianxiong Wu
TECHNIQUES FOR SPATIALLY SELECTIVE WAKE-UP WORD RECOGNITION AND RELATED SYSTEMS AND METHODS

Publication number: 20190073999

Abstract: According to some aspects, a system for detecting a designated wake-up word is provided, the system comprising a plurality of microphones to detect acoustic information from a physical space having a plurality of acoustic zones, at least one processor configured to receive a first acoustic signal representing the acoustic information received by the plurality of microphones, process the first acoustic signal to identify content of the first acoustic signal originating from each of the plurality of acoustic zones, provide a plurality of second acoustic signals, each of the plurality of second acoustic signals substantially corresponding to the content identified as originating from a respective one of the plurality of acoustic zones, and performing automatic speech recognition on each of the plurality of second acoustic signals to determine whether the designated wake-up word was spoken.

Type: Application

Filed: February 10, 2016

Publication date: March 7, 2019

Applicant: Nuance Communications, Inc.

Inventors: Julien Prémont, Tim Haulick, Emanuele Dalmasso, Munir Nikolai Alexander Georges, Andreas Kellner, Gaetan Martens, Oliver van Porten, Holger Quast, Martin Rößler, Tobias Wolff, Markus Buck
Task assistant utilizing context for improved interaction

Patent number: 10223411

Abstract: A method of providing a task assistant is described. The task assistant is designed to receive input from a user through multimodal input including a plurality of speech input, typing input, and touch input, determine the meaning of the input, and determining whether there is a context based on prior interactions with the user. The method further to generate an interpreted input based on a combination of the input and the context, and providing a formatted query to an application. The method further to receive data from the application in response to the formatted query, and provide a response to the user through multimodal output including a plurality of: speech output, text output, non-speech audio output, haptic output, and visual non-text output. The method further to update the context based on the interpreted input.

Type: Grant

Filed: March 6, 2013

Date of Patent: March 5, 2019

Assignee: Nuance Communications, Inc.

Inventors: David Andrew Mauro, Henri Bouvier, Elizabeth Ann Dykstra-Erickson, Simona Gandrabur, Susan Dawnstarr Daniel, Aimee Piercy, Robert Douglas Sharp
Methods and apparatus for generating clinical reports

Patent number: 10199124

Abstract: Techniques for documenting a clinical procedure involve transcribing audio data comprising audio of one or more clinical personnel speaking while performing the clinical procedure. Examples of applicable clinical procedures include sterile procedures such as surgical procedures, as well as non-sterile procedures such as those conventionally involving a core code reporter. The transcribed audio data may be analyzed to identify relevant information for documenting the clinical procedure, and a text report including the relevant information documenting the clinical procedure may be automatically generated.

Type: Grant

Filed: October 9, 2017

Date of Patent: February 5, 2019

Assignee: Nuance Communications, Inc.

Inventor: Mariana Casella dos Santos
Systems and methods for generating speech of multiple styles from text

Patent number: 10192541

Abstract: A text-to-speech (TTS) system includes components capable of supporting the generation of speech output in any of multiple styles, and may switch seamlessly from producing speech output in one style to producing speech output in another style. For example, a concatenative TTS system may include a speech base storing speech units associated with multiple speech styles, and a linguistic analysis component to generate a phonetic transcription specifying speech output in any of multiple styles. Text input may include a style indication associated with a particular segment of the input text. The linguistic analysis component may invoke encoded rules and/or components based upon the style indication, and generate a phonetic transcription specifying a speech style, which may be processed to generate output speech.

Type: Grant

Filed: June 5, 2014

Date of Patent: January 29, 2019

Assignee: Nuance Communications, Inc.

Inventors: Paolo Mairano, Corinne Bos-Plachez, Sourav Nandy, Johan Wouters, Silvia Maria Antonella Quazza, Dong-Jian Yue
Method and system for conveying an example in a natural language understanding application

Patent number: 10192543

Abstract: A method (300) and system (100) is provided to add the creation of examples at a developer level in the generation of Natural Language Understanding (NLU) models, tying the examples into a NLU sentence database (130), automatically validating (310) a correct outcome of using the examples, and automatically resolving (316) problems the user has using the examples. The method (300) can convey examples of what a caller can say to a Natural Language Understanding (NLU) application. The method includes entering at least one example associated with an existing routing destination, and ensuring an NLU model correctly interprets the example unambiguously for correctly routing a call to the routing destination. The method can include presenting the example sentence in a help message (126) within an NLU dialogue as an example of what a caller can say for connecting the caller to a desired routing destination.

Type: Grant

Filed: May 10, 2016

Date of Patent: January 29, 2019

Assignee: Nuance Communications, Inc.

Inventors: Rajesh Balchandran, Linda M. Boyer, James R. Lewis, Brent D. Metz
DOCUMENTATION TAG PROCESSING SYSTEM

Publication number: 20190027149

Abstract: Described herein are embodiments of a system configured to receive text input (e.g., in the form of speech input) that includes provisional text and interpret the provisional text to produce substitute text with which the provisional text is replaced. A user dictating speech input may dictate the provisional text along with other content of the speech, and the speech input including the provisional text may be converted to text in a speech recognition process performed by an automatic speech recognition (ASR) system. The text corresponding to the speech input may be reviewed to determine whether any character strings included in the text match a character pattern defined for provisional text. If so, the character string is interpreted to determine a data field indicated by the provisional text, and substitute text including a value for the data field is determined. The provisional text may then be replaced with the substitute text.

Type: Application

Filed: July 20, 2017

Publication date: January 24, 2019

Applicant: Nuance Communications, Inc.

Inventor: Markus Vogel
Method and apparatus for exploiting language skill information in automatic speech recognition

Patent number: 10186256

Abstract: Typical speech recognition systems usually use speaker-specific speech data to apply speaker adaptation to models and parameters associated with the speech recognition system. Given that speaker-specific speech data may not be available to the speech recognition system, information indicative of language skills is employed in adapting configurations of a speech recognition system. According to at least one example embodiment, a method and corresponding apparatus, for speech recognition comprise maintaining information indicative of language skills of users of the speech recognition system. A configuration of the speech recognition system for a user is determined based at least in part on corresponding information indicative of language skills of the user. Upon receiving speech data from the user, the configuration of the speech recognition system determined is employed in performing speech recognition.

Type: Grant

Filed: January 23, 2014

Date of Patent: January 22, 2019

Assignee: Nuance Communications, Inc.

Inventors: Weiying Li, Daniel Willett
Audio-visual speech recognition with scattering operators

Patent number: 10181325

Abstract: Aspects described herein are directed towards methods, computing devices, systems, and computer-readable media that apply scattering operations to extracted visual features of audiovisual input to generate predictions regarding the speech status of a subject. Visual scattering coefficients generated according to one or more aspects described herein may be used as input to a neural network operative to generate the predictions regarding the speech status of the subject. Predictions generated based on the visual features may be combined with predictions based on audio input associated with the visual features. In some embodiments, the extracted visual features may be combined with the audio input to generate a combined feature vector for use in generating predictions.

Type: Grant

Filed: June 30, 2017

Date of Patent: January 15, 2019

Assignee: Nuance Communications, Inc.

Inventors: Etienne Marcheret, Josef Vopicka, Vaibhava Goel
Updating population language models based on changes made by user clusters

Patent number: 10176803

Abstract: Technology for improving the predictive accuracy of input word recognition on a device by dynamically updating the lexicon of recognized words based on the word choices made by similar users. The technology collects users' vocabulary choices (e.g., words that each user uses, or adds to or removes from a word recognition dictionary), associates users who make similar choices, aggregates related vocabulary choices, filters the words, and sends words identified as likely choices for that user to the user's device. Clusters may include, for example, users in a particular location (e.g., sets of people who use words such as “Puyallup,” “Gloucester,” or “Waiheke”), users with a particular professional or hobby vocabulary, or application-specific vocabulary (e.g., word choices in map searches or email messages).

Type: Grant

Filed: June 5, 2017

Date of Patent: January 8, 2019

Assignee: Nuance Communications, Inc.

Inventors: Ethan R. Bradford, Simon Corston, David J. Kay, Donni McCray, Keith Trnka
SYSTEM AND METHOD FOR SUPPRESSION OF NON-LINEAR ACOUSTIC ECHOES

Publication number: 20180367674

Abstract: A method for residual echo suppression is provided. Embodiments may include receiving an original reference signal and applying a distortion function to the original reference signal to generate a second signal. Embodiments may include generating a non-linear signal from the distortion function that does not include linear components of the original reference signal. Embodiments may also include calculating a residual echo power of a linear component and a non-linear component, wherein the linear component is based upon the original reference signal and the non-linear component is based upon the non-linear signal. Embodiments may further include applying a room model to each of the original reference signal and the non-linear signal and estimating a power associated with the original reference signal and the non-linear signal. Embodiments may include calculating a combined echo power estimate as a weighted sum of a weighted original reference signal power and a weighted non-linear signal power.

Type: Application

Filed: December 8, 2015

Publication date: December 20, 2018

Applicant: Nuance Communications, Inc.

Inventors: Ingo Schalk-Schupp, Markus Buck, Friedrich FaubeI

prev … 6 7 8 9 10 11 12 13 14 … next