Patents Assigned to Nuance Communications
  • Patent number: 10276166
    Abstract: A method of detecting an occurrence of splicing in a speech signal includes comparing one or more discontinuities in the test speech signal to one or more reference speech signals corresponding to the test speech signal. The method may further include calculating a frame-based spectral-like representation ST of the speech signal, and calculating a frame-based spectral-like representation SE of a reference speech signal corresponding to the speech signal. The method further includes aligning ST and SE in time and frequency, calculating a distance function associated with aligned ST and SE, and evaluating the distance function to determine a score. The method also includes comparing the score to a threshold to detect if splicing occurs in the speech signal.
    Type: Grant
    Filed: July 22, 2014
    Date of Patent: April 30, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Zvi Kons, Ron Hoory, Hagai Aronowitz
  • Patent number: 10276157
    Abstract: Some embodiments provide techniques performed by at least one voice agent. The techniques include receiving voice input; identifying at least one application program as relating to the received voice input; and displaying at least one selectable visual representation that, when selected, causes focus of the computing device to be directed to the at least one application program identified as relating to the received voice input.
    Type: Grant
    Filed: October 1, 2012
    Date of Patent: April 30, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Timothy Lynch, Sean P. Brown, Paweena Attayadmawittaya, Tiago Goncalves Cabaco, Victor Shine Chen
  • Patent number: 10276181
    Abstract: A method, computer program product, and computer system for addressing acoustic signal reverberation is provided. Embodiments may include receiving, at one or more microphones, a first audio signal and a reverberation audio signal. Embodiments may further include processing at least one of the first audio signal and the reverberation audio signal. Embodiments may also include limiting a model based reverberation equalizer using a temporal constraint for direct sound distortions, the model based reverberation equalizer configured to generate one or more outputs, based upon, at least in part, at least one of the first audio signal and the reverberation audio signal.
    Type: Grant
    Filed: September 5, 2017
    Date of Patent: April 30, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Tobias Wolff, Lars Tebelmann
  • Publication number: 20190108830
    Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a desired speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identifying comprising identifying a first speech segment recorded and/or synthesized in a first speaking style that is different from the desired speaking style based at least in part on a measure of similarity between the desired speaking style and the first speaking style; synthesizing speech from the text in the desired speaking style at least in part by using the first speech segment; and outputting the synthesized speech.
    Type: Application
    Filed: June 4, 2018
    Publication date: April 11, 2019
    Applicant: Nuance Communications, Inc.
    Inventor: Vincent Pollet
  • Publication number: 20190109880
    Abstract: Methods and apparatus for communicating between virtual agents associated with users of electronic devices connected via at least one network. A first user may instruct an associated first virtual agent to invoke a communication session with a second virtual agent associated with a second user. To invoke the communication session, the first virtual agent may send an outgoing communication to the second virtual agent and the outgoing communication may instruct the second virtual agent to perform at least one action on behalf of the first user. Virtual agents associated with different users may alternatively communicate with each other in the absence of user interaction to perform a collaborative action.
    Type: Application
    Filed: December 7, 2018
    Publication date: April 11, 2019
    Applicant: Nuance Communications, Inc.
    Inventors: Michael Stuart Phillips, John Nguyen, Thomas Jay Leonard, David Grannan
  • Patent number: 10242690
    Abstract: Embodiments of the present disclosure may include a system and method for speech enhancement using the coherent to diffuse sound ratio. Embodiments may include receiving an audio signal at one or more microphones and controlling one or more adaptive filters of a beamformer using a coherent to diffuse ratio (“CDR”).
    Type: Grant
    Filed: December 12, 2014
    Date of Patent: March 26, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Tobias Wolff, Timo Matheja, Markus Buck
  • Patent number: 10235359
    Abstract: Inferring a natural language grammar is based on providing natural language understanding (NLU) data with concept annotations according to an application ontology characterizing a relationship structure between application-related concepts for a given NLU application. An application grammar is then inferred from the concept annotations and the application ontology.
    Type: Grant
    Filed: July 15, 2013
    Date of Patent: March 19, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Réal Tremblay, Jerome Tremblay, Stephen Douglas Peters, Serge Robillard
  • Patent number: 10237412
    Abstract: The present disclosure is directed towards an audio conferencing method. Some embodiments may include receiving, at a first mixing device, a first audio stream from one or more participant conferencing devices. The method may further include generating a top-N voice stream at the first mixing device, wherein the top-N voice stream corresponds with at least one top-N talker and wherein the identification of the at least one top-N talker is based upon, at least in part, an activity ranking. The method may also include receiving the top-N voice stream at a centralized mixing device and generating at least one mixed audio stream at the centralized mixing device.
    Type: Grant
    Filed: April 18, 2014
    Date of Patent: March 19, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Sridhar Pilli, Mahesh Godavarti
  • Patent number: 10229106
    Abstract: Designing a natural language understanding (NLU) model for an application from scratch can be difficult for non-experts. A system can simplify the design process by providing an interface allowing a designer to input example usage sentences and build an NLU model based on presented matches to those example sentences. In one embodiment, a method for initializing a workspace for building an NLU system includes parsing a sample sentence to select at least one candidate stub grammar from among multiple candidate stub grammars. The method can include presenting, to a user, respective representations of the candidate stub grammars selected by the parsing of the sample sentence. The method can include enabling the user to choose one of the respective representations of the candidate stub grammars. The method can include adding to the workspace a stub grammar corresponding to the representation of the candidate stub grammar chosen by the user.
    Type: Grant
    Filed: July 26, 2013
    Date of Patent: March 12, 2019
    Assignee: Nuance Communications, Inc.
    Inventor: Jeffrey N. Marcus
  • Patent number: 10229701
    Abstract: A mobile device is adapted for automatic speech recognition (ASR). A user interface for interaction with a user includes an input microphone for obtaining speech inputs from the user for automatic speech recognition, and an output interface for system output to the user based on ASR results that correspond to the speech input. A local controller obtains a sample of non-ASR audio from the input microphone for ASR-adaptation to channel-specific ASR characteristics, and then provides a representation of the non-ASR audio to a remote ASR server for server-side adaptation to the channel-specific ASR characteristics, and then provides a representation of an unknown ASR speech input from the input microphone to the remote ASR server for determining ASR results corresponding to the unknown ASR speech input, and then provides the system output to the output interface.
    Type: Grant
    Filed: June 12, 2017
    Date of Patent: March 12, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Daniel Willett, Jean-Guy E. Dahan, William F. Ganong, III, Jianxiong Wu
  • Patent number: 10229686
    Abstract: Methods and apparatus to process microphone signals by a speech enhancement module to generate an audio stream signal including first and second metadata for use by a speech recognition module. In an embodiment, speech recognition is performed using endpointing information including transitioning from a silence state to a maybe speech state, in which data is buffered, based on the first metadata and transitioning to a speech state, in which speech recognition is performed, based upon the second metadata.
    Type: Grant
    Filed: August 18, 2014
    Date of Patent: March 12, 2019
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Markus Buck, Tobias Herbig, Simon Graf, Christophe Ris
  • Publication number: 20190073999
    Abstract: According to some aspects, a system for detecting a designated wake-up word is provided, the system comprising a plurality of microphones to detect acoustic information from a physical space having a plurality of acoustic zones, at least one processor configured to receive a first acoustic signal representing the acoustic information received by the plurality of microphones, process the first acoustic signal to identify content of the first acoustic signal originating from each of the plurality of acoustic zones, provide a plurality of second acoustic signals, each of the plurality of second acoustic signals substantially corresponding to the content identified as originating from a respective one of the plurality of acoustic zones, and performing automatic speech recognition on each of the plurality of second acoustic signals to determine whether the designated wake-up word was spoken.
    Type: Application
    Filed: February 10, 2016
    Publication date: March 7, 2019
    Applicant: Nuance Communications, Inc.
    Inventors: Julien Prémont, Tim Haulick, Emanuele Dalmasso, Munir Nikolai Alexander Georges, Andreas Kellner, Gaetan Martens, Oliver van Porten, Holger Quast, Martin Rößler, Tobias Wolff, Markus Buck
  • Patent number: 10223411
    Abstract: A method of providing a task assistant is described. The task assistant is designed to receive input from a user through multimodal input including a plurality of speech input, typing input, and touch input, determine the meaning of the input, and determining whether there is a context based on prior interactions with the user. The method further to generate an interpreted input based on a combination of the input and the context, and providing a formatted query to an application. The method further to receive data from the application in response to the formatted query, and provide a response to the user through multimodal output including a plurality of: speech output, text output, non-speech audio output, haptic output, and visual non-text output. The method further to update the context based on the interpreted input.
    Type: Grant
    Filed: March 6, 2013
    Date of Patent: March 5, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: David Andrew Mauro, Henri Bouvier, Elizabeth Ann Dykstra-Erickson, Simona Gandrabur, Susan Dawnstarr Daniel, Aimee Piercy, Robert Douglas Sharp
  • Patent number: 10210003
    Abstract: Methods and apparatus to process a user input on independent applications that provide classifier outputs to an arbitration module, which selects one of the application to respond to the user input. The classifier outputs include a probability that the user input is in domain for the application functionality.
    Type: Grant
    Filed: September 30, 2014
    Date of Patent: February 19, 2019
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Frederick Ducatelle, Marcus Grober, Gaetan Martens
  • Patent number: 10199035
    Abstract: Systems, methods, and computer-readable storage devices for performing per-channel automatic speech recognition. An example system configured to practice the method combines a first audio signal of a first speaker in a communication session and a second audio signal from a second speaker in the communication session as a first audio channel and a second audio channel. The system can recognize speech in the first audio channel of the recording using a first model specific to the first speaker, and recognize speech in the second audio channel of the recording using a second model specific to the second speaker, wherein the first model is different from the second model. The system can generate recognized speech as an output from the communication session. The system can identify the models based on identifiers of the speakers, such as a telephone number, an IP address, a customer number, or account number.
    Type: Grant
    Filed: November 22, 2013
    Date of Patent: February 5, 2019
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Ilya Dan Melamed, Andrej Ljolje
  • Patent number: 10199052
    Abstract: Systems, computer-implemented methods, and tangible computer-readable media are presented to provide dynamic speech processing services during variable network connectivity. The method includes monitoring, via a processor, a level of network connectivity between a device and a network server. When the level of network connectivity between the device and the network server is below a threshold, the method includes performing speech processing using a speech processor of the device. When the level of network connectivity between the device and the network server is at or above the threshold, the method includes performing speech processing using a speech processor at the network server.
    Type: Grant
    Filed: June 12, 2017
    Date of Patent: February 5, 2019
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventor: Horst Schroeter
  • Patent number: 10199124
    Abstract: Techniques for documenting a clinical procedure involve transcribing audio data comprising audio of one or more clinical personnel speaking while performing the clinical procedure. Examples of applicable clinical procedures include sterile procedures such as surgical procedures, as well as non-sterile procedures such as those conventionally involving a core code reporter. The transcribed audio data may be analyzed to identify relevant information for documenting the clinical procedure, and a text report including the relevant information documenting the clinical procedure may be automatically generated.
    Type: Grant
    Filed: October 9, 2017
    Date of Patent: February 5, 2019
    Assignee: Nuance Communications, Inc.
    Inventor: Mariana Casella dos Santos
  • Patent number: 10199039
    Abstract: A machine-readable medium may include a group of reusable components for building a spoken dialog system. The reusable components may include a group of previously collected audible utterances. A machine-implemented method to build a library of reusable components for use in building a natural language spoken dialog system may include storing a dataset in a database. The dataset may include a group of reusable components for building a spoken dialog system. The reusable components may further include a group of previously collected audible utterances. A second method may include storing at least one set of data. Each one of the at least one set of data may include ones of the reusable components associated with audible data collected during a different collection phase.
    Type: Grant
    Filed: December 9, 2015
    Date of Patent: February 5, 2019
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Lee Begeja, Giuseppe Di Fabbrizio, David Crawford Gibbon, Dilek Z. Hakkani-Tur, Zhu Liu, Bernard S. Renger, Behzad Shahraray, Gokhan Tur
  • Patent number: 10192541
    Abstract: A text-to-speech (TTS) system includes components capable of supporting the generation of speech output in any of multiple styles, and may switch seamlessly from producing speech output in one style to producing speech output in another style. For example, a concatenative TTS system may include a speech base storing speech units associated with multiple speech styles, and a linguistic analysis component to generate a phonetic transcription specifying speech output in any of multiple styles. Text input may include a style indication associated with a particular segment of the input text. The linguistic analysis component may invoke encoded rules and/or components based upon the style indication, and generate a phonetic transcription specifying a speech style, which may be processed to generate output speech.
    Type: Grant
    Filed: June 5, 2014
    Date of Patent: January 29, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Paolo Mairano, Corinne Bos-Plachez, Sourav Nandy, Johan Wouters, Silvia Maria Antonella Quazza, Dong-Jian Yue
  • Patent number: 10192543
    Abstract: A method (300) and system (100) is provided to add the creation of examples at a developer level in the generation of Natural Language Understanding (NLU) models, tying the examples into a NLU sentence database (130), automatically validating (310) a correct outcome of using the examples, and automatically resolving (316) problems the user has using the examples. The method (300) can convey examples of what a caller can say to a Natural Language Understanding (NLU) application. The method includes entering at least one example associated with an existing routing destination, and ensuring an NLU model correctly interprets the example unambiguously for correctly routing a call to the routing destination. The method can include presenting the example sentence in a help message (126) within an NLU dialogue as an example of what a caller can say for connecting the caller to a desired routing destination.
    Type: Grant
    Filed: May 10, 2016
    Date of Patent: January 29, 2019
    Assignee: Nuance Communications, Inc.
    Inventors: Rajesh Balchandran, Linda M. Boyer, James R. Lewis, Brent D. Metz