Patents Assigned to Nuance Communications, Inc.
-
Patent number: 9600086Abstract: A portable electronic device (100,400) and user interface (425) are operated using a method including initiating entry of a content string; determining the most probable completion alternative or a content prediction using a personalized and learning database (430); displaying the most probable completion alternative or next content prediction; determining whether a user has accepted the most probable completion alternative or next content prediction; and adding the most probable completion alternative or next content prediction to the content string upon user acceptance.Type: GrantFiled: February 29, 2012Date of Patent: March 21, 2017Assignee: NUANCE COMMUNICATIONS, INC.Inventors: Heiko K. Sacher, Maria E. Romera, Jens Nagel
-
Patent number: 9602672Abstract: A device that enable users to send and receive a message in different formats may include a text message gateway, an audio message gateway, and a processor. The text message gateway may include a Short Message Service (SMS) gateway. The audio message gateway may include an Interactive Voice Response (IVR) unit and/or a client application interface that receives audio from a client application of a mobile communications device. The processor may be configured to convert text messages received at the text-message gateway into audio messages and then to send the audio messages via the audio message gateway. The processor may also be configured to convert audio messages received at the audio message gateway into text messages and then to send the text messages via the text message gateway.Type: GrantFiled: March 30, 2015Date of Patent: March 21, 2017Assignee: Nuance Communications, Inc.Inventor: Robert Lee Engelhart, Sr.
-
Publication number: 20170076718Abstract: Methods and apparatus for performing speech recognition using a garbage model. The method comprises receiving audio comprising speech and processing at least some of the speech using a garbage model to produce a garbage speech recognition result. The garbage model includes a plurality of sub-words, each of which corresponds to a possible combination of phonemes in a particular language.Type: ApplicationFiled: May 9, 2014Publication date: March 16, 2017Applicant: Nuance Communication, IncInventors: Cosmin Popovici, Kenneth W.D. Smith, Petrus C. Cools
-
Patent number: 9595257Abstract: An approach for phoneme recognition is described. A sequence of intermediate output posterior vectors is generated from an input sequence of cepstral features using a first layer perceptron. The intermediate output posterior vectors are then downsampled to form a reduced input set of intermediate posterior vectors for a second layer perceptron. A sequence of final posterior vectors is generated from the reduced input set of intermediate posterior vectors using the second layer perceptron. Then the final posterior vectors are decoded to determine an output recognized phoneme sequence representative of the input sequence of cepstral features.Type: GrantFiled: September 28, 2009Date of Patent: March 14, 2017Assignee: Nuance Communications, Inc.Inventors: Daniel Andrés Vásquez Cano, Guillermo Aradilla, Rainer Gruhn
-
Patent number: 9589198Abstract: The present invention relates to a camera based method for text input and detection of a keyword or of a text-part within page or a screen comprising the steps of: directing a camera module on the printed page and capturing an image thereof; digital image filtering of the captured image; detection of word blocks contained in the image, each word block containing most likely a recognizable word; performing OCR within each word block; determination of A-blocks among the word blocks according to a keyword probability determination rule, wherein each of the A-blocks contains most likely the keyword; assignment of an attribute to each A-block; indication of the A-blocks in the display by a frame or the like for a further selection of the keyword; further selection of the A-block containing the keyword based on the displayed attribute of the keyword; forwarding the text content as text input to an application.Type: GrantFiled: March 5, 2015Date of Patent: March 7, 2017Assignee: Nuance Communications, Inc.Inventors: Cuneyt Goktekin, Oliver Tenchio
-
Publication number: 20170061968Abstract: Techniques for automatically identifying a speaker in a conversation as a known person based on processing of audio of the speaker's voice to extract characteristics of that voice and on an automated comparison of those characteristics to known characteristics of the known person's voice. A speaker segmentation process may be performed on audio of the conversation to produce, for each speaker in the conversation, a segment that includes the audio of that speaker. Audio of each of the segments may then be processed to extract characteristics of that speaker's voice. The characteristics derived from each segment (and thus for multiple speakers) may then be compared to characteristics of the known person's voice to determine whether the speaker for that segment is the known person. For each segment, a degree of match between the voice characteristics of the speaker and the voice characteristics of the known person may be calculated.Type: ApplicationFiled: August 27, 2015Publication date: March 2, 2017Applicant: Nuance Communications, Inc.Inventors: Emanuele Dalmasso, Daniele Colibro, Claudio Vair, Kevin R. Farrell
-
Patent number: 9583096Abstract: A method for state transition in voice systems including: generating one or more stackable state macros, each of the one or more stackable state macros including a plurality of commands; saving the current state before executing another macro; enabling restoring the previous state after a plurality of commands is completed, allowing a user to utter voice commands to restore the individual state of components or the voice systems as a whole to the previous state or to a known home state. The method further utilizes voice commands not specific to the current state and is used specifically for automatically controlling a plurality of components of a vehicle.Type: GrantFiled: August 15, 2006Date of Patent: February 28, 2017Assignee: Nuance Communications, Inc.Inventors: Ciprian Agapi, Musaed A. Almutawa, Oscar J. Blass, Patrick M. Commarford, Roberto Vila
-
Publication number: 20170053667Abstract: Methods and apparatus for broadening the beamwidth of beamforming and postfiltering using a plurality of beamformers and signal and power spectral density mixing, and controlling a postfilter based on spatial activity detection such that de-reverberation or noise reduction is performed when a speech source is between the first and second beams.Type: ApplicationFiled: July 2, 2014Publication date: February 23, 2017Applicant: NUANCE COMMUNICATIONS, INC.Inventors: Tobias Wolff, Tim Haulick, Markus Buck
-
Publication number: 20170053644Abstract: According to some aspects, a method of classifying speech recognition results is provided, using a neural network comprising a plurality of interconnected network units, each network unit having one or more weight values, the method comprising using at least one computer, performing acts of providing a first vector as input to a first network layer comprising one or more network units of the neural network, transforming, by a first network unit of the one or more network units, the input vector to produce a plurality of values, the transformation being based at least in part on a plurality of weight values of the first network unit, sorting the plurality of values to produce a sorted plurality of values, and providing the sorted plurality of values as input to a second network layer of the neural network.Type: ApplicationFiled: August 20, 2015Publication date: February 23, 2017Applicant: Nuance Communications, Inc.Inventors: Steven John Rennie, Vaibhava Goel
-
Patent number: 9575946Abstract: An automotive text display arrangement is described which includes a driver text display positioned directly in front of an automobile driver and displaying a limited amount of text to the driver without impairing forward visual attention of the driver. The arrangement may include a boundary insertion mode wherein when the active text position is an active text boundary, new text is inserted between the text items separated by the active text boundary, and when the active text position is an active text item, new text replaces the active text item. In addition or alternatively, there may be a multifunctional text control knob offering multiple different user movements, each performing an associated text processing function.Type: GrantFiled: May 23, 2011Date of Patent: February 21, 2017Assignee: NUANCE COMMUNICATIONS, INC.Inventors: Jan Curin, Jan Kleindienst, Martin Labsky, Tomas Macek, Lars Koenig, Holger Quast, Garrett Weinberg
-
Patent number: 9576571Abstract: Techniques are disclosed for recognizing user personality in accordance with a speech recognition system. For example, a technique for recognizing a personality trait associated with a user interacting with a speech recognition system includes the following steps/operations. One or more decoded spoken utterances of the user are obtained. The one or more decoded spoken utterances are generated by the speech recognition system. The one or more decoded spoken utterances are analyzed to determine one or more linguistic attributes (morphological and syntactic filters) that are associated with the one or more decoded spoken utterances. The personality trait associated with the user is then determined based on the analyzing step/operation.Type: GrantFiled: May 2, 2014Date of Patent: February 21, 2017Assignee: Nuance Communications, Inc.Inventors: Osamuyimen Thompson Stewart, Liwei Dai
-
Patent number: 9576580Abstract: Described herein are techniques for determining corresponding positions between different representations of a textual work. In some of the techniques, portions of one or more representations may be processed. A determination of a corresponding position may be made in response to a request received from a user, such as a reader that desires to switch between representations. The request may indicate a position in one representation and the representation to which the user would like to switch. In response to receiving the request, one or more portions of one or more representations of a textual work may be processed. In some techniques, a corresponding position between different representations may be determined without processing the entirety of one or more representations of the textual work. For example, a corresponding position may be determined without processing an entire audio representation.Type: GrantFiled: May 2, 2016Date of Patent: February 21, 2017Assignee: Nuance Communications, Inc.Inventor: William F. Ganong, III
-
Patent number: 9572103Abstract: Embodiments included herein are directed towards a system and method for addressing discontinuous transmission (DTX) in a network device. Embodiments may include receiving, at a computing device, an audio signal and generating at least one silence descriptor (SID) frame associated with the audio signal. Embodiments may also include generating at least one no data frame associated with the audio signal. Embodiments may also include initiating a speech decoder, voice enhancement, and speech encoder operation for the at least one SID frame during a DTX operation and bypassing the speech decoder, voice enhancement, and speech encoder functions for the at least one no data frame.Type: GrantFiled: September 24, 2014Date of Patent: February 14, 2017Assignee: Nuance Communications, Inc.Inventors: Qian-Yu Tang, Victor Zeyliger, Franck Bonard, Weiying Li
-
Patent number: 9569594Abstract: Techniques for documenting a clinical procedure involve transcribing audio data comprising audio of one or more clinical personnel speaking while performing the clinical procedure. Examples of applicable clinical procedures include sterile procedures such as surgical procedures, as well as non-sterile procedures such as those conventionally involving a core code reporter. The transcribed audio data may be analyzed to identify relevant information for documenting the clinical procedure, and a text report including the relevant information documenting the clinical procedure may be automatically generated.Type: GrantFiled: March 8, 2012Date of Patent: February 14, 2017Assignee: Nuance Communications, Inc.Inventor: Mariana Casella dos Santos
-
Patent number: 9569424Abstract: Methods and apparatus for processing a voicemail message to generate a textual representation of at least a portion of the voicemail message. At least one emotion expressed in the voicemail message is determined by applying at least one emotion classifier to the voicemail message and/or the textual representation. An indication of the determined at least one emotion is provided in a manner associated with the textual representation of the at least a portion of the voicemail message.Type: GrantFiled: February 21, 2013Date of Patent: February 14, 2017Assignee: Nuance Communications, Inc.Inventor: Raquel Sanchez Martinez
-
Patent number: 9569593Abstract: Techniques for documenting a clinical procedure involve transcribing audio data comprising audio of one or more clinical personnel speaking while performing the clinical procedure. Examples of applicable clinical procedures include sterile procedures such as surgical procedures, as well as non-sterile procedures such as those conventionally involving a core code reporter. The transcribed audio data may be analyzed to identify relevant information for documenting the clinical procedure, and a text report including the relevant information documenting the clinical procedure may be automatically generated.Type: GrantFiled: March 8, 2012Date of Patent: February 14, 2017Assignee: Nuance Communications, Inc.Inventor: Mariana Casella dos Santos
-
Patent number: 9570065Abstract: Techniques for performing multi-style speech synthesis. The techniques include using at least one computer hardware processor to perform: obtaining input comprising text and an identification of a first speaking style to use in rendering the text as speech; identifying a plurality of speech segments for use in rendering the text as speech, the identified plurality of speech segments comprising a first speech segment having the first speaking style and a second speech segment having a second speaking style different from the first speaking style; and rendering the text as speech having the first speaking style, at least in part, by using the identified plurality of speech segments.Type: GrantFiled: September 29, 2014Date of Patent: February 14, 2017Assignee: Nuance Communications, Inc.Inventor: Vincent Pollet
-
Patent number: 9571645Abstract: A method for conducting a call between a caller and an interactive voice response (IVR) system, the caller using a device to conduct the call, the device configured to execute a virtual assistant, the method comprising using the virtual assistant to conduct the call at least in part by influencing the style of information provided to the caller during the call and/or the content of information passed between the device and the IVR system during the call.Type: GrantFiled: December 16, 2013Date of Patent: February 14, 2017Assignee: Nuance Communications, Inc.Inventors: Holger Quast, Kenneth W. D. Smith, Jean-Guy E. Dahan, Andrew D. Mauro
-
Patent number: 9564140Abstract: Some embodiments relate to techniques for encoding an audio signal represented by a plurality of frames including a first frame. The techniques include using at least one computer hardware processor to perform: obtaining an initial discrete spectral representation of the first frame; obtaining a primary discrete spectral representation of the initial discrete spectral representation at least in part by estimating a phase envelope of the initial discrete spectral representation and evaluating the estimated phase envelope at a discrete set of frequencies; calculating a residual discrete spectral representation of the initial discrete spectral representation based on the initial discrete spectral representation and the primary discrete spectral representation; and encoding the residual discrete spectral representation using a plurality of codewords.Type: GrantFiled: April 7, 2015Date of Patent: February 7, 2017Assignee: Nuance Communications, Inc.Inventors: Slava Shechtman, Alexander Sorin
-
Patent number: 9564126Abstract: In some embodiments, a recognition result produced by a speech processing system based on an analysis of a speech input is evaluated for indications of potential errors. In some embodiments, sets of words/phrases that may be acoustically similar or otherwise confusable, the misrecognition of which can be significant in the domain, may be used together with a language model to evaluate a recognition result to determine whether the recognition result includes such an indication. In some embodiments, a word/phrase of a set that appears in the result is iteratively replaced with each of the other words/phrases of the set. The result of the replacement may be evaluated using a language model to determine a likelihood of the newly-created string of words appearing in a language and/or domain. The likelihood may then be evaluated to determine whether the result of the replacement is sufficiently likely for an alert to be triggered.Type: GrantFiled: December 1, 2014Date of Patent: February 7, 2017Assignee: Nuance Communications, Inc.Inventors: William F. Ganong, III, Raghu Vemula, Robert Fleming