Creating Patterns For Matching Patents (Class 704/243)

Update patterns (Class 704/244)

Clustering (Class 704/245)

Device and method for understanding user intent

Patent number: 10037758

Abstract: A voice recognizer 3 generates plural voice recognition results from one input speech 2. For each of the voice recognition results, an intent understanding processor 7 estimates an intent to thereby output one or more candidates of intent understanding results and scores of them. A weight calculator 11 calculates standby weights using setting information 9 of a control target apparatus. An intent understanding corrector 12 corrects the scores of the candidates of intent understanding result, using the standby weights, to thereby calculate their final scores, and then selects one from among the candidates of intent understanding result, as an intent understanding result 13, on the basis of the final scores.

Type: Grant

Filed: March 31, 2014

Date of Patent: July 31, 2018

Assignee: MITSUBISHI ELECTRIC CORPORATION

Inventors: Yi Jing, Yoichi Fujii, Jun Ishii
User recognition for speech processing systems

Patent number: 10032451

Abstract: Systems, methods, and devices for recognizing a user are disclosed. A speech-controlled device captures a spoken utterance, and sends audio data corresponding thereto to a server. The server determines content sources storing or having access to content responsive to the spoken utterance. The server also determines multiple users associated with a profile of the speech-controlled device. Using the audio data, the server may determine user recognition data with respect to each user indicated in the speech-controlled device's profile. The server may also receive user recognition confidence threshold data from each of the content sources. The server may determine user recognition data associated that satisfies (i.e., meets or exceeds) a most stringent (i.e., highest) of the user recognition confidence threshold data. Thereafter, the server may send data indicating a user associated with the user recognition data to all of the content sources.

Type: Grant

Filed: December 20, 2016

Date of Patent: July 24, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Natalia Vladimirovna Mamkina, Naomi Bancroft, Nishant Kumar, Shamitha Somashekar
Asynchronous optimization for sequence training of neural networks

Patent number: 10019985

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.

Type: Grant

Filed: April 22, 2014

Date of Patent: July 10, 2018

Assignee: Google LLC

Inventors: Georg Heigold, Erik McDermott, Vincent O. Vanhoucke, Andrew W. Senior, Michiel A. U. Bacchiani
Acoustic model training using corrected terms

Patent number: 10019986

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for speech recognition. One of the methods includes receiving first audio data corresponding to an utterance; obtaining a first transcription of the first audio data; receiving data indicating (i) a selection of one or more terms of the first transcription and (ii) one or more of replacement terms; determining that one or more of the replacement terms are classified as a correction of one or more of the selected terms; in response to determining that the one or more of the replacement terms are classified as a correction of the one or more of the selected terms, obtaining a first portion of the first audio data that corresponds to one or more terms of the first transcription; and using the first portion of the first audio data that is associated with the one or more terms of the first transcription to train an acoustic model for recognizing the one or more of the replacement terms.

Type: Grant

Filed: July 29, 2016

Date of Patent: July 10, 2018

Assignee: Google LLC

Inventors: Olga Kapralova, Evgeny A. Cherepanov, Dmitry Osmakov, Martin Baeuml, Gleb Skobeltsyn
Screening for neurological disease using speech articulation characteristics

Patent number: 10010288

Abstract: Detection of neurological diseases such as Parkinson's disease can be accomplished through analyzing a subject's speech for acoustic measures based on human factor cepstral coefficients (HFCC). Upon receiving a speech sample from a subject, a signal analysis can be performed that includes identifying articulation range and articulation rate using HFCC and delta coefficients. A likelihood of Parkinson's disease, for example, can be determined based upon the identified articulation range and articulation rate of the speech.

Type: Grant

Filed: January 17, 2017

Date of Patent: July 3, 2018

Assignees: Board of Trustees of Michigan State University, University of Florida Research Foundation, Inc.

Inventors: John Clyde Rosenbek, Mark D. Skowronski, Rahul Shrivastav, Supraja Anand
Text prediction using multiple devices

Patent number: 9996524

Abstract: A first set of characters may be received in response to a user input for text prediction. An estimate may be generated indicating what second set of characters will be inputted. The generating an estimate may be based on at least receiving data from a second user device. At least some of the data may not be located within the second user device's text dictionary. At least some of the data may be provided to the first user device.

Type: Grant

Filed: November 14, 2017

Date of Patent: June 12, 2018

Assignee: International Business Machines Corporation

Inventors: Inseok Hwang, Su Liu, Eric J. Rozner, Chin Ngai Sze
Systems and methods for dynamic download of embedded voice components

Patent number: 9997160

Abstract: Systems and methods for dynamic download of embedded voice components are disclosed. One embodiment may be configured to receive an application via a wireless communication, where the application comprises a speech recognition component, receive the voice command from a user, and analyze the speech recognition component to determine a translation action to perform, based on the voice command. In some embodiments, in response to determining that the translation action includes downloading a vocabulary from a first remote computing device, the vocabulary may be downloaded from the first remote computing device to utilize the vocabulary to translate the voice command. In some embodiments, in response to determining that the translation action includes communicating the voice command to a second remote computing device, the voice command may be sent to the second remote computing device and receive a translated version of the voice command.

Type: Grant

Filed: July 1, 2013

Date of Patent: June 12, 2018

Assignee: TOYOTA MOTOR ENGINEERING & MANUFACTURING NORTH AMERICA, INC.

Inventor: Eric Randell Schmidt
Computer-implemented systems and methods for automatically generating an assessment of oral recitations of assessment items

Patent number: 9984682

Abstract: Provide automatic assessment of oral recitations during computer based language assessments using a trained neural network to automate the scoring and feedback processes without human transcription and scoring input by automatically generating a score of a language assessment. Providing an automatic speech recognition (“ASR”) scoring system. Training multiple scoring reference vectors associated with multiple possible scores of an assessment, and receiving an acoustic language assessment response to an assessment item. Based on the acoustic language assessment automatically generating a transcription, and generating an individual word vector from the transcription. Generating an input vector by concatenating an individual word vector with a transcription feature vector, and supplying an input vector as input to a neural network. Generating an output vector based on weights of a neural network; and generating a score by comparing an output vector with scoring vectors.

Type: Grant

Filed: March 30, 2017

Date of Patent: May 29, 2018

Assignee: Educational Testing Service

Inventors: Jidong Tao, Lei Chen, Chong Min Lee
Deep tagging background noises

Patent number: 9972340

Abstract: In a computer system for navigating to a location in recorded content, a computer receives a descriptive term or phrase associated with a searchable tag. The searchable tag corresponds to a point-in-time at which a non-speech sound occurred during the recording of recorded content of a communication between a plurality of participants. The recorded content includes speech from one or more of the plurality of participants, the descriptive term includes an automatically generated phonetic translation of the non-speech sound, and the non-speech sound was transmitted to the plurality of participants during the recording. The computer navigates to a location in the recorded content corresponding to the point-in-time at which the non-speech sound occurred.

Type: Grant

Filed: July 27, 2016

Date of Patent: May 15, 2018

Assignee: International Business Machines Corporation

Inventors: Denise A. Bell, Lisa Seacat Deluca, Jana H. Jenkins, Jeffrey A. Kusnitz
Method for suspension sensing in interactive display, method for processing suspension sensing image, and proximity sensing apparatus

Patent number: 9965028

Abstract: Provided is a method for proximity sensing in an interactive display and a method of processing a proximity sensing image. The method for proximity-sensing may include suspending, in a three-dimensional (3D) space, at least one hand of a user to manipulate at least one of a viewpoint of the user and a position of a virtual object displayed on a screen, and changing the virtual object by manipulating at least one of the position of the virtual object and the viewpoint of the user based on the suspending.

Type: Grant

Filed: April 6, 2011

Date of Patent: May 8, 2018

Assignee: Samsung Electronics Co., Ltd.

Inventors: Byung In Yoo, Du Sik Park, Chang Kyu Choi
Providing suggestions within a document

Patent number: 9959296

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing suggestions within a document. In one aspect, a method includes obtaining textual input provided to a document editing application by a user device, the textual input being provided to the document editing application for inclusion in a document; identifying performance measures associated with the current editing session for the document, each performance measure being based on session data obtained from the user device during a document editing session, the session data being for the textual input and prior text that was included in the document prior to the textual input; providing the performance measures as input to a suggestion model that was trained using historical performance measures identified in performance logs for historical document editing sessions of users; and throttling textual suggestions during the current editing session based on the output of the suggestion model.

Type: Grant

Filed: May 12, 2014

Date of Patent: May 1, 2018

Assignee: Google LLC

Inventors: Maxim Gubin, Kenneth W. Dauber, Sangsoo Sung, Krishna Bharat
Keyword model generation for detecting user-defined keyword

Patent number: 9953632

Abstract: According to an aspect of the present disclosure, a method for generating a keyword model of a user-defined keyword in an electronic device is disclosed. The method includes receiving at least one input indicative of the user-defined keyword, determining a sequence of subwords from the at least one input, generating the keyword model associated with the user-defined keyword based on the sequence of subwords and a subword model of the subwords, wherein the subword model is configured to model a plurality of acoustic features of the subwords based on a speech database, and providing the keyword model associated with the user-defined keyword to a voice activation unit configured with a keyword model associated with a predetermined keyword.

Type: Grant

Filed: August 22, 2014

Date of Patent: April 24, 2018

Assignee: QUALCOMM Incorporated

Inventors: Sungrack Yun, Taesu Kim
Input determination method

Patent number: 9939896

Abstract: Methods and systems for determining intent in voice and gesture interfaces are described. An example method includes determining that a gaze direction is in a direction of a gaze target, and determining whether a predetermined time period has elapsed while the gaze direction is in the direction of the gaze target. The method may also include providing an indication that the predetermined time period has elapsed when the predetermined time period has elapsed. According to the method, a voice or gesture command that is received after the predetermined time period has elapsed may be determined to be an input for a computing system. Additional example systems and methods are described herein.

Type: Grant

Filed: July 12, 2016

Date of Patent: April 10, 2018

Assignee: Google LLC

Inventors: Eric Teller, Daniel Aminzade
Customized speech processing language models

Patent number: 9934777

Abstract: User-specific language models (LMs) that include internal word indexes to a word table specific to the user-specific LM rather than a word table specific to a system-wide LM. When the system-wide LM is updated, the word table of the user-specific LM may be updated to translate the user-specific indices to system-wide indices. This prevents having to update the internal indices of the user-specific LM every time the system-wide LM is updated.

Type: Grant

Filed: August 26, 2016

Date of Patent: April 3, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Shaun Nidhiri Joseph, Sonal Pareek, Ariya Rastrow, Gautam Tiwari, Alexander David Rosen
Automatic disclosure detection

Patent number: 9934792

Abstract: A method of detecting pre-determined phrases to determine compliance quality of an agent includes determining a presence of a predetermined input based on a comparison between stored pre-determined phrases and a received communication, and determining a compliance rating of the agent based on a presence of a pre-determined phrase associated with the predetermined input in the communication.

Type: Grant

Filed: January 31, 2017

Date of Patent: April 3, 2018

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventors: I. Dan Melamed, Andrej Ljolje, Bernard S. Renger, David J. Smith, Yeon-Jun Kim
User-aided adaptation of a phonetic dictionary

Patent number: 9922643

Abstract: A method for adapting a phonetic dictionary for peculiarities of a speech of an at least one speaker, comprising generating search pronunciations for a search term, retrieving audio sections from an audio database for each search pronunciation, audibly presenting to a person the audio sections of the speech of the at least one speaker, and updating the phonetic dictionary based on acceptability of the audio sections determined from judgments by the person regarding intelligibility of the audio sections in audibly pronouncing the provided at least one word, wherein the method is performed on an at least one computerized apparatus configured to perform the method.

Type: Grant

Filed: December 23, 2014

Date of Patent: March 20, 2018

Assignee: NICE LTD.

Inventors: Maor Nissan, Ronny Bretter
Biometric identification method

Patent number: 9922234

Abstract: A biometric identification method comprising the steps of comparing a candidate print with a reference print and validating identification as a function of a number of characteristics that are common in the two prints and of a predetermined validation threshold, the method being characterized in that it comprises the steps of altering the biometric characteristics of one of the two prints prior to comparison and of taking the alteration into account during validation.

Type: Grant

Filed: June 16, 2016

Date of Patent: March 20, 2018

Assignee: MORPHO

Inventors: Yves Bocktaels, Julien Bringer, Mael Berthier, Marcelin Ragot
Characterizing, selecting and adapting audio and acoustic training data for automatic speech recognition systems

Patent number: 9922664

Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.

Type: Grant

Filed: March 28, 2016

Date of Patent: March 20, 2018

Assignee: Nuance Communications, Inc.

Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
Gesture recognition cloud command platform, system, method, and apparatus

Patent number: 9916010

Abstract: Systems and methods described herein are for transmitting a command to a remote system. A processing system determines the identity of the user based on the unique identifier and the biometric information. Thereafter, a sensor detects a gesture performed by the user. The sensor is configured to detect the gesture performed by the user when the user is located within the detectable range of the wireless antenna. The processing system determines an action associated with the detected gesture based on the identity of the user and sends a command to a remote computer system to cause it to perform the action associated with the detected gesture.

Type: Grant

Filed: May 18, 2015

Date of Patent: March 13, 2018

Assignee: VISA INTERNATIONAL SERVICE ASSOCIATION

Inventors: Theodore Harris, Scott Edington, Patrick Faith
Computing system with DVE template selection and video content item generation feature

Patent number: 9905267

Abstract: In one aspect, an example method includes (i) receiving a first group of video content items; (ii) identifying from among the first group of video content items, a second group of video content items having a threshold extent of similarity with each other; (iii) determining a quality score for each video content item of the second group; (iv) identifying from among the second group of video content items, a third group of video content items each having a quality score that exceeds a quality score threshold; and (v) based on the identifying of the third group, transmitting at least a portion of at least one video content item of the identified third group to a digital video-effect (DVE) system, wherein the system is configured for using the at least the portion of the at least one video content item of the identified third group to generate a video content item.

Type: Grant

Filed: July 13, 2016

Date of Patent: February 27, 2018

Assignee: Gracenote, Inc.

Inventors: Dale T. Roberts, Michael Gubman
Automatic generation of a database for speech recognition from video captions

Patent number: 9905221

Abstract: A system and method for automatic generation of a database for speech recognition, comprising: a source of text signals; a source of audio signals comprising an audio representation of said text signals; a text words separation module configured to separate said text into a string of text words; an audio words separation module configured to separate said audio signal into a string of audio words; and a matching module configured to receive said string of text words and said string of audio words and store each pair of matching text word and audio word in a database.

Type: Grant

Filed: March 9, 2014

Date of Patent: February 27, 2018

Inventor: Igal Nir
Automated collective term and phrase index

Patent number: 9864741

Abstract: Knowledge automation techniques may include selecting a knowledge element from a knowledge corpus of an enterprise for extraction of n-grams, and deriving a term vector comprising terms in the knowledge element. Based at least on a frequency of occurrence of each term in the knowledge element, key terms are identified in the term vector. Thereafter, the identified key terms are used to extract one or more n-grams from the knowledge element. Each of the extracted n-grams is scored as a function of at least a frequency of occurrence of each of the n-grams across the knowledge corpus of the enterprise, and based on the scoring, one or more of the n-grams is added to a collective term and phrase index.

Type: Grant

Filed: September 23, 2015

Date of Patent: January 9, 2018

Assignee: PRYSM, INC.

Inventors: Gazi Mahmud, Seenu Banda, Deanna Liang
System and method for speech personalization by need

Patent number: 9837071

Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions.

Type: Grant

Filed: April 6, 2015

Date of Patent: December 5, 2017

Assignee: Nuance Communications, Inc.

Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
Methods and systems for providing universal portability in machine learning

Patent number: 9836450

Abstract: Systems, methods, and apparatuses are presented for a trained language model to be stored in an efficient manner such that the trained language model may be utilized in virtually any computing device to conduct natural language processing. Unlike other natural language processing engines that may be computationally intensive to the point of being capable of running only on high performance machines, the organization of the natural language models according to the present disclosures allows for natural language processing to be performed even on smaller devices, such as mobile devices.

Type: Grant

Filed: December 9, 2015

Date of Patent: December 5, 2017

Assignee: Sansa AI Inc.

Inventors: Schuyler D. Erle, Robert J. Munro, Brendan D. Callahan, Gary C. King, Jason Brenier, James B. Robinson
Systems and methods for adaptive proper name entity recognition and understanding

Patent number: 9818401

Abstract: Various embodiments contemplate systems and methods for performing automatic speech recognition (ASR) and natural language understanding (NLU) that enable high accuracy recognition and understanding of freely spoken utterances which may contain proper names and similar entities. The proper name entities may contain or be comprised wholly of words that are not present in the vocabularies of these systems as normally constituted. Recognition of the other words in the utterances in question, e.g. words that are not part of the proper name entities, may occur at regular, high recognition accuracy. Various embodiments provide as output not only accurately transcribed running text of the complete utterance, but also a symbolic representation of the meaning of the input, including appropriate symbolic representations of proper name entities, adequate to allow a computer system to respond appropriately to the spoken request without further analysis of the user's input.

Type: Grant

Filed: September 19, 2016

Date of Patent: November 14, 2017

Assignee: Promptu Systems Corporation

Inventor: Harry William Printz
Initiating actions based on partial hotwords

Patent number: 9805719

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, receiving audio data; determining that an initial portion of the audio data corresponds to an initial portion of a hotword; in response to determining that the initial portion of the audio data corresponds to the initial portion of the hotword, selecting, from among a set of one or more actions that are performed when the entire hotword is detected, a subset of the one or more actions; and causing one or more actions of the subset to be performed.

Type: Grant

Filed: March 23, 2017

Date of Patent: October 31, 2017

Assignee: Google Inc.

Inventor: Matthew Sharifi
Systems and methods for natural language processing for speech content scoring

Patent number: 9799228

Abstract: Computer-implemented systems and methods are provided for scoring content of a spoken response to a prompt. A scoring model is generated for a prompt, where generating the scoring model includes generating a transcript for each of a plurality of training responses to the prompt, dividing the plurality of training responses into clusters based on the transcripts of the training responses, selecting a subset of the training responses in each cluster for scoring, scoring the selected subset of training responses for each cluster, and generating content training vectors using the transcripts from the scored subset. A transcript is generated for a received spoken response to be scored, and a similarity metric is computed between the transcript of the spoken response to be scored and the content training vectors. A score is assigned to the spoken response based on the determined similarity metric.

Type: Grant

Filed: January 10, 2014

Date of Patent: October 24, 2017

Assignee: Educational Testing Service

Inventors: Lei Chen, Klaus Zechner, Anastassia Loukina
Acoustic impulse response simulation

Patent number: 9761223

Abstract: At least one spoken utterance and a stored vehicle acoustic impulse response can be provided to a computing device. The computing device is programmed to provide at least one speech file based at least in part on the spoken utterance and the vehicle acoustic impulse response.

Type: Grant

Filed: October 13, 2014

Date of Patent: September 12, 2017

Assignee: FORD GLOBAL TECHNOLOGIES, LLC

Inventors: Michael Alan Blommer, Scott Andrew Amman, Brigitte Frances Mora Richardson, Francois Charette, Mark Edward Porter, Gintaras Vincent Puskorius, Anthony Dwayne Cooprider
Customized content delivery

Patent number: 9756093

Abstract: Systems and methods are provided for creating a sample of electronic content for transmission to others. For example, a user may select a portion of an electronic content file to be transmitted as an excerpt file to another user. In some instances, transmission may be prohibited based on various criteria.

Type: Grant

Filed: December 30, 2013

Date of Patent: September 5, 2017

Assignee: Audible, Inc.

Inventors: Guy Story, Howard Wolfish, Bryan Field-Elliot, Glenn Rogers, Alexander Galkin, Igor Grebnev, John Federico, Steven Hatch, Deepa Muralikrishnan, Arik Meyer
Sound envelope deconstruction to identify words and speakers in continuous speech

Patent number: 9754593

Abstract: A speech recognition capability in which speakers of spoken text are identified based on the contour of sound waves representing the spoken text. Variations in the contour of the sound waves are identified, features are assigned to those variations, and parameters of those features are grouped into predefined characteristics. The predefined characteristics are combined into voice characteristic groups. If a prior voice characteristic group is present, the voice characteristic group from the soundlet is compared to existing voice characteristic groups and, if a match is present, the sound construct is assigned to a speaker identified by the existing voice characteristic group.

Type: Grant

Filed: November 4, 2015

Date of Patent: September 5, 2017

Assignee: International Business Machines Corporation

Inventor: Mukundan Sundararajan
System and method for selective gesture interaction

Patent number: 9753546

Abstract: A system and method for selective gesture interaction using spatial volumes is disclosed. The method includes processing data frames that each includes one or more body point locations of a collaborating user that is interfacing with an application at each time intervals, defining a spatial volume for each collaborating user based on the processed data frames, detecting a gesture performed by a first collaborating user based on the processed data frames, determining the gesture to be an input gesture performed by the first collaborating user in a first spatial volume, interpreting the input gesture based on a context of the first spatial volume that includes a role of the first collaborating user, a phase of the application, and an intersection volume between the first spatial volume and a second spatial volume for a second collaborating user, and providing an input command to the application based on the interpreted input gesture.

Type: Grant

Filed: August 29, 2014

Date of Patent: September 5, 2017

Assignee: General Electric Company

Inventors: Habib Abi-Rached, Jeng-Weei Lin, Sundar Murugappan, Arnold Lund, Veeraraghavan Ramaswamy
Building language models for a user in a social network from linguistic information

Patent number: 9747895

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for building language models. One of the methods includes identifying a first group of one or more users associated with a user in a social network. The method includes identifying first linguistic information associated with the first group. The method includes generating a first language model based on the first linguistic information. The method includes identifying a second group of one or more users associated with the user. The method includes identifying second linguistic information associated with the second group. The method includes generating a second language model based on the second linguistic information. The method includes associating the first language model and the second language model with the user.

Type: Grant

Filed: July 8, 2013

Date of Patent: August 29, 2017

Assignee: Google Inc.

Inventors: Martin Jansche, Mark Edward Epstein
Audio calibration and adjustment

Patent number: 9743212

Abstract: The subject disclosure is directed towards calibrating sound pressure levels of speakers to determine desired attenuation data for use in later playback. A user may be guided to a calibration location to place a microphone, and each speaker is calibrated to output a desired sound pressure level in its current acoustic environment based upon the attenuation data learned during calibration. During playback, the attenuation data is used. Also described is testing the setup of the speakers, and dynamically adjusting the attenuation data in real time based upon tracking the listener's current location.

Type: Grant

Filed: June 24, 2016

Date of Patent: August 22, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Robert Lucas Ridihalgh, Gregory Michael Shaw, Todd Matthew Williams, Tarlochan Singh Randhawa
Method and apparatus for recognizing speech by lip reading

Patent number: 9741342

Abstract: A dictation device includes: an audio input device configured to receive a voice utterance including a plurality of words; a video input device configured to receive video of lip motion during the voice utterance; a memory portion; a controller configured according to instructions in the memory portion to generate first data packets including an audio stream representative of the voice utterance and a video stream representative of the lip motion; and a transceiver for sending the first data packets to a server end device and receiving second data packets including combined dictation based upon the audio stream and the video stream from the server end device. In the combined dictation, first dictation generated based upon the audio stream has been corrected by second dictation generated based upon the video stream.

Type: Grant

Filed: August 13, 2015

Date of Patent: August 22, 2017

Assignee: Panasonic Intellectual Property Corporation of America

Inventors: Yuichiro Takayanagi, Masashi Kusaka
Token-level interpolation for class-based language models

Patent number: 9734826

Abstract: Optimized language models are provided for in-domain applications through an iterative, joint-modeling approach that interpolates a language model (LM) from a number of component LMs according to interpolation weights optimized for a target domain. The component LMs may include class-based LMs, and the interpolation may be context-specific or context-independent. Through iterative processes, the component LMs may be interpolated and used to express training material as alternative representations or parses of tokens. Posterior probabilities may be determined for these parses and used for determining new (or updated) interpolation weights for the LM components, such that a combination or interpolation of component LMs is further optimized for the domain. The component LMs may be merged, according to the optimized weights, into a single, combined LM, for deployment in an application scenario.

Type: Grant

Filed: March 11, 2015

Date of Patent: August 15, 2017

Assignee: Microsoft Technology Licensing, LLC

Inventors: Michael Levit, Sarangarajan Parthasarathy, Andreas Stolcke, Shuangyu Chang
Sound localization for user in motion

Patent number: 9736613

Abstract: Methods, apparatus, and computer programs for simulating the source of sound are provided. One method includes operations for determining a location in space of the head of a user utilizing face recognition of images of the user. Further, the method includes an operation for determining a sound for two speakers, and an operation for determining an emanating location in space for the sound, each speaker being associated with one ear of the user. The acoustic signals for each speaker are established based on the location in space of the head, the sound, the emanating location in space, and the auditory characteristics of the user. In addition, the acoustic signals are transmitted to the two speakers. When the acoustic signals are played by the two speakers, the acoustic signals simulate that the sound originated at the emanating location in space.

Type: Grant

Filed: May 7, 2015

Date of Patent: August 15, 2017

Assignee: Sony Interactive Entertainment Inc.

Inventor: Steven Osman
Recognizing speech using neural networks

Patent number: 9728185

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for recognizing speech using neural networks. One of the methods includes receiving an audio input; processing the audio input using an acoustic model to generate a respective phoneme score for each of a plurality of phoneme labels; processing one or more of the phoneme scores using an inverse pronunciation model to generate a respective grapheme score for each of a plurality of grapheme labels; and processing one or more of the grapheme scores using a language model to generate a respective text label score for each of a plurality of text labels.

Type: Grant

Filed: May 22, 2015

Date of Patent: August 8, 2017

Assignee: Google Inc.

Inventors: Johan Schalkwyk, Francoise Beaufays, Hasim Sak, John Giannandrea
System and method for combining frame and segment level processing, via temporal pooling, for phonetic classification

Patent number: 9728183

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for combining frame and segment level processing, via temporal pooling, for phonetic classification. A frame processor unit receives an input and extracts the time-dependent features from the input. A plurality of pooling interface units generates a plurality of feature vectors based on pooling the time-dependent features and selecting a plurality of time-dependent features according to a plurality of selection strategies. Next, a plurality of segmental classification units generates scores for the feature vectors. Each segmental classification unit (SCU) can be dedicated to a specific pooling interface unit (PIU) to form a PIU-SCU combination. Multiple PIU-SCU combinations can be further combined to form an ensemble of combinations, and the ensemble can be diversified by varying the pooling operations used by the PIU-SCU combinations.

Type: Grant

Filed: November 10, 2015

Date of Patent: August 8, 2017

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Sumit Chopra, Dimitrios Dimitriadis, Patrick Haffner
Methods and devices for ignoring similar audio being received by a system

Patent number: 9728188

Abstract: Systems and methods for detecting similar audio being received by separate voice activated electronic devices, and ignoring those commands, is described herein. In some embodiments, a voice activated electronic device may be activated by a wakeword that is output by the additional electronic device, such as a television or radio, may capture audio of sound subsequently following the wakeword, and may send audio data representing the sound to a backend system. Upon receipt, the backend system may, in parallel to performing automated speech recognition processing to the audio data, generate a sound profile of the audio data, and may compare that sound profile to sound profiles of recently received audio data and/or flagged sound profiles. If the generated sound profile is determined to match another sound profiles, then the automated speech recognition processing may be stopped, and the voice activated electronic device may be instructed to return to a keyword spotting mode.

Type: Grant

Filed: June 28, 2016

Date of Patent: August 8, 2017

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Alexander David Rosen, Michael James Rodehorst, George Jay Tucker, Aaron Lee Mathers Challenner
Techniques for updating an automatic speech recognition system using finite-state transducers

Patent number: 9715874

Abstract: Techniques are described for updating an automatic speech recognition (ASR) system that, prior to the update, is configured to perform ASR using a first finite-state transducer (FST) comprising a first set of paths representing recognizable speech sequences. A second FST may be accessed, comprising a second set of paths representing speech sequences to be recognized by the updated ASR system. By analyzing the second FST together with the first FST, a patch may be extracted and provided to the ASR system as an update, capable of being applied non-destructively to the first FST at the ASR system to cause the ASR system using the first FST with the patch to recognize speech using the second set of paths from the second FST. In some embodiments, the patch may be configured such that destructively applying the patch to the first FST creates a modified FST that is globally minimized.

Type: Grant

Filed: October 30, 2015

Date of Patent: July 25, 2017

Assignee: Nuance Communications, Inc.

Inventors: Stephan Kanthak, Jan Vlietinck, Johan Vantieghem, Stijn Verschaeren
Method for building language model, speech recognition method and electronic apparatus

Patent number: 9711139

Abstract: A method for building a language model, a speech recognition method and an electronic apparatus are provided. The speech recognition method includes the following steps. Phonetic transcriptions of a speech signal are obtained from an acoustic model. Phonetic spellings matching the phonetic transcriptions are obtained according to the phonetic transcriptions and a syllable acoustic lexicon. According to the phonetic spellings, a plurality of text sequences and a plurality of text sequence probabilities are obtained from a language model. Each phonetic spelling is matched to a candidate sentence table; a word probability of each phonetic spelling matching a word in a sentence of the sentence table are obtained; and the word probabilities of the phonetic spellings are calculated so as to obtain the text sequence probabilities. The text sequence corresponding to a largest one of the sequence probabilities is selected as a recognition result of the speech signal.

Type: Grant

Filed: July 5, 2016

Date of Patent: July 18, 2017

Assignee: VIA Technologies, Inc.

Inventor: Guo-Feng Zhang
Method and device for voice operated control

Patent number: 9706280

Abstract: At least one exemplary embodiment is directed to an acoustic device including an ear canal microphone configured to detect a first acoustic signal, an ambient microphone configured to detect a second acoustic signal, and a processor operatively coupled to the ear canal microphone and the ambient microphone. The processor is configured to detect a predetermined speech pattern based on an analysis of the first acoustic signal and the second acoustic signal and upon detection of the predetermined speech pattern, subsequently process acoustic signals from the ear canal microphone for a predetermined time or until the ear canal microphone detects a lack of an acoustic signal for a second predetermined time. After the predetermined time or after a second predetermined time, the processor processes acoustic signals from the ear canal microphone and the ambient microphone.

Type: Grant

Filed: July 15, 2015

Date of Patent: July 11, 2017

Assignee: Personics Holdings, LLC

Inventors: John Usher, Steve Goldstein, Marc Boillot
System and method for updating an adaptive speech recognition model

Patent number: 9697822

Abstract: A method for updating an adaptive speech recognition model is provided. In some implementations, the method is performed at a communications device including one or more processors and memory storing instructions for execution by the one or more processors. The method includes determining that a first user of a first mobile communication device is engaged in a call over a communications network and providing an adaptive speech recognition model The method also includes analyzing an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device and updating the adaptive speech recognition model with training data derived from the call audio signal.

Type: Grant

Filed: April 28, 2014

Date of Patent: July 4, 2017

Assignee: Apple Inc.

Inventors: Devang K. Naik, Onur E. Tackin
Customer satisfaction system and method based on behavioral assessment data

Patent number: 9692894

Abstract: A method of generating a customer satisfaction score based on behavioral assessment data across one or more recorded communications, which includes analyzing one or more communications between a customer and an agent by applying a linguistic-based psychological behavioral model to each communication to determine a personality type of the customer, selecting at least one filter criterion which comprises a customer, an agent, a team, or a call type, calculating a customer satisfaction score using the at least one selected filter criterion across a selected time interval and based on one or more communications, and displaying a report including the calculated customer satisfaction score to a user that matches the at least one selected filter criterion for the selected time interval. Systems and non-transitory computer readable media configured to generate a customer satisfaction score based on behavioral assessment data are also included.

Type: Grant

Filed: August 5, 2016

Date of Patent: June 27, 2017

Assignee: Mattersight Corporation

Inventors: Kelly Conway, David Gustafson, Douglas Brown, Christopher Danson
Speaker recognition from telephone calls

Patent number: 9685159

Abstract: A method for speaker recognition comprising: obtaining speaker information for a target speaker; obtaining speech samples from telephone calls from an unknown speaker; classifying the speech samples according the unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information of each of the speaker-dependent classes of speech samples; combining the extracted speaker information; comparing the combined extracted speaker information with the stored speaker information for the target speaker to obtain a comparison result; and determining whether the unknown speaker is identical with the target speaker based on the comparison result.

Type: Grant

Filed: August 3, 2015

Date of Patent: June 20, 2017

Assignee: Agnitio SL

Inventors: Marta Garcia Gomar, Johan Nikolaas Langehoven Brummer, Luis Buera Rodriguez
System and method for processing call data

Patent number: 9674355

Abstract: A system and method for processing call data is provided. A call between a user and an agent is monitored. A selected script having a dialog grammar is received. The selected script is executed by converting at least a portion of the script into synthesized speech utterances and providing the synthesized speech utterances to the user. Speech utterances are received from the user in reply to each of the synthesized speech utterances from the script. Each received speech utterance is converted to text as a user message and a form is populated with the user messages. The user speech utterances and the form with the user messages are provided to the agent.

Type: Grant

Filed: March 10, 2016

Date of Patent: June 6, 2017

Assignee: Intellisist, Inc.

Inventors: Gilad Odinak, Alastair Sutherland, William A. Tolhurst
Apparatus for recognizing gesture using infrared ray and method thereof

Patent number: 9671871

Abstract: An apparatus and a method of recognizing a gesture uses an infrared ray. The apparatus for recognizing a gesture, includes a sensing unit which detects a gesture using an infrared sensor to obtain a sensing value from the sensing result; a control unit which performs gesture recognition to which an intention of a user is reflected in accordance with a predetermined recognizing mode based on the obtained sensing value; and a storing unit which stores the predetermined recognizing mode when the gesture recognition set in advance by the user is performed. The predetermined recognizing mode includes a first recognizing mode in which the gesture is directly recognized and a second recognizing mode in which the gesture is recognized after recognizing a hold motion for determining start of the gesture recognition.

Type: Grant

Filed: February 12, 2015

Date of Patent: June 6, 2017

Assignee: HYUNDAI MOBIS CO., LTD

Inventor: Chan Hee Park
Unsupervised and active learning in automatic speech recognition for call classification

Patent number: 9666182

Abstract: Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.

Type: Grant

Filed: October 5, 2015

Date of Patent: May 30, 2017

Assignee: Nuance Communications, Inc.

Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Giuseppe Riccardi, Gokhan Tur
Speech synthesis apparatus and method utilizing acquisition of at least two speech unit waveforms acquired from a continuous memory region by one access

Patent number: 9666179

Abstract: A waveform memory that stores a plurality of speech unit waveforms corresponding to respective speech units, wherein an address order of the speech unit waveforms is determined by a sort order of speech units included in a speech unit sequence corresponding to a phoneme sequence of training data, and the speech units included in the speech unit sequence are selected so as to synthesize a speech of the phone sequence.

Type: Grant

Filed: February 26, 2014

Date of Patent: May 30, 2017

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventor: Takehiko Kagoshima
Music information searching method and apparatus thereof

Patent number: 9659092

Abstract: A music information searching method includes extracting modulating spectrums from audio data, generating modulating spectrum peak point audio fingerprints by using position information which relates to preset peak points from the extracted modulating spectrums, converting the generated modulating spectrum peak point audio fingerprints into hash keys which indicate addresses of hash tables and hash values that are stored on the hash tables via hash functions, and searching music information by extracting hash keys which relate to audio query clips and comparing the extracted hash keys with the indicated addresses of the hash tables.

Type: Grant

Filed: November 13, 2013

Date of Patent: May 23, 2017

Assignees: SAMSUNG ELECTRONICS CO., LTD., KWANGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION

Inventors: Ki-wan Eom, Hyoung-Gook Kim, Kwang-ki Kim

prev 1 2 3 4 5 6 7 8 9 … next