Creating Patterns For Matching Patents (Class 704/243)
  • Patent number: 10037758
    Abstract: A voice recognizer 3 generates plural voice recognition results from one input speech 2. For each of the voice recognition results, an intent understanding processor 7 estimates an intent to thereby output one or more candidates of intent understanding results and scores of them. A weight calculator 11 calculates standby weights using setting information 9 of a control target apparatus. An intent understanding corrector 12 corrects the scores of the candidates of intent understanding result, using the standby weights, to thereby calculate their final scores, and then selects one from among the candidates of intent understanding result, as an intent understanding result 13, on the basis of the final scores.
    Type: Grant
    Filed: March 31, 2014
    Date of Patent: July 31, 2018
    Assignee: MITSUBISHI ELECTRIC CORPORATION
    Inventors: Yi Jing, Yoichi Fujii, Jun Ishii
  • Patent number: 10032451
    Abstract: Systems, methods, and devices for recognizing a user are disclosed. A speech-controlled device captures a spoken utterance, and sends audio data corresponding thereto to a server. The server determines content sources storing or having access to content responsive to the spoken utterance. The server also determines multiple users associated with a profile of the speech-controlled device. Using the audio data, the server may determine user recognition data with respect to each user indicated in the speech-controlled device's profile. The server may also receive user recognition confidence threshold data from each of the content sources. The server may determine user recognition data associated that satisfies (i.e., meets or exceeds) a most stringent (i.e., highest) of the user recognition confidence threshold data. Thereafter, the server may send data indicating a user associated with the user recognition data to all of the content sources.
    Type: Grant
    Filed: December 20, 2016
    Date of Patent: July 24, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Natalia Vladimirovna Mamkina, Naomi Bancroft, Nishant Kumar, Shamitha Somashekar
  • Patent number: 10019985
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, by a first sequence-training speech model, a first batch of training frames that represent speech features of first training utterances; obtaining, by the first sequence-training speech model, one or more first neural network parameters; determining, by the first sequence-training speech model, one or more optimized first neural network parameters based on (i) the first batch of training frames and (ii) the one or more first neural network parameters; obtaining, by a second sequence-training speech model, a second batch of training frames that represent speech features of second training utterances; obtaining one or more second neural network parameters; and determining, by the second sequence-training speech model, one or more optimized second neural network parameters based on (i) the second batch of training frames and (ii) the one or more second neural network parameters.
    Type: Grant
    Filed: April 22, 2014
    Date of Patent: July 10, 2018
    Assignee: Google LLC
    Inventors: Georg Heigold, Erik McDermott, Vincent O. Vanhoucke, Andrew W. Senior, Michiel A. U. Bacchiani
  • Patent number: 10019986
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for speech recognition. One of the methods includes receiving first audio data corresponding to an utterance; obtaining a first transcription of the first audio data; receiving data indicating (i) a selection of one or more terms of the first transcription and (ii) one or more of replacement terms; determining that one or more of the replacement terms are classified as a correction of one or more of the selected terms; in response to determining that the one or more of the replacement terms are classified as a correction of the one or more of the selected terms, obtaining a first portion of the first audio data that corresponds to one or more terms of the first transcription; and using the first portion of the first audio data that is associated with the one or more terms of the first transcription to train an acoustic model for recognizing the one or more of the replacement terms.
    Type: Grant
    Filed: July 29, 2016
    Date of Patent: July 10, 2018
    Assignee: Google LLC
    Inventors: Olga Kapralova, Evgeny A. Cherepanov, Dmitry Osmakov, Martin Baeuml, Gleb Skobeltsyn
  • Patent number: 10010288
    Abstract: Detection of neurological diseases such as Parkinson's disease can be accomplished through analyzing a subject's speech for acoustic measures based on human factor cepstral coefficients (HFCC). Upon receiving a speech sample from a subject, a signal analysis can be performed that includes identifying articulation range and articulation rate using HFCC and delta coefficients. A likelihood of Parkinson's disease, for example, can be determined based upon the identified articulation range and articulation rate of the speech.
    Type: Grant
    Filed: January 17, 2017
    Date of Patent: July 3, 2018
    Assignees: Board of Trustees of Michigan State University, University of Florida Research Foundation, Inc.
    Inventors: John Clyde Rosenbek, Mark D. Skowronski, Rahul Shrivastav, Supraja Anand
  • Patent number: 9996524
    Abstract: A first set of characters may be received in response to a user input for text prediction. An estimate may be generated indicating what second set of characters will be inputted. The generating an estimate may be based on at least receiving data from a second user device. At least some of the data may not be located within the second user device's text dictionary. At least some of the data may be provided to the first user device.
    Type: Grant
    Filed: November 14, 2017
    Date of Patent: June 12, 2018
    Assignee: International Business Machines Corporation
    Inventors: Inseok Hwang, Su Liu, Eric J. Rozner, Chin Ngai Sze
  • Patent number: 9997160
    Abstract: Systems and methods for dynamic download of embedded voice components are disclosed. One embodiment may be configured to receive an application via a wireless communication, where the application comprises a speech recognition component, receive the voice command from a user, and analyze the speech recognition component to determine a translation action to perform, based on the voice command. In some embodiments, in response to determining that the translation action includes downloading a vocabulary from a first remote computing device, the vocabulary may be downloaded from the first remote computing device to utilize the vocabulary to translate the voice command. In some embodiments, in response to determining that the translation action includes communicating the voice command to a second remote computing device, the voice command may be sent to the second remote computing device and receive a translated version of the voice command.
    Type: Grant
    Filed: July 1, 2013
    Date of Patent: June 12, 2018
    Assignee: TOYOTA MOTOR ENGINEERING & MANUFACTURING NORTH AMERICA, INC.
    Inventor: Eric Randell Schmidt
  • Patent number: 9984682
    Abstract: Provide automatic assessment of oral recitations during computer based language assessments using a trained neural network to automate the scoring and feedback processes without human transcription and scoring input by automatically generating a score of a language assessment. Providing an automatic speech recognition (“ASR”) scoring system. Training multiple scoring reference vectors associated with multiple possible scores of an assessment, and receiving an acoustic language assessment response to an assessment item. Based on the acoustic language assessment automatically generating a transcription, and generating an individual word vector from the transcription. Generating an input vector by concatenating an individual word vector with a transcription feature vector, and supplying an input vector as input to a neural network. Generating an output vector based on weights of a neural network; and generating a score by comparing an output vector with scoring vectors.
    Type: Grant
    Filed: March 30, 2017
    Date of Patent: May 29, 2018
    Assignee: Educational Testing Service
    Inventors: Jidong Tao, Lei Chen, Chong Min Lee
  • Patent number: 9972340
    Abstract: In a computer system for navigating to a location in recorded content, a computer receives a descriptive term or phrase associated with a searchable tag. The searchable tag corresponds to a point-in-time at which a non-speech sound occurred during the recording of recorded content of a communication between a plurality of participants. The recorded content includes speech from one or more of the plurality of participants, the descriptive term includes an automatically generated phonetic translation of the non-speech sound, and the non-speech sound was transmitted to the plurality of participants during the recording. The computer navigates to a location in the recorded content corresponding to the point-in-time at which the non-speech sound occurred.
    Type: Grant
    Filed: July 27, 2016
    Date of Patent: May 15, 2018
    Assignee: International Business Machines Corporation
    Inventors: Denise A. Bell, Lisa Seacat Deluca, Jana H. Jenkins, Jeffrey A. Kusnitz
  • Patent number: 9965028
    Abstract: Provided is a method for proximity sensing in an interactive display and a method of processing a proximity sensing image. The method for proximity-sensing may include suspending, in a three-dimensional (3D) space, at least one hand of a user to manipulate at least one of a viewpoint of the user and a position of a virtual object displayed on a screen, and changing the virtual object by manipulating at least one of the position of the virtual object and the viewpoint of the user based on the suspending.
    Type: Grant
    Filed: April 6, 2011
    Date of Patent: May 8, 2018
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Byung In Yoo, Du Sik Park, Chang Kyu Choi
  • Patent number: 9959296
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing suggestions within a document. In one aspect, a method includes obtaining textual input provided to a document editing application by a user device, the textual input being provided to the document editing application for inclusion in a document; identifying performance measures associated with the current editing session for the document, each performance measure being based on session data obtained from the user device during a document editing session, the session data being for the textual input and prior text that was included in the document prior to the textual input; providing the performance measures as input to a suggestion model that was trained using historical performance measures identified in performance logs for historical document editing sessions of users; and throttling textual suggestions during the current editing session based on the output of the suggestion model.
    Type: Grant
    Filed: May 12, 2014
    Date of Patent: May 1, 2018
    Assignee: Google LLC
    Inventors: Maxim Gubin, Kenneth W. Dauber, Sangsoo Sung, Krishna Bharat
  • Patent number: 9953632
    Abstract: According to an aspect of the present disclosure, a method for generating a keyword model of a user-defined keyword in an electronic device is disclosed. The method includes receiving at least one input indicative of the user-defined keyword, determining a sequence of subwords from the at least one input, generating the keyword model associated with the user-defined keyword based on the sequence of subwords and a subword model of the subwords, wherein the subword model is configured to model a plurality of acoustic features of the subwords based on a speech database, and providing the keyword model associated with the user-defined keyword to a voice activation unit configured with a keyword model associated with a predetermined keyword.
    Type: Grant
    Filed: August 22, 2014
    Date of Patent: April 24, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Sungrack Yun, Taesu Kim
  • Patent number: 9939896
    Abstract: Methods and systems for determining intent in voice and gesture interfaces are described. An example method includes determining that a gaze direction is in a direction of a gaze target, and determining whether a predetermined time period has elapsed while the gaze direction is in the direction of the gaze target. The method may also include providing an indication that the predetermined time period has elapsed when the predetermined time period has elapsed. According to the method, a voice or gesture command that is received after the predetermined time period has elapsed may be determined to be an input for a computing system. Additional example systems and methods are described herein.
    Type: Grant
    Filed: July 12, 2016
    Date of Patent: April 10, 2018
    Assignee: Google LLC
    Inventors: Eric Teller, Daniel Aminzade
  • Patent number: 9934777
    Abstract: User-specific language models (LMs) that include internal word indexes to a word table specific to the user-specific LM rather than a word table specific to a system-wide LM. When the system-wide LM is updated, the word table of the user-specific LM may be updated to translate the user-specific indices to system-wide indices. This prevents having to update the internal indices of the user-specific LM every time the system-wide LM is updated.
    Type: Grant
    Filed: August 26, 2016
    Date of Patent: April 3, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Shaun Nidhiri Joseph, Sonal Pareek, Ariya Rastrow, Gautam Tiwari, Alexander David Rosen
  • Patent number: 9934792
    Abstract: A method of detecting pre-determined phrases to determine compliance quality of an agent includes determining a presence of a predetermined input based on a comparison between stored pre-determined phrases and a received communication, and determining a compliance rating of the agent based on a presence of a pre-determined phrase associated with the predetermined input in the communication.
    Type: Grant
    Filed: January 31, 2017
    Date of Patent: April 3, 2018
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: I. Dan Melamed, Andrej Ljolje, Bernard S. Renger, David J. Smith, Yeon-Jun Kim
  • Patent number: 9922643
    Abstract: A method for adapting a phonetic dictionary for peculiarities of a speech of an at least one speaker, comprising generating search pronunciations for a search term, retrieving audio sections from an audio database for each search pronunciation, audibly presenting to a person the audio sections of the speech of the at least one speaker, and updating the phonetic dictionary based on acceptability of the audio sections determined from judgments by the person regarding intelligibility of the audio sections in audibly pronouncing the provided at least one word, wherein the method is performed on an at least one computerized apparatus configured to perform the method.
    Type: Grant
    Filed: December 23, 2014
    Date of Patent: March 20, 2018
    Assignee: NICE LTD.
    Inventors: Maor Nissan, Ronny Bretter
  • Patent number: 9922234
    Abstract: A biometric identification method comprising the steps of comparing a candidate print with a reference print and validating identification as a function of a number of characteristics that are common in the two prints and of a predetermined validation threshold, the method being characterized in that it comprises the steps of altering the biometric characteristics of one of the two prints prior to comparison and of taking the alteration into account during validation.
    Type: Grant
    Filed: June 16, 2016
    Date of Patent: March 20, 2018
    Assignee: MORPHO
    Inventors: Yves Bocktaels, Julien Bringer, Mael Berthier, Marcelin Ragot
  • Patent number: 9922664
    Abstract: A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.
    Type: Grant
    Filed: March 28, 2016
    Date of Patent: March 20, 2018
    Assignee: Nuance Communications, Inc.
    Inventors: Dushyant Sharma, Patrick Naylor, Uwe Helmut Jost
  • Patent number: 9916010
    Abstract: Systems and methods described herein are for transmitting a command to a remote system. A processing system determines the identity of the user based on the unique identifier and the biometric information. Thereafter, a sensor detects a gesture performed by the user. The sensor is configured to detect the gesture performed by the user when the user is located within the detectable range of the wireless antenna. The processing system determines an action associated with the detected gesture based on the identity of the user and sends a command to a remote computer system to cause it to perform the action associated with the detected gesture.
    Type: Grant
    Filed: May 18, 2015
    Date of Patent: March 13, 2018
    Assignee: VISA INTERNATIONAL SERVICE ASSOCIATION
    Inventors: Theodore Harris, Scott Edington, Patrick Faith
  • Patent number: 9905267
    Abstract: In one aspect, an example method includes (i) receiving a first group of video content items; (ii) identifying from among the first group of video content items, a second group of video content items having a threshold extent of similarity with each other; (iii) determining a quality score for each video content item of the second group; (iv) identifying from among the second group of video content items, a third group of video content items each having a quality score that exceeds a quality score threshold; and (v) based on the identifying of the third group, transmitting at least a portion of at least one video content item of the identified third group to a digital video-effect (DVE) system, wherein the system is configured for using the at least the portion of the at least one video content item of the identified third group to generate a video content item.
    Type: Grant
    Filed: July 13, 2016
    Date of Patent: February 27, 2018
    Assignee: Gracenote, Inc.
    Inventors: Dale T. Roberts, Michael Gubman
  • Patent number: 9905221
    Abstract: A system and method for automatic generation of a database for speech recognition, comprising: a source of text signals; a source of audio signals comprising an audio representation of said text signals; a text words separation module configured to separate said text into a string of text words; an audio words separation module configured to separate said audio signal into a string of audio words; and a matching module configured to receive said string of text words and said string of audio words and store each pair of matching text word and audio word in a database.
    Type: Grant
    Filed: March 9, 2014
    Date of Patent: February 27, 2018
    Inventor: Igal Nir
  • Patent number: 9864741
    Abstract: Knowledge automation techniques may include selecting a knowledge element from a knowledge corpus of an enterprise for extraction of n-grams, and deriving a term vector comprising terms in the knowledge element. Based at least on a frequency of occurrence of each term in the knowledge element, key terms are identified in the term vector. Thereafter, the identified key terms are used to extract one or more n-grams from the knowledge element. Each of the extracted n-grams is scored as a function of at least a frequency of occurrence of each of the n-grams across the knowledge corpus of the enterprise, and based on the scoring, one or more of the n-grams is added to a collective term and phrase index.
    Type: Grant
    Filed: September 23, 2015
    Date of Patent: January 9, 2018
    Assignee: PRYSM, INC.
    Inventors: Gazi Mahmud, Seenu Banda, Deanna Liang
  • Patent number: 9837071
    Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions.
    Type: Grant
    Filed: April 6, 2015
    Date of Patent: December 5, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 9836450
    Abstract: Systems, methods, and apparatuses are presented for a trained language model to be stored in an efficient manner such that the trained language model may be utilized in virtually any computing device to conduct natural language processing. Unlike other natural language processing engines that may be computationally intensive to the point of being capable of running only on high performance machines, the organization of the natural language models according to the present disclosures allows for natural language processing to be performed even on smaller devices, such as mobile devices.
    Type: Grant
    Filed: December 9, 2015
    Date of Patent: December 5, 2017
    Assignee: Sansa AI Inc.
    Inventors: Schuyler D. Erle, Robert J. Munro, Brendan D. Callahan, Gary C. King, Jason Brenier, James B. Robinson
  • Patent number: 9818401
    Abstract: Various embodiments contemplate systems and methods for performing automatic speech recognition (ASR) and natural language understanding (NLU) that enable high accuracy recognition and understanding of freely spoken utterances which may contain proper names and similar entities. The proper name entities may contain or be comprised wholly of words that are not present in the vocabularies of these systems as normally constituted. Recognition of the other words in the utterances in question, e.g. words that are not part of the proper name entities, may occur at regular, high recognition accuracy. Various embodiments provide as output not only accurately transcribed running text of the complete utterance, but also a symbolic representation of the meaning of the input, including appropriate symbolic representations of proper name entities, adequate to allow a computer system to respond appropriately to the spoken request without further analysis of the user's input.
    Type: Grant
    Filed: September 19, 2016
    Date of Patent: November 14, 2017
    Assignee: Promptu Systems Corporation
    Inventor: Harry William Printz
  • Patent number: 9805719
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, receiving audio data; determining that an initial portion of the audio data corresponds to an initial portion of a hotword; in response to determining that the initial portion of the audio data corresponds to the initial portion of the hotword, selecting, from among a set of one or more actions that are performed when the entire hotword is detected, a subset of the one or more actions; and causing one or more actions of the subset to be performed.
    Type: Grant
    Filed: March 23, 2017
    Date of Patent: October 31, 2017
    Assignee: Google Inc.
    Inventor: Matthew Sharifi
  • Patent number: 9799228
    Abstract: Computer-implemented systems and methods are provided for scoring content of a spoken response to a prompt. A scoring model is generated for a prompt, where generating the scoring model includes generating a transcript for each of a plurality of training responses to the prompt, dividing the plurality of training responses into clusters based on the transcripts of the training responses, selecting a subset of the training responses in each cluster for scoring, scoring the selected subset of training responses for each cluster, and generating content training vectors using the transcripts from the scored subset. A transcript is generated for a received spoken response to be scored, and a similarity metric is computed between the transcript of the spoken response to be scored and the content training vectors. A score is assigned to the spoken response based on the determined similarity metric.
    Type: Grant
    Filed: January 10, 2014
    Date of Patent: October 24, 2017
    Assignee: Educational Testing Service
    Inventors: Lei Chen, Klaus Zechner, Anastassia Loukina
  • Patent number: 9761223
    Abstract: At least one spoken utterance and a stored vehicle acoustic impulse response can be provided to a computing device. The computing device is programmed to provide at least one speech file based at least in part on the spoken utterance and the vehicle acoustic impulse response.
    Type: Grant
    Filed: October 13, 2014
    Date of Patent: September 12, 2017
    Assignee: FORD GLOBAL TECHNOLOGIES, LLC
    Inventors: Michael Alan Blommer, Scott Andrew Amman, Brigitte Frances Mora Richardson, Francois Charette, Mark Edward Porter, Gintaras Vincent Puskorius, Anthony Dwayne Cooprider
  • Patent number: 9756093
    Abstract: Systems and methods are provided for creating a sample of electronic content for transmission to others. For example, a user may select a portion of an electronic content file to be transmitted as an excerpt file to another user. In some instances, transmission may be prohibited based on various criteria.
    Type: Grant
    Filed: December 30, 2013
    Date of Patent: September 5, 2017
    Assignee: Audible, Inc.
    Inventors: Guy Story, Howard Wolfish, Bryan Field-Elliot, Glenn Rogers, Alexander Galkin, Igor Grebnev, John Federico, Steven Hatch, Deepa Muralikrishnan, Arik Meyer
  • Patent number: 9754593
    Abstract: A speech recognition capability in which speakers of spoken text are identified based on the contour of sound waves representing the spoken text. Variations in the contour of the sound waves are identified, features are assigned to those variations, and parameters of those features are grouped into predefined characteristics. The predefined characteristics are combined into voice characteristic groups. If a prior voice characteristic group is present, the voice characteristic group from the soundlet is compared to existing voice characteristic groups and, if a match is present, the sound construct is assigned to a speaker identified by the existing voice characteristic group.
    Type: Grant
    Filed: November 4, 2015
    Date of Patent: September 5, 2017
    Assignee: International Business Machines Corporation
    Inventor: Mukundan Sundararajan
  • Patent number: 9753546
    Abstract: A system and method for selective gesture interaction using spatial volumes is disclosed. The method includes processing data frames that each includes one or more body point locations of a collaborating user that is interfacing with an application at each time intervals, defining a spatial volume for each collaborating user based on the processed data frames, detecting a gesture performed by a first collaborating user based on the processed data frames, determining the gesture to be an input gesture performed by the first collaborating user in a first spatial volume, interpreting the input gesture based on a context of the first spatial volume that includes a role of the first collaborating user, a phase of the application, and an intersection volume between the first spatial volume and a second spatial volume for a second collaborating user, and providing an input command to the application based on the interpreted input gesture.
    Type: Grant
    Filed: August 29, 2014
    Date of Patent: September 5, 2017
    Assignee: General Electric Company
    Inventors: Habib Abi-Rached, Jeng-Weei Lin, Sundar Murugappan, Arnold Lund, Veeraraghavan Ramaswamy
  • Patent number: 9747895
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for building language models. One of the methods includes identifying a first group of one or more users associated with a user in a social network. The method includes identifying first linguistic information associated with the first group. The method includes generating a first language model based on the first linguistic information. The method includes identifying a second group of one or more users associated with the user. The method includes identifying second linguistic information associated with the second group. The method includes generating a second language model based on the second linguistic information. The method includes associating the first language model and the second language model with the user.
    Type: Grant
    Filed: July 8, 2013
    Date of Patent: August 29, 2017
    Assignee: Google Inc.
    Inventors: Martin Jansche, Mark Edward Epstein
  • Patent number: 9743212
    Abstract: The subject disclosure is directed towards calibrating sound pressure levels of speakers to determine desired attenuation data for use in later playback. A user may be guided to a calibration location to place a microphone, and each speaker is calibrated to output a desired sound pressure level in its current acoustic environment based upon the attenuation data learned during calibration. During playback, the attenuation data is used. Also described is testing the setup of the speakers, and dynamically adjusting the attenuation data in real time based upon tracking the listener's current location.
    Type: Grant
    Filed: June 24, 2016
    Date of Patent: August 22, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Robert Lucas Ridihalgh, Gregory Michael Shaw, Todd Matthew Williams, Tarlochan Singh Randhawa
  • Patent number: 9741342
    Abstract: A dictation device includes: an audio input device configured to receive a voice utterance including a plurality of words; a video input device configured to receive video of lip motion during the voice utterance; a memory portion; a controller configured according to instructions in the memory portion to generate first data packets including an audio stream representative of the voice utterance and a video stream representative of the lip motion; and a transceiver for sending the first data packets to a server end device and receiving second data packets including combined dictation based upon the audio stream and the video stream from the server end device. In the combined dictation, first dictation generated based upon the audio stream has been corrected by second dictation generated based upon the video stream.
    Type: Grant
    Filed: August 13, 2015
    Date of Patent: August 22, 2017
    Assignee: Panasonic Intellectual Property Corporation of America
    Inventors: Yuichiro Takayanagi, Masashi Kusaka
  • Patent number: 9734826
    Abstract: Optimized language models are provided for in-domain applications through an iterative, joint-modeling approach that interpolates a language model (LM) from a number of component LMs according to interpolation weights optimized for a target domain. The component LMs may include class-based LMs, and the interpolation may be context-specific or context-independent. Through iterative processes, the component LMs may be interpolated and used to express training material as alternative representations or parses of tokens. Posterior probabilities may be determined for these parses and used for determining new (or updated) interpolation weights for the LM components, such that a combination or interpolation of component LMs is further optimized for the domain. The component LMs may be merged, according to the optimized weights, into a single, combined LM, for deployment in an application scenario.
    Type: Grant
    Filed: March 11, 2015
    Date of Patent: August 15, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Levit, Sarangarajan Parthasarathy, Andreas Stolcke, Shuangyu Chang
  • Patent number: 9736613
    Abstract: Methods, apparatus, and computer programs for simulating the source of sound are provided. One method includes operations for determining a location in space of the head of a user utilizing face recognition of images of the user. Further, the method includes an operation for determining a sound for two speakers, and an operation for determining an emanating location in space for the sound, each speaker being associated with one ear of the user. The acoustic signals for each speaker are established based on the location in space of the head, the sound, the emanating location in space, and the auditory characteristics of the user. In addition, the acoustic signals are transmitted to the two speakers. When the acoustic signals are played by the two speakers, the acoustic signals simulate that the sound originated at the emanating location in space.
    Type: Grant
    Filed: May 7, 2015
    Date of Patent: August 15, 2017
    Assignee: Sony Interactive Entertainment Inc.
    Inventor: Steven Osman
  • Patent number: 9728185
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for recognizing speech using neural networks. One of the methods includes receiving an audio input; processing the audio input using an acoustic model to generate a respective phoneme score for each of a plurality of phoneme labels; processing one or more of the phoneme scores using an inverse pronunciation model to generate a respective grapheme score for each of a plurality of grapheme labels; and processing one or more of the grapheme scores using a language model to generate a respective text label score for each of a plurality of text labels.
    Type: Grant
    Filed: May 22, 2015
    Date of Patent: August 8, 2017
    Assignee: Google Inc.
    Inventors: Johan Schalkwyk, Francoise Beaufays, Hasim Sak, John Giannandrea
  • Patent number: 9728183
    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for combining frame and segment level processing, via temporal pooling, for phonetic classification. A frame processor unit receives an input and extracts the time-dependent features from the input. A plurality of pooling interface units generates a plurality of feature vectors based on pooling the time-dependent features and selecting a plurality of time-dependent features according to a plurality of selection strategies. Next, a plurality of segmental classification units generates scores for the feature vectors. Each segmental classification unit (SCU) can be dedicated to a specific pooling interface unit (PIU) to form a PIU-SCU combination. Multiple PIU-SCU combinations can be further combined to form an ensemble of combinations, and the ensemble can be diversified by varying the pooling operations used by the PIU-SCU combinations.
    Type: Grant
    Filed: November 10, 2015
    Date of Patent: August 8, 2017
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Sumit Chopra, Dimitrios Dimitriadis, Patrick Haffner
  • Patent number: 9728188
    Abstract: Systems and methods for detecting similar audio being received by separate voice activated electronic devices, and ignoring those commands, is described herein. In some embodiments, a voice activated electronic device may be activated by a wakeword that is output by the additional electronic device, such as a television or radio, may capture audio of sound subsequently following the wakeword, and may send audio data representing the sound to a backend system. Upon receipt, the backend system may, in parallel to performing automated speech recognition processing to the audio data, generate a sound profile of the audio data, and may compare that sound profile to sound profiles of recently received audio data and/or flagged sound profiles. If the generated sound profile is determined to match another sound profiles, then the automated speech recognition processing may be stopped, and the voice activated electronic device may be instructed to return to a keyword spotting mode.
    Type: Grant
    Filed: June 28, 2016
    Date of Patent: August 8, 2017
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Alexander David Rosen, Michael James Rodehorst, George Jay Tucker, Aaron Lee Mathers Challenner
  • Patent number: 9715874
    Abstract: Techniques are described for updating an automatic speech recognition (ASR) system that, prior to the update, is configured to perform ASR using a first finite-state transducer (FST) comprising a first set of paths representing recognizable speech sequences. A second FST may be accessed, comprising a second set of paths representing speech sequences to be recognized by the updated ASR system. By analyzing the second FST together with the first FST, a patch may be extracted and provided to the ASR system as an update, capable of being applied non-destructively to the first FST at the ASR system to cause the ASR system using the first FST with the patch to recognize speech using the second set of paths from the second FST. In some embodiments, the patch may be configured such that destructively applying the patch to the first FST creates a modified FST that is globally minimized.
    Type: Grant
    Filed: October 30, 2015
    Date of Patent: July 25, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Stephan Kanthak, Jan Vlietinck, Johan Vantieghem, Stijn Verschaeren
  • Patent number: 9711139
    Abstract: A method for building a language model, a speech recognition method and an electronic apparatus are provided. The speech recognition method includes the following steps. Phonetic transcriptions of a speech signal are obtained from an acoustic model. Phonetic spellings matching the phonetic transcriptions are obtained according to the phonetic transcriptions and a syllable acoustic lexicon. According to the phonetic spellings, a plurality of text sequences and a plurality of text sequence probabilities are obtained from a language model. Each phonetic spelling is matched to a candidate sentence table; a word probability of each phonetic spelling matching a word in a sentence of the sentence table are obtained; and the word probabilities of the phonetic spellings are calculated so as to obtain the text sequence probabilities. The text sequence corresponding to a largest one of the sequence probabilities is selected as a recognition result of the speech signal.
    Type: Grant
    Filed: July 5, 2016
    Date of Patent: July 18, 2017
    Assignee: VIA Technologies, Inc.
    Inventor: Guo-Feng Zhang
  • Patent number: 9706280
    Abstract: At least one exemplary embodiment is directed to an acoustic device including an ear canal microphone configured to detect a first acoustic signal, an ambient microphone configured to detect a second acoustic signal, and a processor operatively coupled to the ear canal microphone and the ambient microphone. The processor is configured to detect a predetermined speech pattern based on an analysis of the first acoustic signal and the second acoustic signal and upon detection of the predetermined speech pattern, subsequently process acoustic signals from the ear canal microphone for a predetermined time or until the ear canal microphone detects a lack of an acoustic signal for a second predetermined time. After the predetermined time or after a second predetermined time, the processor processes acoustic signals from the ear canal microphone and the ambient microphone.
    Type: Grant
    Filed: July 15, 2015
    Date of Patent: July 11, 2017
    Assignee: Personics Holdings, LLC
    Inventors: John Usher, Steve Goldstein, Marc Boillot
  • Patent number: 9697822
    Abstract: A method for updating an adaptive speech recognition model is provided. In some implementations, the method is performed at a communications device including one or more processors and memory storing instructions for execution by the one or more processors. The method includes determining that a first user of a first mobile communication device is engaged in a call over a communications network and providing an adaptive speech recognition model The method also includes analyzing an outbound audio channel of the first mobile communication device to obtain a call audio signal corresponding to audio input from one or more microphones of the first mobile communication device and updating the adaptive speech recognition model with training data derived from the call audio signal.
    Type: Grant
    Filed: April 28, 2014
    Date of Patent: July 4, 2017
    Assignee: Apple Inc.
    Inventors: Devang K. Naik, Onur E. Tackin
  • Patent number: 9692894
    Abstract: A method of generating a customer satisfaction score based on behavioral assessment data across one or more recorded communications, which includes analyzing one or more communications between a customer and an agent by applying a linguistic-based psychological behavioral model to each communication to determine a personality type of the customer, selecting at least one filter criterion which comprises a customer, an agent, a team, or a call type, calculating a customer satisfaction score using the at least one selected filter criterion across a selected time interval and based on one or more communications, and displaying a report including the calculated customer satisfaction score to a user that matches the at least one selected filter criterion for the selected time interval. Systems and non-transitory computer readable media configured to generate a customer satisfaction score based on behavioral assessment data are also included.
    Type: Grant
    Filed: August 5, 2016
    Date of Patent: June 27, 2017
    Assignee: Mattersight Corporation
    Inventors: Kelly Conway, David Gustafson, Douglas Brown, Christopher Danson
  • Patent number: 9685159
    Abstract: A method for speaker recognition comprising: obtaining speaker information for a target speaker; obtaining speech samples from telephone calls from an unknown speaker; classifying the speech samples according the unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information of each of the speaker-dependent classes of speech samples; combining the extracted speaker information; comparing the combined extracted speaker information with the stored speaker information for the target speaker to obtain a comparison result; and determining whether the unknown speaker is identical with the target speaker based on the comparison result.
    Type: Grant
    Filed: August 3, 2015
    Date of Patent: June 20, 2017
    Assignee: Agnitio SL
    Inventors: Marta Garcia Gomar, Johan Nikolaas Langehoven Brummer, Luis Buera Rodriguez
  • Patent number: 9674355
    Abstract: A system and method for processing call data is provided. A call between a user and an agent is monitored. A selected script having a dialog grammar is received. The selected script is executed by converting at least a portion of the script into synthesized speech utterances and providing the synthesized speech utterances to the user. Speech utterances are received from the user in reply to each of the synthesized speech utterances from the script. Each received speech utterance is converted to text as a user message and a form is populated with the user messages. The user speech utterances and the form with the user messages are provided to the agent.
    Type: Grant
    Filed: March 10, 2016
    Date of Patent: June 6, 2017
    Assignee: Intellisist, Inc.
    Inventors: Gilad Odinak, Alastair Sutherland, William A. Tolhurst
  • Patent number: 9671871
    Abstract: An apparatus and a method of recognizing a gesture uses an infrared ray. The apparatus for recognizing a gesture, includes a sensing unit which detects a gesture using an infrared sensor to obtain a sensing value from the sensing result; a control unit which performs gesture recognition to which an intention of a user is reflected in accordance with a predetermined recognizing mode based on the obtained sensing value; and a storing unit which stores the predetermined recognizing mode when the gesture recognition set in advance by the user is performed. The predetermined recognizing mode includes a first recognizing mode in which the gesture is directly recognized and a second recognizing mode in which the gesture is recognized after recognizing a hold motion for determining start of the gesture recognition.
    Type: Grant
    Filed: February 12, 2015
    Date of Patent: June 6, 2017
    Assignee: HYUNDAI MOBIS CO., LTD
    Inventor: Chan Hee Park
  • Patent number: 9666182
    Abstract: Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.
    Type: Grant
    Filed: October 5, 2015
    Date of Patent: May 30, 2017
    Assignee: Nuance Communications, Inc.
    Inventors: Dilek Z. Hakkani-Tur, Mazin G. Rahim, Giuseppe Riccardi, Gokhan Tur
  • Patent number: 9666179
    Abstract: A waveform memory that stores a plurality of speech unit waveforms corresponding to respective speech units, wherein an address order of the speech unit waveforms is determined by a sort order of speech units included in a speech unit sequence corresponding to a phoneme sequence of training data, and the speech units included in the speech unit sequence are selected so as to synthesize a speech of the phone sequence.
    Type: Grant
    Filed: February 26, 2014
    Date of Patent: May 30, 2017
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventor: Takehiko Kagoshima
  • Patent number: 9659092
    Abstract: A music information searching method includes extracting modulating spectrums from audio data, generating modulating spectrum peak point audio fingerprints by using position information which relates to preset peak points from the extracted modulating spectrums, converting the generated modulating spectrum peak point audio fingerprints into hash keys which indicate addresses of hash tables and hash values that are stored on the hash tables via hash functions, and searching music information by extracting hash keys which relate to audio query clips and comparing the extracted hash keys with the indicated addresses of the hash tables.
    Type: Grant
    Filed: November 13, 2013
    Date of Patent: May 23, 2017
    Assignees: SAMSUNG ELECTRONICS CO., LTD., KWANGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION
    Inventors: Ki-wan Eom, Hyoung-Gook Kim, Kwang-ki Kim