Probability Patents (Class 704/240)

Correcting voice recognition using selective re-speak

Patent number: 10354647

Abstract: Implementations of the present disclosure include actions of providing first text for display on a computing device of a user, the first text being provided from a first speech recognition engine based on first speech received from the computing device, and being displayed as a search query, receiving a speech correction indication from the computing device, the speech correction indication indicating a portion of the first text that is to be corrected, receiving second speech from the computing device, receiving second text from a second speech recognition engine based on the second speech, the second speech recognition engine being different from the first speech recognition engine, replacing the portion of the first text with the second text to provide a combined text, and providing the combined text for display on the computing device as a revised search query.

Type: Grant

Filed: April 28, 2016

Date of Patent: July 16, 2019

Assignee: Google LLC

Inventors: Dhruv Bakshi, Zaheed Sabur, Tilke Mary Judd, Nicholas G. Fey
Natural language grammar enablement by speech characterization

Patent number: 10347245

Abstract: Either or both of voice speaker identification or utterance classification such as by age, gender, accent, mood, and prosody characterize speech utterances in a system that performs automatic speech recognition (ASR) and natural language processing (NLP). The characterization conditions NLP, either through application to interpretation hypotheses or to specific grammar rules. The characterization also conditions language models of ASR. Conditioning may comprise enablement and may comprise reweighting of hypotheses.

Type: Grant

Filed: January 20, 2017

Date of Patent: July 9, 2019

Assignee: SOUNDHOUND, INC.

Inventor: Karl Stahl
Comparison of feature vectors of data using similarity function

Patent number: 10332016

Abstract: The invention concerns a method to compare two data obtained from a sensor or interface, carried out by processing means of a processing unit, the method comprising the computing of a similarity function between two feature vectors of the data to be compared, characterized in that each feature vector of a datum is modelled as the summation of Gaussian variables, said variables comprising: a mean of a class to which the vector belongs, an intrinsic deviation, and an observation noise of the vector, each feature vector being associated with a quality vector comprising information on the observation noise of the feature vector, and in that the similarity function is computed from the feature vectors and associated quality vectors.

Type: Grant

Filed: November 3, 2015

Date of Patent: June 25, 2019

Assignee: IDEMIA IDENTITY & SECURITY

Inventors: Julien Bohne, Stephane Gentric
Method and system for motion analysis and fall prevention

Patent number: 10319209

Abstract: A system and method of motion analysis, fall detection, and fall prediction using machine learning and classifiers. A wearable motion sensor for collecting and transmitting motion data for use in a fall prediction model using features and parameters to classify the motion data and notify when a fall is emergent. Using machine learning, the fall prediction model can be created, implemented, evaluated, and it can evolve over time with additional data. The system and method can use individual data or pool data from various individuals for use in fall prediction.

Type: Grant

Filed: June 5, 2017

Date of Patent: June 11, 2019

Inventor: John Carlton-Foss
Method, non-transitory computer-readable medium, and device for controlling a user-interface

Patent number: 10318009

Abstract: For the purpose of enhancing accessibility of a user with respect to various applications, the disclosed technique provides a method for controlling a user-interface that provides an instruction to an application through a user-operation which is performed on a display provided on a device. The method includes a process performed by the device. The process includes: acquiring information displayed by the display; extracting at least one feature existing on the acquired information; receiving an action of a user; searching a database to identify a predetermined operation, which corresponds to the received action and the extracted at least one feature; and providing an instruction to the application through applying the identified predetermined operation, not the received action, to the user-interface.

Type: Grant

Filed: April 13, 2017

Date of Patent: June 11, 2019

Assignee: HI CORPORATION

Inventors: Tomonobu Aoyama, Tatsuo Sasaki, Seiichi Kataoka
Search query predictions by a keyboard

Patent number: 10305828

Abstract: A computing device is described that includes at least one processor and a memory including instructions that when executed cause the at least one processor to output, for display, a graphical keyboard comprising a plurality of keys, and determine, based on an indication of a selection of one or more keys from the plurality of keys, text of an electronic communication. The instructions, when executed, further cause the at least one processor to identify, based at least in part on the text, a searchable entity or trigger phrase, generate, based on the searchable entity or trigger phrase, a search query, and output, for display, within the graphical keyboard, a graphical indication to indicate that the computing device generated the search query.

Type: Grant

Filed: April 20, 2016

Date of Patent: May 28, 2019

Assignee: Google LLC

Inventors: Jing Cao, Alexa Greenberg, Abhanshu Sharma, Yanchao Su, Nicholas Kong, Muhammad Mohsin, Jacek Jurewicz, Wei Huang, Matthew Sharifi, Benjamin Sidhom
Data storage management system

Patent number: 10296633

Abstract: A system includes a storage system configured to store data objects as a plurality of shards according to a redundancy encoding technique at a plurality of availability zones. The system further includes a redundancy reduction manager configured to perform a shard spreading process and a shard pruning process. The shard spreading process involves identifying an underutilized availability zone for a particular data object and moving at least one shard of the particular data object from another availability zone to the underutilized availability zone. The shard pruning process involves identifying a pruning candidate availability zone and deleting a shard of a particular data object at the pruning candidate availability zone in response to determining that deleting the shard would not violate a durability model for the particular data object.

Type: Grant

Filed: March 23, 2016

Date of Patent: May 21, 2019

Assignee: Amazon Technologies, Inc.

Inventor: Jonathan Robert Collins
Audio file re-recording method, device and storage medium

Patent number: 10283168

Abstract: Provided are an audio file re-recording method and device, and a storage medium. The method includes: determining first time, the first time being start time of a recorded clip to be re-recorded in an audio file; playing a first recorded clip that has been recorded, the first recorded clip using the first time as end time in the audio file; upon arrival of the first time, collecting first voice data of a user to obtain a second recorded clip; and processing the first recorded clip and the second recorded clip to obtain a re-recorded audio file.

Type: Grant

Filed: May 1, 2018

Date of Patent: May 7, 2019

Assignee: GUANGZHOU KUGOU COMPUTER TECHNOLOGY CO., LTD.

Inventor: Suiyu Feng
Speech recognition apparatus and method

Patent number: 10242668

Abstract: An apparatus includes a language model group identifier configured to identify a language model group based on determined characteristic data of a user, and a language model generator configured to generate a user-based language model by interpolating a general language model for speech recognition based on the identified language model group.

Type: Grant

Filed: August 3, 2016

Date of Patent: March 26, 2019

Assignee: Samsung Electronics Co., Ltd.

Inventor: Min Young Mun
Voice and connection platform

Patent number: 10235996

Abstract: A system and method for providing a voice assistant including receiving, at a first device, a first audio input from a user requesting a first action; performing automatic speech recognition on the first audio input; obtaining a context of user; performing natural language understanding based on the speech recognition of the first audio input; and taking the first action based on the context of the user and the natural language understanding.

Type: Grant

Filed: September 30, 2015

Date of Patent: March 19, 2019

Assignee: Xbrain, Inc.

Inventors: Gregory Renard, Mathias Herbaux
Method and apparatus for automatic speech recognition

Patent number: 10199040

Abstract: A method of automatic speech recognition, the method comprising the steps of receiving a speech signal, dividing the speech signal into time windows, for each time window, determining acoustic parameters of the speech signal within that window, and identifying speech features from the acoustic parameters, such that a sequence of speech features are generated for the speech signal, separating the sequence of speech features into a sequence of phonological segments, and comparing the sequential phonological segments to a stored lexicon to identify one or more words in the speech signal.

Type: Grant

Filed: December 17, 2014

Date of Patent: February 5, 2019

Assignee: OXFORD UNIVERSITY INNOVATION LIMITED

Inventors: Aditi Lahiri, Henning Reetz, Philip Roberts
Finding of a target document in a spoken language processing

Patent number: 10152507

Abstract: Methods and systems are provided for finding a target document in spoken language processing. One of the methods includes calculating a score of each document in a document set, in response to a receipt of first n words of output of an automatic speech recognition (ASR) system, n being equal or greater than zero. The method further includes reading a prior distribution of each document in the document set from a memory device, and updating, for each document in the document set, the score, using the prior distribution, and a weight for interpolation, the weight for interpolation being set based on a confidence score of output of the ASR system. The method additionally includes finding a target document among the document set, based on the updated score of each document.

Type: Grant

Filed: March 22, 2016

Date of Patent: December 11, 2018

Assignee: International Business Machines Corporation

Inventors: Gakuto Kurata, Masayuki A. Suzuki, Ryuki Tachibana
Conditional random field model compression

Patent number: 10140581

Abstract: Features are disclosed for generating models, such as conditional random field (“CRF”) models, that consume less storage space and/or transmission bandwidth than conventional models. In some embodiments, the generated CRF models are composed of fewer or alternate components in comparison with conventional CRF models. For example, a system generating such CRF models may forgo the use of large dictionaries or other cross-reference lists that map information extracted from input (e.g., “features”) to model parameters; reduce in weight (or exclude altogether) certain model parameters that may not have a significant effect on model accuracy; and/or reduce the numerical precision of model parameters.

Type: Grant

Filed: December 22, 2014

Date of Patent: November 27, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Imre Attila Kiss, Wei Chen, Anjishnu Kumar
Using a neural network to determine how to direct a flow

Patent number: 10129135

Abstract: A flow of packets is communicated through a data center. The data center includes multiple racks, where each rack includes multiple network devices. A group of packets of the flow is received onto an integrated circuit located in a first network device. The integrated circuit includes a neural network. The neural network analyzes the group of packets and in response outputs a neural network output value. The neural network output value is used to determine how the packets of the flow are to be output from a second network device. In one example, each packet of the flow output by the first network device is output along with a tag. The tag is indicative of the neural network output value. The second device uses the tag to determine which output port located on the second device is to be used to output each of the packets.

Type: Grant

Filed: September 1, 2015

Date of Patent: November 13, 2018

Assignee: Netronome Systems, Inc.

Inventor: Nicolaas J. Viljoen
Extensible context-aware natural language interactions for virtual personal assistants

Patent number: 10127224

Abstract: Technologies for extensible, context-aware natural language interactions include a computing device having a number of context source modules. Context source modules may be developed or installed after deployment of the computing device to a user. Each context source module includes a context capture module, a language model, one or more database query mappings, and may include one or more user interface element mappings. The context capture module interprets, generates, and stores context data. A virtual personal assistant (VPA) of the computing device indexes the language models and generates a semantic representation of a user request that associates each word of the request to a language model. The VPA translates the user request into a database query, and may generate a user interface element for the request. The VPA may execute locally on the computing device or remotely on a cloud server. Other embodiments are described and claimed.

Type: Grant

Filed: August 30, 2013

Date of Patent: November 13, 2018

Assignee: Intel Corporation

Inventor: William C. Deleeuw
Managing private use of program execution capacity

Patent number: 10114668

Abstract: Techniques are described for managing execution of programs, including using excess program execution capacity of one or more computing systems. For example, a private pool of excess computing capacity may be maintained for a user based on unused dedicated program execution capacity allocated for that user, with the private pool of excess capacity being available for priority use by that user. Such private excess capacity pools may further in some embodiments be provided in addition to a general, non-private excess computing capacity pool that is available for use by multiple users, optionally including users who are associated with the private excess capacity pools. In some such situations, excess computing capacity may be made available to execute programs on a temporary basis, such that the programs executing using the excess capacity may be terminated at any time if other preferred use for the excess capacity arises.

Type: Grant

Filed: December 29, 2014

Date of Patent: October 30, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Eric Jason Brandwine, James Alfred Gordon Greenfield
Apparatus and method for decoding to recognize speech using a third speech recognizer based on first and second recognizer results

Patent number: 10115394

Abstract: An object is to provide a technique which can provide a highly valid recognition result while preventing unnecessary processing. A voice recognition device includes first to third voice recognition units, and a control unit. When it is decided based on recognition results obtained by the first and second voice recognition units to cause the third voice recognition unit to recognize an input voice, the control unit causes the third voice recognition unit to recognize the input voice by using a dictionary including a candidate character string obtained by at least one of the first and second voice recognition units.

Type: Grant

Filed: July 8, 2014

Date of Patent: October 30, 2018

Assignee: MITSUBISHI ELECTRIC CORPORATION

Inventors: Naoya Sugitani, Yohei Okato, Michihiro Yamazaki
Determining graphical elements associated with text

Patent number: 10078673

Abstract: A computing device is described that includes at least one processor and a memory including instructions that when executed cause the at least one processor to output for display a graphical keyboard comprising a plurality of keys, determine, based on an indication of a selection of one or more keys from the plurality of keys, inputted, determine, based on the inputted text, an information category associated with the inputted text, determine, based on the information category, a graphical symbol associated with the information category, and output, for display, the graphical symbol in a suggestion region of the graphical keyboard.

Type: Grant

Filed: April 20, 2016

Date of Patent: September 18, 2018

Assignee: Google LLC

Inventors: Jens Nagel, Alexa Greenberg, Christian Paul Charsagua
Video-based sound source separation

Patent number: 10078785

Abstract: A sound source separation method comprising the steps of determining at least one location of at least one sound source based on video data, determining initial estimates of at least two parameters characterizing an audio signal emitted by said sound source, said initial estimates being determined based on said at least one location, performing an expectation-maximization method for determining final estimates of said parameters, and separating the audio signal from a combination of audio signals based on said final estimates of said parameters.

Type: Grant

Filed: December 15, 2015

Date of Patent: September 18, 2018

Assignee: Canon Kabushiki Kaisha

Inventors: Johann Citerin, Gérald Kergourlay
Object recognition device, checkout terminal, and method for processing information

Patent number: 10062067

Abstract: An object recognition device includes an operation unit configured to receive a user input about an item, a storage unit that stores image data of the item, an imaging unit configured to acquire an image of the item and generate image data therefrom, and a control unit configured to compare the generated image data with the stored image data, and cause information about updating the stored image data to be presented to a user, based on a comparison result.

Type: Grant

Filed: January 6, 2015

Date of Patent: August 28, 2018

Assignee: TOSHIBA TEC KABUSHIKI KAISHA

Inventor: Masatsugu Fukuda
Multimodal transmission of packetized data

Patent number: 10032452

Abstract: A system of multi-modal transmission of packetized data in a voice activated data packet based computer network environment is provided. A natural language processor component can parse an input audio signal to identify a request and a trigger keyword. Based on the input audio signal, a direct action application programming interface can generate a first action data structure, and a content selector component can select a content item. An interface management component can identify first and second candidate interfaces, and respective resource utilization values. The interface management component can select, based on the resource utilization values, the first candidate interface to present the content item. The interface management component can provide the first action data structure to the client computing device for rendering as audio output, and can transmit the content item converted for a first modality to deliver the content item for rendering from the selected interface.

Type: Grant

Filed: December 30, 2016

Date of Patent: July 24, 2018

Assignee: GOOGLE LLC

Inventors: Gaurav Bhaya, Robert Stets
Compact HCLG FST

Patent number: 10013974

Abstract: Compact finite state transducers (FSTs) for automatic speech recognition (ASR). An HCLG FST and/or G FST may be compacted at training time to reduce the size of the FST to be used at runtime. The compact FSTs may be significantly smaller (e.g., 50% smaller) in terms of memory size, thus reducing the use of computing resources at runtime to operate the FSTs. The individual arcs and states of each FST may be compacted by binning individual weights, thus reducing the number of bits needed for each weight. Further, certain fields such as a next state ID may be left out of a compact FST if an estimation technique can be used to reproduce the next state at runtime. During runtime portions of the FSTs may be decompressed for processing by an ASR engine.

Type: Grant

Filed: June 20, 2016

Date of Patent: July 3, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Denis Sergeyevich Filimonov, Gautam Tiwari, Shaun Nidhiri Joseph, Ariya Rastrow
Data structure pooling of voice activated data packets

Patent number: 10013986

Abstract: Systems and methods of voice activated thread management in a voice activated data packet based environment are provided. A natural language processor (“NLP”) component can receive and parse data packets comprising a first input audio signal to identify a first request and a first trigger keyword. A direct action application programming interface (“API”) can generate a first action data structure with a parameter defining a first action. The NLP component can receive and parse a second input audio signal to identify a second request and a second trigger keyword, and can generate a second action data structure with a parameter defining a second action. A pooling component can generate the first and second action data structures into a pooled data structure, and can transmit the pooled data structure to a service provider computing device to cause it device to perform an operation defined by the pooled data structure.

Type: Grant

Filed: December 30, 2016

Date of Patent: July 3, 2018

Assignee: GOOGLE LLC

Inventors: Gaurav Bhaya, Robert Stets
System and method for optimizing speech recognition and natural language parameters with user feedback

Patent number: 9984679

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user.

Type: Grant

Filed: July 18, 2016

Date of Patent: May 29, 2018

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Andrej Ljolje, Diamantino Antonio Caseiro, Mazin Gilbert, Vincent Goffin, Taniya Mishra
Passive training for automatic speech recognition

Patent number: 9953634

Abstract: Provided are methods and systems for passive training for automatic speech recognition. An example method includes utilizing a first, speaker-independent model to detect a spoken keyword or a key phrase in spoken utterances. While utilizing the first model, a second model is passively trained to detect the spoken keyword or the key phrase in the spoken utterances using at least partially the spoken utterances. The second, speaker dependent model may utilize deep neural network (DNN) or convolutional neural network (CNN) techniques. In response to completion of the training, a switch is made from utilizing the first model to utilizing the second model to detect the spoken keyword or the key phrase in spoken utterances. While utilizing the second model, parameters associated therewith are updated using the spoken utterances in response to detecting the keyword or the key phrase in the spoken utterances. User authentication functionality may be provided.

Type: Grant

Filed: December 17, 2014

Date of Patent: April 24, 2018

Assignee: Knowles Electronics, LLC

Inventors: David Pearce, Brian Clark
Approach to reducing the response time of a speech interface

Patent number: 9922647

Abstract: A method for reducing response time in a speech interface including constructing a partially completed word sequence from a partially received utterance from a speaker received by an audio sensor, modeling a remainder portion using a processor based on a rich predictive model to predict the remainder portion, and responding to the partially completed word sequence and the predicted remainder portion using a natural language vocalization generator with a vocalization, wherein the vocalization is prepared before a complete utterance is received from the speaker and conveyed to the speaker by an audio transducer.

Type: Grant

Filed: January 29, 2016

Date of Patent: March 20, 2018

Assignee: International Business Machines Corporation

Inventors: Gakuto Kurata, Tohru Nagano
System and method for improving speech recognition accuracy using textual context

Patent number: 9911437

Abstract: Disclosed herein are systems, methods, and computer-readable storage media for improving speech recognition accuracy using textual context. The method includes retrieving a recorded utterance, capturing text from a device display associated with the spoken dialog and viewed by one party to the recorded utterance, and identifying words in the captured text that are relevant to the recorded utterance. The method further includes adding the identified words to a dynamic language model, and recognizing the recorded utterance using the dynamic language model. The recorded utterance can be a spoken dialog. A time stamp can be assigned to each identified word. The method can include adding identified words to and/or removing identified words from the dynamic language model based on their respective time stamps. A screen scraper can capture text from the device display associated with the recorded utterance. The device display can contain customer service data.

Type: Grant

Filed: May 4, 2016

Date of Patent: March 6, 2018

Assignee: Nuance Communications, Inc.

Inventors: Dan Melamed, Srinivas Bangalore, Michael Johnston
Language model biasing modulation

Patent number: 9886946

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model biasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).

Type: Grant

Filed: September 13, 2016

Date of Patent: February 6, 2018

Assignee: Google LLC

Inventors: Pedro J. Moreno-Mengibar, Petar Aleksic
Pre-training and/or transfer learning for sequence taggers

Patent number: 9875736

Abstract: Systems and methods for pre-training a sequence tagger with unlabeled data, such as a hidden layered conditional random field model are provided. Additionally, systems and methods for transfer learning are provided. Accordingly, the systems and methods build more accurate, more reliable, and/or more efficient sequence taggers than previously utilized sequence taggers that are not pre-trained with unlabeled data and/or that are not capable of transfer learning/training.

Type: Grant

Filed: February 19, 2015

Date of Patent: January 23, 2018

Assignee: Microsoft Technology Licensing, LLC

Inventors: Young-Bum Kim, Minwoo Jeong, Ruhi Sarikaya
Low latency and memory efficient keyword spotting

Patent number: 9852729

Abstract: Features are disclosed for spotting keywords in utterance audio data without requiring the entire utterance to first be processed. Likelihoods that a portion of the utterance audio data corresponds to the keyword may be compared to likelihoods that the portion corresponds to background audio (e.g., general speech and/or non-speech sounds). The difference in the likelihoods may be determined, and keyword may be triggered when the difference exceeds a threshold, or shortly thereafter. Traceback information and other data may be stored during the process so that a second speech processing pass may be performed. For efficient management of system memory, traceback information may only be stored for those frames that may encompass a keyword; the traceback information for older frames may be overwritten by traceback information for newer frames.

Type: Grant

Filed: July 11, 2016

Date of Patent: December 26, 2017

Assignee: Amazon Technologies, Inc.

Inventor: Bjorn Hoffmeister
Method and system for assessing similarity of documents

Patent number: 9852337

Abstract: A method for assessing similarity of documents. The method includes extracting a reference document text from a reference document, extracting an archived document text from an archived document, and quantifying the reference document and the archived document. Quantifying the reference and archived documents includes tokenizing sentences of the reference document and archived document, respectively, and vectorizing the tokenized sentences to obtain a reference document text vector and an archived document text vector for each sentence of the reference and archived document, respectively. The method also includes determining a document similarity value of the quantified reference document and the quantified archived document.

Type: Grant

Filed: September 30, 2015

Date of Patent: December 26, 2017

Assignee: Open Text Corporation

Inventors: Jeroen Mattijs van Rotterdam, Michael T Mohen, Chao Chen, Kun Zhao
Image captioning with weak supervision

Patent number: 9811765

Abstract: Techniques for image captioning with weak supervision are described herein. In implementations, weak supervision data regarding a target image is obtained and utilized to provide detail information that supplements global image concepts derived for image captioning. Weak supervision data refers to noisy data that is not closely curated and may include errors. Given a target image, weak supervision data for visually similar images may be collected from sources of weakly annotated images, such as online social networks. Generally, images posted online include “weak” annotations in the form of tags, titles, labels, and short descriptions added by users. Weak supervision data for the target image is generated by extracting keywords for visually similar images discovered in the different sources. The keywords included in the weak supervision data are then employed to modulate weights applied for probabilistic classifications during image captioning analysis.

Type: Grant

Filed: January 13, 2016

Date of Patent: November 7, 2017

Assignee: Adobe Systems Incorporated

Inventors: Zhaowen Wang, Quanzeng You, Hailin Jin, Chen Fang
Motor vehicle device operation with operating correction

Patent number: 9812129

Abstract: A method for operating a motor vehicle operating device to carry out with voice control two operating steps. A first vocabulary is set, which is provided for the first operating step, to a speech recognition device. Based on the first set vocabulary, a first recognition result is generated and the first operating step is carried out. A second vocabulary, which is provided for the second operating step, is then set to the speech recognition device and a second speech input is received. A repetition recognition device recognizes during or after the second speech input a correction request of the user. The first operating step is then reversed for the device and the first vocabulary is reinstalled again for the speech recognition device. The first operating step is repeated based on a second recognition result that is detected in dependence on a part of the second speech input.

Type: Grant

Filed: July 17, 2015

Date of Patent: November 7, 2017

Assignee: AUDI AG

Inventors: Doreen Engelhardt, Jana Paulick, Kerstin Tellermann, Sarah Schadl
Semantic natural language vector space

Patent number: 9792534

Abstract: Techniques for image captioning with word vector representations are described. In implementations, instead of outputting results of caption analysis directly, the framework is adapted to output points in a semantic word vector space. These word vector representations reflect distance values in the context of the semantic word vector space. In this approach, words are mapped into a vector space and the results of caption analysis are expressed as points in the vector space that capture semantics between words. In the vector space, similar concepts with have small distance values. The word vectors are not tied to particular words or a single dictionary. A post-processing step is employed to map the points to words and convert the word vector representations to captions. Accordingly, conversion is delayed to a later stage in the process.

Type: Grant

Filed: January 13, 2016

Date of Patent: October 17, 2017

Assignee: Adobe Systems Incorporated

Inventors: Zhaowen Wang, Quanzeng You, Hailin Jin, Chen Fang
Transcription of speech

Patent number: 9786283

Abstract: A speech media transcription system comprises a playback device arranged to play back speech delimited in segments. The system is programmed to provide, for a segment being transcribed, an adaptive estimate of the proportion of the segment that has not been transcribed by a transcriber. The device is arranged to play back that proportion of the segment, optionally after having already played back the entire segment. Additionally, a segmentation engine is arranged to divide speech media into a plurality of segments by identifying speech as such and using timing information but without using a machine conversion of the speech media into text or a representation of text.

Type: Grant

Filed: March 26, 2013

Date of Patent: October 10, 2017

Assignee: JPAL LIMITED

Inventor: John Richard Baker
Data interpretation with noise signal analysis

Patent number: 9780887

Abstract: Methods and systems for providing and processing data are disclosed. An example method can comprise determining a first weighted probability based on a probability of occurrence of a noise signal and a first likelihood ratio. The first likelihood ratio is based on a frequency distribution of the noise signal. An example method can comprise determining a second weighted probability based on a probability of non-occurrence of the noise signal and a second likelihood ratio. An example method can comprise determining a combination of the first weighted probability and the second weighted probability, and providing the combination to a decoder configured to decode a value based on the combination.

Type: Grant

Filed: February 19, 2016

Date of Patent: October 3, 2017

Assignee: Comcast Cable Communications, LLC

Inventor: David Urban
Predictive text input

Patent number: 9760559

Abstract: Systems and processes for predictive text input are provided. In one example process, a text input can be received. The text input can be associated with an input context. A frequency of occurrence of an m-gram with respect to a subset of a corpus can be determined using a language model. The subset can be associated with a context. A weighting factor can be determined based on a degree of similarity between the input context and the context. A weighted probability of a predicted text given the text input can be determined based on the frequency of occurrence of the m-gram and the weighting factor. The m-gram can include at least one word in the text input and at least one word in the predicted text.

Type: Grant

Filed: May 22, 2015

Date of Patent: September 12, 2017

Assignee: Apple Inc.

Inventors: Jannes Dolfing, Brent Ramerth, Douglas Davidson, Jerome Bellegarda, Jennifer Moore, Andreas Eminidis, Joshua Shaffer
Adaptive self-trained computer engines with associated databases and methods of use thereof

Patent number: 9741337

Abstract: In some embodiments, the present invention provides for an exemplary computer system which includes at least the following components: an adaptive self-trained computer engine programmed, during a training stage, to electronically receive an initial speech audio data generated by a microphone of a computing device; dynamically segment the initial speech audio data and the corresponding initial text into a plurality of user phonemes; dynamically associate a plurality of first timestamps with the plurality of user-specific subject-specific phonemes; and, during a transcription stage, electronically receive to-be-transcribed speech audio data of at least one user; dynamically split the to-be transcribed speech audio data into a plurality of to-be-transcribed speech audio segments; dynamically assigning each timestamped to-be-transcribed speech audio segment to a particular core of the multi-core processor; and dynamically transcribing, in parallel, the plurality of timestamped to-be-transcribed speech audio segm

Type: Grant

Filed: April 3, 2017

Date of Patent: August 22, 2017

Assignee: Green Key Technologies LLC

Inventors: Tejas Shastry, Anthony Tassone, Patrick Kuca, Svyatoslav Vergun
Model stacks for automatically classifying data records imported from big data and/or other sources, associated systems, and/or methods

Patent number: 9740979

Abstract: Techniques relating to managing “bad” or “imperfect” data being imported into a database system are described herein. As an example, a lifecycle technology solution helps receive data from a variety of different data sources of a variety of known and/or unknown formats, standardize it, fit it to a known taxonomy through model-assisted classification, store it to a database in a manner that is consistent with the taxonomy, and allow it to be queried for a variety of different usages. Some or all of the disclosed technology concerning auto-classification, enrichment, clustering model and model stacks, and/or the like, may be used in these and/or other regards.

Type: Grant

Filed: June 3, 2016

Date of Patent: August 22, 2017

Assignee: XEEVA, INC.

Inventors: Dilip Dubey, Dineshchandra Harikisan Rathi, Koushik Kumaraswamy
System and method for dynamic noise adaptation for robust automatic speech recognition

Patent number: 9741341

Abstract: A speech processing method and arrangement are described. A dynamic noise adaptation (DNA) model characterizes a speech input reflecting effects of background noise. A null noise DNA model characterizes the speech input based on reflecting a null noise mismatch condition. A DNA interaction model performs Bayesian model selection and re-weighting of the DNA model and the null noise DNA model to realize a modified DNA model characterizing the speech input for automatic speech recognition and compensating for noise to a varying degree depending on relative probabilities of the DNA model and the null noise DNA model.

Type: Grant

Filed: January 20, 2015

Date of Patent: August 22, 2017

Assignee: Nuance Communications, Inc.

Inventors: Steven J. Rennie, Pierre Dognin, Petr Fousek
Extracting method, computer product, extracting system, information generating method, and information contents

Patent number: 9720976

Abstract: An extracting method includes storing to a storage device: files that include character units; first index information indicating which file includes at least one character unit in a character unit group having a usage frequency less than a predetermined frequency and among character units having common information in a predetermined portion, the usage frequency indicating the extent of files having a given character unit; second index information indicating which file includes a first character unit having a usage frequency at least equal to the predetermined frequency and among the character units having common information in a predetermined portion; and referring to the first and second index information to extract a file having character units in the first and second index information, when a request is received for extraction of a file having the first character unit and a second character unit that is included in the character unit group.

Type: Grant

Filed: April 2, 2014

Date of Patent: August 1, 2017

Assignee: FUJITSU LIMITED

Inventors: Masahiro Kataoka, Takahiro Murata, Takafumi Ohta
Method and apparatus of recommending candidate terms based on geographical location

Patent number: 9690806

Abstract: The disclosure provides a method and device for recommending a candidate word according to a geographic position. The method may include receiving a coded character string of a user by a computing device. The computing device may collect geographic position information corresponding to the coded character string, and then determine a geographic area in which the geographic position information is located. The computing device may obtain a geographic candidate word corresponding to the coded character string according to a geographic word stock of the determined geographic area. The geographic word stock of the geographic area may store the coded character strings and a corresponding geographic word according to the geographic area. As compared to current technologies, complexity of input can be reduced and intelligence of an input method can be improved.

Type: Grant

Filed: March 31, 2016

Date of Patent: June 27, 2017

Assignee: Alibaba Group Holding Limited

Inventor: Maojian Fu
Apparatus and methods for managing resources for a system using voice recognition

Patent number: 9685154

Abstract: The technology of the present application provides a method and apparatus to managing resources for a system using voice recognition. The method and apparatus includes maintaining a database of historical data regarding a plurality of users. The historical database maintains data regarding the training resources required for users to achieve an accuracy score using voice recognition. A resource calculation module determines from the historical data an expected amount of training resources necessary to train a new user to the accuracy score.

Type: Grant

Filed: September 24, 2013

Date of Patent: June 20, 2017

Assignee: nVoq Incorporated

Inventor: Charles Corfield
Intrusion detection on computing devices

Patent number: 9686300

Abstract: A non-transitory computer readable storage medium including instructions that, when executed by a computing system, cause the computing system to perform operations. The operations include collecting, by a processing device, raw data regarding a user action. The operations also include converting, by the processing device, the raw data to characteristic test data (CTD), wherein the CTD represents behavior characteristics of a current user. The operations also include identifying, by the processing device, a characteristic model corresponding to the behavior characteristics represented by the CTD. The operations also include generating, by the processing device, a predictor from a comparison of the CTD against the corresponding characteristic model, wherein the predictor comprises a score indicating a probability that the user action came from an authenticated user.

Type: Grant

Filed: July 14, 2015

Date of Patent: June 20, 2017

Assignee: Akamai Technologies, Inc.

Inventor: Sreenath Kurupati
System and method for rapid customization of speech recognition models

Patent number: 9679561

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating domain-specific speech recognition models for a domain of interest by combining and tuning existing speech recognition models when a speech recognizer does not have access to a speech recognition model for that domain of interest and when available domain-specific data is below a minimum desired threshold to create a new domain-specific speech recognition model. A system configured to practice the method identifies a speech recognition domain and combines a set of speech recognition models, each speech recognition model of the set of speech recognition models being from a respective speech recognition domain. The system receives an amount of data specific to the speech recognition domain, wherein the amount of data is less than a minimum threshold to create a new domain-specific model, and tunes the combined speech recognition model for the speech recognition domain based on the data.

Type: Grant

Filed: March 28, 2011

Date of Patent: June 13, 2017

Assignee: Nuance Communications, Inc.

Inventors: Srinivas Bangalore, Robert Bell, Diamantino Antonio Caseiro, Mazin Gilbert, Patrick Haffner
Text-to-speech corpus development system

Patent number: 9679554

Abstract: A system may determine text for inclusion in a voice corpus for use in text-to-speech (TTS) processing using an interface that allows multiple entities to connect and review potential text segments concurrently. The interface may allow networked communication with the system. Individual text segments may be approved or rejected by reviewing entities, such as proofreaders. The system may prioritize text segments to send to reviewers and may re-prioritize text segments based on a linguistic coverage of previously accepted text segments.

Type: Grant

Filed: June 23, 2014

Date of Patent: June 13, 2017

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Michal Czuczman, Michal Grzegorz Kurpanik, Remus Razvan Mois
Service requirement analysis system, method and non-transitory computer readable storage medium

Patent number: 9652472

Abstract: A service requirement analysis system includes a service provider database and an analysis server. The service provider database stores multiple service provider data entries, and is connected to a client device. The analysis server receives a service requirement string, and performs segmentation and filtering to obtain requirement keywords. The correlation values quantifying semantic relatedness between any two of the requirement keywords are calculated to construct a requirement keyword connected graph for dividing the requirement keywords into one or more requirement keyword groups associated with one or more concepts in the service requirement string. A semantic hierarchical structure of each of the requirement keyword groups is constructed for searching the service provider database to obtain service provider data entries matching the service requirement string. The matched entries are displayed on the client device.

Type: Grant

Filed: November 21, 2014

Date of Patent: May 16, 2017

Assignee: INSTITUTE FOR INFORMATION INDUSTRY

Inventors: Chi-Hung Tsai, Ping-I Chen, Chun-Yen Chu, Shao-Hua Cheng
System and method for context sensitive inference in a speech processing system

Patent number: 9626968

Abstract: A method of operating a speech processing system is provided. The method includes translating a portion of a speech record into a plurality of possible words associated with a plurality of contexts, and determining a plurality of correctness values based on a plurality of probabilities that each of the plurality of possible words is correct for each of the plurality of contexts. The method also includes determining which of the plurality of possible words is a correct translation of the portion of the speech record based on the plurality of correctness values.

Type: Grant

Filed: January 12, 2015

Date of Patent: April 18, 2017

Assignee: VERINT SYSTEMS LTD.

Inventor: Michael Brand
Systems and methods for improving the accuracy of a transcription using auxiliary data such as personal data

Patent number: 9626969

Abstract: A method is described for improving the accuracy of a transcription generated by an automatic speech recognition (ASR) engine. A personal vocabulary is maintained that includes replacement words. The replacement words in the personal vocabulary are obtained from personal data associated with a user. A transcription is received of an audio recording. The transcription is generated by an ASR engine using an ASR vocabulary and includes a transcribed word that represents a spoken word in the audio recording. Data is received that is associated with the transcribed word. A replacement word from the personal vocabulary is identified, which is used to re-score the transcription and replace the transcribed word.

Type: Grant

Filed: April 13, 2015

Date of Patent: April 18, 2017

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: George Zavaliagkos, William F. Ganong, III, Uwe H. Jost, Shreedhar Madhavapeddi, Gary B. Clayton
Efficient empirical computation and utilization of acoustic confusability

Patent number: 9626965

Abstract: Efficient empirical determination, computation, and use of an acoustic confusability measure comprises: (1) an empirically derived acoustic confusability measure, comprising a means for determining the acoustic confusability between any two textual phrases in a given language, where the measure of acoustic confusability is empirically derived from examples of the application of a specific speech recognition technology, where the procedure does not require access to the internal computational models of the speech recognition technology, and does not depend upon any particular internal structure or modeling technique, and where the procedure is based upon iterative improvement from an initial estimate; (2) techniques for efficient computation of empirically derived acoustic confusability measure, comprising means for efficient application of an acoustic confusability score, allowing practical application to very large-scale problems; and (3) a method for using acoustic confusability measures to make principled

Type: Grant

Filed: December 17, 2014

Date of Patent: April 18, 2017

Assignee: PROMPTU SYSTEMS CORPORATION

Inventors: Harry Printz, Naren Chittar

prev 1 2 3 4 5 6 7 … next