Probability Patents (Class 704/240)

Device and method to personalize speech recognition model

Patent number: 10957308

Abstract: Provided is a method and device to personalize a speech recognition model, the device that personalizes a speech recognition model by identifying a language group corresponding to a user, and generating a personalized speech recognition model by applying a group scale matrix corresponding to the identified language group to at least a layer of a speech recognition model.

Type: Grant

Filed: August 31, 2018

Date of Patent: March 23, 2021

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ki Soo Kwon, Inchul Song, YoungSang Choi
Creation of language models for speech recognition

Patent number: 10943583

Abstract: A system to perform automatic speech recognition (ASR) using a dynamic language model. Portions of the language model can include a group of probabilities rather than a single probability. At runtime individual probabilities of the group are weighted and combined to create an adjusted probability for the portion of the language model. The adjusted probability can be used for ASR processing. The weights can be determined based on a characteristic of the utterance, for example an associated speechlet/application, the specific user speaking, or other characteristic. By applying the weights at runtime the system can use a single language model to dynamically adjust to different utterance conditions.

Type: Grant

Filed: March 23, 2018

Date of Patent: March 9, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Ankur Gandhe, Ariya Rastrow, Shaswat Pratap Shah
Algorithm for scoring partial matches between words

Patent number: 10943143

Abstract: Techniques are disclosed relating to scoring partial matches between words. In certain embodiments, a method may include receiving a request to determine a similarity between an input text data and a stored text data. The method also includes determining, based on comparing one or more words included in the input text data with one or more words included in the stored text data, a set of word pairs and a set of unpaired words. Further, in response to determining that the set of unpaired words passes elimination criteria, the method includes calculating a base similarity score between the input text data and the stored text data based on the set of word pairs. The method also includes determining a scoring penalty based on the set of unpaired words and generating a final similarity score between the input text data and the stored text data by modifying the base similarity score based on the scoring penalty.

Type: Grant

Filed: December 28, 2018

Date of Patent: March 9, 2021

Assignee: PAYPAL, INC.

Inventors: Rushik Upadhyay, Dhamodharan Lakshmipathy, Nandhini Ramesh, Aditya Kaulagi
Information processing device and information processing method for presenting character information obtained by converting a voice

Patent number: 10937415

Abstract: There is provided an information processing device to further improve the operability of user interfaces that use a voice as an input, the information processing device including: an acquisition unit configured to acquire context information in a period for collection of a voice; and a control unit configured to cause a predetermined output unit to present a candidate for character information obtained by converting the voice in a mode in accordance with the context information.

Type: Grant

Filed: March 15, 2017

Date of Patent: March 2, 2021

Assignee: SONY CORPORATION

Inventors: Ayumi Kato, Shinichi Kawano, Yuhei Taki, Yusuke Nakagawa
Display apparatus and method for question and answer

Patent number: 10922990

Abstract: A display apparatus and a method for questions and answers includes a display unit includes an input unit configured to receive user's speech voice; a communication unit configured to perform data communication with an answer server; and a processor configured to create and display one or more question sentences using the speech voice in response to the speech voice being a word speech, create a question language corresponding to the question sentence selected from among the displayed one or more question sentences, transmit the created question language to the answer server via the communication unit, and, in response to one or more answer results related to the question language being received from the answer server, display the received one or more answer results. Accordingly, the display apparatus may provide an answer result appropriate to a user's question intention although a non-sentence speech is input.

Type: Grant

Filed: May 23, 2019

Date of Patent: February 16, 2021

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventor: Eun-sang Bak
Information processing apparatus and information processing method

Patent number: 10896293

Abstract: Provided is an information processing apparatus including a processing unit configured to determine, on a basis of a word of a predetermined unit selected in a text string indicated by text string information, another word connected to the selected word and included in the text string and to set a delimitation in the text string with regard to the selected word.

Type: Grant

Filed: April 19, 2017

Date of Patent: January 19, 2021

Assignee: SONY CORPORATION

Inventors: Yuhei Taki, Shinichi Kawano
Speech recognition with selective use of dynamic language models

Patent number: 10896681

Abstract: This document describes, among other things, a computer-implemented method for transcribing an utterance. The method can include receiving, at a computing system, speech data that characterizes an utterance of a user. A first set of candidate transcriptions of the utterance can be generated using a static class-based language model that includes a plurality of classes that are each populated with class-based terms selected independently of the utterance or the user. The computing system can then determine whether the first set of candidate transcriptions includes class-based terms. Based on whether the first set of candidate transcriptions includes class-based terms, the computing system can determine whether to generate a dynamic class-based language model that includes at least one class that is populated with class-based terms selected based on a context associated with at least one of the utterance and the user.

Type: Grant

Filed: December 29, 2015

Date of Patent: January 19, 2021

Assignee: Google LLC

Inventors: Petar Aleksic, Pedro J. Moreno Mengibar
Determining a type of speech recognition processing according to a request from a user

Patent number: 10885909

Abstract: A speech recognition method to be performed by a computer, the method including: detecting a first keyword uttered by a user from an audio signal representing voice of the user; detecting a term indicating a request of the user from sections that follow the first keyword in the audio signal; and determining a type of speech recognition processing applied to the following sections in accordance with the detected term indicating the request of the user.

Type: Grant

Filed: February 6, 2018

Date of Patent: January 5, 2021

Assignee: FUJITSU LIMITED

Inventors: Chikako Matsumoto, Naoshi Matsuo
Inter-channel bandwidth extension spectral mapping and adjustment

Patent number: 10872613

Abstract: A method includes generating a synthesized non-reference high-band channel based on a non-reference high-band excitation corresponding to a non-reference target channel. The method further includes estimating one or more spectral mapping parameters based on the synthesized non-reference high-band channel and a high-band portion of the non-reference target channel. The method also includes applying the one or more spectral mapping parameters to the synthesized non-reference high-band channel to generate a spectrally shaped synthesized non-reference high-band channel. The method further includes generating an encoded bitstream based on the one or more spectral mapping parameters and the spectrally shaped synthesized non-reference high-band channel.

Type: Grant

Filed: November 4, 2019

Date of Patent: December 22, 2020

Assignee: QUALCOMM Incorporated

Inventors: Venkata Subrahmanyam Chandra Sekhar Chebiyyam, Venkatraman Atti
Semantic analysis method, semantic analysis system and non-transitory computer-readable medium

Patent number: 10867598

Abstract: A semantic analysis method, semantic analysis and non-transitory computer-readable medium are provided in this disclosure.

Type: Grant

Filed: December 10, 2018

Date of Patent: December 15, 2020

Assignee: INSTITUTE FOR INFORMATION INDUSTRY

Inventors: Yu-Shian Chiu, Wei-Jen Yang
Trigger word detection using neural network waveform processing

Patent number: 10847137

Abstract: An approach to speech recognition, and in particular trigger word detection, implements fixed feature extraction form waveform samples with a neural network (NN). For example, rather than computing Log Frequency Band Energies (LFBEs), a convolutional neural network is used. In some implementations, this NN waveform processing is combined with a trained secondary classification that makes use of phonetic segmentation of a possible trigger word occurrence.

Type: Grant

Filed: December 12, 2017

Date of Patent: November 24, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Arindam Mandal, Nikko Strom, Kenichi Kumatani, Sankaran Panchapagesan
Hyperarticulation detection in repetitive voice queries using pairwise comparison for improved speech recognition

Patent number: 10847147

Abstract: Automatic speech recognition systems can benefit from cues in user voice such as hyperarticulation. Traditional approaches typically attempt to define and detect an absolute state of hyperarticulation, which is very difficult, especially on short voice queries. This disclosure provides for an approach for hyperarticulation detection using pair-wise comparisons and on a real-world speech recognition system. The disclosed approach uses delta features extracted from a pair of repetitive user utterances. The improvements provided by the disclosed systems and methods include improvements in word error rate by using hyperarticulation information as a feature in a second pass N-best hypotheses rescoring setup.

Type: Grant

Filed: May 24, 2019

Date of Patent: November 24, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Ranjitha Gurunath Kulkarni, Ahmed Moustafa El Kholy, Ziad Al Bawab, Noha Alon, Imed Zitouni
Systems and methods for establishing a communications session

Patent number: 10841411

Abstract: Systems, methods, and devices for establishing communications sessions with contacts are disclosed. In some embodiments, a first request may be received from a first device. The first request may be to communicate with a contact name. A user account associated with the first device may then be identified, and a contact list associated with the user account may be accessed to determine contacts associated with the contact name. Based on the contact list, a first contact and a second contact associated with the contact name may be identified. It may be determined, from memory, that the first contact is a first preferred contact. However, based on an intervening event, the second contact, rather than the preferred contact, may be selected for communicating with the contact.

Type: Grant

Filed: November 9, 2017

Date of Patent: November 17, 2020

Assignee: Amazon Technologies, Inc.

Inventor: Aparna Nandyal
Dynamic personalized multi-turn interaction of cognitive models

Patent number: 10839796

Abstract: Multi-turn conversation systems that are personalized to a user based on insights derived from big data are described. A method includes: receiving, by a computer device, input from a user; obtaining, by the computer device, insights about the user; generating, by the computer device, a response based on the insights and the input; and outputting, by the computer device, the response.

Type: Grant

Filed: December 15, 2017

Date of Patent: November 17, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Faried Abrahams, Lalit Agarwalla, Gandhi Sivakumar
Automated speech recognition using language models that selectively use domain-specific model components

Patent number: 10832664

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language models using domain-specific model components. In some implementations, context data for an utterance is obtained. A domain-specific model component is selected from among multiple domain-specific model components of a language model based on the non-linguistic context of the utterance. A score for a candidate transcription for the utterance is generated using the selected domain-specific model component and a baseline model component of the language model that is domain-independent. A transcription for the utterance is determined using the score the transcription is provided as output of an automated speech recognition system.

Type: Grant

Filed: August 21, 2017

Date of Patent: November 10, 2020

Assignee: Google LLC

Inventors: Fadi Biadsy, Diamantino Antionio Caseiro
Quantized dialog language model for dialog systems

Patent number: 10832658

Abstract: A method, program product and computer system to predict utterances in a dialog system includes receiving a set of utterances associated with a dialog between a client device and a dialog system, mapping the utterances to vector representations of the utterances, and identifying at least one cluster to which the utterances belong from among a plurality of possible clusters. A next cluster is predicted based upon a conditional probability of the next cluster following a set of a predetermined number of previous clusters using a language model. A next utterance is predicted from among a plurality of possible utterances within the predicted next cluster.

Type: Grant

Filed: March 8, 2018

Date of Patent: November 10, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Chulaka Gunasekara, David Nahamoo, Lazaros Polymenakos, Kshitij Fadnis, David Echeverria Ciaurri, Jatin Ganhotra
Techniques for sentiment analysis of data using a convolutional neural network and a co-occurrence network

Patent number: 10810472

Abstract: Techniques are provided for performing sentiment analysis on words in a first data set. An example embodiment includes generating a word embedding model including a first plurality of features. A value indicating sentiment for the words in the first data set can be determined using a convolutional neural network (CNN). A second plurality of features are generated based on bigrams identified in the data set. The bigrams can be generated using a co-occurrence graph. The model is updated to include the second plurality of features, and sentiment analysis can be performed on a second data set using the updated model.

Type: Grant

Filed: May 10, 2018

Date of Patent: October 20, 2020

Assignee: ORACLE INTERNATIONAL CORPORATION

Inventors: Michael Malak, Mark L. Kreider
System and method for speech recognition with decoupling awakening phrase

Patent number: 10789946

Abstract: Systems and methods are provided for speech recognition. An example method may be implementable by a server. The method may comprise adding a key phrase into a dictionary comprising a plurality of dictionary phrases, and for each one or more of the dictionary phrases, obtaining a first probability that the dictionary phrase is after the key phrase in a phrase sequence. The key phrase and the dictionary phrase may each comprise one or more words. The first probability may be independent of the key phrase.

Type: Grant

Filed: December 27, 2019

Date of Patent: September 29, 2020

Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.

Inventor: Chen Huang
Automated workflows for identification of reading order from text segments using probabilistic language models

Patent number: 10713519

Abstract: The present invention is directed towards providing automated workflows for the identification of a reading order from text segments extracted from a document. Ordering the text segments is based on trained natural language models. In some embodiments, the workflows are enabled to perform a method for identifying a sequence associated with a portable document. The methods includes iteratively generating a probabilistic language model, receiving the portable document, and selectively extracting features (such as but not limited to text segments) from the document. The method may generate pairs of features (or feature pair from the extracted features). The method may further generate a score for each of the pairs based on the probabilistic language model and determine an order to features based on the scores. The method may provide the extracted features in the determined order.

Type: Grant

Filed: June 22, 2017

Date of Patent: July 14, 2020

Assignee: ADOBE INC.

Inventors: Trung Huu Bui, Hung Hai Bui, Shawn Alan Gaither, Walter Wei-Tuh Chang, Michael Frank Kraley, Pranjal Daga
WFST decoding system, speech recognition system including the same and method for storing WFST data

Patent number: 10714080

Abstract: A weighted finite-state transducer (WFST) decoding system is provided. The WFST decoding system includes a memory that stores WFST data and a WFST decoder including a data fetch logic. The WFST data has a structure including states, and arcs connecting the states with directivity. The WFST data is compressed in the memory. The WFST data includes body data, and header data including state information for each states that is aligned discontinuously. The body data includes arc information of the arcs that is aligned continuously. The state information includes an arc index of the arcs, a number of the arcs, and compression information of the arcs, and the data fetch logic de-compresses the WFST data using the compression information, and retrieves the WFST data from the memory.

Type: Grant

Filed: September 8, 2017

Date of Patent: July 14, 2020

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jae Sung Yoon, Jun Seok Park
Question answering system

Patent number: 10713289

Abstract: Systems, methods, and devices for performing interactive question answering using data source credibility and conversation entropy are disclosed. A speech-controlled device captures audio including a spoken question, and sends audio data corresponding thereto to a server(s). The server(s) performs speech processing on the audio data, and determines various stored data that can be used to determine an answer to the question. The server(s) determines which stored data to use based on the credibility of the source from which the stored data was received. The server(s) may also determine a number of user interactions needed to obtain data in order to fully answer the question and may select a question for a dialog soliciting further data based on the number of user interactions.

Type: Grant

Filed: March 31, 2017

Date of Patent: July 14, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Nina Mishra, Yonatan Naamad
Speech classification of audio for wake on voice

Patent number: 10714122

Abstract: Speech or non-speech detection techniques are discussed and include updating a speech pattern model using probability scores from an acoustic model to generate a score for each state of the speech pattern model, such that the speech pattern model includes a first non-speech state having multiple self loops each associated with a non-speech probability score of the probability scores, a plurality of speech states following the first non-speech state, and a second non-speech state following the speech states, and detecting speech based on a comparison of a score of the first non-speech state and a score of the last speech state of the multiple speech states.

Type: Grant

Filed: June 6, 2018

Date of Patent: July 14, 2020

Assignee: Intel Corporation

Inventors: Maciej Muchlinski, Tobias Bocklet
Confidence features for automated speech recognition arbitration

Patent number: 10706852

Abstract: The described technology provides arbitration between speech recognition results generated by different automatic speech recognition (ASR) engines, such as ASR engines trained according to different language or acoustic models. The system includes an arbitrator that selects between a first speech recognition result representing an acoustic utterance as transcribed by a first ASR engine and a second speech recognition result representing the acoustic utterance as transcribed by a second ASR engine. This selection is based on a set of confidence features that is initially used by the first ASR engine or the second ASR engine to generate the first and second speech recognition results.

Type: Grant

Filed: November 13, 2015

Date of Patent: July 7, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Kshitiz Kumar, Hosam Khalil, Yifan Gong, Ziad Al-Bawab, Chaojun Liu
Processing method and electronic device for determining logic boundaries between speech information using information input in a different collection manner

Patent number: 10699712

Abstract: An information processing method and an electronic device are provided. The method includes an electronic device obtaining an input information through a second collection manner when the electronic device is in a speech collection state for obtaining speech information through a first collection manner, and determining a logic boundary position in relation to a first speech information in accordance with the input information, the first speech information is obtained by the electronic device through the first collection manner which is different from the second collection manner. An electronic device corresponding thereto is also disclosed.

Type: Grant

Filed: March 4, 2015

Date of Patent: June 30, 2020

Assignee: LENOVO (BEIJING) CO., LTD.

Inventors: Haisheng Dai, Zhepeng Wang
Method and apparatus for correcting speech recognition error based on artificial intelligence, and storage medium

Patent number: 10699696

Abstract: The present disclosure provides a method and apparatus for correcting a speech recognition error based on artificial intelligence, and a storage medium, wherein the method comprises: obtaining a second speech recognition result of a second speech query input by the user; performing error-correcting intention recognition according to the second speech recognition result; extracting error-correcting information from the second speech recognition result when it is determined that the user has an error-correcting intention; screening error-correcting resources according to the error-correcting information, and using a selected best-matched error-correcting resource to perform error correction for the first speech recognition result, the first speech recognition result being a speech recognition result of a first speech query which is input before the second speech query.

Type: Grant

Filed: May 22, 2018

Date of Patent: June 30, 2020

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Hehan Li, Wensong He
Heterogenous media storage and organization in automated data storage systems

Patent number: 10649850

Abstract: Techniques and systems for storing and retrieving data storage devices of a data storage system are disclosed. In some embodiments, inventory holders are used to store data storage devices used by a data storage system. When data is to be transacted with the data storage devices, mobile drive units locate appropriate inventory holders and transport them to a device reading station, where an appropriate device retrieval unit transacts the data. In some embodiments, each inventory holder includes a heterogenous mix of data storage device types, the layout of which may be calculated according to the specific mix allocated to a given inventory holder. After the data has been transacted, the data storage devices are returned to the appropriate inventory holders, and the inventory holders are placed by the mobile drive units in locations where they may be accessed in response to further data transactions.

Type: Grant

Filed: June 29, 2015

Date of Patent: May 12, 2020

Assignee: Amazon Technologies, Inc.

Inventors: James Raymond Allard, Paul David Franklin, Samuel Rubin Barrett, Jeremiah Brazeau, Jeffrey Allen Dzado, James Caleb Kirschner, David Levy, Brent James Lutz, Andrew Brendan Tinka, Colin Laird Lazier
Named entity disambiguation for providing TV content enrichment

Patent number: 10652592

Abstract: Methods and systems are disclosed for enriching a viewing experience of a user watching video content on a screen of a client terminal by increasing the relevance of additional media content proposed or provided to the user. Disambiguation of named entities detected in a video content item being played is performed by identifying and accessing an information source directly associated with the video content item, and/or by analyzing visual content of a segment of the video content item. Selecting, proposing and/or providing an additional media content item is based on the information source and/or on the analyzing.

Type: Grant

Filed: March 25, 2018

Date of Patent: May 12, 2020

Assignee: Comigo Ltd.

Inventors: Guy Geva, Menahem Lasser
Speech recognition using electronic device and server

Patent number: 10643621

Abstract: An electronic device is provided. The electronic device includes a processor configured to perform automatic speech recognition (ASR) on a speech input by using a speech recognition model that is stored in a memory and a communication module configured to provide the speech input to a server and receive a speech instruction, which corresponds to the speech input, from the server. The electronic device may perform different operations according to a confidence score of a result of the ASR. Besides, it may be permissible to prepare other various embodiments speculated through the specification.

Type: Grant

Filed: September 11, 2018

Date of Patent: May 5, 2020

Assignee: Samsung Electronics Co., Ltd.

Inventors: Seok Yeong Jung, Kyung Tae Kim
Accelerating agent performance in a natural language processing system

Patent number: 10621282

Abstract: A computer-implemented method for providing agent assisted transcriptions of user utterances. A user utterance is received in response to a prompt provided to the user at a remote client device. An automatic transcription is generated from the utterance using a language model based upon an application or context, and presented to a human agent. The agent reviews the transcription and may replace at least a portion of the transcription with a corrected transcription. As the agent inputs the corrected transcription, accelerants are presented to the user comprising suggested texted to be inputted. The accelerants may be determined based upon an agent input, an application or context of the transcription, the portion of the transcription being replaced, or any combination thereof. In some cases, the user provides textual input, to which the agent transcribes an intent associated with the input with the aid of one or more accelerants.

Type: Grant

Filed: April 26, 2018

Date of Patent: April 14, 2020

Assignee: Interactions LLC

Inventors: Ethan Selfridge, Michael Johnston, Robert Lifgren, James Dreher, John Leonard
Methods for detecting double-talk

Patent number: 10622009

Abstract: A system configured to improve double-talk detection. The system detects when double-talk is present in a voice conversation using two or more speaker models. The system extracts feature data from microphone audio data and compares the feature data to each speaker model. For example, the system may generate a first distance score indicating a likelihood that the feature data corresponds to a far-end speaker model and a second distance score indicating a likelihood that the feature data corresponds to a universal speaker model. The system may determine current system conditions based on the distance scores and may change settings to improve speech quality during the voice conversation. For example, during far-end single-talk the system may aggressively reduce an echo signal, whereas during near-end single-talk and double-talk the system may apply minimal echo cancellation to improve a quality of the local speech.

Type: Grant

Filed: September 10, 2018

Date of Patent: April 14, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Xianxian Zhang, Philip Ryan Hilmes, Trausti Thor Kristjansson
Program and syndicated content detection

Patent number: 10614121

Abstract: Content from multiple different stations can be divided into segments based on time. Matched segments associated with each station can be identified by comparing content included in a first segment associated with a first station, to content included in a second segment associated with a second station. Syndicated content can be identified and tagged based, at least in part, on a relationship between sequences of matched segments on different stations. Various embodiments also include identifying main sequences associated with each station under consideration, removing some of the main sequences, and consolidating remaining main sequences based on various threshold criteria.

Type: Grant

Filed: May 26, 2015

Date of Patent: April 7, 2020

Assignee: IHEARTMEDIA MANAGEMENT SERVICES, INC.

Inventors: Periklis Beltas, Philippe Generali, David C. Jellison, Jr.
Multimodal transmission of packetized data

Patent number: 10593329

Abstract: A system of multi-modal transmission of packetized data in a voice activated data packet based computer network environment is provided. A natural language processor component can parse an input audio signal to identify a request and a trigger keyword. Based on the input audio signal, a direct action application programming interface can generate a first action data structure, and a content selector component can select a content item. An interface management component can identify first and second candidate interfaces, and respective resource utilization values. The interface management component can select, based on the resource utilization values, the first candidate interface to present the content item. The interface management component can provide the first action data structure to the client computing device for rendering as audio output, and can transmit the content item converted for a first modality to deliver the content item for rendering from the selected interface.

Type: Grant

Filed: July 18, 2018

Date of Patent: March 17, 2020

Assignee: Google LLC

Inventors: Gaurav Bhaya, Robert Stets, Umesh Patil
Method and apparatus for automatic speech recognition

Patent number: 10573300

Abstract: The invention provides a method of automatic speech recognition. The method includes receiving a speech signal, dividing the speech signal into time windows, for each time window determining acoustic parameters of the speech signal within that window, and identifying phonological features from the acoustic parameters, such that a sequence of phonological features are generated for the speech signal, separating the sequence of phonological features into a sequence of zones, and comparing the sequences of zones to a lexical entry comprising a sequence of phonological segments to a stored lexicon to identify one or more words in the speech signal.

Type: Grant

Filed: August 22, 2018

Date of Patent: February 25, 2020

Assignee: Oxford University Innovation Limited

Inventors: Aditi Lahiri, Henning Reetz, Philip Roberts
Automatic learning of language models

Patent number: 10535342

Abstract: Techniques and systems are disclosed for context-dependent speech recognition. The techniques and systems described enable accurate recognition of speech by accessing sub-libraries associated with the context of the speech to be recognized. These techniques translate audible input into audio data at a smart device and determine context for the speech, such as location-based, temporal-based, recipient-based, and application based context. The smart device then accesses a context-dependent library to compare the audio data with phrase-associated translation data in one or more sub-libraries of the context-dependent library to determine a match. In this way, the techniques allow access to a large quantity of phrases while reducing incorrect matching of the audio data to translation data caused by organizing the phrases into context-dependent sub-libraries.

Type: Grant

Filed: April 10, 2017

Date of Patent: January 14, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventor: Christian Liensberger
Multimodal transmission of packetized data

Patent number: 10535348

Abstract: A system of multi-modal transmission of packetized data in a voice activated data packet based computer network environment is provided. A natural language processor component can parse an input audio signal to identify a request and a trigger keyword. Based on the input audio signal, a direct action application programming interface can generate a first action data structure, and a content selector component can select a content item. An interface management component can identify first and second candidate interfaces, and respective resource utilization values. The interface management component can select, based on the resource utilization values, the first candidate interface to present the content item. The interface management component can provide the first action data structure to the client computing device for rendering as audio output, and can transmit the content item converted for a first modality to deliver the content item for rendering from the selected interface.

Type: Grant

Filed: July 18, 2018

Date of Patent: January 14, 2020

Assignee: Google LLC

Inventors: Gaurav Bhaya, Robert Stets
Semantic model for tagging of word lattices

Patent number: 10529322

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for tagging during speech recognition. A word lattice that indicates probabilities for sequences of words in an utterance is obtained. A conditional probability transducer that indicates a frequency that sequences of both the words and semantic tags for the words appear is obtained. The word lattice and the conditional probability transducer are composed to construct a word lattice that indicates probabilities for sequences of both the words in the utterance and the semantic tags for the words. The word lattice that indicates probabilities for sequences of both the words in the utterance and the semantic tags for the words is used to generate a transcription that includes the words in the utterance and the semantic tags for the words.

Type: Grant

Filed: August 21, 2017

Date of Patent: January 7, 2020

Assignee: Google LLC

Inventors: Petar Aleksic, Michael D. Riley, Pedro J. Moreno Mengibar, Leonid Velikovich
Pollution event detection

Patent number: 10518607

Abstract: A sound of a second vehicle engine is detected. Upon predicting a pollution event by comparing the detected sound to a stored sound model, a countermeasure is actuated in a first vehicle.

Type: Grant

Filed: August 28, 2017

Date of Patent: December 31, 2019

Assignee: FORD GLOBAL TECHNOLOGIES, LLC

Inventors: Howard E. Churchwell, II, Mahmoud Yousef Ghannam
Machine translation method and machine translation apparatus

Patent number: 10496758

Abstract: According to one embodiment, According to one embodiment, a machine translation apparatus includes a circuitry and a memory. The circuitry is configured to input a sentence of a first language, to segment the sentence to obtain a plurality of phrases, to search a translation model for translation options of a second language of each of the plurality of phrases, and to select top N translation options with high probabilities for decoding. N is an integer equal to or larger than 1. Furthermore, the circuitry is configured to combine the top N translation options of the plurality of phases to obtain a plurality of translation hypotheses, to search user history phrase pairs for the translation hypotheses, and to increase a score of a translation hypothesis existing in the user history phrase pairs. The memory is configured to store the score of the translation hypothesis.

Type: Grant

Filed: August 31, 2017

Date of Patent: December 3, 2019

Assignee: Kabushiki Kaisha Toshiba

Inventors: Zhengshan Xue, Dakun Zhang, Jichong Guo, Jie Hao
Unified classification and ranking strategy

Patent number: 10496693

Abstract: Systems and methods provide for classification and ranking of features for a hierarchical dataset. A hierarchical schema of features from the dataset is accessed. A hierarchical rank is assigned to each feature based on its schema level in the hierarchical schema. Additionally, a semantic rank is assigned to each feature using a semantic model having ranked semantic contexts. The semantic rank of a feature is assigned by identifying a semantic context of the feature and assigning the rank of the semantic context as the semantic rank of the feature. A rank is computed for each feature as a function of its hierarchical rank and semantic rank.

Type: Grant

Filed: May 31, 2016

Date of Patent: December 3, 2019

Assignee: ADOBE INC.

Inventors: Shiladitya Bose, Wei Zhang, Arvind Heda
Method and device for recognition and method and device for constructing recognition model

Patent number: 10475442

Abstract: A method and a device for recognition, and a method and a device for constructing a recognition model are disclosed. A device for constructing a recognition model includes a training data inputter configured to receive additional training data, a model learner configured to train a first recognition model constructed based on basic training data to learn the additional training data, and a model constructor configured to construct a final recognition model by integrating the first recognition model with a second recognition model generated by the training of the first recognition model.

Type: Grant

Filed: October 24, 2016

Date of Patent: November 12, 2019

Assignee: Samsung Electronics Co., Ltd.

Inventor: Ho Shik Lee
Post-speech recognition request surplus detection and prevention

Patent number: 10453460

Abstract: Systems and methods for determining that artificial commands, in excess of a threshold value, are detected by multiple voice activated electronic devices is described herein. In some embodiments, numerous voice activated electronic devices may send audio data representing a phrase to a backend system at a substantially same time. Text data representing the phrase, and counts for instances of that text data, may be generated. If the number of counts exceeds a predefined threshold, the backend system may cause any remaining response generation functionality that particular command that is in excess of the predefined threshold to be stopped, and those devices returned to a sleep state. In some embodiments, a sound profile unique to the phrase that caused the excess of the predefined threshold may be generated such that future instances of the same phrase may be recognized prior to text data being generated, conserving the backend system's resources.

Type: Grant

Filed: March 30, 2016

Date of Patent: October 22, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Colin Wills Wightman, Naresh Narayanan, Daniel Robert Rashid
Dialogue act estimation with learning model

Patent number: 10417329

Abstract: A dialog act estimation method includes acquiring learning data including a first sentence to be estimated in the form of text data of a first uttered sentence uttered at a first time point, a second sentence which is text data of a second uttered sentence uttered, at a time point before the first time point, successively after the first uttered sentence, act information indicating an act associated to the first sentence, property information indicating a property information associated to the first sentence, and dialog act information indicating a dialog act in the form of a combination of an act and a property associated to the first sentence, making a particular model learn three or more tasks at the same time using the learning data, and storing a result of the learning as learning result information in a memory.

Type: Grant

Filed: August 1, 2017

Date of Patent: September 17, 2019

Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

Inventors: Takashi Ushio, Hongjie Shi, Mitsuru Endo, Katsuyoshi Yamagami
Text quality evaluation methods and processes

Patent number: 10417328

Abstract: Methods and processes evaluate a quality score of a text. The text includes a plurality of words. The methods compute first probability characteristics of groups of words in a reference text which is known to be a high-quality text. The methods also compute second probability characteristics of groups of words in a text to be scored. The methods also compute the quality score based on a difference between the first probability characteristics and the second probability characteristics.

Type: Grant

Filed: January 5, 2018

Date of Patent: September 17, 2019

Assignee: Searchmetrics GmbH

Inventors: Ahmet Anil Pala, Alexander Kagoshima, Marcus Tober
Lexical analyzer for a neuro-linguistic behavior recognition system

Patent number: 10409909

Abstract: Techniques are disclosed for building a dictionary of words from combinations of symbols generated based on input data. A neuro-linguistic behavior recognition system includes a neuro-linguistic module that generates a linguistic model that describes data input from a source (e.g., video data, SCADA data, etc.). To generate words for the linguistic model, a lexical analyzer component in the neuro-linguistic module receives a stream of symbols, each symbol generated based on an ordered stream of normalized vectors generated from input data. The lexical analyzer component determines words from combinations of the symbols based on a hierarchical learning model having one or more levels. Each level indicates a length of the words to be identified at that level. Statistics are evaluated for the words identified at each level. The lexical analyzer component identifies one or more of the words having statistical significance.

Type: Grant

Filed: December 12, 2014

Date of Patent: September 10, 2019

Assignee: Omni AI, Inc.

Inventors: Gang Xu, Ming-Jung Seow, Tao Yang, Wesley Kenneth Cobb
Perceptual associative memory for a neuro-linguistic behavior recognition system

Patent number: 10409910

Abstract: Techniques are disclosed for generating a syntax for a neuro-linguistic model of input data obtained from one or more sources. A stream of words of a dictionary built from a sequence of symbols are received. The symbols are generated from an ordered stream of normalized vectors generated from input data. Statistics for combinations of words co-occurring in the stream are evaluated. The statistics includes a frequency upon which the combinations of words co-occur. A model of combinations of words based on the evaluated statistics is updated. The model identifies statistically relevant words. A connected graph is generated. Each node in the connected graph represents one of the words in the stream. Edges connecting the nodes represent a probabilistic relationship between words in the stream. Phrases are identified based on the connected graph.

Type: Grant

Filed: December 12, 2014

Date of Patent: September 10, 2019

Assignee: Omni AI, Inc.

Inventors: Ming-Jung Seow, Gang Xu, Tao Yang, Wesley Kenneth Cobb
Automatic identification of retraining data in a classifier-based dialogue system

Patent number: 10372737

Abstract: According to one embodiment, a method, computer system, and computer program product for retraining a classifier-based automatic dialog system with recorded user interactions is provided. The present invention may include receiving recorded interactions, where the interactions are between a user and an automatic dialog system; determining, based on the recorded interactions, whether to pair a given input with one or more classes; pairing inputs with one or more classes; assessing the reliability of the paired inputs and classes; determining whether the reliable paired inputs and classes can be consistently mapped; and merging all consistently mapped reliable pairs with an initial training set.

Type: Grant

Filed: November 16, 2017

Date of Patent: August 6, 2019

Assignee: International Business Machines Corporation

Inventors: Allen Ginsberg, Edward G. Katz, Alexander C. Tonetti
System, a computer readable medium, and a method for providing an integrated management of message information

Patent number: 10375224

Abstract: A mobile device providing integrated management of message information and service provision through artificial intelligence is disclosed. The mobile device includes an integrated message management unit comprising a message monitoring unit configured to monitor voice call information and text message information in association with the voice call management part and the text message management part, a message information managing unit configured to generate integrated message information, which is to be provided to a user, based on the voice call information and the text message information, an interface managing unit configured to generate an integrated message management user interface displaying the integrated message information, and an artificial intelligence agent analyzing the voice call information and the text message information and providing a service associated with at least one additional function in association with the additional function process part based on the analyzed result.

Type: Grant

Filed: March 28, 2018

Date of Patent: August 6, 2019

Assignee: NHN Entertainment Corporation

Inventor: Dong Wook Kim
Address parsing system

Patent number: 10366159

Abstract: A system for identifying address components includes an interface and a processor. The interface is to receive an address for parsing. The processor is to determine a matching model of a set of models based at least in part on a matching probability for each model for a tokenized address, which is based on the address for parsing, and associate each component of the tokenized address with an identifier based at least in part on the matching model, wherein each component of the set of components is associated with an identifier, and wherein probabilities of each component of the set of components are determined using training addresses.

Type: Grant

Filed: October 14, 2016

Date of Patent: July 30, 2019

Assignee: Workday, Inc.

Inventors: Parag Avinash Namjoshi, Shuangshuang Jiang, Mohammad Sabah
Method and system for predicting speech recognition performance using accuracy scores

Patent number: 10360898

Abstract: A system and method are presented for predicting speech recognition performance using accuracy scores in speech recognition systems within the speech analytics field. A keyword set is selected. Figure of Merit (FOM) is computed for the keyword set. Relevant features that describe the word individually and in relation to other words in the language are computed. A mapping from these features to FOM is learned. This mapping can be generalized via a suitable machine learning algorithm and be used to predict FOM for a new keyword. In at least embodiment, the predicted FOM may be used to adjust internals of speech recognition engine to achieve a consistent behavior for all inputs for various settings of confidence values.

Type: Grant

Filed: June 5, 2018

Date of Patent: July 23, 2019

Inventors: Aravind Ganapathiraju, Yingyi Tan, Felix Immanuel Wyss, Scott Allen Randal
Methods and apparatus for speech recognition using a garbage model

Patent number: 10360904

Abstract: Methods and apparatus for performing speech recognition using a garbage model. The method comprises receiving audio comprising speech and processing at least some of the speech using a garbage model to produce a garbage speech recognition result. The garbage model includes a plurality of sub-words, each of which corresponds to a possible combination of phonemes in a particular language.

Type: Grant

Filed: May 9, 2014

Date of Patent: July 23, 2019

Assignee: Nuance Communications, Inc.

Inventors: Cosmin Popovici, Kenneth W. D. Smith, Petrus C. Cools

prev 1 2 3 4 5 6 … next