Abstract: Methods and apparatus to perform windowed sliding transforms are disclosed. An example apparatus includes a transformer to transform a first block of time-domain samples of an input signal into a first frequency-domain representation based on a second frequency-domain representation of a second block of time-domain samples of the input signal, and a windower to apply a third frequency-domain representation of a time-domain window function to the first frequency-domain representation.
Abstract: A system for translation from a first human language to a second language including one or more processors and one or more non-transitory memory units coupled to said one or more processors storing computer readable program instructions, wherein the computer readable program instructions configure the one or more processors to perform the steps of: receive an input representation of information in the first language, convert the input representation of information in the first language to one or more sets of one or more marked-lemma dependency trees (MDTs), convert said one or more sets of one or more marked-MDTs to a representation of information in said second language, and output said representation of information in said second language, wherein the MDTs are represented in a mathematically-equivalent or isomorphic memory structure using one of word embeddings, sense embeddings, tree kernels, capsules, pose vectors, embeddings, and vectorizations.
Abstract: According to an example, with respect to artificial intelligence based service implementation, a voice call may be analyzed to generate voice data. The voice data may be converted to text data, which may be analyzed to identify keywords. Based on an analysis of the identified keywords, a user of a plurality of users may be identified. A user assistance flow of a plurality of user assistance flows that corresponds to a determined intent of the identified user may be ascertained. The voice call may be transferred to a digital assistant that may provide artificial intelligence based assistance to the identified user based on the user assistance flow that corresponds to the determined intent.
Abstract: Provided are a speech recognition training processing method and an apparatus including the same. The speech recognition training processing method includes acquiring multi-talker mixed speech sequence data corresponding to a plurality of speakers, encoding the multi-speaker mixed speech sequence data into an embedded sequence data, generating speaker specific context vectors at each frame based on the embedded sequence, generating senone posteriors for each of the speaker based on the speaker specific context vectors and updating an acoustic model by performing permutation invariant training (PIT) model training based on the senone posteriors.
Type:
Grant
Filed:
July 31, 2018
Date of Patent:
June 30, 2020
Assignee:
TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
Abstract: A computer system associates one or more actions with an emoji. The computer system detects a selection of the emoji within an electronic communication by a user. In response to the detecting the selection of the emoji within the electronic communication, the computer system initiates performance of at least one action of the one or more actions based on determining that one or more contextual factors associated with the electronic communication satisfy a set of conditions associated with the at least one action.
Abstract: Methods and apparatus to perform windowed sliding transforms are disclosed. An example apparatus includes a transformer to transform a first block of time-domain samples of an input signal into a first frequency-domain representation based on a second frequency-domain representation of a second block of time-domain samples of the input signal, and a windower to apply a third frequency-domain representation of a time-domain window function to the first frequency-domain representation.
Abstract: A computer selects a test set of sentences from among sentences applied to train a whole sentence recurrent neural network language model to estimate the probability of likelihood of each whole sentence processed by natural language processing being correct. The computer generates imposter sentences from among the test set of sentences by substituting one word in each sentence of the test set of sentences. The computer generates, through the whole sentence recurrent neural network language model, a first score for each sentence of the test set of sentences and at least one additional score for each of the imposter sentences. The computer evaluates an accuracy of the natural language processing system in performing sequential classification tasks based on an accuracy value of the first score in reflecting a correct sentence and the at least one additional score in reflecting an incorrect sentence.
Type:
Grant
Filed:
August 23, 2019
Date of Patent:
June 23, 2020
Assignee:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Abstract: Techniques are described herein for enabling an automated assistant to adjust its behavior depending on a detected vocabulary level or other vocal characteristics of an input utterance provided to an automated assistant. The estimated vocabulary level or other vocal characteristics may be used to influence various aspects of a data processing pipeline employed by the automated assistant. In some implementations, one or more tolerance thresholds associated with, for example, grammatical tolerances or vocabulary tolerances, may be adjusted based on the estimated vocabulary level or vocal characteristics of the input utterance.
Type:
Grant
Filed:
April 24, 2019
Date of Patent:
June 9, 2020
Assignee:
GOOGLE LLC
Inventors:
Pedro Gonnet Anders, Victor Carbune, Daniel Keysers, Thomas Deselaers, Sandro Feuz
Abstract: Electronic natural language processing in a natural language processing (NLP) system, such as a Question-Answering (QA) system. A receives electronic text input, in question form, and determines a readability level indicator in the question. The readability level indicator includes at least a grammatical error, a slang term, and a misspelling type. The computer determines a readability level for the electronic text input based on the readability level indicator, and retrieves candidate answers based on the readability level.
Type:
Grant
Filed:
June 19, 2019
Date of Patent:
May 26, 2020
Assignee:
International Business Machines Corporation
Inventors:
Donna K. Byron, Devendra Goyal, Lakshminarayanan Krishnamurthy, Priscilla Santos Moraes, Michael C. Smith
Abstract: In one embodiment, a domain-name based framework implemented in a digital assistant ecosystem uses domain names as unique identifiers for request types, requesting entities, responders, and target entities embedded in a natural language request. Further, the framework enables interpreting natural language requests according to domain ontologies associated with different responders. A domain ontology operates as a keyword dictionary for a given responder and defines the keywords and corresponding allowable values to be used for request types and request parameters. The domain-name based framework thus enables the digital assistant to interact with any responder that supports a domain ontology to generate precise and complete responses to natural language based requests.
Type:
Grant
Filed:
December 12, 2017
Date of Patent:
May 26, 2020
Assignee:
VERISIGN, INC.
Inventors:
Andrew Fregly, Burton S. Kaliski, Jr., Swapneel Sheth
Abstract: A dialog content is generated using information that is unique to a user and information that is not unique. The processing executed by a dialog system includes a step of identifying a person based on a dialog with a user, a step of acquiring personal information, a step of analyzing the dialog, a step of extracting an event, a step of searching for a local episode and a global episode based on the personal information and the event, a step of generating dialog data using the search result, a step of outputting a dialog, and a step of accepting user evaluation.
Abstract: Exemplary embodiments relate to techniques to classify or detect the intent of content written in a language for which a classifier does not exist. These techniques involve building a code-switching corpus via machine translation, generating a universal embedding for words in the code-switching corpus, training a classifier on the universal embeddings to generate an embedding mapping/table; accessing new content written in a language for which a specific classifier may not exist, and mapping entries in the embedding mapping/table to the universal embeddings. Using these techniques, a classifier can be applied to the universal embedding without needing to be trained on a particular language. Exemplary embodiments may be applied to recognize similarities in two content items, make recommendations, find similar documents, perform deduplication, and perform topic tagging for stories in foreign languages.
Abstract: A system for extracting verifiable entities from a user-utterance received on an automated calling service is provided. The system may include a receiver configured to receive a user-utterance, a processor and a non-transitory computer-readable media comprising computer-executable instructions. The processor may be configured to execute the instructions which, canonicalize the user-utterance into a plurality of tokens, determine the number of tokens of the user-utterance, and generate, using a sliding-window protocol, a comprehensive number of n-gram sequences from the user-utterance. The processor may be configured to process a plurality of threads of execution that may include a series of actions executed on the n-gram sequences to identify and extract verified entities from the user-utterance.
Type:
Grant
Filed:
August 13, 2018
Date of Patent:
May 12, 2020
Assignee:
Bank of America Corporation
Inventors:
Viju Kothuvatiparambil, Maruthi Z. Shanmugam, Donatus Asumu
Abstract: A robotic system for processing input, such as text data provided through a messaging system, spoken language data provided through a microphone, or any other such input data, which may function to process the input so as to be able to respond or reply to a user based on comprehension of the input sentences. An automated theorem prover (ATP) may operate as an underlying framework for the AI system that understands and responds to spoken or written statements translated into a proper format. An ATP formatter may be used to translate natural language processing (NLP) output from a NLP syntactical sentence parser into the proper format, such that the ATP system may be able to generate and populate an ontology from the NLP output. User queries may be mapped to this ontology in order to facilitate comprehension. If desired, the system may automatically populate the ontology through Internet searching.
Abstract: Disclosed is an LPC residual signal encoding/decoding apparatus of an MDCT based unified voice and audio encoding device. The LPC residual signal encoding apparatus analyzes a property of an input signal, selects an encoding method of an LPC filtered signal, and encode the LPC residual signal based on one of a real filterbank, a complex filterbank, and an algebraic code excited linear prediction (ACELP).
Type:
Grant
Filed:
August 4, 2017
Date of Patent:
April 14, 2020
Assignee:
Electronics and Telecommunications Research Institute
Inventors:
Seung Kwon Beack, Tae Jin Lee, Min Je Kim, Kyeongok Kang, Dae Young Jang, Jin Woo Hong, Jeongil Seo, Chieteuk Ahn, Hochong Park, Young-cheol Park
Abstract: Systems and methods for determining whether a first electronic device detects a media item that is to be output by a second electronic device is described herein. In some embodiments, an individual may request, using a first electronic device, that a media item be played on a second electronic device. The backend system may send first audio data representing a first response to the first electronic device, along with instructions to delay outputting the first response, as well as to continue sending audio data of additional audio captured thereby. The backend system may also send second audio data representing a second response to the second electronic device along with the media item. Text data may be generated representing the captured audio, which may then be compared with text data representing the second response to determine whether or not they match.
Abstract: The present invention is directed to apparatus and methods for decoding Higher Order Ambisonics (HOA) audio signals. HOA audio signals may be decompressed based on perceptual decoding to determine at least an HOA representation corresponding to the HOA audio signals. A rotated transform may be determined based on a rotation of a spherical sample grid. A rotated HOA representation may be determined based on the rotated transform and the HOA representation. The rotated HOA representation may be rendered to output to a loudspeaker setup.
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating author vectors. One of the methods includes obtaining a set of sequences of words, the set of sequences of words comprising a plurality of first sequences of words and, for each first sequence of words, a respective second sequence of words that follows the first sequence of words, wherein each first sequence of words and each second sequence of words has been classified as being authored by a first author; and training a neural network system on the first sequences and the second sequences to determine an author vector for the first author, wherein the author vector characterizes the first author.
Abstract: Embodiments of systems, apparatuses, and/or methods are disclosed for automatic speech imitation. An apparatus may include a machine learner to perform an analysis of tagged data that is to be generated based on a speech pattern and/or a speech context behavior in media content. The machine learner may further generate, based on the analysis, a trained speech model that is to be applied to the media content to transform speech data to mimic data. The apparatus may further include a data analyzer to perform an analysis of the speech pattern, the speech context behavior, and/or the tagged data. The data analyzer may further generate, based on the analysis, a programmed speech rule that is to be applied to transform the speech data to the mimic data.