Abstract: Some implementations relate to performing speech biasing, NLU biasing, and/or other biasing based on historical assistant interaction(s). It can be determined, for one or more given historical interactions of a given user, whether to affect future biasing for (1) the given user account, (2) additional user account(s), and/or (3) the shared assistant device as a whole. Some implementations disclosed herein additionally and/or alternatively relate to: determining, based on utterance(s) of a given user to a shared assistant device, an association of first data and second data; storing the association as accessible to a given user account of the given user; and determining whether to store the association as also accessible by additional user account(s) and/or the shared assistant device.
Abstract: A feature vector may be extracted from each frame of input digitized microphone audio data. The feature vector may include a power value for each frequency band of a plurality of frequency bands. A feature history data structure, including a plurality of feature vectors, may be formed. A normalized feature set that includes a normalized feature data structure may be produced by determining normalized power values for a plurality of frequency bands of each feature vector of the feature history data structure. A signal recognition or modification process may be based, at least in part, on the normalized feature data structure.
Abstract: In some aspects, systems and methods for rapidly building, managing, and sharing machine learning models are provided. Managing the lifecycle of machine learning models can include: receiving a set of unannotated data; requesting annotations of samples of the unannotated data to produce an annotated set of data; building a machine learning model based on the annotated set of data; deploying the machine learning model to a client system, wherein production annotations are generated; collecting the generated production annotations and generating a new machine learning model incorporating the production annotations; and selecting one of the machine learning model built based on the annotated set of data or the new machine learning model.
Abstract: Some implementations relate to performing speech biasing, NLU biasing, and/or other biasing based on historical assistant interaction(s). It can be determined, for one or more given historical interactions of a given user, whether to affect future biasing for (1) the given user account, (2) additional user account(s), and/or (3) the shared assistant device as a whole. Some implementations disclosed herein additionally and/or alternatively relate to: determining, based on utterance(s) of a given user to a shared assistant device, an association of first data and second data; storing the association as accessible to a given user account of the given user; and determining whether to store the association as also accessible by additional user account(s) and/or the shared assistant device.
Abstract: A method and apparatus for building a conversation understanding system based on artificial intelligence, a device and a computer-readable storage medium. In embodiments of the present disclosure, it is feasible to obtain the training feedback information provided by conversation service conducted by the user and the basic conversation understanding system, then according to the training feedback information, perform adjustment processing for a service state of the basic conversation understanding system, to obtain an adjustment state of the basic conversation understanding system. It is possible to perform data merging processing according to the training feedback information and the adjustment state of the basic conversation understanding system, to obtain model training data for building the model conversation understanding system.
Type:
Grant
Filed:
June 12, 2018
Date of Patent:
August 15, 2023
Assignee:
BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
Inventors:
Ke Sun, Shiqi Zhao, Dianhai Yu, Haifeng Wang
Abstract: A machine learning system includes a learning section and an operating section including a memory. The operating section holds a required accuracy, and an internal state and a weight value of a learner in the memory and executes calculation processing by using data input to the machine learning system and the weight value held in the memory to update the internal state. An accuracy of the internal state is calculated from a result of the calculation processing and an evaluation value is calculated using the data input to the machine learning system, the weight value, and the updated internal state held in the memory when the calculated accuracy is higher than the required accuracy. The evaluation value is transmitted to the learning section, which updates the weight value by using the evaluation value and notifies the number of times of updating the weight value to the operating section.
Abstract: Some implementations relate to performing speech biasing, NLU biasing, and/or other biasing based on historical assistant interaction(s). It can be determined, for one or more given historical interactions of a given user, whether to affect future biasing for (1) the given user account, (2) additional user account(s), and/or (3) the shared assistant device as a whole. Some implementations disclosed herein additionally and/or alternatively relate to: determining, based on utterance(s) of a given user to a shared assistant device, an association of first data and second data; storing the association as accessible to a given user account of the given user; and determining whether to store the association as also accessible by additional user account(s) and/or the shared assistant device.
Abstract: A system and method for adapting a language model to a specific environment by receiving interactions captured the specific environment, generating a collection of documents from documents retrieved from external resources, detecting in the collection of documents terms related to the environment that are not included in an initial language model and adapting the initial language model to include the terms detected.
Abstract: A model adaptation device includes a text database that stores a plurality of sentences containing predetermined phonemes; a sentence list that includes a plurality of sentences that describe the contents of the input voice; an input unit to which the input voice is input; a model adaptation unit that performs the model adaptation using the input voice and the sentence list and outputs adapting characteristic information, which is for making the model approximate to the input voice; a statistic database that stores the adapting characteristic information; a distance calculation unit that outputs a value of an acoustic distance between the adapting characteristic information and the model for each phoneme; a phoneme detection unit that outputs a distance value, among the distance values, which is greater than a threshold value as a detection result; and a label generation unit that extracts from the text database a sentence containing a phoneme associated with the detection result and outputs the sentence.
Abstract: A method for performing speech recognition relating to an object for the purpose of affecting automatic processing of the object by a processing system. The object carries information with at least a character string of processing information. The character string spoken by an operator is processed by way of a speech recognition procedure to generate a first result. Based on the need for more information of an element of the first result additional processing data is requested. An operator's response generates a second result. The first result is then modified to achieve consistency with the operator's response.
Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module.
Abstract: A is recognized using a predefinable vocabulary that is partitioned in sections of phonetically similar words. In a recognition process, first oral input is associated with one of the sections, then the oral input is determined from the vocabulary of the associated section.