Patents by Inventor Caiming Xiong

Caiming Xiong has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11657233
    Abstract: Systems and methods for unifying question answering and text classification via span extraction include a preprocessor for preparing a source text and an auxiliary text based on a task type of a natural language processing task, an encoder for receiving the source text and the auxiliary text from the preprocessor and generating an encoded representation of a combination of the source text and the auxiliary text, and a span-extractive decoder for receiving the encoded representation and identifying a span of text within the source text that is a result of the NLP task. The task type is one of entailment, classification, or regression. In some embodiments, the source text includes one or more of text received as input when the task type is entailment, a list of classifications when the task type is entailment or classification, or a list of similarity options when the task type is regression.
    Type: Grant
    Filed: February 16, 2022
    Date of Patent: May 23, 2023
    Assignee: salesforce.com, inc.
    Inventors: Nitish Shirish Keskar, Bryan McCann, Richard Socher, Caiming Xiong
  • Patent number: 11657269
    Abstract: Verification of discriminative models includes receiving an input; receiving a prediction from a discriminative model for the input; encoding, using an encoder, a latent variable based on the input; decoding, using a decoder, a reconstructed input based on the prediction and the latent variable; and determining, using an anomaly detection module, whether the prediction is reliable based on the input, the reconstructed input, and the latent variable. The encoder and the decoder are jointly trained to maximize an evidence lower bound of the encoder and the decoder. In some embodiments, the encoder and the decoder are further trained using a disentanglement constraint between the prediction and the latent variable. In some embodiments, the encoder and the decoder are further trained without using inputs that are out of a distribution of inputs used to train the discriminative model or that are adversarial to the discriminative model.
    Type: Grant
    Filed: October 3, 2019
    Date of Patent: May 23, 2023
    Assignee: salesforce.com, inc.
    Inventors: Tong Che, Caiming Xiong
  • Publication number: 20230153542
    Abstract: Embodiments described herein provide a cross-lingual sentence alignment framework that is trained only on rich-resource language pairs. To obtain an accurate aligner, a pretrained multi-lingual language model is used, and a classifier is trained on parallel data from rich-resource language pairs. This trained classifier may then be used for cross-lingual transfer with low-resource languages.
    Type: Application
    Filed: January 21, 2022
    Publication date: May 18, 2023
    Inventors: Tong Niu, Kazuma Hashimoto, Yingbo Zhou, Caiming Xiong
  • Patent number: 11651158
    Abstract: A system performs conversations with users using chatbots customized for performing a set of tasks. The system may be a multi-tenant system that allows customization of the chatbots for each tenant. The system receives a task configuration that maps tasks to entity types and an entity configuration that specifies methods for determining entities of a particular entity type. The system receives a user utterance and determines the intent of the user using an intent detection model, for example, a neural network. The intent represents a task that the user is requesting. The system determines one or more entities corresponding to the task. The system performs tasks based on the determined intent and the entities and performs conversations with users based on the tasks.
    Type: Grant
    Filed: August 13, 2020
    Date of Patent: May 16, 2023
    Assignee: Salesforce, Inc.
    Inventors: Xinyi Yang, Tian Xie, Caiming Xiong, Wenhao Liu, Huan Wang, Jin Qu, Soujanya Lanka, Chu Hong Hoi, Xugang Ye, Feihong Wu
  • Patent number: 11645509
    Abstract: Embodiments for training a neural network using sequential tasks are provided. A plurality of sequential tasks are received. For each task in the plurality of tasks a copy of the neural network that includes a plurality of layers is generated. From the copy of the neural network a task specific neural network is generated by performing an architectural search on the plurality of layers in the copy of the neural network. The architectural search identifies a plurality of candidate choices in the layers of the task specific neural network. Parameters in the task specific neural network that correspond to the plurality of candidate choices and that maximize architectural weights at each layer are identified. The parameters are retrained and merged with the neural network. The neural network trained on the plurality of sequential tasks is a trained neural network.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: May 9, 2023
    Assignee: Salesforce.com, Inc.
    Inventors: Yingbo Zhou, Xilai Li, Caiming Xiong
  • Patent number: 11640527
    Abstract: Systems and methods are provided for near-zero-cost (NZC) query framework or approach for differentially private deep learning. To protect the privacy of training data during learning, the near-zero-cost query framework transfers knowledge from an ensemble of teacher models trained on partitions of the data to a student model. Privacy guarantees may be understood intuitively and expressed rigorously in terms of differential privacy. Other features are also provided.
    Type: Grant
    Filed: October 21, 2019
    Date of Patent: May 2, 2023
    Assignee: Salesforce.com, Inc.
    Inventors: Lichao Sun, Jia Li, Caiming Xiong, Yingbo Zhou
  • Publication number: 20230120940
    Abstract: Embodiments described herein propose an approach for unsupervised structure extraction in task-oriented dialogues. Specifically, a Slot Boundary Detection (SBD) module is adopted, for which utterances from training domains are tagged with the conventional BIO schema but without the slot names. A transformer-based classifier is trained to detect the boundary of potential slot tokens in the test domain. Next, while the state number is usually unknown, it is more reasonable to assume the slot number is given when analyzing a dialogue system. The detected tokens are clustered into the number of slot of groups. Finally, the dialogue state is represented with a vector recording the modification times of every slot. The slot values are then tracked through each dialogue session in the corpus and label utterances with their dialogue states accordingly. The semantic structure is portrayed by computing the transition frequencies among the unique states.
    Type: Application
    Filed: January 31, 2022
    Publication date: April 20, 2023
    Inventors: Liang Qiu, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
  • Patent number: 11631009
    Abstract: Approaches for multi-hop knowledge graph reasoning with reward shaping include a system and method of training a system to search relational paths in a knowledge graph. The method includes identifying, using an reasoning module, a plurality of first outgoing links from a current node in a knowledge graph, masking, using the reasoning module, one or more links from the plurality of first outgoing links to form a plurality of second outgoing links, rewarding the reasoning module with a reward of one when a node corresponding to an observed answer is reached, and rewarding the reasoning module with a reward identified by a reward shaping network when a node not corresponding to an observed answer is reached. In some embodiments, the reward shaping network is pre-trained.
    Type: Grant
    Filed: July 31, 2018
    Date of Patent: April 18, 2023
    Assignee: Salesforce.com, Inc
    Inventors: Xi Victoria Lin, Caiming Xiong, Richard Socher
  • Publication number: 20230113750
    Abstract: A system performs group testing on a population of items. The group testing identifies items satisfying particular criteria from a population of items, for example, defective items from the population. The group testing may be performed for software or hardware testing, for testing a human population, for training of deep learning applications, and so on. The system trains a machine learning based model, for example, a reinforcement learning based model to evaluate groups. The model may further determine system dynamics that may represent priors of items. An agent treats the population and groups of items being tested as the environment and performs actions, for example, adjusting the groups. The system also performs a non-adaptive strategy based on monte carlo simulation of tests based on a simulation results.
    Type: Application
    Filed: October 11, 2021
    Publication date: April 13, 2023
    Inventors: Lav Raj Varshney, Yingbo Zhou, Caiming Xiong, Govardana Sachithanandam Ramachandran
  • Patent number: 11625543
    Abstract: Embodiments described herein provide a composed variational natural language generation (CLANG) model that is configured to generate training samples for few-shot intents. Specifically, the CLANG model may build connections between existing training samples of many-shot intents and new training samples of few-shot intents by modeling an intent as a combination of a domain and an action. In this way, the CLANG model transfers knowledge from existing many-shot intents to few-shot intents in natural language generation by learning how to compose utterances with many-shot intents and transferring such knowledge to few-shot intents.
    Type: Grant
    Filed: September 2, 2020
    Date of Patent: April 11, 2023
    Assignee: salesforce.com, inc.
    Inventors: Congying Xia, Caiming Xiong
  • Publication number: 20230107640
    Abstract: Embodiments described herein provide methods and systems for effectively and efficiently summarizing long documents. A transformer is provided with bottom-up and top-down inference combined to effectively capture long-range dependency. In the bottom-up inference, each token only attends to nearby tokens within a window of a specified size. In the top-down inference, full self-attention is given using units with coarser granularity. The bottom-up-inferred token representations are then updated with the top-down representations, which is achieved with cross-attention between the top and token levels. Multiple levels of top-down representations with increasingly coarser granularity can be used if documents are extremely long.
    Type: Application
    Filed: January 31, 2022
    Publication date: April 6, 2023
    Inventors: Bo Pang, Erik Nijkamp, Yingbo Zhou, Caiming Xiong
  • Publication number: 20230104662
    Abstract: Embodiments are directed to a training framework for reducing gender bias in a pre-trained language model. To reduce gender bias a gender neutral dataset is generated. Next, parameters of the pre-trained language model are frozen and do not change during a subsequent training phase. As all the pre-trained parameters are frozen, forgetting of information from the original training data is minimized. New parameters are added to the language model. The new parameters may be associated with gender related terms, such as profession names. In a subsequent training phase the new parameters of the language model are trained using a gender neutral dataset.
    Type: Application
    Filed: January 27, 2022
    Publication date: April 6, 2023
    Inventors: Zahra Fatemi, Caiming Xiong, Wenhao Liu, Chen Xing
  • Publication number: 20230105322
    Abstract: Embodiments described herein provide a system and method for extracting information. The system receives, via a communication interface, a dataset of a plurality of data samples. The system determines, in response to an input data sample from the dataset, a set of feature vectors via a plurality of pre-trained feature extractors, respectively. The system retrieves a set of memory bank vectors that correspond to the input data sample. The system, generates, via a plurality of Multi-Layer-Perceptrons (MLPs), a mapped set of representations in response to an input of the set of memory bank vectors, respectively. The system determines a loss objective between the set of feature vectors and the combination of the mapped set of representations and a network of layers in the MLP. The system updates, the parameters of the plurality of MLPs and the parameters of the memory bank vectors by minimizing the computed loss objective.
    Type: Application
    Filed: January 28, 2022
    Publication date: April 6, 2023
    Inventors: Bram Wallace, Devansh Arpit, Huan Wang, Caiming Xiong
  • Patent number: 11620515
    Abstract: Systems and methods are provided that employ knowledge distillation under a multi-task learning setting. In some embodiments, the systems and methods are implemented with a larger teacher model and a smaller student model, each of which comprise one or more shared layers and a plurality of task layers for performing multiple tasks. During training of the teacher model, its shared layers are initialized, and then the teacher model is multi-task refined. The teacher model predicts teacher logits. During training of the student model, its shared layers are initialized. Knowledge distillation is employed to transfer knowledge from the teacher model to the student model by the student model updating its shared layers and task layers, for example, according to the teacher logits of the teacher model. Other features are also provided.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: April 4, 2023
    Assignee: salesforce.com, inc.
    Inventors: Linqing Liu, Caiming Xiong
  • Patent number: 11615249
    Abstract: Approaches for multitask learning as question answering include an input layer for encoding a context and a question, a self-attention based transformer including an encoder and a decoder, a first bi-directional long-term short-term memory (biLSTM) for further encoding an output of the encoder, a long-term short-term memory (LSTM) for generating a context-adjusted hidden state from the output of the decoder and a hidden state, an attention network for generating first attention weights based on an output of the first biLSTM and an output of the LSTM, a vocabulary layer for generating a distribution over a vocabulary, a context layer for generating a distribution over the context, and a switch for generating a weighting between the distributions over the vocabulary and the context, generating a composite distribution based on the weighting, and selecting a word of an answer using the composite distribution.
    Type: Grant
    Filed: August 18, 2020
    Date of Patent: March 28, 2023
    Assignee: salesforce.com, inc.
    Inventors: Bryan McCann, Nitish Shirish Keskar, Caiming Xiong, Richard Socher
  • Patent number: 11605118
    Abstract: Embodiments described herein provide an attentive network framework that models dynamic attributes with item and feature interactions. Specifically, the attentive network framework first encodes basket item sequences and dynamic attribute sequences with time-aware padding and time/month encoding to capture the seasonal patterns (e.g. in app recommendation, outdoor activities apps are more suitable for summer time while indoor activity apps are better for winter). Then the attentive network framework applies time-level attention modules on basket items' sequences and dynamic user attributes' sequences to capture basket items to basket items and attributes to attributes temporal sequential patterns. After that, an intra-basket attentive module is used on items in each basket to capture the correlation information among items.
    Type: Grant
    Filed: December 4, 2020
    Date of Patent: March 14, 2023
    Assignee: salesforce.com, inc.
    Inventors: Yongjun Chen, Jia Li, Chenxi Li, Markus Anderle, Caiming Xiong, Simo Arajarvi, Harshavardhan Utharavalli
  • Publication number: 20230073754
    Abstract: Embodiments described herein provides an intent prototypical contrastive learning framework that leverages intent similarities between users with different behavior sequences. Specifically, user behavior sequences are encoded into a plurality of user interest representations. The user interest representations are clustered into a plurality of clusters based on mutual distances among the user interest representations in a representation space. Intention prototypes are determined based on centroids of the clusters. A set of augmented views for user behavior sequences are created and encoded into a set of view representations. A contrastive loss is determined based on the set of augmented views and the plurality of intention prototypes. Model parameters are updated based at least in part on the contrastive loss.
    Type: Application
    Filed: January 27, 2022
    Publication date: March 9, 2023
    Inventors: Yongjun Chen, Zhiwei Liu, Jia Li, Caiming Xiong
  • Patent number: 11600194
    Abstract: Approaches for natural language processing include a multi-layer encoder for encoding words from a context and words from a question in parallel, a multi-layer decoder for decoding the encoded context and the encoded question, a pointer generator for generating distributions over the words from the context, the words from the question, and words in a vocabulary based on an output from the decoder, and a switch. The switch generates a weighting of the distributions over the words from the context, the words from the question, and the words in the vocabulary, generates a composite distribution based on the weighting of the distribution over the first words from the context, the distribution over the second words from the question, and the distribution over the words in the vocabulary, and selects words for inclusion in an answer using the composite distribution.
    Type: Grant
    Filed: June 12, 2018
    Date of Patent: March 7, 2023
    Assignee: Salesforce.com, Inc.
    Inventors: Bryan McCann, Nitish Shirish Keskar, Caiming Xiong, Richard Socher
  • Patent number: 11599730
    Abstract: Embodiments described in this disclosure illustrate the use of self-/semi supervised approaches for label-efficient DST in task-oriented dialogue systems. Conversational behavior is modeled by next response generation and turn utterance generation tasks. Prediction consistency is strengthened by augmenting data with stochastic word dropout and label guessing. Experimental results show that by exploiting self-supervision the joint goal accuracy can be boosted with limited labeled data.
    Type: Grant
    Filed: May 8, 2020
    Date of Patent: March 7, 2023
    Assignee: Salesforce.com, Inc.
    Inventors: Chien-Sheng Wu, Chu Hong Hoi, Caiming Xiong
  • Patent number: 11588800
    Abstract: A system authenticates users using voice-based conversations. The system allows the authentication process to be customized using an authentication plan. For example, the system may be a multi-tenant system that allows customization of the authentication process for each tenant. The authentication plan is represented as an expression of phrase types, each phrase type associated with a phrase verification method. The system authenticates a user by executing the expression of an authentication plan for that user in response to a request from the user. The system performs a conversation with the user according to the authentication plan. The system determines whether to allow or deny the user request based on the result of evaluation of the expression of the authentication plan.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: February 21, 2023
    Assignee: Salesforce, Inc.
    Inventors: Tian Xie, Caiming Xiong