Patents by Inventor Caiming Xiong

Caiming Xiong has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250139411
    Abstract: Embodiments described herein provide a large language model (LLM) based AI agent that adopts Monte-Carlo Tree Search (MCTS) to execute a task. The LLM is prompted with a task description and it responds with its first attempted list of actions. Based on the success or failure of the first attempt, the LLM is prompted with an updated prompt which includes feedback from the first attempt based on a determined reward. The prompt may include a relative “score” for each action taken at each step. A numeric score may be mapped to a set of pre-defined text labels, such as “high” or “low” value putting the score in a form more suited for an LLM prompt. In this way, the LLM is iteratively given prompts which are updated with the scores from each action taken at each previous iterations so that it traverses different paths on the tree in each iteration.
    Type: Application
    Filed: October 31, 2023
    Publication date: May 1, 2025
    Inventors: Rithesh Murthy, Shelby Heinecke, Juan Carlos Niebles Duque, Zhiwei Liu, Le Xue, Weiran Yao, Yihao Feng, Zeyuan Chen, Akash Gokul, Devansh Arpit, Ran Xu, Lik Mui, Huan Wang, Caiming Xiong, Silvio Savarese
  • Publication number: 20250131246
    Abstract: Embodiments provide an attention mechanism that computes attention weights for an input sequence by employing a set of multi-head learnable vectors (referred to as “binder vectors”) to attend to the input sequence.
    Type: Application
    Filed: October 23, 2023
    Publication date: April 24, 2025
    Inventors: Devansh Arpit, Huan Wang, Caiming Xiong
  • Publication number: 20250124233
    Abstract: Systems and methods for editing a large language model are provided. The large language model generates a sequence of tokens, a first probability of a pre-edit output based on the sequence of tokens, and a second probability of a target output based on the sequence of tokens. A loss function is provided based on the first probability and the second probability. A plurality of gradients of the large language model with respect to the loss function is computed. An edit location of the large language model is determined based on the plurality of gradients. The large language model is edited by editing weights at the edit location of the large language model, such that the updated large language model generates the target output for an input including the sequence of words.
    Type: Application
    Filed: January 31, 2024
    Publication date: April 17, 2025
    Inventors: Itai Izhak Feigenbaum, Devansh Arpit, Shelby Heinecke, Juan Carlos Niebles Duque, Weiran Yao, Huan Wang, Caiming Xiong, Silvio Savarese
  • Publication number: 20250111155
    Abstract: Embodiments described herein provide a method for mitigating toxic content in text generation by a neural network based framework. The method includes the following operations. A text input of a sequence of tokens is received via a communication interface. A first output probability for a next token generating is generated by a first neural network model that is trained to generate tokens belonging to a prioritized category of vocabulary, in response to the text input. A second output probability of the next token is generated by a second neural network model that is trained to generate tokens belonging to an indiscriminate vocabulary, in response to the text input. The next token for a text output based on a combined output probability computed based on a correction item reflective of the first output probability and the second output probability is generated in response to the text input.
    Type: Application
    Filed: January 18, 2024
    Publication date: April 3, 2025
    Inventors: Tong Niu, Yingbo Zhou, Silvio Savarese, Semih Yavuz, Caiming Xiong
  • Patent number: 12260185
    Abstract: Dialogue summarization is challenging due to its multi-speaker standpoints, casual spoken language style, and limited labelled data. The embodiments are directed to a coarse-to-fine dialogue summarization model that improves abstractive dialogue summarization quality and enables granular controllability. A summary draft that includes key words for turns in a dialogue conversation history is created. The summary draft includes pseudo-labelled interrogative pronoun categories and noisy key phrases. The dialogue conversation history is divided into segments. A generate language model is trained to generate a segment summary for each dialogue segment using a portion of the summary draft that corresponds to at least one dialogue turn in the dialogue segment. A dialogue summary is generated using the generative language model trained using the summary draft.
    Type: Grant
    Filed: January 27, 2021
    Date of Patent: March 25, 2025
    Assignee: Salesforce, Inc.
    Inventors: Chien-Sheng Wu, Wenhao Liu, Caiming Xiong, Linqing Liu
  • Publication number: 20250068901
    Abstract: Embodiments described herein provide a diffusion-based framework that is trained on a dataset with limited text labels, to generate a distribution of data samples in the dataset given a specific text description label. Specifically, firstly, unlabeled data is used to train the diffusion model to generate a data distribution of data samples given a specific text description label. Then text-labeled data samples are used to finetune the diffusion model to generate data distribution given a specific text description label, thus enhancing controllability of training.
    Type: Application
    Filed: January 25, 2024
    Publication date: February 27, 2025
    Inventors: Shiyu Wang, Yihao Feng, Tian Lan, Ning Yu, Yu Bai, Ran Xu, Huan Wang, Caiming Xiong, Silvio Savarese
  • Publication number: 20250053793
    Abstract: Embodiments described herein provide a method of predicting an action by a plurality of language model augmented agents (LAAs). In at least one embodiment, a controller receives a task instruction to be performed using an environment. The controller receives an observation of a first state from the environment. The controller selects a LAA from the plurality of LAAs based on the task instruction and the observation. The controller obtains an output from the selected LAA generated using an input combining the task instruction, the observation, and an LAA-specific prompt template. The controller determines the action based on the output. The controller causes the action to be performed on the environment thereby causing the first state of the environment to change to a second state.
    Type: Application
    Filed: October 25, 2023
    Publication date: February 13, 2025
    Inventors: Zhiwei Liu, Weiran Yao, Jianguo Zhang, Le Xue, Shelby Heinecke, Rithesh Murthy, Yihao Feng, Zeyuan Chen, Juan Carlos Niebles Duque, Devansh Arpit, Ran Xu, Lik Mui, Huan Wang, Caiming Xiong, Silvio Savarese
  • Publication number: 20250053787
    Abstract: Embodiments described herein provide a method for training a recommendation neural network model using multiple data sources. The method may include: receiving, via a data interface, time series data indicating a user-item interaction history; transforming the time series data into a user-item graph; encoding, by a neural network encoder, the user-item graph into user embeddings and item embeddings; generating a plurality of losses according to a plurality of training tasks performed based on the user embeddings and, item embeddings; training the recommendation neural network model by updating the user embeddings and the item embeddings via backpropagation based on a weighted sum of gradients of the plurality of losses; and generating, by a neural network decoder, one or more recommended items for a given user based on the updated user embeddings and the updated item embeddings.
    Type: Application
    Filed: January 31, 2024
    Publication date: February 13, 2025
    Inventors: Liangwei Yang, Shelby Heinecke, Jianguo Zhang, Rithesh Murthy, Huan Wang, Caiming Xiong, Zhiwei Liu
  • Patent number: 12223270
    Abstract: Embodiments described herein provide a method of evaluating a natural language processing model. The method includes receiving an evaluation dataset that may include a plurality of unit tests, the unit tests having: an input context, and a first candidate and a second candidate that are generated in response to the input context, where the first test candidate is associated with a first quality notation, and the second candidate is associated with a second quality notation. The method includes determining, via a model, a first likelihood of generating the first candidate and a second likelihood of generating the second candidate in response to the input context. The method also includes determining whether the first likelihood being greater than the second likelihood. The method also includes determining whether the first model passed the unit test, where the first quality notation indicates a higher quality candidate and the second quality notation indicate a lower quality candidate.
    Type: Grant
    Filed: June 10, 2022
    Date of Patent: February 11, 2025
    Assignee: Salesforce, Inc.
    Inventors: Philippe Laban, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
  • Publication number: 20250045567
    Abstract: Embodiments described herein provide for optimizing a language model (LM) agent. In at least one embodiment, and LM agent comprises an “actor” LM and a “retrospective LM which provides reflections on attempts by the actor LM. The reflections are used to update subsequent prompts to the actor LM. Optimizing the LM agent comprises fine-tuning parameters of the retrospective LM while keeping parameters of the actor LM frozen. A gradient may be determined by a change in reward from the environment based on actions taken by the actor LM with and without a reflection of the retrospective LM. Using this gradient, parameters of the retrospective LM may be updated via backpropagation.
    Type: Application
    Filed: October 31, 2023
    Publication date: February 6, 2025
    Inventors: Weiran Yao, Shelby Heinecke, Juan Carlos Niebles Duque, Zhiwei Liu, Yihao Feng, Le Xue, Rithesh Murthy, Zeyuan Chen, Jianguo Zhang, Devansh Arpit, Ran Xu, Lik Mui, Huan Wang, Caiming Xiong, Silvio Savarese
  • Patent number: 12217146
    Abstract: A computer-implemented method for dual sequence inference using a neural network model includes generating a codependent representation based on a first input representation of a first sequence and a second input representation of a second sequence using an encoder of the neural network model and generating an inference based on the codependent representation using a decoder of the neural network model. The neural network model includes a plurality of model parameters learned according to a machine learning process. The encoder includes a plurality of coattention layers arranged sequentially, each coattention layer being configured to receive a pair of layer input representations and generate one or more summary representations, and an output layer configured to receive the one or more summary representations from a last layer among the plurality of coattention layers and generate the codependent representation.
    Type: Grant
    Filed: October 20, 2021
    Date of Patent: February 4, 2025
    Assignee: Salesforce, Inc.
    Inventors: Victor Zhong, Caiming Xiong, Richard Socher
  • Patent number: 12204847
    Abstract: Embodiments described herein provide a method for text summarization. The method includes receiving a training dataset having at least an uncompressed text, a compressed text, and one or more information entities accompanying the compressed text. The method also includes generating, using a perturber model, a perturbed text with the one or more information entities being inserted into the compressed text. The method further includes training the perturber model based on a first training objective, and generating, using the trained perturber model, a perturbed summary in response to an input of a reference summary. The method further includes generating, via an editor model, a predicted summary by removing information from the perturbed summary conditioned on a source document of the reference summary, and training the editor model based on a second training objective.
    Type: Grant
    Filed: October 6, 2022
    Date of Patent: January 21, 2025
    Assignee: Salesforce, Inc.
    Inventors: Alexander R. Fabbri, Prafulla Kumar Choubey, Jesse Vig, Chien-Sheng Wu, Caiming Xiong
  • Patent number: 12198047
    Abstract: The technology disclosed provides a quasi-recurrent neural network (QRNN) encoder-decoder model that alternates convolutional layers, which apply in parallel across timesteps, and minimalist recurrent pooling layers that apply in parallel across feature dimensions.
    Type: Grant
    Filed: December 15, 2020
    Date of Patent: January 14, 2025
    Assignee: Salesforce, Inc.
    Inventors: James Bradbury, Stephen Joseph Merity, Caiming Xiong, Richard Socher
  • Patent number: 12198060
    Abstract: Embodiments described herein combine both masked reconstruction and predictive coding. Specifically, unlike contrastive learning, the mutual information between past states and future states are directly estimated. The context information can also be directly captured via shifted masked reconstruction—unlike standard masked reconstruction, the target reconstructed observations are shifted slightly towards the future to incorporate more predictability. The estimated mutual information and shifted masked reconstruction loss can then be combined as the loss function to update the neural model.
    Type: Grant
    Filed: August 28, 2020
    Date of Patent: January 14, 2025
    Assignee: Salesforce, Inc.
    Inventors: Junwen Bai, Weiran Wang, Yingbo Zhou, Caiming Xiong
  • Publication number: 20240428044
    Abstract: Embodiments described herein provide a framework that integrates a retriever model and the LLM to feed retrieved passages to an LLM to generate an answer conditioned on the retrieved passages in response to a query. For example, in one embodiment, a single-round approach is implemented, which involves directly transmitting the retrieved passages to the LLM. For another example, a multi-round methodology is implemented, which involves initially presenting the retrieved passages to the Language Model, collecting its responses, and then adjusting our interaction with the Language Model based on this acquired feedback.
    Type: Application
    Filed: October 30, 2023
    Publication date: December 26, 2024
    Inventors: Ye Liu, Semih Yavuz, Meghana Moorthy Bhat, Rui Meng, Shafiq Joty, Caiming Xiong, Yingbo Zhou
  • Patent number: 12169698
    Abstract: Embodiments described herein provide a pipelined natural language question answering system that improves a BERT-based system. Specifically, the natural language question answering system uses a pipeline of neural networks each trained to perform a particular task. The context selection network identifies premium context from context for the question. The question type network identifies the natural language question as a yes, no, or span question and a yes or no answer to the natural language question when the question is a yes or no question. The span extraction model determines an answer span to the natural language question when the question is a span question.
    Type: Grant
    Filed: September 7, 2023
    Date of Patent: December 17, 2024
    Assignee: Salesforce, Inc.
    Inventors: Akari Asai, Kazuma Hashimoto, Richard Socher, Caiming Xiong
  • Publication number: 20240411992
    Abstract: Embodiments described herein provide a training framework for generative NLP models. Specifically, the training input, e.g., in the form of a sequence of tokens representing a user-agent dialogue, may be randomly masked for a few spans, which can be one or more tokens, one or more words, one or more sentences, or one or more paragraphs. These masked spans are replaced with their embeddings generated from pre-trained large language models are then used for training the NLP model.
    Type: Application
    Filed: June 15, 2023
    Publication date: December 12, 2024
    Inventors: Shiva Kumar Pentyala, Prafulla Kumar Choubey, Shashank Harinath, Sitaram Asur, Chien-Sheng Jason Wu, Zachary Alexander, Caiming Xiong
  • Publication number: 20240411991
    Abstract: Embodiments described herein provide a training framework for generative NLP models that operate on previously learnt knowledge from pretrained large language models. Specifically, to train an NLP model to generate a response to a user utterance (e.g., “resolve login issue”), document embeddings of support IT documents encoded by a pretrained LLM are fed to an NLP decoder together with a training dialogue (e.g., a dialogue between the chat agent on how to “resolve login issue”). The NLP decoder can thus be trained by a causal language modeling loss computed based on the predicted next token and the ground-truth token from the training dialogue.
    Type: Application
    Filed: June 6, 2023
    Publication date: December 12, 2024
    Inventors: Shiva Kumar Pentyala, Prafulla Kumar Choubey, Shashank Harinath, Sitaram Asur, Chien-Sheng Jason Wu, Zachary Alexander, Caiming Xiong
  • Patent number: 12164878
    Abstract: Embodiments described herein provide a cross-lingual sentence alignment framework that is trained only on rich-resource language pairs. To obtain an accurate aligner, a pretrained multi-lingual language model is used, and a classifier is trained on parallel data from rich-resource language pairs. This trained classifier may then be used for cross-lingual transfer with low-resource languages.
    Type: Grant
    Filed: January 21, 2022
    Date of Patent: December 10, 2024
    Assignee: Salesforce, Inc.
    Inventors: Tong Niu, Kazuma Hashimoto, Yingbo Zhou, Caiming Xiong
  • Patent number: 12165053
    Abstract: A method for using a neural network to generate an improved graph model includes receiving, by the neural network, a graph model. The graph model is based on data relating to an environment for allocating resources to a first group and a second group. The method further includes receiving, by the neural network, a budget for editing the graph model based on a cost of corresponding modification to the environment, and determining, by the neural network, a fairness representation based on a fairness requirement between the first and second groups. It is determined by the neural network, a utility function for the graph model based on first and second group utilities representing resource allocation to the first and second groups respectively. Reinforcement learning is performed on the neural network to generate the improved graph model using the utility function and the fairness representation.
    Type: Grant
    Filed: November 17, 2020
    Date of Patent: December 10, 2024
    Assignee: Salesforce, Inc.
    Inventors: Govardana Sachithanandam Ramachandran, Ivan Brugere, Lav Varshney, Caiming Xiong