Patents by Inventor Caiming Xiong

Caiming Xiong has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SYSTEMS AND METHODS FOR HIERARCHICAL MULTI-LABEL CONTRASTIVE LEARNING

Publication number: 20220300761

Abstract: Embodiments described herein provide a hierarchical multi-label framework to learn an embedding function that may capture the hierarchical relationship between classes at different levels in the hierarchy. Specifically, supervised contrastive learning framework may be extended to the hierarchical multi-label setting. Each data point has multiple dependent labels, and the relationship between labels is represented as a hierarchy of labels. The relationship between the different levels of labels may then be learnt by a contrastive learning framework.

Type: Application

Filed: May 24, 2021

Publication date: September 22, 2022

Inventors: Shu Zhang, Chetan Ramaiah, Caiming Xiong, Ran Xu
MACHINE LEARNING BASED MODELS FOR AUTOMATIC CONVERSATIONS IN ONLINE SYSTEMS

Publication number: 20220293094

Abstract: A system uses conversation engines to process natural language requests and conduct automatic conversations with users. The system generates responses to users in an online conversation. The system ranks generated user responses for the online conversation. The system generates a context vector based on a sequence of utterances of the conversation and generates response vectors for generated user responses. The system ranks the user responses based on a comparison of the context vectors and user response vectors. The system uses a machine learning based model that uses a pretrained neural network that supports multiple languages. The system determines a context of an utterance based on utterances in the conversation. The system generates responses and ranks them based on the context. The ranked responses are used to respond to the user.

Type: Application

Filed: March 15, 2021

Publication date: September 15, 2022

Inventors: Yixin Mao, Zachary Alexander, Victor Winslow Yee, Joseph R. Zeimen, Na Cheng, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
NEURAL NETWORK BASED REPRESENTATION LEARNING FOR NATURAL LANGUAGE PROCESSING

Publication number: 20220277141

Abstract: A machine learning based model generates a feature representation of a text sequence, for example, a natural language sentence or phrase. The system trains the machine learning based model by receiving an input text sequence and perturbing the input text sequence by masking a subset of tokens. The machine learning based model is used to predict the masked tokens. A predicted text sequence is generated based on the predictions of the masked tokens. The system processes the predicted text sequence using the machine learning based model to determine whether a token was predicted or an original token. The parameters of the machine learning based model are adjusted to minimize an aggregate loss based on prediction of the correct word for a masked token and a classification of a word as original or replaced.

Type: Application

Filed: February 26, 2021

Publication date: September 1, 2022

Inventors: Erik Lennart Nijkamp, Caiming Xiong
SYSTEMS AND METHODS FOR CONTRASTIVE LEARNING WITH SELF-LABELING REFINEMENT

Publication number: 20220269946

Abstract: Embodiments described herein provide a contrastive learning mechanism with self-labeling refinement, which iteratively employs the network and data themselves to generate more accurate and informative soft labels for contrastive learning. Specifically, the contrastive learning framework includes a self-labeling refinery module to explicitly generate accurate labels, and a momentum mix-up module to increase similarity between a query and its positive, which in turn implicitly improves label accuracy.

Type: Application

Filed: July 14, 2021

Publication date: August 25, 2022

Inventors: Pan Zhou, Caiming Xiong, Chu Hong Hoi
Learning dialogue state tracking with limited labeled data

Patent number: 11416688

Abstract: Embodiments described in this disclosure illustrate the use of self-/semi supervised approaches for label-efficient DST in task-oriented dialogue systems. Conversational behavior is modeled by next response generation and turn utterance generation tasks. Prediction consistency is strengthened by augmenting data with stochastic word dropout and label guessing. Experimental results show that by exploiting self-supervision the joint goal accuracy can be boosted with limited labeled data.

Type: Grant

Filed: May 8, 2020

Date of Patent: August 16, 2022

Assignee: salesforce.com, inc.

Inventors: Chien-Sheng Wu, Chu Hong Hoi, Caiming Xiong
Three-dimensional (3D) convolution with 3D batch normalization

Patent number: 11416747

Abstract: A method of classifying three-dimensional (3D) data includes receiving three-dimensional (3D) data and processing the 3D data using a neural network that includes a plurality of subnetworks arranged in a sequence and the data is processed through each of the subnetworks. Each of the subnetworks is configured to receive an output generated by a preceding subnetwork in the sequence, process the output through a plurality of parallel 3D convolution layer paths of varying convolution volume, process the output through a parallel pooling path, and concatenate output of the 3D convolution layer paths and the pooling path to generate an output representation from each of the subnetworks. Following processing the data through the subnetworks, the method includes processing the output of a last one of the subnetworks in the sequence through a vertical pooling layer to generate an output and classifying the received 3D data based upon the generated output.

Type: Grant

Filed: March 15, 2019

Date of Patent: August 16, 2022

Assignee: salesforce.com, inc.

Inventors: Richard Socher, Caiming Xiong, Kai Sheng Tai
Natural language processing using context-specific word vectors

Patent number: 11409945

Abstract: A system is provided for natural language processing. In some embodiments, the system includes an encoder for generating context-specific word vectors for at least one input sequence of words. The encoder is pre-trained using training data for performing a first natural language processing task. A neural network performs a second natural language processing task on the at least one input sequence of words using the context-specific word vectors. The first natural language process task is different from the second natural language processing task and the neural network is separately trained from the encoder. In some embodiments, the first natural processing task can be machine translation, and the second natural processing task can be one of sentiment analysis, question classification, entailment classification, and question answering.

Type: Grant

Filed: September 21, 2020

Date of Patent: August 9, 2022

Assignee: SALESFORCE.COM, INC.

Inventors: Bryan McCann, Caiming Xiong, Richard Socher
Block-diagonal hessian-free optimization for recurrent and convolutional neural networks

Patent number: 11386327

Abstract: Embodiments for training a neural network are provided. A neural network is divided into a first block and a second block, and the parameters in the first block and second block are trained in parallel. To train the parameters, a gradient from a gradient mini-batch included in training data is generated. A curvature-vector product from a curvature mini-batch included in the training data is also generated. The gradient and the curvature-vector product generate a conjugate gradient. The conjugate gradient is used to determine a change in parameters in the first block in parallel with a change in parameters in the second block. The curvature matrix in the curvature-vector product includes zero values when the terms correspond to parameters from different blocks.

Type: Grant

Filed: May 18, 2018

Date of Patent: July 12, 2022

Assignee: Salesforce.com, inc.

Inventors: Huishuai Zhang, Caiming Xiong
IMAGE ANALYSIS BASED DOCUMENT PROCESSING FOR INFERENCE OF KEY-VALUE PAIRS IN NON-FIXED DIGITAL DOCUMENTS

Publication number: 20220215195

Abstract: An online system extracts information from non-fixed form documents. The online system receives an image of a form document and obtains a set of phrases and locations of the set of phrases on the form image. For at least one field, the online system determines key scores for the set of phrases. The online system identifies a set of candidate values for the field from the set of identified phrases and identifies a set of neighbors for each candidate value from the set of identified phrases. The online system determines neighbor scores, where a neighbor score for a candidate value and a respective neighbor is determined based on the key score for the neighbor and a spatial relationship of the neighbor to the candidate value. The online system selects a candidate value and a respective neighbor based on the neighbor score as the value and key for the field.

Type: Application

Filed: January 4, 2021

Publication date: July 7, 2022

Inventors: Mingfei Gao, Zeyuan Chen, Le Xue, Ran Xu, Caiming Xiong
SYSTEMS AND METHODS FOR UNIFYING QUESTION ANSWERING AND TEXT CLASSIFICATION VIA SPAN EXTRACTION

Publication number: 20220171943

Abstract: Systems and methods for unifying question answering and text classification via span extraction include a preprocessor for preparing a source text and an auxiliary text based on a task type of a natural language processing task, an encoder for receiving the source text and the auxiliary text from the preprocessor and generating an encoded representation of a combination of the source text and the auxiliary text, and a span-extractive decoder for receiving the encoded representation and identifying a span of text within the source text that is a result of the NLP task. The task type is one of entailment, classification, or regression. In some embodiments, the source text includes one or more of text received as input when the task type is entailment, a list of classifications when the task type is entailment or classification, or a list of similarity options when the task type is regression.

Type: Application

Filed: February 16, 2022

Publication date: June 2, 2022

Inventors: Nitish Shirish Keskar, Bryan McCann, Richard Socher, Caiming Xiong
System and method for unsupervised density based table structure identification

Patent number: 11347708

Abstract: Embodiments described herein provide unsupervised density-based clustering to infer table structure from document. Specifically, a number of words are identified from a block of text in an noneditable document, and the spatial coordinates of each word relative to the rectangular region are identified. Based on the word density of the rectangular region, the words are grouped into clusters using a heuristic radius search method. Words that are grouped into the same cluster are determined to be the element that belong to the same cell. In this way, the cells of the table structure can be identified. Once the cells are identified based on the word density of the block of text, the identified cells can be expanded horizontally or grouped vertically to identify rows or columns of the table structure.

Type: Grant

Filed: November 11, 2019

Date of Patent: May 31, 2022

Assignee: salesforce.com, inc.

Inventors: Ankit Chadha, Zeyuan Chen, Caiming Xiong, Ran Xu, Richard Socher
Phone-based sub-word units for end-to-end speech recognition

Patent number: 11328731

Abstract: System and methods for identifying a text word from a spoken utterance are provided. An ensemble BPE system that includes a phone BPE system and a character BPE system receives a spoken utterance. Both BPE systems include a multi-level language model (LM) and an acoustic model. The phone BPE system identifies first words from the spoken utterance and determine a first score for each first word. The first words are converted into character sequences. The character BPE model converts the character sequences into second words and determines a second score for each second word. For each word from the first words that matches a word in the second words the first and second scores are combined. The text word is the word with a highest score.

Type: Grant

Filed: June 17, 2020

Date of Patent: May 10, 2022

Assignee: salesforce.com, inc.

Inventors: Weiran Wang, Yingbo Zhou, Caiming Xiong
SYSTEM AND METHODS FOR TRAINING TASK-ORIENTED DIALOGUE (TOD) LANGUAGE MODELS

Publication number: 20220139384

Abstract: Embodiments described herein provide methods and systems for training task-oriented dialogue (TOD) language models. In some embodiments, a TOD language model may receive a TOD dataset including a plurality of dialogues and a model input sequence may be generated from the dialogues using a first token prefixed to each user utterance and a second token prefixed to each system response of the dialogues. In some embodiments, the first token or the second token may be randomly replaced with a mask token to generate a masked training sequence and a masked language modeling (MLM) loss may be computed using the masked training sequence. In some embodiments, the TOD language model may be updated based on the MLM loss.

Type: Application

Filed: November 3, 2020

Publication date: May 5, 2022

Inventors: Chien-Sheng Wu, Chu Hong Hoi, Richard Socher, Caiming Xiong
SYSTEMS AND METHODS FOR MULTI-SCALE PRE-TRAINING WITH DENSELY CONNECTED TRANSFORMER

Publication number: 20220129626

Abstract: Embodiments described herein propose a densely connected Transformer architecture in which each Transformer layer takes advantages of all previous layers. Specifically, the input for each Transformer layer comes from the outputs of all its preceding layers; and the output information of each layer will be incorporated in all its subsequent layers. In this way, a L-layer Transformer network will have L(L+1)/2 connections. In this way, the dense connection allows the linguistic information learned by the lower layer to be directly propagated to all upper layers and encourages feature reuse throughout the network. Each layer is thus directly optimized from the loss function in the fashion of implicit deep supervision.

Type: Application

Filed: October 26, 2020

Publication date: April 28, 2022

Inventors: Linqing Liu, Caiming Xiong
SYSTEMS AND METHODS FOR UNSUPERVISED PARAPHRASE GENERATION

Publication number: 20220129629

Abstract: Embodiments described herein provide dynamic blocking, a decoding algorithm which enables large-scale pretrained language models to generate high-quality paraphrases in an un-supervised setting. Specifically, in order to obtain an alternative surface form, when the language model emits a token that is present in the source sequence, the language model is prevented from generating the next token that is the same as the subsequent source token in the source sequence at the next time step. In this way, the language model is forced to generate a paraphrased sequence of the input source sequence, but with mostly different wording.

Type: Application

Filed: January 28, 2021

Publication date: April 28, 2022

Inventors: Tong Niu, Semih Yavuz, Yingbo Zhou, Nitish Shirish Keskar, Huan Wang, Caiming Xiong
SYSTEMS AND METHODS FOR COUNTERFACTUAL EXPLANATION IN MACHINE LEARNING MODELS

Publication number: 20220114464

Abstract: Embodiments described herein provide a two-stage model-agnostic approach for generating counterfactual explanation via counterfactual feature selection and counterfactual feature optimization. Given a query instance, counterfactual feature selection picks a subset of feature columns and values that can potentially change the prediction and then counterfactual feature optimization determines the best feature value for the selected feature as a counterfactual example.

Type: Application

Filed: January 29, 2021

Publication date: April 14, 2022

Inventors: Wenzhuo Yang, Jia Li, Chu Hong Hoi, Caiming Xiong
SYSTEMS AND METHODS FOR COUNTERFACTUAL EXPLANATION IN MACHINE LEARNING MODELS

Publication number: 20220114481

Abstract: Embodiments described herein provide a two-stage model-agnostic approach for generating counterfactual explanation via counterfactual feature selection and counterfactual feature optimization. Given a query instance, counterfactual feature selection picks a subset of feature columns and values that can potentially change the prediction and then counterfactual feature optimization determines the best feature value for the selected feature as a counterfactual example.

Type: Application

Filed: January 29, 2021

Publication date: April 14, 2022

Inventors: Wenzhuo Yang, Jia Li, Chu Hong Hoi, Caiming Xiong
COARSE-TO-FINE ABSTRACTIVE DIALOGUE SUMMARIZATION WITH CONTROLLABLE GRANULARITY

Publication number: 20220108086

Abstract: Dialogue summarization is challenging due to its multi-speaker standpoints, casual spoken language style, and limited labelled data. The embodiments are directed to a coarse-to-fine dialogue summarization model that improves abstractive dialogue summarization quality and enables granular controllability. A summary draft that includes key words for turns in a dialogue conversation history is created. The summary draft includes pseudo-labelled interrogative pronoun categories and noisy key phrases. The dialogue conversation history is divided into segments. A generate language model is trained to generate a segment summary for each dialogue segment using a portion of the summary draft that corresponds to at least one dialogue turn in the dialogue segment. A dialogue summary is generated using the generative language model trained using the summary draft.

Type: Application

Filed: January 27, 2021

Publication date: April 7, 2022

Inventors: Chien-Sheng Wu, Wenhao Liu, Caiming Xiong, Linqing Liu
CUSTOMIZING CHATBOTS BASED ON USER SPECIFICATION

Publication number: 20220103491

Abstract: A conversation engine performs conversations with users using chatbots customized for performing a set of tasks that can be performed using an online system. The conversation engine loads a chatbot configuration that specifies the behavior of a chatbot including the tasks that can be performed by the chatbot, the types of entities relevant to each task, and so on. The conversation may be voice based and use natural language. The conversation engine may load different chatbot configurations to implement different chatbots. The conversation engine receives a conversation engine configuration that specifies the behavior of the conversation engine across chatbots. The system may be a multi-tenant system that allows customization of the chatbots for each tenant.

Type: Application

Filed: September 29, 2020

Publication date: March 31, 2022

Inventors: Xinyi Yang, Tian Xie, Caiming Xiong, Wenhao Liu, Huan Wang, Kazuma Hashimoto, Jin Qu, Feihong Wu, Yingbo Zhou
CONFIGURABLE CONVERSATION ENGINE FOR EXECUTING CUSTOMIZABLE CHATBOTS

Publication number: 20220101844

Abstract: A conversation engine performs conversations with users using chatbots customized for performing a set of tasks that can be performed using an online system. The conversation engine loads a chatbot configuration that specifies the behavior of a chatbot including the tasks that can be performed by the chatbot, the types of entities relevant to each task, and so on. The conversation may be voice based and use natural language. The conversation engine may load different chatbot configurations to implement different chatbots. The conversation engine receives a conversation engine configuration that specifies the behavior of the conversation engine across chatbots. The system may be a multi-tenant system that allows customization of the chatbots for each tenant.

Type: Application

Filed: September 29, 2020

Publication date: March 31, 2022

Inventors: Xinyi Yang, Tian Xie, Caiming Xiong, Wenhao Liu, Huan Wang, Kazuma Hashimoto, Yingbo Zhou, Xugang Ye, Jin Qu, Feihong Wu

prev 1 2 3 4 5 6 7 8 9 … next