Patents by Inventor Richard Socher

Richard Socher has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

GENERATING WORD EMBEDDINGS WITH A WORD EMBEDDER AND A CHARACTER EMBEDDER NEURAL NETWORK MODELS

Publication number: 20220083837

Abstract: The technology disclosed provides a so-called “joint many-task neural network model” to solve a variety of increasingly complex natural language processing (NLP) tasks using growing depth of layers in a single end-to-end model. The model is successively trained by considering linguistic hierarchies, directly connecting word representations to all model layers, explicitly using predictions in lower tasks, and applying a so-called “successive regularization” technique to prevent catastrophic forgetting. Three examples of lower level model layers are part-of-speech (POS) tagging layer, chunking layer, and dependency parsing layer. Two examples of higher level model layers are semantic relatedness layer and textual entailment layer. The model achieves the state-of-the-art results on chunking, dependency parsing, semantic relatedness and textual entailment.

Type: Application

Filed: November 23, 2021

Publication date: March 17, 2022

Inventors: Kazuma Hashimoto, Caiming Xiong, Richard Socher
Hybrid training of deep networks

Patent number: 11276002

Abstract: Hybrid training of deep networks includes a multi-layer neural network. The training includes setting a current learning algorithm for the multi-layer neural network to a first learning algorithm. The training further includes iteratively applying training data to the neural network, determining a gradient for parameters of the neural network based on the applying of the training data, updating the parameters based on the current learning algorithm, and determining whether the current learning algorithm should be switched to a second learning algorithm based on the updating. The training further includes, in response to the determining that the current learning algorithm should be switched to a second learning algorithm, changing the current learning algorithm to the second learning algorithm and initializing a learning rate of the second learning algorithm based on the gradient and a step used by the first learning algorithm to update the parameters of the neural network.

Type: Grant

Filed: March 20, 2018

Date of Patent: March 15, 2022

Assignee: salesforce.com, inc.

Inventors: Nitish Shirish Keskar, Richard Socher
Interpretable counting in visual question answering

Patent number: 11270145

Abstract: Approaches for interpretable counting for visual question answering include a digital image processor, a language processor, and a counter. The digital image processor identifies objects in an image, maps the identified objects into an embedding space, generates bounding boxes for each of the identified objects, and outputs the embedded objects paired with their bounding boxes. The language processor embeds a question into the embedding space. The scorer determines scores for the identified objects. Each respective score determines how well a corresponding one of the identified objects is responsive to the question. The counter determines a count of the objects in the digital image that are responsive to the question based on the scores. The count and a corresponding bounding box for each object included in the count are output. In some embodiments, the counter determines the count interactively based on interactions between counted and uncounted objects.

Type: Grant

Filed: February 4, 2020

Date of Patent: March 8, 2022

Assignee: salesforce.com, inc.

Inventors: Alexander Richard Trott, Caiming Xiong, Richard Socher
Intelligent Training Set Augmentation for Natural Language Processing Tasks

Publication number: 20220067277

Abstract: A natural language processing system that trains task models for particular natural language tasks programmatically generates additional utterances for inclusion in the training set, based on the existing utterances in the training set and the existing state of a task model as generated from the original (non-augmented) training set. More specifically, the training augmentation module 220 identifies specific textual units of utterances and generates variants of the utterances based on those identified units. The identification is based on determined importances of the textual units to the output of the task model, as well as on task rules that correspond to the natural language task for which the task model is being generated. The generation of the additional utterances improves the quality of the task model without the expense of manual labeling of utterances for training set inclusion.

Type: Application

Filed: August 25, 2020

Publication date: March 3, 2022

Inventors: Shiva Kumar Pentyala, Mridul Gupta, Ankit Chadha, Indira Iyer, Richard Socher
Deep neural network-based decision network

Patent number: 11250311

Abstract: The technology disclosed proposes using a combination of computationally cheap, less-accurate bag of words (BoW) model and computationally expensive, more-accurate long short-term memory (LSTM) model to perform natural processing tasks such as sentiment analysis. The use of cheap, less-accurate BoW model is referred to herein as “skimming”. The use of expensive, more-accurate LSTM model is referred to herein as “reading”. The technology disclosed presents a probability-based guider (PBG). PBG combines the use of BoW model and the LSTM model. PBG uses a probability thresholding strategy to determine, based on the results of the BoW model, whether to invoke the LSTM model for reliably classifying a sentence as positive or negative. The technology disclosed also presents a deep neural network-based decision network (DDN) that is trained to learn the relationship between the BoW model and the LSTM model and to invoke only one of the two models.

Type: Grant

Filed: December 22, 2017

Date of Patent: February 15, 2022

Assignee: salesforce.com, inc.

Inventors: Alexander Rosenberg Johansen, Bryan McCann, James Bradbury, Richard Socher
GENERATING DUAL SEQUENCE INFERENCES USING A NEURAL NETWORK MODEL

Publication number: 20220044093

Abstract: A computer-implemented method for dual sequence inference using a neural network model includes generating a codependent representation based on a first input representation of a first sequence and a second input representation of a second sequence using an encoder of the neural network model and generating an inference based on the codependent representation using a decoder of the neural network model. The neural network model includes a plurality of model parameters learned according to a machine learning process. The encoder includes a plurality of coattention layers arranged sequentially, each coattention layer being configured to receive a pair of layer input representations and generate one or more summary representations, and an output layer configured to receive the one or more summary representations from a last layer among the plurality of coattention layers and generate the codependent representation.

Type: Application

Filed: October 20, 2021

Publication date: February 10, 2022

Inventors: Victor Zhong, Caiming Xiong, Richard Socher
Adaptive attention model for image captioning

Patent number: 11244111

Abstract: The technology disclosed presents a novel spatial attention model that uses current hidden state information of a decoder long short-term memory (LSTM) to guide attention and to extract spatial image features for use in image captioning. The technology disclosed also presents a novel adaptive attention model for image captioning that mixes visual information from a convolutional neural network (CNN) and linguistic information from an LSTM. At each timestep, the adaptive attention model automatically decides how heavily to rely on the image, as opposed to the linguistic model, to emit the next caption word. The technology disclosed further adds a new auxiliary sentinel gate to an LSTM architecture and produces a sentinel LSTM (Sn-LSTM). The sentinel gate produces a visual sentinel at each timestep, which is an additional representation, derived from the LSTM's memory, of long and short term visual and linguistic information.

Type: Grant

Filed: October 30, 2019

Date of Patent: February 8, 2022

Assignee: salesforce.com, inc.

Inventors: Jiasen Lu, Caiming Xiong, Richard Socher
SYSTEMS AND METHODS FOR SAFE POLICY IMPROVEMENT FOR TASK ORIENTED DIALOGUES

Publication number: 20220036884

Abstract: Embodiments described herein provide safe policy improvement (SPI) in a batch reinforcement learning framework for a task-oriented dialogue. Specifically, a batch reinforcement learning framework for dialogue policy learning is provided, which improves the performance of the dialogue and learns to shape a reward that reasons the invention behind human response rather than just imitating the human demonstration.

Type: Application

Filed: October 13, 2021

Publication date: February 3, 2022

Inventors: Govardana Sachithanandam Ramachandran, Kazuma Hashimoto, Caiming Xiong, Richard Socher
Two-stage online detection of action start in untrimmed videos

Patent number: 11232308

Abstract: Embodiments described herein provide a two-stage online detection of action start system including a classification module and a localization module. The classification module generates a set of action scores corresponding to a first video frame from the video, based on the first video frame and video frames before the first video frames in the video. Each action score indicating a respective probability that the first video frame contains a respective action class. The localization module is coupled to the classification module for receiving the set of action scores from the classification module and generating an action-agnostic start probability that the first video frame contains an action start.

Type: Grant

Filed: April 25, 2019

Date of Patent: January 25, 2022

Assignee: salesforce.com, inc.

Inventors: Mingfei Gao, Richard Socher, Caiming Xiong
Question answering from minimal context over documents

Patent number: 11227218

Abstract: A natural language processing system that includes a sentence selector and a question answering module. The sentence selector receives a question and sentences that are associated with a context. For a question and each sentence, the sentence selector determines a score. A score represents whether the question is answerable with the sentence. Sentence selector then generates a minimum set of sentences from the scores associated with the question and sentences. The question answering module generates an answer for the question from the minimum set of sentences.

Type: Grant

Filed: May 15, 2018

Date of Patent: January 18, 2022

Assignee: salesforce.com, inc.

Inventors: Sewon Min, Victor Zhong, Caiming Xiong, Richard Socher
Deep neural network model for processing data through multiple linguistic task hierarchies

Patent number: 11222253

Abstract: The technology disclosed provides a so-called “joint many-task neural network model” to solve a variety of increasingly complex natural language processing (NLP) tasks using growing depth of layers in a single end-to-end model. The model is successively trained by considering linguistic hierarchies, directly connecting word representations to all model layers, explicitly using predictions in lower tasks, and applying a so-called “successive regularization” technique to prevent catastrophic forgetting. Three examples of lower level model layers are part-of-speech (POS) tagging layer, chunking layer, and dependency parsing layer. Two examples of higher level model layers are semantic relatedness layer and textual entailment layer. The model achieves the state-of-the-art results on chunking, dependency parsing, semantic relatedness and textual entailment.

Type: Grant

Filed: January 31, 2017

Date of Patent: January 11, 2022

Assignee: salesforce.com, inc.

Inventors: Kazuma Hashimoto, Caiming Xiong, Richard Socher
SYSTEMS AND METHODS FOR STRUCTURED TEXT TRANSLATION WITH TAG ALIGNMENT

Publication number: 20210397799

Abstract: Approaches for the translation of structured text include an embedding module for encoding and embedding source text in a first language, an encoder for encoding output of the embedding module, a decoder for iteratively decoding output of the encoder based on generated tokens in translated text from previous iterations, a beam module for constraining output of the decoder with respect to possible embedded tags to include in the translated text for a current iteration using a beam search, and a layer for selecting a token to be included in the translated text for the current iteration. The translated text is in a second language different from the first language. In some embodiments, the approach further includes scoring and pointer modules for selecting the token based on the output of the beam module or copied from the source text or reference text from a training pair best matching the source text.

Type: Application

Filed: August 31, 2021

Publication date: December 23, 2021

Inventors: Kazuma Hashimoto, Raffaella Buschiazzo, James Bradbury, Teresa Anna Marshall, Caiming Xiong, Richard Socher
SYSTEMS AND METHODS FOR LEARNING FOR DOMAIN ADAPTATION

Publication number: 20210389736

Abstract: A method for training parameters of a first domain adaptation model. The method includes evaluating a cycle consistency objective using a first task specific model associated with a first domain and a second task specific model associated with a second domain, and evaluating one or more first discriminator models to generate a first discriminator objective using the second task specific model. The one or more first discriminator models include a plurality of discriminators corresponding to a plurality of bands that corresponds domain variable ranges of the first and second domains respectively. The method further includes updating, based on the cycle consistency objective and the first discriminator objective, one or more parameters of the first domain adaptation model for adapting representations from the first domain to the second domain.

Type: Application

Filed: August 30, 2021

Publication date: December 16, 2021

Inventors: Ehsan Hosseini-Asl, Caiming Xiong, Yingbo Zhou, Richard Socher
SYSTEMS AND METHODS FOR SAFE POLICY IMPROVEMENT FOR TASK ORIENTED DIALOGUES

Publication number: 20210383212

Abstract: Embodiments described herein provide safe policy improvement (SPI) in a batch reinforcement learning framework for a task-oriented dialogue. Specifically, a batch reinforcement learning framework for dialogue policy learning is provided, which improves the performance of the dialogue and learns to shape a reward that reasons the invention behind human response rather than just imitating the human demonstration.

Type: Application

Filed: November 25, 2020

Publication date: December 9, 2021

Inventors: Govardana Sachithanandam Ramachandran, Kazuma Hashimoto, Caiming Xiong, Richard Socher
SYSTEMS AND METHODS FOR DOMAIN ADAPTATION IN DIALOG ACT TAGGING

Publication number: 20210375269

Abstract: Embodiments described herein utilize pre-trained masked language models as the backbone for dialogue act tagging and provide cross-domain generalization of the resulting dialogue acting taggers. For example, a pre-trained MASK token of BERT model may be used as a controllable mechanism for augmenting text input, e.g., generating tags for an input of unlabeled dialogue history. The pre-trained MASK model can be trained with semi-supervised learning, e.g., using multiple objectives from supervised tagging loss, masked tagging loss, masked language model loss, and/or a disagreement loss.

Type: Application

Filed: August 21, 2020

Publication date: December 2, 2021

Inventors: Semih Yavuz, Kazuma Hashimoto, Wenhao Liu, Nitish Shirish Keskar, Richard Socher, Caiming Xiong
EFFICIENT DETERMINATION OF USER INTENT FOR NATURAL LANGUAGE EXPRESSIONS BASED ON MACHINE LEARNING

Publication number: 20210374353

Abstract: An online system allows user interactions using natural language expressions. The online system uses a machine learning based model to infer an intent represented by a user expression. The machine learning based model takes as input a user expression and an example expression to compute a score indicating whether the user expression matches the example expression. Based on the scores, the intent inference module determines a most applicable intent for the expression. The online system determines a confidence threshold such that user expressions indicating a high confidence are assigned the most applicable intent and user expressions indicating a low confidence are assigned an out-of-scope intent. The online system encodes the example expressions using the machine learning based model. The online system may compare an encoded user expression with encoded example expressions to identify a subset of example expressions used to determine the most applicable intent.

Type: Application

Filed: August 28, 2020

Publication date: December 2, 2021

Inventors: Jianguo Zhang, Kazuma Hashimoto, Chien-Sheng Wu, Wenhao Liu, Richard Socher, Caiming Xiong
Diversity and Explainability Parameters for Recommendation Accuracy in Machine Learning Recommendation Systems

Publication number: 20210374132

Abstract: Embodiments are directed to a machine learning recommendation system. The system receives a user query for generating a recommendation for one or more items with an explanation associated with recommending the one or more items. The system obtains first features of at least one user and second features of a set of items. The system provides the first features and the second features to a first machine learning network for determining a predicted score for an item. The system provides a portion of the first features and a portion of the second features to second machine learning networks for determining explainability scores for an item and generating corresponding explanation narratives. The system provides the recommendation for one or more items and corresponding explanation narratives based on ranking predicted scores and explainability scores for the items.

Type: Application

Filed: November 10, 2020

Publication date: December 2, 2021

Inventors: Wenzhuo Yang, Jia Li, Chenxi Li, Latrice Barnett, Markus Anderle, Simo Arajarvi, Harshavardhan Utharavalli, Caiming Xiong, Richard Socher, Chu Hong Hoi
PREDICTION-CORRECTION APPROACH TO ZERO SHOT LEARNING

Publication number: 20210365740

Abstract: Approaches to zero-shot learning include partitioning training data into first and second sets according to classes assigned to the training data, training a prediction module based on the first set to predict a cluster center based on a class label, training a correction module based on the second set and each of the class labels in the first set to generate a correction to a cluster center predicted by the prediction module, presenting a new class label for a new class to the prediction module to predict a new cluster center, presenting the new class label, the predicted new cluster center, and each of the class labels in the first set to the correction module to generate a correction for the predicted new cluster center, augmenting a classifier based on the corrected cluster center for the new class, and classifying input data into the new class using the classifier.

Type: Application

Filed: August 9, 2021

Publication date: November 25, 2021

Inventors: Lily HU, Caiming XIONG, Richard SOCHER
Generating dual sequence inferences using a neural network model

Patent number: 11170287

Abstract: A computer-implemented method for dual sequence inference using a neural network model includes generating a codependent representation based on a first input representation of a first sequence and a second input representation of a second sequence using an encoder of the neural network model and generating an inference based on the codependent representation using a decoder of the neural network model. The neural network model includes a plurality of model parameters learned according to a machine learning process. The encoder includes a plurality of coattention layers arranged sequentially, each coattention layer being configured to receive a pair of layer input representations and generate one or more summary representations, and an output layer configured to receive the one or more summary representations from a last layer among the plurality of coattention layers and generate the codependent representation.

Type: Grant

Filed: January 26, 2018

Date of Patent: November 9, 2021

Assignee: salesforce.com, inc.

Inventors: Victor Zhong, Caiming Xiong, Richard Socher
TRAINING A JOINT MANY-TASK NEURAL NETWORK MODEL USING SUCCESSIVE REGULARIZATION

Publication number: 20210279551

Abstract: The technology disclosed provides a so-called “joint many-task neural network model” to solve a variety of increasingly complex natural language processing (NLP) tasks using growing depth of layers in a single end-to-end model. The model is successively trained by considering linguistic hierarchies, directly connecting word representations to all model layers, explicitly using predictions in lower tasks, and applying a so-called “successive regularization” technique to prevent catastrophic forgetting. Three examples of lower level model layers are part-of-speech (POS) tagging layer, chunking layer, and dependency parsing layer. Two examples of higher level model layers are semantic relatedness layer and textual entailment layer. The model achieves the state-of-the-art results on chunking, dependency parsing, semantic relatedness and textual entailment.

Type: Application

Filed: May 26, 2021

Publication date: September 9, 2021

Inventors: Kazuma Hashimoto, Caiming Xiong, Richard Socher

prev 1 2 3 4 5 6 7 … next