Patents by Inventor Richard Socher

Richard Socher has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Global-to-Local Memory Pointer Networks for Task-Oriented Dialogue

Publication number: 20200105272

Abstract: A system and corresponding method are provided for generating responses for a dialogue between a user and a computer. The system includes a memory storing information for a dialogue history and a knowledge base. An encoder may receive a new utterance from the user and generate a global memory pointer used for filtering the knowledge base information in the memory. A decoder may generate at least one local memory pointer and a sketch response for the new utterance. The sketch response includes at least one sketch tag to be replaced by knowledge base information from the memory. The system generates the dialogue computer response using the local memory pointer to select a word from the filtered knowledge base information to replace the at least one sketch tag in the sketch response.

Type: Application

Filed: October 30, 2018

Publication date: April 2, 2020

Inventors: Chien-Sheng WU, Caiming XIONG, Richard SOCHER
Interpretable counting in visual question answering

Patent number: 10592767

Abstract: Approaches for interpretable counting for visual question answering include a digital image processor, a language processor, and a counter. The digital image processor identifies objects in an image, maps the identified objects into an embedding space, generates bounding boxes for each of the identified objects, and outputs the embedded objects paired with their bounding boxes. The language processor embeds a question into the embedding space. The scorer determines scores for the identified objects. Each respective score determines how well a corresponding one of the identified objects is responsive to the question. The counter determines a count of the objects in the digital image that are responsive to the question based on the scores. The count and a corresponding bounding box for each object included in the count are output. In some embodiments, the counter determines the count interactively based on interactions between counted and uncounted objects.

Type: Grant

Filed: January 29, 2018

Date of Patent: March 17, 2020

Assignee: salesforce.com, inc.

Inventors: Alexander Richard Trott, Caiming Xiong, Richard Socher
Dense Video Captioning

Publication number: 20200084465

Abstract: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.

Type: Application

Filed: November 18, 2019

Publication date: March 12, 2020

Inventors: Yingbo ZHOU, Luowei ZHOU, Caiming XIONG, Richard SOCHER
POINTER SENTINEL MIXTURE ARCHITECTURE

Publication number: 20200065651

Abstract: The technology disclosed provides a so-called “pointer sentinel mixture architecture” for neural network sequence models that has the ability to either reproduce a token from a recent context or produce a token from a predefined vocabulary. In one implementation, a pointer sentinel-LSTM architecture achieves state of the art language modeling performance of 70.9 perplexity on the Penn Treebank dataset, while using far fewer parameters than a standard softmax LSTM.

Type: Application

Filed: October 25, 2019

Publication date: February 27, 2020

Inventors: Stephen Joseph Merity, Caiming Xiong, James Bradbury, Richard Socher
SPATIAL ATTENTION MODEL FOR IMAGE CAPTIONING

Publication number: 20200057805

Abstract: The technology disclosed presents a novel spatial attention model that uses current hidden state information of a decoder long short-term memory (LSTM) to guide attention and to extract spatial image features for use in image captioning. The technology disclosed also presents a novel adaptive attention model for image captioning that mixes visual information from a convolutional neural network (CNN) and linguistic information from an LSTM. At each timestep, the adaptive attention model automatically decides how heavily to rely on the image, as opposed to the linguistic model, to emit the next caption word. The technology disclosed further adds a new auxiliary sentinel gate to an LSTM architecture and produces a sentinel LSTM (Sn-LSTM). The sentinel gate produces a visual sentinel at each timestep, which is an additional representation, derived from the LSTM's memory, of long and short term visual and linguistic information.

Type: Application

Filed: October 23, 2019

Publication date: February 20, 2020

Inventors: Jiasen LU, Caiming Xiong, Richard Socher
Sentinel gate for modulating auxiliary information in a long short-term memory (LSTM) neural network

Patent number: 10565306

Abstract: The technology disclosed presents a novel spatial attention model that uses current hidden state information of a decoder long short-term memory (LSTM) to guide attention and to extract spatial image features for use in image captioning. The technology disclosed also presents a novel adaptive attention model for image captioning that mixes visual information from a convolutional neural network (CNN) and linguistic information from an LSTM. At each timestep, the adaptive attention model automatically decides how heavily to rely on the image, as opposed to the linguistic model, to emit the next caption word. The technology disclosed further adds a new auxiliary sentinel gate to an LSTM architecture and produces a sentinel LSTM (Sn-LSTM). The sentinel gate produces a visual sentinel at each timestep, which is an additional representation, derived from the LSTM's memory, of long and short term visual and linguistic information.

Type: Grant

Filed: November 18, 2017

Date of Patent: February 18, 2020

Assignee: salesforce.com, inc.

Inventors: Jiasen Lu, Caiming Xiong, Richard Socher
Pointer sentinel mixture architecture

Patent number: 10565493

Abstract: The technology disclosed provides a so-called “pointer sentinel mixture architecture” for neural network sequence models that has the ability to either reproduce a token from a recent context or produce a token from a predefined vocabulary. In one implementation, a pointer sentinel-LSTM architecture achieves state of the art language modeling performance of 70.9 perplexity on the Penn Treebank dataset, while using far fewer parameters than a standard softmax LSTM.

Type: Grant

Filed: January 31, 2017

Date of Patent: February 18, 2020

Assignee: salesforce.com, inc.

Inventors: Stephen Joseph Merity, Caiming Xiong, James Bradbury, Richard Socher
Adaptive attention model for image captioning

Patent number: 10565305

Abstract: The technology disclosed presents a novel spatial attention model that uses current hidden state information of a decoder long short-term memory (LSTM) to guide attention and to extract spatial image features for use in image captioning. The technology disclosed also presents a novel adaptive attention model for image captioning that mixes visual information from a convolutional neural network (CNN) and linguistic information from an LSTM. At each timestep, the adaptive attention model automatically decides how heavily to rely on the image, as opposed to the linguistic model, to emit the next caption word. The technology disclosed further adds a new auxiliary sentinel gate to an LSTM architecture and produces a sentinel LSTM (Sn-LSTM). The sentinel gate produces a visual sentinel at each timestep, which is an additional representation, derived from the LSTM's memory, of long and short term visual and linguistic information.

Type: Grant

Filed: November 17, 2017

Date of Patent: February 18, 2020

Assignee: salesforce.com, inc.

Inventors: Jiasen Lu, Caiming Xiong, Richard Socher
Spatial attention model for image captioning

Patent number: 10558750

Abstract: The technology disclosed presents a novel spatial attention model that uses current hidden state information of a decoder long short-term memory (LSTM) to guide attention and to extract spatial image features for use in image captioning. The technology disclosed also presents a novel adaptive attention model for image captioning that mixes visual information from a convolutional neural network (CNN) and linguistic information from an LSTM. At each timestep, the adaptive attention model automatically decides how heavily to rely on the image, as opposed to the linguistic model, to emit the next caption word. The technology disclosed further adds a new auxiliary sentinel gate to an LSTM architecture and produces a sentinel LSTM (Sn-LSTM). The sentinel gate produces a visual sentinel at each timestep, which is an additional representation, derived from the LSTM's memory, of long and short term visual and linguistic information.

Type: Grant

Filed: November 17, 2017

Date of Patent: February 11, 2020

Assignee: salesforce.com, inc.

Inventors: Jiasen Lu, Caiming Xiong, Richard Socher
Dense video captioning

Patent number: 10542270

Abstract: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.

Type: Grant

Filed: January 18, 2018

Date of Patent: January 21, 2020

Assignee: salesforce.com, inc.

Inventors: Yingbo Zhou, Luowei Zhou, Caiming Xiong, Richard Socher
MULTI-HOP KNOWLEDGE GRAPH REASONING WITH REWARD SHAPING

Publication number: 20190362246

Abstract: Approaches for multi-hop knowledge graph reasoning with reward shaping include a system and method of training a system to search relational paths in a knowledge graph. The method includes identifying, using an reasoning module, a plurality of first outgoing links from a current node in a knowledge graph, masking, using the reasoning module, one or more links from the plurality of first outgoing links to form a plurality of second outgoing links, rewarding the reasoning module with a reward of one when a node corresponding to an observed answer is reached, and rewarding the reasoning module with a reward identified by a reward shaping network when a node not corresponding to an observed answer is reached. In some embodiments, the reward shaping network is pre-trained.

Type: Application

Filed: July 31, 2018

Publication date: November 28, 2019

Inventors: Xi Victoria Lin, Caiming Xiong, Richard Socher
Multitask Learning As Question Answering

Publication number: 20190355270

Abstract: Approaches for natural language processing include a multi-layer encoder for encoding words from a context and words from a question in parallel, a multi-layer decoder for decoding the encoded context and the encoded question, a pointer generator for generating distributions over the words from the context, the words from the question, and words in a vocabulary based on an output from the decoder, and a switch. The switch generates a weighting of the distributions over the words from the context, the words from the question, and the words in the vocabulary, generates a composite distribution based on the weighting of the distribution over the first words from the context, the distribution over the second words from the question, and the distribution over the words in the vocabulary, and selects words for inclusion in an answer using the composite distribution.

Type: Application

Filed: June 12, 2018

Publication date: November 21, 2019

Applicant: salesforce.com, inc.

Inventors: Bryan McCann, Nitish Shirish Keskar, Caiming Xiong, Richard Socher
UNSUPERVISED NON-PARALLEL SPEECH DOMAIN ADAPTATION USING A MULTI-DISCRIMINATOR ADVERSARIAL NETWORK

Publication number: 20190295530

Abstract: A system for domain adaptation includes a domain adaptation model configured to adapt a representation of a signal in a first domain to a second domain to generate an adapted presentation and a plurality of discriminators corresponding to a plurality of bands of values of a domain variable. Each of the plurality of discriminators is configured to discriminate between the adapted representation and representations of one or more other signals in the second domain.

Type: Application

Filed: July 3, 2018

Publication date: September 26, 2019

Applicant: salesforce.com, inc.

Inventors: Ehsan Hosseini-Asl, Caiming Xiong, Yingbo Zhou, Richard Socher
SYSTEMS AND METHODS FOR LEARNING FOR DOMAIN ADAPTATION

Publication number: 20190286073

Abstract: A method for training parameters of a first domain adaptation model includes evaluating a cycle consistency objective using a first task specific model associated with a first domain and a second task specific model associated with a second domain. The evaluating the cycle consistency objective is based on one or more first training representations adapted from the first domain to the second domain by a first domain adaptation model and from the second domain to the first domain by a second domain adaptation model, and one or more second training representations adapted from the second domain to the first domain by the second domain adaptation model and from the first domain to the second domain by the first domain adaptation model. The method further includes evaluating a learning objective based on the cycle consistency objective, and updating parameters of the first domain adaptation model based on learning objective.

Type: Application

Filed: August 3, 2018

Publication date: September 19, 2019

Inventors: Ehsan Hosseini-Asl, Caiming Xiong, Yingbo Zhou, Richard Socher
Question Answering From Minimal Context Over Documents

Publication number: 20190258939

Abstract: A natural language processing system that includes a sentence selector and a question answering module. The sentence selector receives a question and sentences that are associated with a context. For a question and each sentence, the sentence selector determines a score. A score represents whether the question is answerable with the sentence. Sentence selector then generates a minimum set of sentences from the scores associated with the question and sentences. The question answering module generates an answer for the question from the minimum set of sentences.

Type: Application

Filed: May 15, 2018

Publication date: August 22, 2019

Inventors: Sewon Min, Victor Zhong, Caiming Xiong, Richard Socher
Multitask Learning As Question Answering

Publication number: 20190251168

Abstract: Approaches for multitask learning as question answering include an input layer for encoding a context and a question, a self-attention based transformer including an encoder and a decoder, a first bi-directional long-term short-term memory (biLSTM) for further encoding an output of the encoder, a long-term short-term memory (LSTM) for generating a context-adjusted hidden state from the output of the decoder and a hidden state, an attention network for generating first attention weights based on an output of the first biLSTM and an output of the LSTM, a vocabulary layer for generating a distribution over a vocabulary, a context layer for generating a distribution over the context, and a switch for generating a weighting between the distributions over the vocabulary and the context, generating a composite distribution based on the weighting, and selecting a word of an answer using the composite distribution.

Type: Application

Filed: May 8, 2018

Publication date: August 15, 2019

Inventors: Bryan McCann, Nitish Shirish Keskar, Caiming Xiong, Richard Socher
Multitask Learning As Question Answering

Publication number: 20190251431

Abstract: Approaches for multitask learning as question answering include a method for training that includes receiving a plurality of training samples including training samples from a plurality of task types, presenting the training samples to a neural model to generate an answer, determining an error between the generated answer and the natural language ground truth answer for each training sample presented, and adjusting parameters of the neural model based on the error. Each of the training samples includes a natural language context, question, and ground truth answer. An order in which the training samples are presented to the neural model includes initially selecting the training samples according to a first training strategy and switching to selecting the training samples according to a second training strategy. In some embodiments the first training strategy is a sequential training strategy and the second training strategy is a joint training strategy.

Type: Application

Filed: May 8, 2018

Publication date: August 15, 2019

Inventors: Nitish Shirish Keskar, Bryan McCann, Caiming Xiong, Richard Socher
Three-Dimensional (3D) Convolution With 3D Batch Normalization

Publication number: 20190213482

Abstract: A method of classifying three-dimensional (3D) data includes receiving three-dimensional (3D) data and processing the 3D data using a neural network that includes a plurality of subnetworks arranged in a sequence and the data is processed through each of the subnetworks. Each of the subnetworks is configured to receive an output generated by a preceding subnetwork in the sequence, process the output through a plurality of parallel 3D convolution layer paths of varying convolution volume, process the output through a parallel pooling path, and concatenate output of the 3D convolution layer paths and the pooling path to generate an output representation from each of the subnetworks. Following processing the data through the subnetworks, the method includes processing the output of a last one of the subnetworks in the sequence through a vertical pooling layer to generate an output and classifying the received 3D data based upon the generated output.

Type: Application

Filed: March 15, 2019

Publication date: July 11, 2019

Inventors: Richard SOCHER, Caiming XIONG, Kai Sheng TAI
HYBRID TRAINING OF DEEP NETWORKS

Publication number: 20190188568

Abstract: Hybrid training of deep networks includes a multi-layer neural network. The training includes setting a current learning algorithm for the multi-layer neural network to a first learning algorithm. The training further includes iteratively applying training data to the neural network, determining a gradient for parameters of the neural network based on the applying of the training data, updating the parameters based on the current learning algorithm, and determining whether the current learning algorithm should be switched to a second learning algorithm based on the updating. The training further includes, in response to the determining that the current learning algorithm should be switched to a second learning algorithm, changing the current learning algorithm to the second learning algorithm and initializing a learning rate of the second learning algorithm based on the gradient and a step used by the first learning algorithm to update the parameters of the neural network.

Type: Application

Filed: March 20, 2018

Publication date: June 20, 2019

Inventors: Nitish Shirish KESKAR, Richard SOCHER
Dense Video Captioning

Publication number: 20190149834

Abstract: Systems and methods for dense captioning of a video include a multi-layer encoder stack configured to receive information extracted from a plurality of video frames, a proposal decoder coupled to the encoder stack and configured to receive one or more outputs from the encoder stack, a masking unit configured to mask the one or more outputs from the encoder stack according to one or more outputs from the proposal decoder, and a decoder stack coupled to the masking unit and configured to receive the masked one or more outputs from the encoder stack. Generating the dense captioning based on one or more outputs of the decoder stack. In some embodiments, the one or more outputs from the proposal decoder include a differentiable mask. In some embodiments, during training, error in the dense captioning is back propagated to the decoder stack, the encoder stack, and the proposal decoder.

Type: Application

Filed: January 18, 2018

Publication date: May 16, 2019

Inventors: Yingbo Zhou, Luowei ZHOU, Caiming XIONG, Richard SOCHER

prev … 2 3 4 5 6 7 8 next