Patents by Inventor Hanjun Dai
Hanjun Dai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250252309Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes obtaining data specifying a trained neural network that includes a plurality of layers that include a particular layer; generating an adapted neural network, comprising generating, for the particular layer, an approximation of an adapter parameter matrix that includes fewer parameters than the adapter parameter matrix; and training the adapted neural network on a machine learning task, wherein the adapting comprises learning fine-tuned values of parameters of the approximation using training data while holding the trained values in the base parameter matrix fixed.Type: ApplicationFiled: January 28, 2025Publication date: August 7, 2025Inventors: Hanjun Dai, Bo Dai, Mengjiao Yang, Azade Nova, Dale Eric Schuurmans, Sanjiv Kumar, Yixin Wang, Yuan Xue
-
Publication number: 20250045577Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing stochastic optimization using machine learning.Type: ApplicationFiled: October 5, 2021Publication date: February 6, 2025Inventors: Bo Dai, Hanjun Dai, Yuan Xue, Zia Syed, Dale Eric Schuurmans
-
Publication number: 20240394545Abstract: Aspects of the disclosure are directed to methods, systems, and computer readable media for universal self-adaptive prompting (USP), which includes an automatic prompt design approach specifically tailored for zero-shot learning, though still compatible with few-shot learning. To achieve universal prompting, USP categorizes a natural language processing (NLP) task into one of a plurality of possible task types and then uses a corresponding selector to select the most suitable queries and zero-shot model-generated responses as pseudo-demonstrations, thereby generalizing in-context learning to the zero-shot setup in a fully automated manner.Type: ApplicationFiled: October 6, 2023Publication date: November 28, 2024Inventors: Julian Martin Eisenschlos, Xingchen Wan, Hootan Nakhost, Sercan Omer Arik, Ruoxi Sun, Hanjun Dai
-
Publication number: 20240362212Abstract: Aspects of the disclosure are directed to methods, systems, and non-transitory computer readable media for automatically generating queries on a database from natural language text using in-context learning to leverage zero-shot and few-shot adaptation capabilities of large language models (LLMs). The methods, systems, and non-transitory computer readable media can consider database information, employ execution based consistency decoding, and employ a mixture of prompts and/or LLMs.Type: ApplicationFiled: July 24, 2023Publication date: October 31, 2024Inventors: Ruoxi Sun, Sercan Omer Arik, Rajarishi Sinha, Hootan Nakhost, Hanjun Dai, Pengcheng Yin
-
Publication number: 20240289619Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a machine learning task on a network input to generate a network output. One of the methods includes: obtaining data specifying an initial neural network configured to perform a machine learning task; a representativeness measure for each of a plurality of filters; determining a central tendency measure for the plurality of filters based on processing a batch of network inputs using the initial neural network; determining a cumulative importance score for each of the plurality of filters; selecting a proper subset of the plurality of filters; and generating a pruned neural network configured to perform the machine learning task.Type: ApplicationFiled: January 26, 2024Publication date: August 29, 2024Inventors: Azade Nova, Hanjun Dai, Dale Eric Schuurmans
-
Publication number: 20240249080Abstract: Aspects of the disclosure are directed to automatically selecting examples in a prompt for an LLM to demonstrate how to perform tasks. Aspects of the disclosure can select and build a set of examples from LLM zero-shot outputs via predetermined criteria that can combine consistency, diversity, and repetition. In the zero-shot setting for three different LLMs, using only LLM predictions, aspects of the disclosure can improve performance up to 15% compared to zero-shot baselines and can match or exceed few-shot base-lines for a range of reasoning tasks.Type: ApplicationFiled: March 30, 2023Publication date: July 25, 2024Inventors: Ruoxi Sun, Xingchen Wan, Hanjun Dai, Sercan Omer Arik, Tomas Pfister
-
Patent number: 11960867Abstract: Using a natural language (NL) latent presentation in the automated conversion of source code from a base programming language (e.g., C++) to a target programming language (e.g., Python). A base-to-NL model can be used to generate an NL latent representation by processing a base source code snippet in the base programming language. Further, an NL-to-target model can be used to generate a target source code snippet in the target programming language (that is functionally equivalent to the base source code snippet), by processing the NL latent representation. In some implementations, output(s) from the NL-to-target model indicate canonical representation(s) of variables, and in generating the target source code snippet, technique(s) are used to match those canonical representation(s) to variable(s) of the base source code snippet. In some implementations, multiple candidate target source code snippets are generated, and a subset (e.g., one) is selected based on evaluation(s).Type: GrantFiled: May 17, 2023Date of Patent: April 16, 2024Assignee: GOOGLE LLCInventors: Rishabh Singh, Hanjun Dai, Manzil Zaheer, Artem Goncharuk, Karen Davis, David Andre
-
Publication number: 20240112013Abstract: The present disclosure is directed to generative models for datasets constrained by marginal constraints. One method includes receiving a request to generate a target dataset based on a marginal constraint for a source dataset. A first object occurs at a source frequency in the source dataset. The marginal constraint indicates a target frequency for the first object. The source dataset encodes a set of co-occurrence frequencies for a plurality of object pairs. A source generative model is accessed. The source generative model includes a first module and a second module that are trained on the source dataset. The second module is updated based on the marginal constraint. An adapted generative model is generated that includes the first module and the updated second module. The target dataset is generated based on the adapted generative model. The first object occurs at the target frequency in the target dataset. The target dataset encodes the set of co-occurrence frequencies for the plurality of object pairs.Type: ApplicationFiled: September 23, 2022Publication date: April 4, 2024Inventors: Hanjun Dai, Bo Dai, Mengjiao Yang, Yuan Xue, Dale Eric Schuurmans
-
Patent number: 11947503Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating data defining a graph. In one aspect, a method comprises: sequentially generating a respective edge set for each node in the graph, wherein for each of a plurality of nodes after a first node, generating the edge set for the node comprises: receiving a context embedding for the node that summarizes a respective edge set for each node that precedes the node; generating, based on the context embedding for the node: (i) a respective edge set for the node, and (ii) a respective embedding of the edge set for the node; generating a context embedding for a next node in the ordering of the nodes using the embedding of the edge set for the node; and adding the set of edges defined by the edge set for the node to the graph.Type: GrantFiled: June 17, 2021Date of Patent: April 2, 2024Assignee: Google LLCInventors: Hanjun Dai, Azade Nazi, Yujia Li, Bo Dai, Dale Eric Schuurmans
-
Publication number: 20230289626Abstract: Provided are computing systems, methods, and platforms for negative sampling in knowledge graphs with improved efficiency. A knowledge graph comprising entities and links between the entities can be obtained. A query computation graph comprising nodes and edges can be generated based on the knowledge graph. The nodes of the query computation graph can include anchor nodes, a root node, and intermediate nodes positioned in paths between the anchor nodes and the root node. A node cut of a query of the query computation graph can be determined and can include at least one node that cuts at least one path between each anchor node and the root node of the query computation graph. Negative samples can be identified by bidirectionally traversing the query computation graph in a first direction from the anchor nodes to the node cut and in a second direction from the root node to the node cut.Type: ApplicationFiled: March 14, 2023Publication date: September 14, 2023Inventors: Hanjun Dai, Dale Eric Schuurmans, Xinyun Chen, Dengyong Zhou, Bo Dai, Hongyu Ren
-
Publication number: 20230223112Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing retrosynthesis using a neural network. One of the methods includes generating a prediction of a set of a plurality of predicted reactants that are combinable to generate a target compound, the generating comprising processing, for each of a plurality of candidate sets of reactants, a network input characterizing the candidate set using a neural network, determining, for each candidate set of the plurality of candidate sets, a score using the generated probabilities; and selecting a particular candidate set of one or more reactants using the determined scores.Type: ApplicationFiled: June 28, 2021Publication date: July 13, 2023Inventors: Ruoxi SUN, Li LI, Bo DAI, Hanjun DAI
-
Patent number: 11693637Abstract: Using a natural language (NL) latent presentation in the automated conversion of source code from a base programming language (e.g., C++) to a target programming language (e.g., Python). A base-to-NL model can be used to generate an NL latent representation by processing a base source code snippet in the base programming language. Further, an NL-to-target model can be used to generate a target source code snippet in the target programming language (that is functionally equivalent to the base source code snippet), by processing the NL latent representation. In some implementations, output(s) from the NL-to-target model indicate canonical representation(s) of variables, and in generating the target source code snippet, technique(s) are used to match those canonical representation(s) to variable(s) of the base source code snippet. In some implementations, multiple candidate target source code snippets are generated, and a subset (e.g., one) is selected based on evaluation(s).Type: GrantFiled: May 13, 2021Date of Patent: July 4, 2023Assignee: GOOGLE LLCInventors: Rishabh Singh, Hanjun Dai, Manzil Zaheer, Artem Goncharuk, Karen Davis, David Andre
-
Patent number: 11636347Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a method comprises: obtaining a graph of nodes and edges that represents an interaction history of the agent with the environment; generating an encoded representation of the graph representing the interaction history of the agent with the environment; processing an input based on the encoded representation of the graph using an action selection neural network, in accordance with current values of action selection neural network parameters, to generate an action selection output; and selecting an action from a plurality of possible actions to be performed by the agent using the action selection output generated by the action selection neural network.Type: GrantFiled: January 22, 2020Date of Patent: April 25, 2023Assignee: DeepMind Technologies LimitedInventors: Hanjun Dai, Yujia Li, Chenglong Wang, Rishabh Singh, Po-Sen Huang, Pushmeet Kohli
-
Publication number: 20230022151Abstract: The present disclosure is directed to machine learning model architectures which provide full attention capability in each attention head while maintaining low computation and memory complexity. Specifically, according to one aspect of the present disclosure, example attention models provided herein can treat the self-attention mechanism as a conditional expectation over embeddings at each location and approximate the conditional distribution with a structured factorization. Each location can attend to all other locations, either via direct attention, or through indirect attention to group representations, which are again conditional expectations of embeddings from corresponding local regions.Type: ApplicationFiled: July 8, 2022Publication date: January 26, 2023Inventors: Hanjun Dai, Bo Dai, Hongyu Ren, Dale Eric Schuurmans, Zihang Dai, Mengjiao Yang
-
Publication number: 20220414067Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating data defining a graph. In one aspect, a method comprises: sequentially generating a respective edge set for each node in the graph, wherein for each of a plurality of nodes after a first node, generating the edge set for the node comprises: receiving a context embedding for the node that summarizes a respective edge set for each node that precedes the node; generating, based on the context embedding for the node: (i) a respective edge set for the node, and (ii) a respective embedding of the edge set for the node; generating a context embedding for a next node in the ordering of the nodes using the embedding of the edge set for the node; and adding the set of edges defined by the edge set for the node to the graph.Type: ApplicationFiled: June 17, 2021Publication date: December 29, 2022Inventors: Hanjun Dai, Azade Nazi, Yujia Li, Bo Dai, Dale Eric Schuurmans
-
Publication number: 20220343152Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generative modelling of an exchangeable sets. Methods can include obtaining a dataset of training observations. Each training observation is an exchangeable set that includes a plurality of data points. Each training observations is processed using a first neural network to generate parameters of a first probability distribution based on which a latent variable is sampled. The latent variable is processed using a second neural network to generate a new observation that includes a plurality of data points. The training observation and the new observation is processed using an energy neural network to generate an estimate of an energy of the training observation and the new observation. The energy neural network is then trained to optimize an objective function that measures the difference between the estimate of the energy of the training observation and the new observation.Type: ApplicationFiled: April 23, 2021Publication date: October 27, 2022Inventors: Bo Dai, Mengjiao Yang, Hanjun Dai, Dale Eric Schuurmans
-
Publication number: 20200234145Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions to be performed by an agent interacting with an environment. In one aspect, a method comprises: obtaining a graph of nodes and edges that represents an interaction history of the agent with the environment; generating an encoded representation of the graph representing the interaction history of the agent with the environment; processing an input based on the encoded representation of the graph using an action selection neural network, in accordance with current values of action selection neural network parameters, to generate an action selection output; and selecting an action from a plurality of possible actions to be performed by the agent using the action selection output generated by the action selection neural network.Type: ApplicationFiled: January 22, 2020Publication date: July 23, 2020Inventors: Hanjun Dai, Yujia Li, Chenglong Wang, Rishabh Singh, Po-Sen Huang, Pushmeet Kohli