Patents by Inventor Zihang Dai

Zihang Dai has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230359862
    Abstract: A computer-implemented method for performing computer vision with reduced computational cost and improved accuracy can include obtaining, by a computing system including one or more computing devices, input data comprising an input tensor having one or more dimensions, providing, by the computing system, the input data to a machine-learned convolutional attention network, the machine-learned convolutional attention network including two or more network stages, and, in response to providing the input data to the machine-learned convolutional attention network, receiving, by the computing system, a machine-learning prediction from the machine-learned convolutional attention network. The convolutional attention network can include at least one attention block, wherein the attention block includes a relative attention mechanism, the relative attention mechanism including the sum of a static convolution kernel with an adaptive attention matrix.
    Type: Application
    Filed: July 19, 2023
    Publication date: November 9, 2023
    Inventors: Zihang Dai, Mingxing Tan, Quoc V. Le, Hanxiao Liu
  • Patent number: 11755883
    Abstract: A computer-implemented method for performing computer vision with reduced computational cost and improved accuracy can include obtaining, by a computing system including one or more computing devices, input data comprising an input tensor having one or more dimensions, providing, by the computing system, the input data to a machine-learned convolutional attention network, the machine-learned convolutional attention network including two or more network stages, and, in response to providing the input data to the machine-learned convolutional attention network, receiving, by the computing system, a machine-learning prediction from the machine-learned convolutional attention network. The convolutional attention network can include at least one attention block, wherein the attention block includes a relative attention mechanism, the relative attention mechanism including the sum of a static convolution kernel with an adaptive attention matrix.
    Type: Grant
    Filed: May 27, 2022
    Date of Patent: September 12, 2023
    Assignee: GOOGLE LLC
    Inventors: Zihang Dai, Hanxiao Liu, Mingxing Tan, Quoc V. Le
  • Publication number: 20230281400
    Abstract: Example embodiments of the present disclosure relate to systems and methods for pretraining image-processing models on weakly-supervised image-text pairs. The pretraining can include receiving a training sequence for the machine-learned image-processing model. The training sequence can include text tokens and image tokens. A prefix sequence can contain the image tokens. A remainder sequence can include a remainder set of the text tokens. The pretraining can include determining, using the prefix sequence as an input to the machine-learned image-processing model, an objective based on recovery of the remainder sequence. The pretraining can include updating one or more learnable parameters of the machine-learned image-processing model based on the objective.
    Type: Application
    Filed: March 3, 2022
    Publication date: September 7, 2023
    Inventors: Zirui Wang, Jiahui Yu, Yuan Cao, Wei Yu, Zihang Dai
  • Publication number: 20230154161
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using memory-optimized contrastive learning to train image encoder and text encoder neural networks.
    Type: Application
    Filed: November 16, 2022
    Publication date: May 18, 2023
    Inventors: Hieu Hy Pham, Zihang Dai, Golnaz Ghiasi, Hanxiao Liu, Wei Yu, Mingxing Tan, Quoc V. Le
  • Publication number: 20230022151
    Abstract: The present disclosure is directed to machine learning model architectures which provide full attention capability in each attention head while maintaining low computation and memory complexity. Specifically, according to one aspect of the present disclosure, example attention models provided herein can treat the self-attention mechanism as a conditional expectation over embeddings at each location and approximate the conditional distribution with a structured factorization. Each location can attend to all other locations, either via direct attention, or through indirect attention to group representations, which are again conditional expectations of embeddings from corresponding local regions.
    Type: Application
    Filed: July 8, 2022
    Publication date: January 26, 2023
    Inventors: Hanjun Dai, Bo Dai, Hongyu Ren, Dale Eric Schuurmans, Zihang Dai, Mengjiao Yang
  • Publication number: 20220383119
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a machine learning task on a network input to generate a network output. One of the systems includes an attention neural network configured to perform the machine learning task. The attention neural network includes one or more attentions layers that each include a squared ReLU activation layer, a depth-wise convolution layer, or both.
    Type: Application
    Filed: May 27, 2022
    Publication date: December 1, 2022
    Inventors: David Richard So, Quoc V. Le, Jr., Hanxiao Liu, Wojciech Andrzej Manke, Zihang Dai, Noam M. Shazeer
  • Publication number: 20220383069
    Abstract: A computer-implemented method for performing computer vision with reduced computational cost and improved accuracy can include obtaining, by a computing system including one or more computing devices, input data comprising an input tensor having one or more dimensions, providing, by the computing system, the input data to a machine-learned convolutional attention network, the machine-learned convolutional attention network including two or more network stages, and, in response to providing the input data to the machine-learned convolutional attention network, receiving, by the computing system, a machine-learning prediction from the machine-learned convolutional attention network. The convolutional attention network can include at least one attention block, wherein the attention block includes a relative attention mechanism, the relative attention mechanism including the sum of a static convolution kernel with an adaptive attention matrix.
    Type: Application
    Filed: May 27, 2022
    Publication date: December 1, 2022
    Inventors: Zihang Dai, Hanxiao Liu, Mingxing Tan, Quoc V. Le
  • Publication number: 20220367052
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a network input to generate a network output. In one aspect, one of the systems includes a neural network configured to perform the machine learning task, the neural network including one or more blocks that each include a feedforward spatial transformation unit.
    Type: Application
    Filed: May 16, 2022
    Publication date: November 17, 2022
    Inventors: Hanxiao Liu, David Richard So, Quoc V. Le, Zihang Dai
  • Publication number: 20220215209
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a IT machine learning model. One of the methods includes receiving training data comprising a plurality of unlabeled training inputs and a plurality of labeled training inputs; generating augmented training data, comprising generating, for each of the plurality of unlabeled training inputs, a respective augmented training input by applying a data augmentation technique to the unlabeled training input; and training the machine learning model on the augmented training data. In particular, but not exclusively, the model may be trained for perceptual tasks (e.g. tasks relating to vision or speech).
    Type: Application
    Filed: April 24, 2020
    Publication date: July 7, 2022
    Inventors: Thang Minh Luong, Quoc V. Le, Qizhe Xie, Zihang Dai
  • Publication number: 20220188636
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network using meta pseudo-labels. One of the methods includes training a student neural network using pseudo-labels generated by a teacher neural network that is being trained jointly with the student neural network.
    Type: Application
    Filed: December 14, 2021
    Publication date: June 16, 2022
    Inventors: Hieu Hy Pham, Zihang Dai, Qizhe Xie, Quoc V. Le
  • Patent number: 10606846
    Abstract: Described herein are systems and methods for determining how to automatically answer questions like “Where did Harry Potter go to school?” Carefully built knowledge graphs provide rich sources of facts. However, it still remains a challenge to answer factual questions in natural language due to the tremendous variety of ways a question can be raised. Presented herein are embodiments of systems and methods for human inspired simple question answering (HISQA), a deep-neural-network-based methodology for automatic question answering using a knowledge graph. Inspired by human's natural actions in this task, embodiments first find the correct entity via entity linking, and then seek a proper relation to answer the question—both achieved by deep gated recurrent networks and neural embedding mechanism.
    Type: Grant
    Filed: May 23, 2016
    Date of Patent: March 31, 2020
    Assignee: Baidu USA LLC
    Inventors: Lei Li, Zihang Dai, Wei Xu
  • Publication number: 20170109355
    Abstract: Described herein are systems and methods for determining how to automatically answer questions like “Where did Harry Potter go to school?” Carefully built knowledge graphs provide rich sources of facts. However, it still remains a challenge to answer factual questions in natural language due to the tremendous variety of ways a question can be raised. Presented herein are embodiments of systems and methods for human inspired simple question answering (HISQA), a deep-neural-network-based methodology for automatic question answering using a knowledge graph. Inspired by human's natural actions in this task, embodiments first find the correct entity via entity linking, and then seek a proper relation to answer the question—both achieved by deep gated recurrent networks and neural embedding mechanism.
    Type: Application
    Filed: May 23, 2016
    Publication date: April 20, 2017
    Applicant: Baidu USA LLC
    Inventors: Lei Li, Zihang Dai, Wei Xu