Patents by Inventor Mostafa Dehghani
Mostafa Dehghani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240169184Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output sequences using auto-regressive decoder neural networks. In particular, during generation, adaptive early exiting is used to reduce the time required to generate the output sequence.Type: ApplicationFiled: January 29, 2024Publication date: May 23, 2024Inventors: Tal Schuster, Adam Joshua Fisch, Jai Prakash Gupta, Mostafa Dehghani, Dara Bahri, Vinh Quoc Tran, Yi Tay, Donald Arthur Metzler, JR.
-
Patent number: 11983903Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using self-attention based neural networks. One of the methods includes obtaining one or more images comprising a plurality of pixels; determining, for each image of the one or more images, a plurality of image patches of the image, wherein each image patch comprises a different subset of the pixels of the image; processing, for each image of the one or more images, the corresponding plurality of image patches to generate an input sequence comprising a respective input element at each of a plurality of input positions, wherein a plurality of the input elements correspond to respective different image patches; and processing the input sequences using a neural network to generate a network output that characterizes the one or more images, wherein the neural network comprises one or more self-attention neural network layers.Type: GrantFiled: November 1, 2023Date of Patent: May 14, 2024Assignee: Google LLCInventors: Neil Matthew Tinmouth Houlsby, Sylvain Gelly, Jakob D. Uszkoreit, Xiaohua Zhai, Georg Heigold, Lucas Klaus Beyer, Alexander Kolesnikov, Matthias Johannes Lorenz Minderer, Dirk Weissenborn, Mostafa Dehghani, Alexey Dosovitskiy, Thomas Unterthiner
-
Publication number: 20240143691Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a sequence-to-sequence model that is recurrent in depth while employing self-attention to combine information from different parts of sequences.Type: ApplicationFiled: December 18, 2023Publication date: May 2, 2024Inventors: Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob D. Uszkoreit, Lukasz Mieczyslaw Kaiser
-
Publication number: 20240062426Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using self-attention based neural networks. One of the methods includes obtaining one or more images comprising a plurality of pixels; determining, for each image of the one or more images, a plurality of image patches of the image, wherein each image patch comprises a different subset of the pixels of the image; processing, for each image of the one or more images, the corresponding plurality of image patches to generate an input sequence comprising a respective input element at each of a plurality of input positions, wherein a plurality of the input elements correspond to respective different image patches; and processing the input sequences using a neural network to generate a network output that characterizes the one or more images, wherein the neural network comprises one or more self-attention neural network layers.Type: ApplicationFiled: November 1, 2023Publication date: February 22, 2024Inventors: Neil Matthew Tinmouth Houlsby, Sylvain Gelly, Jakob D. Uszkoreit, Xiaohua Zhai, Georg Heigold, Lucas Klaus Beyer, Alexander Kolesnikov, Matthias Johannes Lorenz Minderer, Dirk Weissenborn, Mostafa Dehghani, Alexey Dosovitskiy, Thomas Unterthiner
-
Patent number: 11886976Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output sequences using auto-regressive decoder neural networks. In particular, during generation, adaptive early exiting is used to reduce the time required to generate the output sequence.Type: GrantFiled: July 14, 2023Date of Patent: January 30, 2024Assignee: Google LLCInventors: Tal Schuster, Adam Joshua Fisch, Jai Prakash Gupta, Mostafa Dehghani, Dara Bahri, Vinh Quoc Tran, Yi Tay, Donald Arthur Metzler, Jr.
-
Publication number: 20240020516Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output sequences using auto-regressive decoder neural networks. In particular, during generation, adaptive early exiting is used to reduce the time required to generate the output sequence.Type: ApplicationFiled: July 14, 2023Publication date: January 18, 2024Inventors: Tal Schuster, Adam Joshua Fisch, Jai Prakash Gupta, Mostafa Dehghani, Dara Bahri, Vinh Quoc Tran, Yi Tay, Donald Arthur Metzler, Jr.
-
Patent number: 11860969Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a sequence to sequence model that is recurrent in depth while employing self-attention to combine information from different parts of sequences.Type: GrantFiled: August 10, 2020Date of Patent: January 2, 2024Assignee: Google LLCInventors: Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob D. Uszkoreit, Lukasz Mieczyslaw Kaiser
-
Publication number: 20230409899Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing a network input using a computer vision neural network with learned tokenization.Type: ApplicationFiled: June 21, 2022Publication date: December 21, 2023Inventors: Michael Sahngwon Ryoo, Anthony Jacob Piergiovanni, Anelia Angelova, Anurag Arnab, Mostafa Dehghani
-
Publication number: 20230244938Abstract: An example method for pretraining a machine-learned model is provided. The example method includes obtaining a plurality of different combinations of configuration parameters of a pretraining objective framework. The example method includes generating, using the pretraining objective framework, a plurality of corrupted training examples from one or more training examples, wherein the plurality of corrupted training examples are respectively generated according to the plurality of different combinations. The example method includes inputting the plurality of corrupted training examples into the machine-learned model, wherein the machine-learned model is configured to generate uncorrupted subportions corresponding to corrupted subportions of the corrupted training examples. The example method includes obtaining, from the machine-learned model, a plurality of outputs respectively generated by the machine-learned model based on the plurality of corrupted training examples.Type: ApplicationFiled: January 27, 2023Publication date: August 3, 2023Inventors: Jason Weng Wei, Dengyong Zhou, Xuezhi Wang, Dale Eric Schuurmans, Quoc V. Le, Maarten Paul Bosma, Ed Huai-Hsin Chi, Olivier Jean Andrè Bousquet, Le Hou, Charles Aloysius Sutton, Nathanael Martin Schärli, Nathan Kemp Sekiguchi Scales, Augustus Quadrozzi Odena, Sharan Ajit Narang, Guy Gur-Ari Krakover, Aakanksha Chowdhery, David Martin Dohan, Aitor Lewkowycz, Henryk Michalewski, Jiageng Luan, David J. Bieber, Jacob Austin, Anders Johan Andreassen, Maxwell Isaac Nye, Yi Tay, Mostafa Dehghani
-
Publication number: 20230031702Abstract: A method includes receiving, via a computing device, a screenshot of a display provided by a graphical user interface of the computing device. The method also includes generating, by an image-structure transformer of a neural network, a representation by fusing a first embedding based on the screenshot and a second embedding based on a layout of virtual objects in the screenshot. The method additionally includes predicting, by the neural network and based on the generated representation, a modeling task output associated with the graphical user interface. The method further includes providing, by the computing device, the predicted modeling task output.Type: ApplicationFiled: July 13, 2022Publication date: February 2, 2023Inventors: Yang Li, Xin Zhou, Gang Li, Mostafa Dehghani, Alexey Alexeevich Gritsenko
-
Publication number: 20230017072Abstract: A computer-implemented method for classifying video data with improved accuracy includes obtaining, by a computing system comprising one or more computing devices, video data comprising a plurality of video frames; extracting, by the computing system, a plurality of video tokens from the video data, the plurality of video tokens comprising a representation of spatiotemporal information in the video data; providing, by the computing system, the plurality of video tokens as input to a video understanding model, the video understanding model comprising a video transformer encoder model; and receiving, by the computing system, a classification output from the video understanding model.Type: ApplicationFiled: July 8, 2021Publication date: January 19, 2023Inventors: Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Lucic, Cordelia Luise Schmid
-
Publication number: 20220245432Abstract: The present disclosure provides echo-attention layers, a new efficient method for increasing the expressiveness of self-attention layers without incurring significant parameter or training time costs. One intuition behind the proposed method is to learn to echo, i.e., attend once and then get N echo-ed attentions for free (or at a relatively cheap cost). As compared to stacking new layers, the proposed echoed attentions are targeted at providing similar representation power at a better cost efficiency.Type: ApplicationFiled: February 3, 2022Publication date: August 4, 2022Inventors: Yi Tay, Donald Arthur Metzler, JR., Dara Bahri, Mostafa Dehghani
-
Publication number: 20220245428Abstract: Provided are machine-learned attention models that feature omnidirectional processing, example implementations of which can be referred to as Omnidirectional Representations from Transformers (OMNINET). In example models described in the present disclosure, instead of maintaining a strictly horizontal receptive field, each token is allowed to attend to all tokens in some or all of the other tokens across the entire network.Type: ApplicationFiled: February 4, 2022Publication date: August 4, 2022Inventors: Yi Tay, Da-Cheng Juan, Dara Bahri, Donald Arthur Metzler, JR., Jai Prakash Gupta, Mostafa Dehghani, Phillip Pham, Vamsi Krishna Aribandi, Zhen Qin
-
Publication number: 20220108478Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using self-attention based neural networks. One of the methods includes obtaining one or more images comprising a plurality of pixels; determining, for each image of the one or more images, a plurality of image patches of the image, wherein each image patch comprises a different subset of the pixels of the image; processing, for each image of the one or more images, the corresponding plurality of image patches to generate an input sequence comprising a respective input element at each of a plurality of input positions, wherein a plurality of the input elements correspond to respective different image patches; and processing the input sequences using a neural network to generate a network output that characterizes the one or more images, wherein the neural network comprises one or more self-attention neural network layers.Type: ApplicationFiled: October 1, 2021Publication date: April 7, 2022Inventors: Neil Matthew Tinmouth Houlsby, Sylvain Gelly, Jakob D. Uszkoreit, Xiaohua Zhai, Georg Heigold, Lucas Klaus Beyer, Alexander Kolesnikov, Matthias Johannes Lorenz Minderer, Dirk Weissenborn, Mostafa Dehghani, Alexey Dosovitskiy, Thomas Unterthiner
-
Publication number: 20210056162Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a sequence to sequence model that is recurrent in depth while employing self-attention to combine information from different parts of sequences.Type: ApplicationFiled: August 10, 2020Publication date: February 25, 2021Inventors: Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob D. Uszkoreit, Lukasz Mieczyslaw Kaiser
-
Patent number: 10740433Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a sequence to sequence model that is recurrent in depth while employing self-attention to combine information from different parts of sequences.Type: GrantFiled: May 20, 2019Date of Patent: August 11, 2020Assignee: Google LLCInventors: Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob D. Uszkoreit, Lukasz Mieczyslaw Kaiser
-
Publication number: 20190354567Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a sequence to sequence model that is recurrent in depth while employing self-attention to combine information from different parts of sequences.Type: ApplicationFiled: May 20, 2019Publication date: November 21, 2019Inventors: Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob D. Uszkoreit, Lukasz Mieczyslaw Kaiser