Patents by Inventor Joshua Timothy Ainslie

Joshua Timothy Ainslie has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250111210
    Abstract: Systems and methods for processing inputs using attention neural networks. In particular, one or more of the attention layers within the attention neural network compute relative position biases using functional interpolation.
    Type: Application
    Filed: September 27, 2024
    Publication date: April 3, 2025
    Inventors: Chong You, Guru Guruganesh, Joshua Timothy Ainslie, Manzil Zaheer, Sanjiv Kumar, Santiago Ontañón, Shanda Li, Venkata Sesha Pavana Srinadh Bhojanapalli, Sumit Sanghai
  • Publication number: 20240386256
    Abstract: Improved multi-layer machine learning model architectures are provided that exhibit increased accuracy, decreased training time, decreased inference compute cost, and/or increased stability while training. These improved models include a plurality of sequential layers, each layer comprising a mixing layer that feeds into a feedforward layer. These improved models achieve these benefits by ‘enhancing’ a subset of the feedforward layers with mixture-of-experts or other sparse multi-network architectures while ‘degrading’ a subset of the mixing layers to be simple linear mixing layers (e.g., that multiply inputs by one or more mixing matrices) rather than more complicated attentional mixing mechanisms (e.g., including a number of matrix multiplications, dot products, and nonlinear operations).
    Type: Application
    Filed: May 16, 2023
    Publication date: November 21, 2024
    Inventors: James Lee Thorp, Joshua Timothy Ainslie
  • Publication number: 20230077928
    Abstract: Transformer systems and methods of using such transformer systems including computer programs encoded on a computer storage medium, for performing a deep learning task on an input sequence to generate an encoded output. In one aspect, one of the transformer systems includes an encoder architecture block, comprising: a spectral transform mixing layer that receives input embeddings of input tokens and generates, as output, a spectral transform output along a sequence dimension of the input embeddings; and a feed forward layer that receives an input based on the input embeddings of input tokens and the spectral transform output and generates an output for a subsequent processing block.
    Type: Application
    Filed: September 14, 2021
    Publication date: March 16, 2023
    Inventors: James Patrick Lee-Thorp, Joshua Timothy Ainslie, Ilya Eckstein, Santiago Ontañón
  • Publication number: 20220156553
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset.
    Type: Application
    Filed: January 31, 2022
    Publication date: May 19, 2022
    Inventors: Joshua Timothy Ainslie, Santiago Ontañón, Philip Pham, Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Amr Ahmed
  • Patent number: 11238332
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset.
    Type: Grant
    Filed: June 7, 2021
    Date of Patent: February 1, 2022
    Assignee: Google LLC
    Inventors: Joshua Timothy Ainslie, Santiago Ontañón, Philip Pham, Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Amr Ahmed
  • Publication number: 20210383191
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing network inputs using an attention neural network that has one or more sparse attention sub-layers. Each sparse attention sub-layer is configured to apply a sparse attention mechanism that attends differently for input positions that are in a first proper subset of the input positions in the input to the sub-layer than for positions that are not in the first proper subset.
    Type: Application
    Filed: June 7, 2021
    Publication date: December 9, 2021
    Inventors: Joshua Timothy Ainslie, Santiago Ontañón, Philip Pham, Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Amr Ahmed