Patents by Inventor Anthony J. Piergiovanni

Anthony J. Piergiovanni has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240119713
    Abstract: Provided is an approach that aligns multi-modal tokens using cross-attention without losing the advantages of global self-attention. In contrast to previous works that concatenate the unimodal tokens along the sequence dimension, example approaches described herein align per-modality tokens by chaining them along the channels. Specifically, the tokens from one modality can be used to query the other modality and the output can be concatenated with the query tokens on the channels. An analogous process can also be repeated (or performed in parallel) where the roles of the two modalities are switched. The resulting sets of compound tokens can be concatenated and fed into a self-attention encoder such as a transformer encoder that performs self-attention.
    Type: Application
    Filed: September 27, 2023
    Publication date: April 11, 2024
    Inventors: Anthony J. Piergiovanni, Maxwell Mbabilla Aladago
  • Publication number: 20230394306
    Abstract: Provided is an efficient multi-modal processing model. The multi-modal processing model can process input data from multiple different domains to generate a prediction for a multi-modal processing task. A machine-learned multi-modal processing model can include an adaptive tokenization layer that is configured to adaptively tokenize features generated from the multi-modal inputs into sets of tokens. Specifically, the tokens may have a smaller data size relative to the features from the inputs, thereby enabling a reduced number of processing operations to be performed overall, thereby improving the efficiency of model.
    Type: Application
    Filed: June 2, 2023
    Publication date: December 7, 2023
    Inventors: Anthony J. Piergiovanni, Wei-Cheng Kuo, Anelia Angelova
  • Publication number: 20220366257
    Abstract: Generally, the present disclosure is directed to a neural architecture search process for finding small and fast video processing networks for understanding of video data. The neural architecture search process can automatically design networks that provide comparable video processing performance at a fraction of the computational and storage cost of larger existing models, thereby conserving computing resources such as memory and processor usage.
    Type: Application
    Filed: September 16, 2020
    Publication date: November 17, 2022
    Inventors: Anthony J. Piergiovanni, Anelia Angelova, Michael Sahngwon Ryoo