Patents by Inventor Jacob Assa

Jacob Assa has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20260080168
    Abstract: Described herein are techniques for generating captions for augmented reality (AR) effects in videos using an adapted multimodal large language model (MLLM). The technique involves sampling frames from base and AR-applied videos, combining them into concatenated frames, and processing them with a hybrid vision encoder. Visual tokens are projected into a language model token space, reshaped, downsampled, and interleaved with text tokens. A fine-tuned large language model processes this input sequence to generate AR effect captions. Optical character recognition extracts text from AR frames, which is combined with generated captions and metadata to produce merged captions and content tags. This approach enables accurate description of temporal AR effects and facilitates downstream applications like search and ranking.
    Type: Application
    Filed: September 16, 2024
    Publication date: March 19, 2026
    Inventors: Kwot Sin Lee, Varnith Uttam Chordia, Jacob Assa, Maryna Diakonova
  • Publication number: 20260080636
    Abstract: The subject matter describe herein relates to a system and method for creating augmented reality (AR) applications using artificial intelligence. The system comprises a multi-agent architecture including a lens or AR content creator agent, AR engineer agent, and designer agent that collaborate to generate AR application designs based on user input. A content management system stores reusable components and asset generators provide customized visual elements. The user interface allows natural language interactions to iteratively refine the AR application. The system leverages large language models and retrieval-augmented generation to construct appropriate prompts and select relevant components. Generated designs are assembled into executable AR applications using a plugin that interfaces with an AR development environment. This AI-assisted approach enables rapid creation of diverse, engaging AR experiences with minimal technical expertise required from users.
    Type: Application
    Filed: September 9, 2025
    Publication date: March 19, 2026
    Inventors: Jacob Assa, Avihay Assouline, Itamar Berger, Amir Fruchtman, Sergei Korolev, Gal Sasson, Roman Shtemberko, Jonathan Solichin, Trevor Stephenson, Aleksei Stoliar
  • Patent number: 12361934
    Abstract: Systems and methods are provided for performing automated speech recognition. The systems and methods perform operations comprising: accessing a language model that includes a plurality of n-grams, each of the plurality of n-grams comprising a respective sequence of words and corresponding LM score; selecting a target word to boost in the language model; receiving a boosting factor for the target word; identifying a target n-gram in the language model that includes the target word; identifying a subset of n-grams of the plurality of n-grams that include words in a portion of the target n-gram; and adjusting the LM score of the target n-gram based on the LM scores of the subset of n-grams and the boosting factor.
    Type: Grant
    Filed: July 14, 2022
    Date of Patent: July 15, 2025
    Assignee: Snap Inc.
    Inventors: Jacob Assa, Alan Bekker, Zach Moshe
  • Patent number: 12315495
    Abstract: Systems and methods are provided for extracting entities from received speech. The systems and methods perform operations comprising receiving an audio file comprising speech input and processing, by a speech recognition engine, the audio file comprising the speech input to generate an initial character-based representation of the speech input. The operations further comprise processing, by an entity extractor, the initial character-based representation of the speech input to generate an estimated set of entities of the speech input. The operations further comprise generating, by the speech recognition engine, a textual representation of the speech input based on the estimated set of entities of the speech input.
    Type: Grant
    Filed: December 17, 2021
    Date of Patent: May 27, 2025
    Assignee: Snap Inc.
    Inventors: Alan Bekker, Jacob Assa, Itamar Schen, Einav Itamar
  • Patent number: 12236946
    Abstract: Systems and methods are provided for performing automated speech recognition. The systems and methods access a LM that includes a plurality of n-grams, each of the plurality of n-grams comprising a respective sequence of words and corresponding LM score and receive a list of words associated with a group classification, each word in the list of words being associated with a respective weight. The systems and method compute, based on the LM scores of the plurality of n-grams, a probability that a given word in the list of words associated with the group classification appears in an n-gram in the LM comprising an individual sequence of words and adds one or more new n-grams to the LM comprising one or more words in the list of words in combination with the individual sequence of words and associated with a particular LM score based on the computed probability.
    Type: Grant
    Filed: August 22, 2022
    Date of Patent: February 25, 2025
    Inventors: Jacob Assa, Alan Bekker, Zach Moshe
  • Publication number: 20250029595
    Abstract: Systems and methods are provided for providing emotion-based text to speech. The systems and methods perform operations comprising accessing a text string; storing a plurality of embeddings associated with a plurality of speakers, a first embedding for a first speaker being associated with a first emotion and a second embedding for a second speaker of the plurality of speakers being associated with a second emotion; selecting the first speaker to speak one or more words of the text string; determining that the one or more words are associated with the second emotion; generating, based on the first embedding and the second embedding, a third embedding for the first speaker associated with the second emotion; and applying the third embedding and the text string to a vocoder to generate an audio stream comprising the one or more words being spoken by the first speaker with the second emotion.
    Type: Application
    Filed: October 4, 2024
    Publication date: January 23, 2025
    Inventors: Liron Harazi, Jacob Assa, Alan Bekker
  • Patent number: 12142257
    Abstract: Systems and methods are provided for providing emotion-based text to speech. The systems and methods perform operations comprising accessing a text string; storing a plurality of embeddings associated with a plurality of speakers, a first embedding for a first speaker being associated with a first emotion and a second embedding for a second speaker of the plurality of speakers being associated with a second emotion; selecting the first speaker to speak one or more words of the text string; determining that the one or more words are associated with the second emotion; generating, based on the first embedding and the second embedding, a third embedding for the first speaker associated with the second emotion; and applying the third embedding and the text string to a vocoder to generate an audio stream comprising the one or more words being spoken by the first speaker with the second emotion.
    Type: Grant
    Filed: February 8, 2022
    Date of Patent: November 12, 2024
    Assignee: SNAP INC.
    Inventors: Liron Harazi, Jacob Assa, Alan Bekker