Patents by Inventor Jacob Assa

Jacob Assa has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

AUTOMATED CAPTIONING OF AUGMENTED REALITY EFFECTS IN VIDEOS

Publication number: 20260080168

Abstract: Described herein are techniques for generating captions for augmented reality (AR) effects in videos using an adapted multimodal large language model (MLLM). The technique involves sampling frames from base and AR-applied videos, combining them into concatenated frames, and processing them with a hybrid vision encoder. Visual tokens are projected into a language model token space, reshaped, downsampled, and interleaved with text tokens. A fine-tuned large language model processes this input sequence to generate AR effect captions. Optical character recognition extracts text from AR frames, which is combined with generated captions and metadata to produce merged captions and content tags. This approach enables accurate description of temporal AR effects and facilitates downstream applications like search and ranking.

Type: Application

Filed: September 16, 2024

Publication date: March 19, 2026

Inventors: Kwot Sin Lee, Varnith Uttam Chordia, Jacob Assa, Maryna Diakonova
AUTOMATED CREATION OF AUGMENTED REALITY EXPERIENCES USING MULTI-AGENT LANGUAGE MODELS

Publication number: 20260080636

Abstract: The subject matter describe herein relates to a system and method for creating augmented reality (AR) applications using artificial intelligence. The system comprises a multi-agent architecture including a lens or AR content creator agent, AR engineer agent, and designer agent that collaborate to generate AR application designs based on user input. A content management system stores reusable components and asset generators provide customized visual elements. The user interface allows natural language interactions to iteratively refine the AR application. The system leverages large language models and retrieval-augmented generation to construct appropriate prompts and select relevant components. Generated designs are assembled into executable AR applications using a plugin that interfaces with an AR development environment. This AI-assisted approach enables rapid creation of diverse, engaging AR experiences with minimal technical expertise required from users.

Type: Application

Filed: September 9, 2025

Publication date: March 19, 2026

Inventors: Jacob Assa, Avihay Assouline, Itamar Berger, Amir Fruchtman, Sergei Korolev, Gal Sasson, Roman Shtemberko, Jonathan Solichin, Trevor Stephenson, Aleksei Stoliar
Boosting words in automated speech recognition

Patent number: 12361934

Abstract: Systems and methods are provided for performing automated speech recognition. The systems and methods perform operations comprising: accessing a language model that includes a plurality of n-grams, each of the plurality of n-grams comprising a respective sequence of words and corresponding LM score; selecting a target word to boost in the language model; receiving a boosting factor for the target word; identifying a target n-gram in the language model that includes the target word; identifying a subset of n-grams of the plurality of n-grams that include words in a portion of the target n-gram; and adjusting the LM score of the target n-gram based on the LM scores of the subset of n-grams and the boosting factor.

Type: Grant

Filed: July 14, 2022

Date of Patent: July 15, 2025

Assignee: Snap Inc.

Inventors: Jacob Assa, Alan Bekker, Zach Moshe
Speech to entity

Patent number: 12315495

Abstract: Systems and methods are provided for extracting entities from received speech. The systems and methods perform operations comprising receiving an audio file comprising speech input and processing, by a speech recognition engine, the audio file comprising the speech input to generate an initial character-based representation of the speech input. The operations further comprise processing, by an entity extractor, the initial character-based representation of the speech input to generate an estimated set of entities of the speech input. The operations further comprise generating, by the speech recognition engine, a textual representation of the speech input based on the estimated set of entities of the speech input.

Type: Grant

Filed: December 17, 2021

Date of Patent: May 27, 2025

Assignee: Snap Inc.

Inventors: Alan Bekker, Jacob Assa, Itamar Schen, Einav Itamar
Grouping similar words in a language model

Patent number: 12236946

Abstract: Systems and methods are provided for performing automated speech recognition. The systems and methods access a LM that includes a plurality of n-grams, each of the plurality of n-grams comprising a respective sequence of words and corresponding LM score and receive a list of words associated with a group classification, each word in the list of words being associated with a respective weight. The systems and method compute, based on the LM scores of the plurality of n-grams, a probability that a given word in the list of words associated with the group classification appears in an n-gram in the LM comprising an individual sequence of words and adds one or more new n-grams to the LM comprising one or more words in the list of words in combination with the individual sequence of words and associated with a particular LM score based on the computed probability.

Type: Grant

Filed: August 22, 2022

Date of Patent: February 25, 2025

Inventors: Jacob Assa, Alan Bekker, Zach Moshe
EMOTION-BASED TEXT TO SPEECH

Publication number: 20250029595

Abstract: Systems and methods are provided for providing emotion-based text to speech. The systems and methods perform operations comprising accessing a text string; storing a plurality of embeddings associated with a plurality of speakers, a first embedding for a first speaker being associated with a first emotion and a second embedding for a second speaker of the plurality of speakers being associated with a second emotion; selecting the first speaker to speak one or more words of the text string; determining that the one or more words are associated with the second emotion; generating, based on the first embedding and the second embedding, a third embedding for the first speaker associated with the second emotion; and applying the third embedding and the text string to a vocoder to generate an audio stream comprising the one or more words being spoken by the first speaker with the second emotion.

Type: Application

Filed: October 4, 2024

Publication date: January 23, 2025

Inventors: Liron Harazi, Jacob Assa, Alan Bekker
Emotion-based text to speech

Patent number: 12142257

Abstract: Systems and methods are provided for providing emotion-based text to speech. The systems and methods perform operations comprising accessing a text string; storing a plurality of embeddings associated with a plurality of speakers, a first embedding for a first speaker being associated with a first emotion and a second embedding for a second speaker of the plurality of speakers being associated with a second emotion; selecting the first speaker to speak one or more words of the text string; determining that the one or more words are associated with the second emotion; generating, based on the first embedding and the second embedding, a third embedding for the first speaker associated with the second emotion; and applying the third embedding and the text string to a vocoder to generate an audio stream comprising the one or more words being spoken by the first speaker with the second emotion.

Type: Grant

Filed: February 8, 2022

Date of Patent: November 12, 2024

Assignee: SNAP INC.

Inventors: Liron Harazi, Jacob Assa, Alan Bekker

AUTOMATED CAPTIONING OF AUGMENTED REALITY EFFECTS IN VIDEOS

AUTOMATED CREATION OF AUGMENTED REALITY EXPERIENCES USING MULTI-AGENT LANGUAGE MODELS

Boosting words in automated speech recognition

Speech to entity

Grouping similar words in a language model

EMOTION-BASED TEXT TO SPEECH

Emotion-based text to speech