Patents by Inventor Gal Chechik

Gal Chechik has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

TEXT-TO-IMAGE PRODUCT PLACEMENT

Publication number: 20260120362

Abstract: Text-to-image transformers configured in one aspect to associate an input text token with the specific object, apply latent blending with attention to a combination of keys and values for the input text token and a background image upon which to add the object; and which in another aspect perform latent blending with attention to keys and values for the object to add, keys and values for the background, and keys and values for a text prompt.

Type: Application

Filed: October 21, 2025

Publication date: April 30, 2026

Applicant: NVIDIA Corp.

Inventors: Yoad Tewel, Gal Chechik
Text-to-image diffusion model with component locking and rank-one editing

Patent number: 12614316

Abstract: A text-to-image machine learning model takes a user input text and generates an image matching the given description. While text-to-image models currently exist, there is a desire to personalize these models on a per-user basis, including to configure the models to generate images of specific, unique user-provided concepts (via images of specific objects or styles) while allowing the user to use free text “prompts” to modify their appearance or compose them in new roles and novel scenes. Current personalization solutions either generate images with only coarse-grained resemblance to the provided concept(s) or require fine tuning of the entire model which is costly and can adversely affect the model.

Type: Grant

Filed: October 31, 2023

Date of Patent: April 28, 2026

Assignee: NVIDIA CORPORATION

Inventors: Yuval Atzmon, Yoad Tewel, Rinon Gal, Gal Chechik
CONTENT-BASED VIDEO COMPRESSION USING REINFORCEMENT LEARNING FOR VIDEO RATE CONTROL

Publication number: 20260101046

Abstract: A method for performing content-based video compression using reinforcement learning (RL) for video rate control for a downstream task is provided. The method includes processing frame information associated with a raw frame from a video using an RL agent to generate quantization parameter (QP) information that indicates one or more values associated with a compression level of the raw frame and encoding the raw frame into a bitstream based on the QP information. The method further includes reconstructing the raw frame using the bitstream to obtain a reconstructed frame and processing the raw frame as well as the reconstructed frame using a pre-trained downstream model to generate two outputs. The method then includes determining a downstream task reward based on the two outputs and training the RL agent based on the downstream task reward.

Type: Application

Filed: April 29, 2025

Publication date: April 9, 2026

Inventors: Assaf Joseph Hallak, Uri Haim Gadot, Assaf Shoher, Dotan Levi, Eshed Ram, Dror Porat, Eyal Frishman, Shie Mannor, Gal Chechik
CONTENT-BASED VIDEO COMPRESSION USING REINFORCEMENT LEARNING FOR VIDEO RATE CONTROL

Publication number: 20260101045

Abstract: A method for performing content-based video compression using reinforcement learning (RL) is provided. The method includes obtaining frame information associated with a frame from a video. The frame information comprises quantization parameter (QP) information associated with the frame, and the QP information indicates an initial compression level for encoding aspects of the frame. The frame information and additional information are processed by an RL agent to generate a generated QP map indicating a plurality of updated values associated with a plurality of macro-blocks (MBs) of the frame. A bitstream is generated comprising a plurality of bits for the frame based on the generated QP map. Specifically, the plurality of updated values from the generated QP map indicates an amount of allocated bits from the bitstream to allocate for each of the plurality of MBs. The bitstream is provided to a downstream model.

Type: Application

Filed: April 29, 2025

Publication date: April 9, 2026

Inventors: Assaf Joseph Hallak, Uri Haim Gadot, Assaf Shoher, Dotan Levi, Eshed Ram, Dror Porat, Eyal Frishman, Shie Mannor, Gal Chechik
Scene-aware speech recognition using vision-language models

Patent number: 12573403

Abstract: A system to generate a latent space model of a scene or video and apply this latent space and candidate sentences formed from digital audio to a vision-language matching model to enhance the accuracy of speech-to-text conversion. A latent space embedding of the scene is generated in which similar features are represented in the space closer to one another. An embedding for the digital audio is also generated. The vision-language matching model utilizes the latent space embedding to enhance the accuracy of transcribing/interpreting the embedding of the digital audio.

Type: Grant

Filed: June 22, 2023

Date of Patent: March 10, 2026

Assignee: NVIDIA Corp.

Inventors: Gal Chechik, Shie Mannor
Method for fast and better tree search for reinforcement learning

Patent number: 12566801

Abstract: A method for performing a Tree-Search (TS) on an environment is provided. The method comprises generating a tree for a current state of the environment based on a TS policy, determining a corrected TS policy, and determining an action to apply to the environment based on the corrected TS policy. The tree comprises a plurality of nodes including a root node among the plurality of nodes corresponding to the current state of the environment. Each node other than the root node among the plurality of nodes corresponding to an estimated future state of the environment. The plurality of nodes in the tree are connected by a plurality of edges. Each edge among the plurality of edges is associated with an action causing a transition from a first state to a different sate of the environment.

Type: Grant

Filed: May 25, 2022

Date of Patent: March 3, 2026

Assignee: NVIDIA Corporation

Inventors: Shie Mannor, Assaf Joseph Hallak, Gal Dalal, Steven Tarence Dalton, Iuri Frosio, Gal Chechik
SCENE-AWARE SPEECH RECOGNITION USING VISION-LANGUAGE MODELS

Publication number: 20260057889

Abstract: Ae system to generate a latent space model of a scene or video and apply this latent space and candidate sentences formed from digital audio to a vision-language matching model to enhance the accuracy of speech-to-text conversion. A latent space embedding of the scene is generated in which similar features are represented in the space closer to one another. An embedding for the digital audio is also generated. The vision-language matching model utilizes the latent space embedding to enhance the accuracy of transcribing/interpreting the embedding of the digital audio.

Type: Application

Filed: October 31, 2025

Publication date: February 26, 2026

Applicant: NVIDIA Corp.

Inventors: Gal Chechik, Shie Mannor
Feedback based content generation in graphical interfaces

Patent number: 12554385

Abstract: Apparatuses, systems, and techniques to identify one or more modifications to objects within an environment. In at least one embodiment, objects are identified in an image, based on extracted feedback information, using one or more machine learning models, for example, using direct and/or implicit feedback of user interaction with one or more objects in an environment.

Type: Grant

Filed: August 9, 2023

Date of Patent: February 17, 2026

Assignee: NVIDIA CORPORATION

Inventors: Shie Mannor, Gal Chechik
Learning directable virtual agents through conditional adversarial latent models

Patent number: 12505599

Abstract: A conditional adversarial latent model (CALM) process can be used to generate reference motions from a set of original reference movements to create a library of new movements for an agent. The agent can be a virtual representation various types of characters, animals, or objects. The CALM process can receive a set of reference movements and a requested movement. An encoder can be used to map the requested movement onto a latent space. A low-level policy can be employed to produce a series of latent space joint movements for the agent. A conditional discriminator can be used to provide feedback to the low-level policy to produce stationary distributions over the states of the agent. A high-level policy can be employed to provide a macro movement control over the low-level policy movements, such as providing direction in the environment. The high-level policy can utilize a reward or a finite-state machine function.

Type: Grant

Filed: August 3, 2023

Date of Patent: December 23, 2025

Assignee: NVIDIA Corporation

Inventors: Chen Tessler, Gal Chechik, Yoni Kasten, Shie Mannor, Jason Peng
DIFFUSION MODEL FOR OBJECT DRAGGING IN IMAGES

Publication number: 20250363690

Abstract: Seamlessly moving, or dragging, an object from one location in an image to another location in the image is, in practice, a challenge especially for current generative image editing methods. Current methods that tackle this problem rely on time-consuming Low Ranked Adaptation (LoRA) training per image, training a designated model on a large dataset or utilizing classifier-free guidance (CFG) with specific objectives. However, these methods are not robust and struggle to operate reliably in a real-world setting due to lacking spatial reasoning. The present disclosure provides a diffusion model that can harness spatial understanding when relocating an object in an image, thereby resulting in a more seamless result (e.g. fewer visual artifacts).

Type: Application

Filed: February 27, 2025

Publication date: November 27, 2025

Inventors: Omri Avrahami, Weili Nie, Rinon Gal, Arash Vahdat, Gal Chechik
TECHNIQUES FOR UNIFIED PHYSICS-BASED CHARACTER CONTROL THROUGH MASKED MOTION INPAINTING

Publication number: 20250356186

Abstract: One embodiment of a method for animating characters includes receiving one or more goals specified in one or more modalities, generating, via a trained machine learning model and based on the one or more goals, a first action for a character to perform, where the trained machine learning model is trained to process inputs in multiple modalities, and causing the character to perform the first action within a computer-based or physical environment.

Type: Application

Filed: December 16, 2024

Publication date: November 20, 2025

Inventors: Chen TESSLER, Gal CHECHIK, Ofir NABATI, Jason PENG
TECHNIQUES FOR UNIFIED PHYSICS-BASED CHARACTER CONTROL THROUGH MASKED MOTION INPAINTING

Publication number: 20250356565

Abstract: One embodiment of a method for animating characters includes receiving one or more goals specified in one or more modalities, generating, via a trained machine learning model and based on the one or more goals, a first action for a character to perform, where the trained machine learning model is trained to process inputs in multiple modalities, and causing the character to perform the first action within a computer-based or physical environment.

Type: Application

Filed: December 16, 2024

Publication date: November 20, 2025

Inventors: Chen TESSLER, Gal CHECHIK, Ofir NABATI, Jason PENG
ADVANCED PROTECTION FROM LLM-POISONING

Publication number: 20250342389

Abstract: Systems and methods herein are for determining a poisoning in a machine learning (ML) model, which may be a pre-trained ML model that is subject to finetuning by a third-party. The system and method herein obtain first observations associated with the pre-trained ML model and may determine a distribution or classification of the first observations with respect to second observations obtained during the finetuning of the pre-trained ML model at different periods. Further, the determining of the poisoned ML model may be based in part on the distribution or classification being different than a predetermined threshold or being outside a predetermined threshold range.

Type: Application

Filed: May 2, 2024

Publication date: November 6, 2025

Inventors: Nir Rosen, Vadim Gechman, Shie Mannor, Gal Chechik
SKETCH-TO-3D OBJECT CREATION

Publication number: 20250316009

Abstract: Text-to-image generation generally refers to the process of generating an image from one or more text prompts input by a user and in some cases also a user provided sample image. Existing text-to-image generation processes are configured to only generate content from text and usually non-original sample images (e.g. obtained from the Internet). This limits the customization options available to the user. The present disclosure provides a sketch-to-3D content generation process which allows users to generate 3D content from a given 3D human generated, or free-form, sketch, which enables greater customization of computer generated 3D content.

Type: Application

Filed: April 8, 2024

Publication date: October 9, 2025

Inventors: Chen Tessler, Yoni Kasten, Gal Chechik
TRAINING-FREE CONSISTENT TEXT-TO-VIDEO GENERATION

Publication number: 20250252614

Abstract: Embodiments of the present disclosure relate to training-free consistent text-to-image generation. A pre-trained text-to-image diffusion model is leveraged to generate images depicting a consistent subject for diverse prompts describing scenes. Inputs to the model are a text description of at least one subject with prompts (scene text descriptions) describing scenes, where each prompt is associated with a different generated image and the text description is used for all images that depict the subject. Internal activations (intermediate data) computed by the model during generation of the different images are shared for generation of the different images. A subject-driven shared attention block and correspondence-based feature injection are incorporated into the model to promote subject consistency within each image and/or between images. Additionally, layout diversity is encouraged while maintaining subject consistency.

Type: Application

Filed: April 30, 2024

Publication date: August 7, 2025

Inventors: Yoad Tewel, Rinon Pery Gal, Yehonatan Kasten, Yuval Atzmon, Gal Chechik
TRAINING-FREE CONSISTENT TEXT-TO-IMAGE GENERATION

Publication number: 20250252613

Abstract: Embodiments of the present disclosure relate to training-free consistent text-to-image generation. A pre-trained text-to-image diffusion model is leveraged to generate images depicting a consistent subject for diverse prompts describing scenes. Inputs to the model are a text description of at least one subject with prompts (scene text descriptions) describing scenes, where each prompt is associated with a different generated image and the text description is used for all images that depict the subject. Internal activations (intermediate data) computed by the model during generation of the different images are shared for generation of the different images. A subject-driven shared attention block and correspondence-based feature injection are incorporated into the model to promote subject consistency within each image and/or between images. Additionally, layout diversity is encouraged while maintaining subject consistency.

Type: Application

Filed: April 30, 2024

Publication date: August 7, 2025

Inventors: Yoad Tewel, Rinon Pery Gal, Yehonatan Kasten, Yuval Atzmon, Gal Chechik
Relevance-based image selection

Patent number: 12373490

Abstract: A system, computer readable storage medium, and computer-implemented method presents video search results responsive to a user keyword query. The video hosting system uses a machine learning process to learn a feature-keyword model associating features of media content from a labeled training dataset with keywords descriptive of their content. The system uses the learned model to provide video search results relevant to a keyword query based on features found in the videos. Furthermore, the system determines and presents one or more thumbnail images representative of the video using the learned model.

Type: Grant

Filed: May 22, 2023

Date of Patent: July 29, 2025

Assignee: GOOGLE LLC

Inventors: Gal Chechik, Samy Bengio
TECHNIQUES FOR CHARACTER MOTION PLANNING

Publication number: 20250238988

Abstract: One embodiment of a method for controlling a character includes receiving a state of the character, a path to follow, and first information about a scene, generating, via a trained machine learning model and based on the state of the character, the path, and the first information, a first action for the character to perform, wherein the first action comprises a first type of motion included in a plurality of types of motions for which the trained machine learning model is trained to generate actions, and causing the character to perform the first action.

Type: Application

Filed: July 24, 2024

Publication date: July 24, 2025

Inventors: Chen TESSLER, Assaf HALLAK, Gal DALAL, Gal CHECHIK, Shie MANNOR
TECHNIQUES FOR CHARACTER MOTION PLANNING

Publication number: 20250238989

Abstract: One embodiment of a method for controlling a character includes receiving a state of the character, a path to follow, and first information about a scene, generating, via a trained machine learning model and based on the state of the character, the path, and the first information, a first action for the character to perform, wherein the first action comprises a first type of motion included in a plurality of types of motions for which the trained machine learning model is trained to generate actions, and causing the character to perform the first action.

Type: Application

Filed: July 24, 2024

Publication date: July 24, 2025

Inventors: Chen TESSLER, Assaf HALLAK, Gal DALAL, Gal CHECHIK, Shie MANNOR
Content-aware, machine-learning-based rate control

Patent number: 12335486

Abstract: A system includes a processing device to receive video content, metadata related to the video content, and a target bit rate for encoding the video content. The processing device further detects a content type of the video content based on the metadata and encodes hardware to perform frame encoding on the video content. The system further includes a controller coupled between the processing device and the encoding hardware. The controller is programmed with machine instructions to generate first QP values on a per-frame basis using a frame machine learning model with a first plurality of weights. The first plurality of weights depends at least in part on the content type and the target bit rate. The controller further provides the first QP values to the encoding hardware for rate control of the frame encoding.

Type: Grant

Filed: January 12, 2023

Date of Patent: June 17, 2025

Assignee: Mellanox Technologies, Ltd.

Inventors: Eshed Ram, Dotan David Levi, Assaf Hallak, Shie Mannor, Gal Chechik, Eyal Frishman, Ohad Markus, Dror Porat, Assaf Weissman

1 2 3 4 5 … next