Patents by Inventor Gal Chechik

Gal Chechik has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20260120362
    Abstract: Text-to-image transformers configured in one aspect to associate an input text token with the specific object, apply latent blending with attention to a combination of keys and values for the input text token and a background image upon which to add the object; and which in another aspect perform latent blending with attention to keys and values for the object to add, keys and values for the background, and keys and values for a text prompt.
    Type: Application
    Filed: October 21, 2025
    Publication date: April 30, 2026
    Applicant: NVIDIA Corp.
    Inventors: Yoad Tewel, Gal Chechik
  • Patent number: 12614316
    Abstract: A text-to-image machine learning model takes a user input text and generates an image matching the given description. While text-to-image models currently exist, there is a desire to personalize these models on a per-user basis, including to configure the models to generate images of specific, unique user-provided concepts (via images of specific objects or styles) while allowing the user to use free text “prompts” to modify their appearance or compose them in new roles and novel scenes. Current personalization solutions either generate images with only coarse-grained resemblance to the provided concept(s) or require fine tuning of the entire model which is costly and can adversely affect the model.
    Type: Grant
    Filed: October 31, 2023
    Date of Patent: April 28, 2026
    Assignee: NVIDIA CORPORATION
    Inventors: Yuval Atzmon, Yoad Tewel, Rinon Gal, Gal Chechik
  • Publication number: 20260101046
    Abstract: A method for performing content-based video compression using reinforcement learning (RL) for video rate control for a downstream task is provided. The method includes processing frame information associated with a raw frame from a video using an RL agent to generate quantization parameter (QP) information that indicates one or more values associated with a compression level of the raw frame and encoding the raw frame into a bitstream based on the QP information. The method further includes reconstructing the raw frame using the bitstream to obtain a reconstructed frame and processing the raw frame as well as the reconstructed frame using a pre-trained downstream model to generate two outputs. The method then includes determining a downstream task reward based on the two outputs and training the RL agent based on the downstream task reward.
    Type: Application
    Filed: April 29, 2025
    Publication date: April 9, 2026
    Inventors: Assaf Joseph Hallak, Uri Haim Gadot, Assaf Shoher, Dotan Levi, Eshed Ram, Dror Porat, Eyal Frishman, Shie Mannor, Gal Chechik
  • Publication number: 20260101045
    Abstract: A method for performing content-based video compression using reinforcement learning (RL) is provided. The method includes obtaining frame information associated with a frame from a video. The frame information comprises quantization parameter (QP) information associated with the frame, and the QP information indicates an initial compression level for encoding aspects of the frame. The frame information and additional information are processed by an RL agent to generate a generated QP map indicating a plurality of updated values associated with a plurality of macro-blocks (MBs) of the frame. A bitstream is generated comprising a plurality of bits for the frame based on the generated QP map. Specifically, the plurality of updated values from the generated QP map indicates an amount of allocated bits from the bitstream to allocate for each of the plurality of MBs. The bitstream is provided to a downstream model.
    Type: Application
    Filed: April 29, 2025
    Publication date: April 9, 2026
    Inventors: Assaf Joseph Hallak, Uri Haim Gadot, Assaf Shoher, Dotan Levi, Eshed Ram, Dror Porat, Eyal Frishman, Shie Mannor, Gal Chechik
  • Patent number: 12573403
    Abstract: A system to generate a latent space model of a scene or video and apply this latent space and candidate sentences formed from digital audio to a vision-language matching model to enhance the accuracy of speech-to-text conversion. A latent space embedding of the scene is generated in which similar features are represented in the space closer to one another. An embedding for the digital audio is also generated. The vision-language matching model utilizes the latent space embedding to enhance the accuracy of transcribing/interpreting the embedding of the digital audio.
    Type: Grant
    Filed: June 22, 2023
    Date of Patent: March 10, 2026
    Assignee: NVIDIA Corp.
    Inventors: Gal Chechik, Shie Mannor
  • Patent number: 12566801
    Abstract: A method for performing a Tree-Search (TS) on an environment is provided. The method comprises generating a tree for a current state of the environment based on a TS policy, determining a corrected TS policy, and determining an action to apply to the environment based on the corrected TS policy. The tree comprises a plurality of nodes including a root node among the plurality of nodes corresponding to the current state of the environment. Each node other than the root node among the plurality of nodes corresponding to an estimated future state of the environment. The plurality of nodes in the tree are connected by a plurality of edges. Each edge among the plurality of edges is associated with an action causing a transition from a first state to a different sate of the environment.
    Type: Grant
    Filed: May 25, 2022
    Date of Patent: March 3, 2026
    Assignee: NVIDIA Corporation
    Inventors: Shie Mannor, Assaf Joseph Hallak, Gal Dalal, Steven Tarence Dalton, Iuri Frosio, Gal Chechik
  • Publication number: 20260057889
    Abstract: Ae system to generate a latent space model of a scene or video and apply this latent space and candidate sentences formed from digital audio to a vision-language matching model to enhance the accuracy of speech-to-text conversion. A latent space embedding of the scene is generated in which similar features are represented in the space closer to one another. An embedding for the digital audio is also generated. The vision-language matching model utilizes the latent space embedding to enhance the accuracy of transcribing/interpreting the embedding of the digital audio.
    Type: Application
    Filed: October 31, 2025
    Publication date: February 26, 2026
    Applicant: NVIDIA Corp.
    Inventors: Gal Chechik, Shie Mannor
  • Patent number: 12554385
    Abstract: Apparatuses, systems, and techniques to identify one or more modifications to objects within an environment. In at least one embodiment, objects are identified in an image, based on extracted feedback information, using one or more machine learning models, for example, using direct and/or implicit feedback of user interaction with one or more objects in an environment.
    Type: Grant
    Filed: August 9, 2023
    Date of Patent: February 17, 2026
    Assignee: NVIDIA CORPORATION
    Inventors: Shie Mannor, Gal Chechik
  • Patent number: 12505599
    Abstract: A conditional adversarial latent model (CALM) process can be used to generate reference motions from a set of original reference movements to create a library of new movements for an agent. The agent can be a virtual representation various types of characters, animals, or objects. The CALM process can receive a set of reference movements and a requested movement. An encoder can be used to map the requested movement onto a latent space. A low-level policy can be employed to produce a series of latent space joint movements for the agent. A conditional discriminator can be used to provide feedback to the low-level policy to produce stationary distributions over the states of the agent. A high-level policy can be employed to provide a macro movement control over the low-level policy movements, such as providing direction in the environment. The high-level policy can utilize a reward or a finite-state machine function.
    Type: Grant
    Filed: August 3, 2023
    Date of Patent: December 23, 2025
    Assignee: NVIDIA Corporation
    Inventors: Chen Tessler, Gal Chechik, Yoni Kasten, Shie Mannor, Jason Peng
  • Publication number: 20250363690
    Abstract: Seamlessly moving, or dragging, an object from one location in an image to another location in the image is, in practice, a challenge especially for current generative image editing methods. Current methods that tackle this problem rely on time-consuming Low Ranked Adaptation (LoRA) training per image, training a designated model on a large dataset or utilizing classifier-free guidance (CFG) with specific objectives. However, these methods are not robust and struggle to operate reliably in a real-world setting due to lacking spatial reasoning. The present disclosure provides a diffusion model that can harness spatial understanding when relocating an object in an image, thereby resulting in a more seamless result (e.g. fewer visual artifacts).
    Type: Application
    Filed: February 27, 2025
    Publication date: November 27, 2025
    Inventors: Omri Avrahami, Weili Nie, Rinon Gal, Arash Vahdat, Gal Chechik
  • Publication number: 20250356186
    Abstract: One embodiment of a method for animating characters includes receiving one or more goals specified in one or more modalities, generating, via a trained machine learning model and based on the one or more goals, a first action for a character to perform, where the trained machine learning model is trained to process inputs in multiple modalities, and causing the character to perform the first action within a computer-based or physical environment.
    Type: Application
    Filed: December 16, 2024
    Publication date: November 20, 2025
    Inventors: Chen TESSLER, Gal CHECHIK, Ofir NABATI, Jason PENG
  • Publication number: 20250356565
    Abstract: One embodiment of a method for animating characters includes receiving one or more goals specified in one or more modalities, generating, via a trained machine learning model and based on the one or more goals, a first action for a character to perform, where the trained machine learning model is trained to process inputs in multiple modalities, and causing the character to perform the first action within a computer-based or physical environment.
    Type: Application
    Filed: December 16, 2024
    Publication date: November 20, 2025
    Inventors: Chen TESSLER, Gal CHECHIK, Ofir NABATI, Jason PENG
  • Publication number: 20250342389
    Abstract: Systems and methods herein are for determining a poisoning in a machine learning (ML) model, which may be a pre-trained ML model that is subject to finetuning by a third-party. The system and method herein obtain first observations associated with the pre-trained ML model and may determine a distribution or classification of the first observations with respect to second observations obtained during the finetuning of the pre-trained ML model at different periods. Further, the determining of the poisoned ML model may be based in part on the distribution or classification being different than a predetermined threshold or being outside a predetermined threshold range.
    Type: Application
    Filed: May 2, 2024
    Publication date: November 6, 2025
    Inventors: Nir Rosen, Vadim Gechman, Shie Mannor, Gal Chechik
  • Publication number: 20250316009
    Abstract: Text-to-image generation generally refers to the process of generating an image from one or more text prompts input by a user and in some cases also a user provided sample image. Existing text-to-image generation processes are configured to only generate content from text and usually non-original sample images (e.g. obtained from the Internet). This limits the customization options available to the user. The present disclosure provides a sketch-to-3D content generation process which allows users to generate 3D content from a given 3D human generated, or free-form, sketch, which enables greater customization of computer generated 3D content.
    Type: Application
    Filed: April 8, 2024
    Publication date: October 9, 2025
    Inventors: Chen Tessler, Yoni Kasten, Gal Chechik
  • Publication number: 20250252614
    Abstract: Embodiments of the present disclosure relate to training-free consistent text-to-image generation. A pre-trained text-to-image diffusion model is leveraged to generate images depicting a consistent subject for diverse prompts describing scenes. Inputs to the model are a text description of at least one subject with prompts (scene text descriptions) describing scenes, where each prompt is associated with a different generated image and the text description is used for all images that depict the subject. Internal activations (intermediate data) computed by the model during generation of the different images are shared for generation of the different images. A subject-driven shared attention block and correspondence-based feature injection are incorporated into the model to promote subject consistency within each image and/or between images. Additionally, layout diversity is encouraged while maintaining subject consistency.
    Type: Application
    Filed: April 30, 2024
    Publication date: August 7, 2025
    Inventors: Yoad Tewel, Rinon Pery Gal, Yehonatan Kasten, Yuval Atzmon, Gal Chechik
  • Publication number: 20250252613
    Abstract: Embodiments of the present disclosure relate to training-free consistent text-to-image generation. A pre-trained text-to-image diffusion model is leveraged to generate images depicting a consistent subject for diverse prompts describing scenes. Inputs to the model are a text description of at least one subject with prompts (scene text descriptions) describing scenes, where each prompt is associated with a different generated image and the text description is used for all images that depict the subject. Internal activations (intermediate data) computed by the model during generation of the different images are shared for generation of the different images. A subject-driven shared attention block and correspondence-based feature injection are incorporated into the model to promote subject consistency within each image and/or between images. Additionally, layout diversity is encouraged while maintaining subject consistency.
    Type: Application
    Filed: April 30, 2024
    Publication date: August 7, 2025
    Inventors: Yoad Tewel, Rinon Pery Gal, Yehonatan Kasten, Yuval Atzmon, Gal Chechik
  • Patent number: 12373490
    Abstract: A system, computer readable storage medium, and computer-implemented method presents video search results responsive to a user keyword query. The video hosting system uses a machine learning process to learn a feature-keyword model associating features of media content from a labeled training dataset with keywords descriptive of their content. The system uses the learned model to provide video search results relevant to a keyword query based on features found in the videos. Furthermore, the system determines and presents one or more thumbnail images representative of the video using the learned model.
    Type: Grant
    Filed: May 22, 2023
    Date of Patent: July 29, 2025
    Assignee: GOOGLE LLC
    Inventors: Gal Chechik, Samy Bengio
  • Publication number: 20250238988
    Abstract: One embodiment of a method for controlling a character includes receiving a state of the character, a path to follow, and first information about a scene, generating, via a trained machine learning model and based on the state of the character, the path, and the first information, a first action for the character to perform, wherein the first action comprises a first type of motion included in a plurality of types of motions for which the trained machine learning model is trained to generate actions, and causing the character to perform the first action.
    Type: Application
    Filed: July 24, 2024
    Publication date: July 24, 2025
    Inventors: Chen TESSLER, Assaf HALLAK, Gal DALAL, Gal CHECHIK, Shie MANNOR
  • Publication number: 20250238989
    Abstract: One embodiment of a method for controlling a character includes receiving a state of the character, a path to follow, and first information about a scene, generating, via a trained machine learning model and based on the state of the character, the path, and the first information, a first action for the character to perform, wherein the first action comprises a first type of motion included in a plurality of types of motions for which the trained machine learning model is trained to generate actions, and causing the character to perform the first action.
    Type: Application
    Filed: July 24, 2024
    Publication date: July 24, 2025
    Inventors: Chen TESSLER, Assaf HALLAK, Gal DALAL, Gal CHECHIK, Shie MANNOR
  • Patent number: 12335486
    Abstract: A system includes a processing device to receive video content, metadata related to the video content, and a target bit rate for encoding the video content. The processing device further detects a content type of the video content based on the metadata and encodes hardware to perform frame encoding on the video content. The system further includes a controller coupled between the processing device and the encoding hardware. The controller is programmed with machine instructions to generate first QP values on a per-frame basis using a frame machine learning model with a first plurality of weights. The first plurality of weights depends at least in part on the content type and the target bit rate. The controller further provides the first QP values to the encoding hardware for rate control of the frame encoding.
    Type: Grant
    Filed: January 12, 2023
    Date of Patent: June 17, 2025
    Assignee: Mellanox Technologies, Ltd.
    Inventors: Eshed Ram, Dotan David Levi, Assaf Hallak, Shie Mannor, Gal Chechik, Eyal Frishman, Ohad Markus, Dror Porat, Assaf Weissman