Patents by Inventor Adam Polyak

Adam Polyak has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20260127792
    Abstract: Methods and systems are provided to edit or update images or videos based on instructions. A system may analyze an input image and may determine an instruction associated with the input image. The instruction may include content to edit or update the input image. The system may select an edit task, among predetermined edit tasks associated with changes to images, based on a description of the content of the instruction. The system may generate an output image, based on implementing the selected edit task, including an update to the input image depicting the description of the content of the instruction.
    Type: Application
    Filed: November 3, 2025
    Publication date: May 7, 2026
    Inventors: Adam Polyak, Yuval Kirstain, Yaniv Nechemia Taigman, Shelly Sheynin, Uriel Singer, Amit Zohar, Devi Niru Parikh
  • Publication number: 20260100204
    Abstract: A method to generate synchronized audio for a video includes receiving the video including a sequence of frames and receiving a text input describing at least one of a scene, an event, or a mood to be reflected in an audio track. The method also includes generating a latent audio representation via an audio generation model conditioned jointly on video embeddings associated with the sequence of frames and text embeddings associated with the text input. The method also includes decoding the latent audio representation to produce an audio track temporally aligned with the video and semantically consistent with the text input.
    Type: Application
    Filed: October 2, 2025
    Publication date: April 9, 2026
    Inventors: Zecheng He, Samaneh Azadi, Bowen Shi, Apoorv Vyas, Ann Lee, Ishan Satish Misra, Peizhao Zhang, Roshan Rajesh Sumbaly, Yaniv Nechemia Taigman, Peter Vajda, Yi-Chiao Wu, Andros Tjandra, Wei-Ning Hsu, Amit Zohar, Animesh Sinha, Yuval Kirstain, Shelly Sheynin, Adam Polyak, Matthew Le, Juefei Xu, Haoyu Ma, Tingbo Hou
  • Publication number: 20260100203
    Abstract: A method to edit a video includes receiving an input video including a sequence of frames and receiving an editing instruction expressed in natural language. The method also includes generating a multimodal condition based on the textual editing instruction and the input video. The multimodal condition may include an embedding of the input video concatenated with an embedding of the textual editing instruction. The method also includes applying, via a video editing model, the multimodal condition to modify visual content of the input video. The method further includes generating an edited video including visual modifications corresponding to the textual editing instruction. The edited video preserves temporal coherence and overall visual fidelity of the input video.
    Type: Application
    Filed: October 2, 2025
    Publication date: April 9, 2026
    Inventors: Zecheng He, Samaneh Azadi, Bowen Shi, Apoorv Vyas, Ann Lee, Ishan Satish Misra, Peizhao Zhang, Roshan Rajesh Sumbaly, Yaniv Nechemia Taigman, Peter Vajda, Yi-Chiao Wu, Andros Tjandra, Wei-Ning Hsu, Amit Zohar, Animesh Sinha, Yuval Kirstain, Shelly Sheynin, Adam Polyak, Matthew Le, Juefei Xu, Haoyu Ma, Tingbo Hou
  • Publication number: 20260101081
    Abstract: A system and method to generate a video is provided. The method may include generating, based on a user input including a description of a desired video, a structured script including one or more of scene descriptions, dialogue, or explicit shot-level information. The method also includes generating, based on the structured script, a sequence of video frames representing one or more scenes. The method further includes generating, based on the structured script and the sequence of video frames, an audio track including one or more of ambient sounds, sound effects, or music. The generated audio track being temporally synchronized with the sequence of video frames. The method also includes combining the sequence of video frames with the audio track to generate a synchronized video output representing the desired video.
    Type: Application
    Filed: October 2, 2025
    Publication date: April 9, 2026
    Inventors: Zecheng He, Samaneh Azadi, Bowen Shi, Apoorv Vyas, Ann Lee, Ishan Satish Misra, Peizhao Zhang, Roshan Rajesh Sumbaly, Yaniv Nechemia Taigman, Peter Vajda, Yi-Chiao Wu, Andros Tjandra, Wei-Ning Hsu, Amit Zohar, Animesh Sinha, Yuval Kirstain, Shelly Sheynin, Adam Polyak, Matthew Le, Juefei Xu, Haoyu Ma, Tingbo Hou
  • Publication number: 20260099978
    Abstract: A method to generate a video includes receiving an input describing a scene. The method also includes receiving a reference image depicting a character. The method further includes generating, via an encoder, embeddings of identity features of the reference image. The method also includes generating, via a video generation model, the video in which the character appears with consistent likeness across multiple frames in accordance with the embeddings and the text prompt.
    Type: Application
    Filed: October 2, 2025
    Publication date: April 9, 2026
    Inventors: Zecheng He, Samaneh Azadi, Bowen Shi, Apoorv Vyas, Ann Lee, Ishan Satish Misra, Peizhao Zhang, Roshan Rajesh Sumbaly, Yaniv Nechemia Taigman, Peter Vajda, Yi-Chiao Wu, Andros Tjandra, Wei-Ning Hsu, Amit Zohar, Animesh Sinha, Yuval Kirstain, Shelly Sheynin, Adam Polyak, Matthew Le, Juefei Xu, Haoyu Ma, Tingbo Hou
  • Publication number: 20260006286
    Abstract: Various systems, methods, and devices are described for utilizing artificial intelligence (AI) bot (e.g., a chatbot) to fetch or create content associated with a third-party platform based on an input associated with an electronic device. In an example, systems and methods of AI bot fetching or creating content may include receiving an input, via a user device. The input may be textual, audible, or any other suitable method. Based on the input, one or more content items may be fetched or created. The machine learning model may be utilized to determine context associated with the input. The machine leaning model may determine a number of content items associated with the input and data sources related to the retrieval generators. A result may be presented to a user, where the result may comprise the one or more content items determined.
    Type: Application
    Filed: May 20, 2025
    Publication date: January 1, 2026
    Inventors: Hong Yan, Adam Polyak, Yaniv Nechemia Taigman, Devi Niru Parikh, Rakesh Ranjan, Hao Jiang, Shelly Sheynin, Uriel Singer, Yuval Kirstain, Jingqing Huang, Amit Zohar
  • Patent number: 12387388
    Abstract: In one embodiment, a method includes accessing a text input and a scene input corresponding to the text input, wherein the scene input comprises semantic segmentations, generating text tokens for the text input and scene tokens for the scene input by machine-learning models, generating predicted image tokens based on the text tokens and the scene tokens by the machine-learning models, and generating an image corresponding to the text input and the scene input based on the predicted image tokens by the machine-learning models.
    Type: Grant
    Filed: January 3, 2023
    Date of Patent: August 12, 2025
    Assignee: Meta Platforms, Inc.
    Inventors: Oran Gafni, Adam Polyak, Yaniv Nechemia Taigman
  • Publication number: 20240221235
    Abstract: In one embodiment, a method includes accessing a text input and a scene input corresponding to the text input, wherein the scene input comprises semantic segmentations, generating text tokens for the text input and scene tokens for the scene input by machine-learning models, generating predicted image tokens based on the text tokens and the scene tokens by the machine-learning models, and generating an image corresponding to the text input and the scene input based on the predicted image tokens by the machine-learning models.
    Type: Application
    Filed: January 3, 2023
    Publication date: July 4, 2024
    Inventors: Oran Gafni, Adam Polyak, Yaniv Nechemia Taigman
  • Publication number: 20240155071
    Abstract: A method and system for text-to-video generation. The method includes receiving a text input, generating a representation frame based on the text input using a model trained on text-image pairs, generating a set of frames based on the representation frame and a first frame rate, interpolating the set of frames to a higher frame rate, generating a first video based on the interpolated set of frames, increasing a resolution of the first video based on a first and second super-resolution model, and generating an output video based on a result of the super-resolution models.
    Type: Application
    Filed: September 29, 2023
    Publication date: May 9, 2024
    Inventors: Sonal Gupta, Adam Polyak, Thomas Falstad Hayes, Xi Yin, Jie An, Chao Yang, Oron Ashual, Oran Gafni, Devi Niru Parikh, Yaniv Nechemia Taigman, Uriel Singer, Songyang Zhang, Qiyuan Hu
  • Publication number: 20240112687
    Abstract: Methods, systems, and storage media for generating audio data includes receiving a text input. The method also includes receiving a plurality of representative audio sources and encoding the plurality of representative audio sources into a plurality of audio tokens. The method includes encoding the text input into a plurality of text representations. The method comprises mapping each audio tokens of the plurality of audio tokens to a text representation of the plurality of text representations. The method also comprises determining a relationship score based on mapping each audio tokens to the text representation, wherein the relationship score identifies a distribution of audio tokens from the plurality of audio tokens. The method and systems can also comprise decoding the subgroup of audio tokens to yield a reconstructed audio source.
    Type: Application
    Filed: September 29, 2023
    Publication date: April 4, 2024
    Inventors: Yaniv Nechemia Taigman, Felix Kruk, Yossef Mordechay Adi, Gabriel Synnaeve, Adam Polyak, Uriel Singer, Devi Niru Parikh, Alexandre Défossez, Jade Copet
  • Patent number: 11430424
    Abstract: Disclosed herein a system, a method and a device for generating a voice model for a user. A device can include an encoder and a decoder to generate a voice model for converting text to an audio output that resembles a voice of the person sending respective text. The encoder can includes a neural network and can receive a plurality of audio samples from a user. The encoder can generate a sequence of values and provide the sequence of values to the decoder. The decoder can establish, using the sequence of values and one or more speaker embeddings of the user, a voice model corresponding to the plurality of audio samples of the user.
    Type: Grant
    Filed: November 13, 2019
    Date of Patent: August 30, 2022
    Assignee: Meta Platforms Technologies, LLC
    Inventors: Lior Wolf, David Vazquez, Tali Zvi, Yaniv Nechemia Taigman, Adam Polyak, Hyunbin Park
  • Publication number: 20210142782
    Abstract: Disclosed herein a system, a method and a device for generating a voice model for a user. A device can include an encoder and a decoder to generate a voice model for converting text to an audio output that resembles a voice of the person sending respective text. The encoder can includes a neural network and can receive a plurality of audio samples from a user. The encoder can generate a sequence of values and provide the sequence of values to the decoder. The decoder can establish, using the sequence of values and one or more speaker embeddings of the user, a voice model corresponding to the plurality of audio samples of the user.
    Type: Application
    Filed: November 13, 2019
    Publication date: May 13, 2021
    Inventors: Lior Wolf, David Vazquez, Tali Zvi, Yaniv Nechemia Taigman, Adam Polyak, Hyunbin Park