Patents by Inventor Adam Polyak

Adam Polyak has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240221235
    Abstract: In one embodiment, a method includes accessing a text input and a scene input corresponding to the text input, wherein the scene input comprises semantic segmentations, generating text tokens for the text input and scene tokens for the scene input by machine-learning models, generating predicted image tokens based on the text tokens and the scene tokens by the machine-learning models, and generating an image corresponding to the text input and the scene input based on the predicted image tokens by the machine-learning models.
    Type: Application
    Filed: January 3, 2023
    Publication date: July 4, 2024
    Inventors: Oran Gafni, Adam Polyak, Yaniv Nechemia Taigman
  • Publication number: 20240155071
    Abstract: A method and system for text-to-video generation. The method includes receiving a text input, generating a representation frame based on the text input using a model trained on text-image pairs, generating a set of frames based on the representation frame and a first frame rate, interpolating the set of frames to a higher frame rate, generating a first video based on the interpolated set of frames, increasing a resolution of the first video based on a first and second super-resolution model, and generating an output video based on a result of the super-resolution models.
    Type: Application
    Filed: September 29, 2023
    Publication date: May 9, 2024
    Inventors: Sonal Gupta, Adam Polyak, Thomas Falstad Hayes, Xi Yin, Jie An, Chao Yang, Oron Ashual, Oran Gafni, Devi Niru Parikh, Yaniv Nechemia Taigman, Uriel Singer, Songyang Zhang, Qiyuan Hu
  • Publication number: 20240112687
    Abstract: Methods, systems, and storage media for generating audio data includes receiving a text input. The method also includes receiving a plurality of representative audio sources and encoding the plurality of representative audio sources into a plurality of audio tokens. The method includes encoding the text input into a plurality of text representations. The method comprises mapping each audio tokens of the plurality of audio tokens to a text representation of the plurality of text representations. The method also comprises determining a relationship score based on mapping each audio tokens to the text representation, wherein the relationship score identifies a distribution of audio tokens from the plurality of audio tokens. The method and systems can also comprise decoding the subgroup of audio tokens to yield a reconstructed audio source.
    Type: Application
    Filed: September 29, 2023
    Publication date: April 4, 2024
    Inventors: Yaniv Nechemia Taigman, Felix Kruk, Yossef Mordechay Adi, Gabriel Synnaeve, Adam Polyak, Uriel Singer, Devi Niru Parikh, Alexandre Défossez, Jade Copet
  • Patent number: 11430424
    Abstract: Disclosed herein a system, a method and a device for generating a voice model for a user. A device can include an encoder and a decoder to generate a voice model for converting text to an audio output that resembles a voice of the person sending respective text. The encoder can includes a neural network and can receive a plurality of audio samples from a user. The encoder can generate a sequence of values and provide the sequence of values to the decoder. The decoder can establish, using the sequence of values and one or more speaker embeddings of the user, a voice model corresponding to the plurality of audio samples of the user.
    Type: Grant
    Filed: November 13, 2019
    Date of Patent: August 30, 2022
    Assignee: Meta Platforms Technologies, LLC
    Inventors: Lior Wolf, David Vazquez, Tali Zvi, Yaniv Nechemia Taigman, Adam Polyak, Hyunbin Park
  • Publication number: 20210142782
    Abstract: Disclosed herein a system, a method and a device for generating a voice model for a user. A device can include an encoder and a decoder to generate a voice model for converting text to an audio output that resembles a voice of the person sending respective text. The encoder can includes a neural network and can receive a plurality of audio samples from a user. The encoder can generate a sequence of values and provide the sequence of values to the decoder. The decoder can establish, using the sequence of values and one or more speaker embeddings of the user, a voice model corresponding to the plurality of audio samples of the user.
    Type: Application
    Filed: November 13, 2019
    Publication date: May 13, 2021
    Inventors: Lior Wolf, David Vazquez, Tali Zvi, Yaniv Nechemia Taigman, Adam Polyak, Hyunbin Park