Patents by Inventor Paloma de Juan

Paloma de Juan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240038271
    Abstract: One or more computing devices, systems, and/or methods for generating a video in a target language are provided. In an example, a first video, in which a first speaker speaks in a first language, is identified. A translated transcript in a second language is determined. The translated transcript is indicative of a translation of speech spoken by the first speaker in the first video. Based upon the translated transcript and a speaker profile associated with a second speaker, first audio, including an auditory representation of the translated transcript being spoken in a voice of the second speaker, is generated. Based upon the first video and the first audio, a second video, in which mouth movements of the first speaker are aligned with speech of the auditory representation of the first audio, is generated.
    Type: Application
    Filed: July 29, 2022
    Publication date: February 1, 2024
    Inventors: Paloma de Juan, Alex J. Shaw, Eric M. Dodds, Benjamin J. Culpepper, Kapil Raj Thadani, Lakshmi V. Kesiraju, Praveen Mareedu, Sanika Shirwadkar, Xingyue Zhou, Yueh-Ning Ku
  • Patent number: 11886488
    Abstract: The present teaching relates to method, system, and programming for responding to an image related query. Information related to each of a plurality of images is received, wherein the information represents concepts co-existing in the image. Visual semantics for each of the plurality of images are created based on the information related thereto. Representations of scenes of the plurality of images are obtained via machine learning, based on the visual semantics of the plurality of images, wherein the representations capture concepts associated with the scenes.
    Type: Grant
    Filed: October 21, 2022
    Date of Patent: January 30, 2024
    Assignee: YAHOO ASSETS LLC
    Inventors: Paloma de Juan, Aasish Pappu
  • Publication number: 20230260001
    Abstract: One or more systems and/or methods for product similarity detection and recommendation are provided. Users may view articles and/or other content that includes images depicting products that may be of interest to the users. These images are processed using image processing functionality such as computer vision to identify the products depicted by the images. A vector embedding model is used to generate product vector representations of the products. Catalog items that are available from a catalog to supplement the articles and other content may be processed to generate catalog item vector representations. When content (an article) with an image depicting a product is to be displayed to the user, similarity between a product vector representation of the product and the catalog item vector representations is determined in order to identify and display catalog items depicting products that are similar to the product depicted by the image in the content.
    Type: Application
    Filed: February 14, 2022
    Publication date: August 17, 2023
    Inventors: Paloma de Juan, Ritesh Kumar Shyam Sund Agrawal, Sricharanya Venkataramani, Eric McVoy Dodds, Gaurav Batra, Simao Herdade
  • Publication number: 20230206614
    Abstract: Disclosed frameworks for generating an image including a salient object and a staged background include extracting a salient object from a source image and applying a generative model to the salient object to generate the image. According to some embodiments, extracting a salient object from a source image involves using salient object detection method to identify the relevant portions of the source image corresponding to the salient object. In some embodiments, the generative model is a generative adversarial network trained using a domain relevant dataset.
    Type: Application
    Filed: December 28, 2021
    Publication date: June 29, 2023
    Inventors: Yueh-Ning KU, Mikhail KUZNETSOV, Shaunak MISHRA, Paloma de JUAN
  • Publication number: 20230041472
    Abstract: The present teaching relates to method, system, and programming for responding to an image related query. Information related to each of a plurality of images is received, wherein the information represents concepts co-existing in the image. Visual semantics for each of the plurality of images are created based on the information related thereto. Representations of scenes of the plurality of images are obtained via machine learning, based on the visual semantics of the plurality of images, wherein the representations capture concepts associated with the scenes.
    Type: Application
    Filed: October 21, 2022
    Publication date: February 9, 2023
    Inventors: Paloma de Juan, Aasish Pappu
  • Patent number: 11481575
    Abstract: The present teaching relates to method, system, and programming for responding to an image related query. Information related to each of a plurality of images is received, wherein the information represents concepts co-existing in the image. Visual semantics for each of the plurality of images are created based on the information related thereto. Representations of scenes of the plurality of images are obtained via machine learning, based on the visual semantics of the plurality of images, wherein the representations capture concepts associated with the scenes.
    Type: Grant
    Filed: September 26, 2018
    Date of Patent: October 25, 2022
    Assignee: YAHOO ASSETS LLC
    Inventors: Paloma de Juan, Aasish Pappu
  • Patent number: 10965999
    Abstract: Multimodal multilabel tagging of video content may include labeling the video content with topical tags that are identified based on extracted features from two or more modalities of the video content. The two or more modalities may include (i) a video modality for the object, images, and/or visual elements of the video content, (ii) a text modality for the speech, dialog, and/or text of the video content, and/or (iii) an audio modality for non-speech sounds and/or sound characteristics of the video content. Combinational multimodal multilabel tagging may include combining two or more features from the same or different modality in order to increase the contextual understanding of the features and generate contextually relevant tags. Video content may be labeled with global tags relating to overall topics of the video content, and different sets of local tags relating to topics at different segments of the video content.
    Type: Grant
    Filed: March 2, 2020
    Date of Patent: March 30, 2021
    Assignee: Oath Inc.
    Inventors: Aasish Pappu, Akshay Soni, Paloma de Juan
  • Publication number: 20200204879
    Abstract: Multimodal multilabel tagging of video content may include labeling the video content with topical tags that are identified based on extracted features from two or more modalities of the video content. The two or more modalities may include (i) a video modality for the object, images, and/or visual elements of the video content, (ii) a text modality for the speech, dialog, and/or text of the video content, and/or (iii) an audio modality for non-speech sounds and/or sound characteristics of the video content. Combinational multimodal multilabel tagging may include combining two or more features from the same or different modality in order to increase the contextual understanding of the features and generate contextually relevant tags. Video content may be labeled with global tags relating to overall topics of the video content, and different sets of local tags relating to topics at different segments of the video content.
    Type: Application
    Filed: March 2, 2020
    Publication date: June 25, 2020
    Applicant: Oath Inc.
    Inventors: Aasish Pappu, Akshay Soni, Paloma de Juan
  • Patent number: 10623829
    Abstract: Multimodal multilabel tagging of video content may include labeling the video content with topical tags that are identified based on extracted features from two or more modalities of the video content. The two or more modalities may include (i) a video modality for the object, images, and/or visual elements of the video content, (ii) a text modality for the speech, dialog, and/or text of the video content, and/or (iii) an audio modality for non-speech sounds and/or sound characteristics of the video content. Combinational multimodal multilabel tagging may include combining two or more features from the same or different modality in order to increase the contextual understanding of the features and generate contextually relevant tags. Video content may be labeled with global tags relating to overall topics of the video content, and different sets of local tags relating to topics at different segments of the video content.
    Type: Grant
    Filed: September 7, 2018
    Date of Patent: April 14, 2020
    Assignee: Oath Inc.
    Inventors: Aasish Pappu, Akshay Soni, Paloma de Juan
  • Publication number: 20200097764
    Abstract: The present teaching relates to method, system, and programming for responding to an image related query. Information related to each of a plurality of images is received, wherein the information represents concepts co-existing in the image. Visual semantics for each of the plurality of images are created based on the information related thereto. Representations of scenes of the plurality of images are obtained via machine learning, based on the visual semantics of the plurality of images, wherein the representations capture concepts associated with the scenes.
    Type: Application
    Filed: September 26, 2018
    Publication date: March 26, 2020
    Inventors: Paloma de Juan, Aasish Pappu
  • Publication number: 20200084519
    Abstract: Multimodal multilabel tagging of video content may include labeling the video content with topical tags that are identified based on extracted features from two or more modalities of the video content. The two or more modalities may include (i) a video modality for the object, images, and/or visual elements of the video content, (ii) a text modality for the speech, dialog, and/or text of the video content, and/or (iii) an audio modality for non-speech sounds and/or sound characteristics of the video content. Combinational multimodal multilabel tagging may include combining two or more features from the same or different modality in order to increase the contextual understanding of the features and generate contextually relevant tags. Video content may be labeled with global tags relating to overall topics of the video content, and different sets of local tags relating to topics at different segments of the video content.
    Type: Application
    Filed: September 7, 2018
    Publication date: March 12, 2020
    Applicant: Oath Inc.
    Inventors: Aasish Pappu, Akshay Soni, Paloma de Juan
  • Patent number: 10560742
    Abstract: A method is provided, that initiates with providing a video over a network to a plurality of client devices, wherein each client device is configured to render the video and track movements of a pointer during the rendering of the video. Movement data that is indicative of the tracked movements of the pointer is received over the network from each client device. The movement data from the plurality of client devices is processed to determine aggregate pointer movement versus elapsed time of the video. The aggregate pointer movement is analyzed to identify a region of interest of the video. A preview of the video is generated based on the identified region of interest.
    Type: Grant
    Filed: January 28, 2016
    Date of Patent: February 11, 2020
    Assignee: Oath Inc.
    Inventors: Paloma de Juan, Yale Song, Gloria Zen
  • Patent number: 10032081
    Abstract: Methods and systems for classifying a video include analyzing an image captured in each frame of the video file to identify one or more elements. Each element identified in the image of each frame is matched to a corresponding term defined in a vocabulary list. A number of frames within the video file in which each of the element that correspond to the term in the vocabulary list, appears, is determined. A vector is generated for the video file identifying each term in the vocabulary list. The vector is represented as a name-value pair with the name corresponding to the term in the vocabulary list and the value corresponding to number of frames in which the element matching the term appears in the video file.
    Type: Grant
    Filed: February 9, 2016
    Date of Patent: July 24, 2018
    Assignee: Oath Inc.
    Inventor: Paloma De Juan
  • Publication number: 20170228599
    Abstract: Methods and systems for classifying a video include analyzing an image captured in each frame of the video file to identify one or more elements. Each element identified in the image of each frame is matched to a corresponding term defined in a vocabulary list. A number of frames within the video file in which each of the element that correspond to the term in the vocabulary list, appears, is determined. A vector is generated for the video file identifying each term in the vocabulary list. The vector is represented as a name-value pair with the name corresponding to the term in the vocabulary list and the value corresponding to number of frames in which the element matching the term appears in the video file.
    Type: Application
    Filed: February 9, 2016
    Publication date: August 10, 2017
    Inventor: Paloma De Juan
  • Publication number: 20170223411
    Abstract: A method is provided, that initiates with providing a video over a network to a plurality of client devices, wherein each client device is configured to render the video and track movements of a pointer during the rendering of the video. Movement data that is indicative of the tracked movements of the pointer is received over the network from each client device. The movement data from the plurality of client devices is processed to determine aggregate pointer movement versus elapsed time of the video. The aggregate pointer movement is analyzed to identify a region of interest of the video. A preview of the video is generated based on the identified region of interest.
    Type: Application
    Filed: January 28, 2016
    Publication date: August 3, 2017
    Inventors: Paloma de Juan, Yale Song, Gloria Zen