Patents by Inventor Seunghyun Yoon

Seunghyun Yoon has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Systems and methods for image processing using natural language

Patent number: 12293577

Abstract: Embodiments of the disclosure provide a machine learning model for generating a predicted executable command for an image. The learning model includes an interface configured to obtain an utterance indicating a request associated with the image, an utterance sub-model, a visual sub-model, an attention network, and a selection gate. The machine learning model generates a segment of the predicted executable command from weighted probabilities of each candidate token in a predetermined vocabulary determined based on the visual features, the concept features, current command features, and the utterance features extracted from the utterance or the image.

Type: Grant

Filed: February 18, 2022

Date of Patent: May 6, 2025

Assignee: Adobe Inc.

Inventors: Seunghyun Yoon, Trung Huu Bui, Franck Dernoncourt, Hyounghun Kim, Doo Soon Kim
UTILIZING MACHINE LEARNING MODELS TO GENERATE ASPECT-BASED TRANSCRIPT SUMMARIES

Publication number: 20250077775

Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for generating aspect-based summaries utilizing deep learning. In particular, in one or more embodiments, the disclosed systems access a transcript comprising sentences. The disclosed systems generate, utilizing a sentence classification machine learning model, aspect labels for the sentences of the transcript. The disclosed systems organize the sentences based on the aspect labels. The disclosed systems generate, utilizing a summary machine learning model, a summary of the transcript for each aspect of the plurality of aspects from the organized sentences.

Type: Application

Filed: August 29, 2023

Publication date: March 6, 2025

Inventors: Zhongfen Deng, Seunghyun Yoon, Trung Bui, Quan Tran, Franck Dernoncourt
Generating synthetic code-switched data for training language models

Patent number: 12242820

Abstract: Techniques for training a language model for code switching content are disclosed. Such techniques include, in some embodiments, generating a dataset, which includes identifying one or more portions within textual content in a first language, the identified one or more portions each including one or more of offensive content or non-offensive content; translating the identified one or more salient portions to a second language; and reintegrating the translated one or more portions into the textual content to generate code-switched textual content. In some cases, the textual content in the first language includes offensive content and non-offensive content, the identified one or more portions include the offensive content, and the translated one or more portions include a translated version of the offensive content. In some embodiments, the code-switched textual content is at least part of a synthetic dataset usable to train a language model, such as a multilingual classification model.

Type: Grant

Filed: February 17, 2022

Date of Patent: March 4, 2025

Assignee: Adobe Inc.

Inventors: Cesa Salaam, Seunghyun Yoon, Trung Huu Bui, Franck Dernoncourt
MULTILINGUAL SEMANTIC SEARCH UTILIZING META-DISTILLATION LEARNING

Publication number: 20250068924

Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for providing multilingual semantic search results utilizing meta-learning and knowledge distillation. For example, in some implementations, the disclosed systems perform a first inner learning loop for a monolingual to bilingual meta-learning task for a teacher model. Additionally, in some implementations, the disclosed systems perform a second inner learning loop for a bilingual to multilingual meta-learning task for a student model. In some embodiments, the disclosed systems perform knowledge distillation based on the first inner learning loop for the monolingual to bilingual meta-learning task and the second inner learning loop for the bilingual to multilingual meta-learning task.

Type: Application

Filed: August 14, 2023

Publication date: February 27, 2025

Inventors: Meryem M'hamdi, Seunghyun Yoon, Franck Dernoncourt, Trung Bui
Bi-directional recurrent encoders with multi-hop attention for speech emotion recognition

Patent number: 12236975

Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for determining speech emotion. In particular, a speech emotion recognition system generates an audio feature vector and a textual feature vector for a sequence of words. Further, the speech emotion recognition system utilizes a neural attention mechanism that intelligently blends together the audio feature vector and the textual feature vector to generate attention output. Using the attention output, which includes consideration of both audio and text modalities for speech corresponding to the sequence of words, the speech emotion recognition system can apply attention methods to one of the feature vectors to generate a hidden feature vector. Based on the hidden feature vector, the speech emotion recognition system can generate a speech emotion probability distribution of emotions among a group of candidate emotions, and then select one of the candidate emotions as corresponding to the sequence of words.

Type: Grant

Filed: November 15, 2021

Date of Patent: February 25, 2025

Assignee: Adobe Inc.

Inventors: Trung Bui, Subhadeep Dey, Seunghyun Yoon
Image captioning

Patent number: 12210825

Abstract: Systems and methods for image captioning are described. One or more aspects of the systems and methods include generating a training caption for a training image using an image captioning network; encoding the training caption using a multi-modal encoder to obtain an encoded training caption; encoding the training image using the multi-modal encoder to obtain an encoded training image; computing a reward function based on the encoded training caption and the encoded training image; and updating parameters of the image captioning network based on the reward function.

Type: Grant

Filed: November 18, 2021

Date of Patent: January 28, 2025

Assignee: ADOBE INC.

Inventors: Jaemin Cho, Seunghyun Yoon, Ajinkya Gorakhnath Kale, Trung Huu Bui, Franck Dernoncourt
PERFORMING VIDEO MOMENT RETRIEVAL UTILIZING DEEP LEARNING

Publication number: 20250028758

Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that learns parameters for a natural language video localization model utilizing a curated dataset. In particular, in some embodiments, the disclosed systems generate a set of similarity scores between a target query and a video dataset that includes a plurality of digital videos. For instance, the disclosed systems determines a false-negative threshold by utilizing the set of similarity scores to exclude a subset of false-negative samples from the plurality of digital videos. Further, the disclosed systems determines a negative sample distribution and generates a curated dataset that includes a subset of negative samples with the subset of false-negative samples excluded.

Type: Application

Filed: July 19, 2023

Publication date: January 23, 2025

Inventor: Seunghyun Yoon
IMPROVED TRAINING METHODS FOR LANGUAGE MODELS USING DATA GENERATION AND REINFORCEMENT LEARNING

Publication number: 20250022459

Abstract: The disclosed method generates helpful training data for a language model, for example, a model implementing a punctuation restoration task, for real-world ASR texts. The method uses a reinforcement learning method using a generative AI model to generate additional data to train the language model. The method allows the generative AI model to learn from real-world ASR text to generate more effective training examples based on gradient feedback from the language model.

Type: Application

Filed: July 12, 2023

Publication date: January 16, 2025

Applicant: Adobe Inc.

Inventors: Viet Dac Lai, Trung Bui, Seunghyun Yoon, Quan Tran, Hao Tan, Hanieh Deilamsalehy, Abel Salinas, Franck Dernoncourt
Intent detection

Patent number: 12182524

Abstract: Systems and methods for natural language processing are described. One or more aspects of a method, apparatus, and non-transitory computer readable medium include receiving a text phrase; encoding the text phrase using an encoder to obtain a hidden representation of the text phrase, wherein the encoder is trained during a first training phrase using self-supervised learning based on a first contrastive loss and during a second training phrase using supervised learning based on a second contrastive learning loss; identifying an intent of the text phrase from a predetermined set of intent labels using a classification network, wherein the classification network is jointly trained with the encoder in the second training phase; and generating a response to the text phrase based on the intent.

Type: Grant

Filed: November 4, 2021

Date of Patent: December 31, 2024

Assignee: ADOBE INC.

Inventors: Jianguo Zhang, Trung Huu Bui, Seunghyun Yoon, Xiang Chen, Quan Hung Tran, Walter W. Chang
SEGMENT IDENTIFICATION FROM LONG VIDEOS

Publication number: 20240355119

Abstract: One or more aspects of the method, apparatus, and non-transitory computer readable medium include receiving a query relating to a long video. One or more aspects of the method, apparatus, and non-transitory computer readable medium further include generating a segment of the long video corresponding to the query using a machine learning model trained to identify relevant segments from long videos. One or more aspects of the method, apparatus, and non-transitory computer readable medium further include responding to the query based on the generated segment.

Type: Application

Filed: April 24, 2023

Publication date: October 24, 2024

Inventors: Ioana Croitoru, Trung Huu Bui, Zhaowen Wang, Seunghyun Yoon, Franck Dernoncourt, Hailin Jin
Multimodal intent discovery system

Patent number: 12124508

Abstract: Systems and methods for intent discovery and video summarization are described. Embodiments of the present disclosure receive a video and a transcript of the video, encode the video to obtain a sequence of video encodings, encode the transcript to obtain a sequence of text encodings, apply a visual gate to the sequence of text encodings based on the sequence of video encodings to obtain gated text encodings, and generate an intent label for the transcript based on the gated text encodings.

Type: Grant

Filed: July 12, 2022

Date of Patent: October 22, 2024

Assignee: ADOBE INC.

Inventors: Adyasha Maharana, Quan Hung Tran, Seunghyun Yoon, Franck Dernoncourt, Trung Huu Bui, Walter W. Chang
PERTURBATION ROBUST METRIC FOR EVALUATING IMAGE CAPTIONS

Publication number: 20240304009

Abstract: Embodiments are disclosed for training an image caption evaluation system to perform evaluations of image captions. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving a training image, a ground truth image caption for the training image, and a perturbed image caption for the training image, where the perturbed image caption includes modifications to the ground truth image caption. The disclosed systems and methods further comprise generating, by a visual encoder, a visual embedding representation of the training image and generating, by a perturbation-aware text encoder, a first text embedding for the ground truth image caption and a second text embedding for the perturbed image caption. The disclosed systems and methods further comprise computing losses between the visual embedding, the first text embedding, and the second text embedding and training the perturbation-aware text encoder based on the computed losses.

Type: Application

Filed: March 6, 2023

Publication date: September 12, 2024

Applicant: Adobe Inc.

Inventors: Seunghyun YOON, Trung BUI
Using neural networks to detect incongruence between headlines and body text of documents

Patent number: 12038960

Abstract: An incongruent headline detection system receives a request to determine a headline incongruence score for an electronic document. The incongruent headline detection system determines the headline incongruence score for the electronic document by applying a machine learning model to the electronic document. Applying the machine learning model to the electronic document includes generating a graph representing a textual similarity between a headline of the electronic document and each of a plurality of paragraphs of the electronic document and determining the headline incongruence score using the graph. The incongruent headline detection system transmits, responsive to the request, the headline incongruence score for the electronic document.

Type: Grant

Filed: November 17, 2021

Date of Patent: July 16, 2024

Assignee: Adobe Inc.

Inventor: Seunghyun Yoon
MULTIMODAL INTENT DISCOVERY SYSTEM

Publication number: 20240020337

Abstract: Systems and methods for intent discovery and video summarization are described. Embodiments of the present disclosure receive a video and a transcript of the video, encode the video to obtain a sequence of video encodings, encode the transcript to obtain a sequence of text encodings, apply a visual gate to the sequence of text encodings based on the sequence of video encodings to obtain gated text encodings, and generate an intent label for the transcript based on the gated text encodings.

Type: Application

Filed: July 12, 2022

Publication date: January 18, 2024

Inventors: Adyasha Maharana, Quan Hung Tran, Seunghyun Yoon, Franck Dernoncourt, Trung Huu Bui, Walter W. Chang
Multitask Machine-Learning Model Training and Training Data Augmentation

Publication number: 20230419164

Abstract: Multitask machine-learning model training and training data augmentation techniques are described. In one example, training is performed for multiple tasks simultaneously as part of training a multitask machine-learning model using question pairs. Examples of the multiple tasks include question summarization and recognizing question entailment. Further, a loss function is described that incorporates a parameter sharing loss that is configured to adjust an amount that parameters are shared between corresponding layers trained for the first and second tasks, respectively. In an implementation, training data augmentation techniques are also employed by synthesizing question pairs, automatically and without user intervention, to improve accuracy in model training.

Type: Application

Filed: June 22, 2022

Publication date: December 28, 2023

Applicant: Adobe Inc.

Inventors: Khalil Mrini, Franck Dernoncourt, Seunghyun Yoon, Trung Huu Bui, Walter W. Chang, Emilia Farcas, Ndapandula T. Nakashole
VIRTUAL KNOWLEDGE GRAPH CONSTRUCTION FOR ZERO-SHOT DOMAIN-SPECIFIC DOCUMENT RETRIEVAL

Publication number: 20230418868

Abstract: Systems and methods for text processing are described. Embodiments of the present disclosure receive a query comprising a natural language expression; extract a plurality of mentions from the query; generate a relation vector between a pair of the plurality of mentions using a relation encoder network, wherein the relation encoder network is trained using a contrastive learning process where mention pairs from a same document are labeled as positive samples and mention pairs from different documents are labeled as negative samples; combine the plurality of mentions with the relation vector to obtain a virtual knowledge graph of the query; identify a document corresponding to the query by comparing the virtual knowledge graph of the query to a virtual knowledge graph of the document; and transmit a response to the query, wherein the response includes a reference to the document.

Type: Application

Filed: June 24, 2022

Publication date: December 28, 2023

Inventors: Yeon Seonwoo, Seunghyun Yoon, Trung Huu Bui, Franck Dernoncourt, Roger K. Brooks, Mihir Naware
SYSTEMS AND METHODS FOR IMAGE PROCESSING USING NATURAL LANGUAGE

Publication number: 20230267726

Abstract: Embodiments of the disclosure provide a machine learning model for generating a predicted executable command for an image. The learning model includes an interface configured to obtain an utterance indicating a request associated with the image, an utterance sub-model, a visual sub-model, an attention network, and a selection gate. The machine learning model generates a segment of the predicted executable command from weighted probabilities of each candidate token in a predetermined vocabulary determined based on the visual features, the concept features, current command features, and the utterance features extracted from the utterance or the image.

Type: Application

Filed: February 18, 2022

Publication date: August 24, 2023

Inventors: Seunghyun Yoon, Trung Huu Bui, Franck Dernoncourt, Hyounghun Kim, Doo Soon Kim
SYSTEM AND METHODS FOR KEY-PHRASE EXTRACTION

Publication number: 20230259708

Abstract: Systems and methods for key-phrase extraction are described. The systems and methods include receiving a transcript including a text paragraph and generating key-phrase data for the text paragraph using a key-phrase extraction network. The key-phrase extraction network is trained to identify domain-relevant key-phrase data based on domain data obtained using a domain discriminator network. The systems and methods further include generating meta-data for the transcript based on the key-phrase data.

Type: Application

Filed: February 14, 2022

Publication date: August 17, 2023

Inventors: Amir Pouran Ben Veyseh, Franck Dernoncourt, Walter W. Chang, Trung Huu Bui, Hanieh Deilamsalehy, Seunghyun Yoon, Rajiv Bhawanji Jain, Quan Hung Tran, Varun Manjunatha
GENERATING SYNTHETIC CODE-SWITCHED DATA FOR TRAINING LANGUAGE MODELS

Publication number: 20230259718

Abstract: Techniques for training a language model for code switching content are disclosed. Such techniques include, in some embodiments, generating a dataset, which includes identifying one or more portions within textual content in a first language, the identified one or more portions each including one or more of offensive content or non-offensive content; translating the identified one or more salient portions to a second language; and reintegrating the translated one or more portions into the textual content to generate code-switched textual content. In some cases, the textual content in the first language includes offensive content and non-offensive content, the identified one or more portions include the offensive content, and the translated one or more portions include a translated version of the offensive content. In some embodiments, the code-switched textual content is at least part of a synthetic dataset usable to train a language model, such as a multilingual classification model.

Type: Application

Filed: February 17, 2022

Publication date: August 17, 2023

Inventors: Cesa Salaam, Seunghyun Yoon, Trung Huu Bui, Franck Dernoncourt
USING NEURAL NETWORKS TO DETECT INCONGRUENCE BETWEEN HEADLINES AND BODY TEXT OF DOCUMENTS

Publication number: 20230153341

Abstract: An incongruent headline detection system receives a request to determine a headline incongruence score for an electronic document. The incongruent headline detection system determines the headline incongruence score for the electronic document by applying a machine learning model to the electronic document. Applying the machine learning model to the electronic document includes generating a graph representing a textual similarity between a headline of the electronic document and each of a plurality of paragraphs of the electronic document and determining the headline incongruence score using the graph. The incongruent headline detection system transmits, responsive to the request, the headline incongruence score for the electronic document.

Type: Application

Filed: November 17, 2021

Publication date: May 18, 2023

Inventor: Seunghyun Yoon

1 2 next