Patents by Inventor Trung Bui

Trung Bui has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12652444
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for segmenting digital videos into topic chapters. In particular, in some embodiments, the disclosed systems generate, utilizing a text encoder, a text representation for a transcript sentence of a video transcript. In addition, in some embodiments, the disclosed systems generate, utilizing a frame encoder, a set of frame representations for a set of video frames associated with the transcript sentence. Moreover, in some embodiments, the disclosed systems generate, utilizing a cross-modal attention model, a text-aware visual representation from the text representation and the set of frame representations. Furthermore, in some embodiments, the disclosed systems determine a topic-boundary label for the transcript sentence from the text representation and the text-aware visual representation.
    Type: Grant
    Filed: September 10, 2024
    Date of Patent: June 9, 2026
    Assignee: Adobe Inc.
    Inventors: Fabian David Caba Heilbron, Franck Dernoncourt, Linzi Xing, Quan Tran, Seunghyun Yoon, Trung Bui, Zhaowen Wang
  • Patent number: 12645959
    Abstract: This disclosure describes methods, non-transitory computer readable storage media, and systems that provide a platform for on-demand selection of machine-learning models and on-demand learning of parameters for the selected machine-learning models via cloud-based systems. For instance, the disclosed system receives a request indicating a selection of a machine-learning model to perform a machine-learning task (e.g., a natural language task) utilizing a specific dataset (e.g., a user-defined dataset). The disclosed system utilizes a scheduler to monitor available computing devices on cloud-based storage systems for instantiating the selected machine-learning model. Using the indicated dataset at a determined cloud-based computing device, the disclosed system automatically trains the machine-learning model.
    Type: Grant
    Filed: May 26, 2021
    Date of Patent: June 2, 2026
    Assignee: Adobe Inc.
    Inventors: Nham Van Le, Tuan Manh Lai, Trung Bui, Doo Soon Kim
  • Patent number: 12646507
    Abstract: The disclosed method generates helpful training data for a language model, for example, a model implementing a punctuation restoration task, for real-world ASR texts. The method uses a reinforcement learning method using a generative AI model to generate additional data to train the language model. The method allows the generative AI model to learn from real-world ASR text to generate more effective training examples based on gradient feedback from the language model.
    Type: Grant
    Filed: July 12, 2023
    Date of Patent: June 2, 2026
    Assignee: Adobe Inc.
    Inventors: Viet Dac Lai, Trung Bui, Seunghyun Yoon, Quan Tran, Hao Tan, Hanieh Deilamsalehy, Abel Salinas, Franck Dernoncourt
  • Patent number: 12586392
    Abstract: Embodiments are disclosed for training an image caption evaluation system to perform evaluations of image captions. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving a training image, a ground truth image caption for the training image, and a perturbed image caption for the training image, where the perturbed image caption includes modifications to the ground truth image caption. The disclosed systems and methods further comprise generating, by a visual encoder, a visual embedding representation of the training image and generating, by a perturbation-aware text encoder, a first text embedding for the ground truth image caption and a second text embedding for the perturbed image caption. The disclosed systems and methods further comprise computing losses between the visual embedding, the first text embedding, and the second text embedding and training the perturbation-aware text encoder based on the computed losses.
    Type: Grant
    Filed: March 6, 2023
    Date of Patent: March 24, 2026
    Assignee: Adobe Inc.
    Inventors: Seunghyun Yoon, Trung Bui
  • Patent number: 12586374
    Abstract: A method includes receiving a video input and a text transcription of the video input. The video input includes a plurality of frames and the text transcription includes a plurality of sentences. The method further includes determining, by a multimodal summarization model, a subset of key frames of the plurality of frames and a subset of key sentences of the plurality of sentences. The method further includes providing a summary of the video input and a summary of the text transcription based on the subset of key frames and the subset of key sentences.
    Type: Grant
    Filed: June 2, 2023
    Date of Patent: March 24, 2026
    Assignee: Adobe Inc.
    Inventors: Zhaowen Wang, Trung Bui, Bo He
  • Patent number: 12580003
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for segmenting digital videos into topic chapters utilizing a sliding window and video segmentation models. Specifically, the disclosed systems utilize a sliding window to divide a digital video into overlapping segments, each segment including a subset of sentences of a transcript of the video and corresponding video frames for a given time window of the digital video. Further, the disclosed systems generate, for each overlapping segment, topic-boundary label predictions for the subset of sentences. Specifically, the disclosed systems generate text representations for the sentences using a text encoder and frame representations for the corresponding video frames using a frame encoder. Moreover, the disclosed systems generate the topic-boundary label predictions based on the text representations and the frame representations.
    Type: Grant
    Filed: November 26, 2024
    Date of Patent: March 17, 2026
    Assignee: Adobe Inc.
    Inventors: Linzi Xing, Franck Dernoncourt, Fabian David Caba Heilbron, Seunghyun Yoon, Zhaowen Wang, Trung Bui
  • Publication number: 20260075295
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for segmenting digital videos into topic chapters. In particular, in some embodiments, the disclosed systems generate, utilizing a text encoder, a text representation for a transcript sentence of a video transcript. In addition, in some embodiments, the disclosed systems generate, utilizing a frame encoder, a set of frame representations for a set of video frames associated with the transcript sentence. Moreover, in some embodiments, the disclosed systems generate, utilizing a cross-modal attention model, a text-aware visual representation from the text representation and the set of frame representations. Furthermore, in some embodiments, the disclosed systems determine a topic-boundary label for the transcript sentence from the text representation and the text-aware visual representation.
    Type: Application
    Filed: September 10, 2024
    Publication date: March 12, 2026
    Inventors: Fabian David Caba Heilbron, Franck Dernoncourt, Linzi Xing, Quan Tran, Seunghyun Yoon, Trung Bui, Zhaowen Wang
  • Patent number: 12547616
    Abstract: Systems and methods for natural language processing are described. One or more embodiments of the present disclosure receive a query related to information in a table, compute an operation selector by combining the query with an operation embedding representing a plurality of table operations, compute a column selector by combining the query with a weighted operation embedding, compute a row selector based on the operation selector and the column selector, compute a probability value for a cell in the table based on the row selector and the column selector, where the probability value represents a probability that the cell provides an answer to the query, and transmit contents of the cell based on the probability value.
    Type: Grant
    Filed: May 11, 2021
    Date of Patent: February 10, 2026
    Assignee: ADOBE INC.
    Inventors: Dung Thai, Doo Soon Kim, Franck Dernoncourt, Trung Bui
  • Patent number: 12547901
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for providing multilingual semantic search results utilizing meta-learning and knowledge distillation. For example, in some implementations, the disclosed systems perform a first inner learning loop for a monolingual to bilingual meta-learning task for a teacher model. Additionally, in some implementations, the disclosed systems perform a second inner learning loop for a bilingual to multilingual meta-learning task for a student model. In some embodiments, the disclosed systems perform knowledge distillation based on the first inner learning loop for the monolingual to bilingual meta-learning task and the second inner learning loop for the bilingual to multilingual meta-learning task.
    Type: Grant
    Filed: August 14, 2023
    Date of Patent: February 10, 2026
    Assignee: Adobe Inc.
    Inventors: Meryem M'Hamdi, Seunghyun Yoon, Franck Dernoncourt, Trung Bui
  • Publication number: 20260017471
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for training a multilingual large language model to embed text into an embedding space of a vision language model comprising a text encoder for a first language and a vision encoder. In particular, in some embodiments, the disclosed systems generate, utilizing the vision encoder, image embeddings for images. Additionally, in some embodiments, the disclosed systems generate, utilizing the multilingual large language model, text embeddings for text in languages other than the first language. Furthermore, in some embodiments, the disclosed systems determine similarity metrics between the image embeddings for the images and the text embeddings for the text. Moreover, in some embodiments, the disclosed systems adjust parameters of the multilingual large language model to reduce an output of a contrastive loss function based on the similarity metrics without adjusting parameters of the vision encoder.
    Type: Application
    Filed: July 12, 2024
    Publication date: January 15, 2026
    Inventors: Handong Zhao, Tracy King, Kushal Kafle, Rohith Reddy Katikireddy, Sanat Sharma, Scott Cohen, Seunghyun Yoon, Trung Bui, Tushar Vatsa, Venkata Naveen Kumar Yadav Marri, Wei-ting Hsu, Hao Tan, Fangzheng Wu, Amine Ben Khalifa, Ajinkya Gorakhnath Kale
  • Patent number: 12518523
    Abstract: A method and a system for performing distance metric learning using proxies are provided. The method for performing distance metric learning assigns a proxy as an anchor to represent a class and associates the proxy with all data points in a training batch. The method allows data points to interact with each other via proxies during training. Additionally, the fine-grained data-to-data relation is actively considered, which is combined with a learnable margin parameter leading to intra-class compactness and inter-class separability.
    Type: Grant
    Filed: October 12, 2023
    Date of Patent: January 6, 2026
    Assignee: VINBRAIN JOINT STOCK COMPANY
    Inventors: Nguyen Phan, Sen Kim Tran, Huy Duc Ta, Soan Thi Minh Duong, Chanh Do Trung Nguyen, Trung Bui, Steven Q. H. Truong
  • Publication number: 20250342628
    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that perform text-to-image editing using executable code generated from natural language text input. For instance, in one or more embodiments, the disclosed systems receive, from a client device, a digital image and natural language text input providing instructions for modifying the digital image. The disclosed systems also generate, using a large language model, executable action code for modifying the digital image in accordance with the instructions of the natural language text input, the executable action code being compatible with an editing application. The disclosed systems further modify the digital image by executing the executable action code via the editing application and provide the modified digital image for display via a graphical user interface of the client device.
    Type: Application
    Filed: May 3, 2024
    Publication date: November 6, 2025
    Inventors: Handong Zhao, Qiucheng Wu, Trung Bui, Seunghyun Yoon, Quan Tran, Jing Shi
  • Patent number: 12451132
    Abstract: A computer-implemented method is disclosed for determining one or more characteristics of a dialog between a computer system and user. The method may comprise receiving a system utterance comprising one or more tokens defining one or more words generated by the computer system; receiving a user utterance comprising one or more tokens defining one or more words uttered by a user in response to the system utterance, the system utterance and the user utterance forming a dialog context; receiving one or more utterance candidates comprising one or more tokens; for each utterance candidate, generating an input sequence combining the one or more tokens of each of the system utterance, the user utterance, and the utterance candidate; and for each utterance candidate, evaluating the generated input sequence with a model to determine a probability that the utterance candidate is relevant to the dialog context.
    Type: Grant
    Filed: February 9, 2023
    Date of Patent: October 21, 2025
    Assignee: Adobe Inc.
    Inventors: Tuan Manh Lai, Trung Bui, Quan Tran
  • Publication number: 20250307606
    Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for generating digital images via a generative neural network with localized constraints. The disclosed system generates, utilizing one or more encoder neural networks, a sequence of embeddings comprising a prompt embedding representing a text prompt and an object text embedding representing a phrase indicating an object in the text prompt. The disclosed system generates, utilizing the one or more encoder neural networks, a visual embedding representing an object image corresponding to the object. The disclosed system determines a modified sequence of embeddings by replacing the object text embedding with the visual embedding in the sequence of embeddings. The disclosed system also generates, utilizing a generative neural network, a synthetic digital image from the modified sequence of embeddings comprising the visual embedding.
    Type: Application
    Filed: March 28, 2024
    Publication date: October 2, 2025
    Inventors: Weixi Feng, Yijun Li, Trung Bui, Tobias Hinz, Scott Cohen, Quan Tran, Jianming Zhang, Handong Zhao, Franck Dernoncourt
  • Publication number: 20250298487
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for modifying a digital design by performing a selective object-level undo operation. In one or more embodiments, the disclosed systems generate a modified object by performing a series of operations on an object depicted within the digital design. In some embodiments, the disclosed systems receive a selective object-level undo operation on the modified object, wherein the request specifies an operation to undo from among the series of operations performed on the object. In one or more embodiments, the disclosed systems modify the modified object by performing the selective object-level undo operation on the modified object to undo the operation from among the series of operations. In some embodiments, the disclosed systems provide an updated digital design depicting the modified object reflecting modifications from the series of operations excluding the operation undone by the selective object-level undo operation.
    Type: Application
    Filed: March 20, 2024
    Publication date: September 25, 2025
    Inventors: Nikita Soni, Trung Bui, Kevin Gary Smith
  • Patent number: 12399890
    Abstract: Systems and methods for natural language processing are described. Embodiments are configured to receive a structured representation of a search query, wherein the structured representation comprises a plurality of nodes and at least one edge connecting two of the nodes, receive a modification expression for the search query, wherein the modification expression comprises a natural language expression, generate a modified structured representation based on the structured representation and the modification expression using a neural network configured to combine structured representation features and natural language expression features, and perform a search based on the modified structured representation.
    Type: Grant
    Filed: November 3, 2020
    Date of Patent: August 26, 2025
    Assignee: ADOBE INC.
    Inventors: Quan Tran, Zhe Lin, Xuanli He, Walter Chang, Trung Bui, Franck Dernoncourt
  • Publication number: 20250252265
    Abstract: The present disclosure is directed toward systems, methods, and non-transitory computer readable media that provide a contextual query answering system that trains and implements a unique machine learning architecture to generate accurate domain-specific contextual responses. For example, the disclosed systems receive a contextual query indicating a software context of a computer application within a software-specific domain. The disclosed systems utilize a context retrieval model to generate query embeddings from the contextual query and data segment embeddings from data segments of stored digital documents. Further, the context retrieval model determines relevant digital documents from among the stored digital documents based on comparing the query embeddings and the data segment embeddings. The disclosed systems provide the relevant digital documents to a response generator model to generate a contextual response within the software-specific domain.
    Type: Application
    Filed: February 5, 2024
    Publication date: August 7, 2025
    Inventors: Varun Kumar Kotte, Trung Bui, Seunghyun Yoon, Sanat Sharma, Franck Dernoncourt, Dewang Sultania
  • Publication number: 20250209278
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for identifying speaker names in transcripts. In particular, in one or more embodiments, the disclosed systems determine, from a set of sentences in a textual transcript of a dialogue, a first sentence spoken by a first speaker and a second sentence spoken by a second speaker. Additionally, in some embodiments, the disclosed systems generate a first feature representation for the first sentence and a second feature representation for the second sentence. Moreover, in some embodiments, the disclosed systems determine a speaker name for at least one of the first sentence or the second sentence by comparing each of the first feature representation and the second feature representation with a name representation for a name spoken in at least one of the first sentence or the second sentence.
    Type: Application
    Filed: December 20, 2023
    Publication date: June 26, 2025
    Inventors: Minh Nguyen, Franck Dernoncourt, Hanieh Deilamsalehy, Hao Tan, Quan Tran, Seunghyun Yoon, Trung Bui
  • Publication number: 20250124698
    Abstract: A method and a system for performing distance metric learning using proxies are provided. The method for performing distance metric learning assigns a proxy as an anchor to represent a class and associates the proxy with all data points in a training batch. The method allows data points to interact with each other via proxies during training. Additionally, the fine-grained data-to-data relation is actively considered, which is combined with a learnable margin parameter leading to intra-class compactness and inter-class separability.
    Type: Application
    Filed: October 12, 2023
    Publication date: April 17, 2025
    Inventors: Nguyen PHAN, Sen Kim TRAN, Huy Duc TA, Soan Thi Minh DUONG, Chanh Do Trung NGUYEN, Trung BUI, Steven Q. H. TRUONG
  • Publication number: 20250077775
    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods for generating aspect-based summaries utilizing deep learning. In particular, in one or more embodiments, the disclosed systems access a transcript comprising sentences. The disclosed systems generate, utilizing a sentence classification machine learning model, aspect labels for the sentences of the transcript. The disclosed systems organize the sentences based on the aspect labels. The disclosed systems generate, utilizing a summary machine learning model, a summary of the transcript for each aspect of the plurality of aspects from the organized sentences.
    Type: Application
    Filed: August 29, 2023
    Publication date: March 6, 2025
    Inventors: Zhongfen Deng, Seunghyun Yoon, Trung Bui, Quan Tran, Franck Dernoncourt