Patents by Inventor Nikolaos Barmpalios

Nikolaos Barmpalios has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Domain adaptation for machine learning models

Patent number: 11978272

Abstract: Adapting a machine learning model to process data that differs from training data used to configure the model for a specified objective is described. A domain adaptation system trains the model to process new domain data that differs from a training data domain by using the model to generate a feature representation for the new domain data, which describes different content types included in the new domain data. The domain adaptation system then generates a probability distribution for each discrete region of the new domain data, which describes a likelihood of the region including different content described by the feature representation. The probability distribution is compared to ground truth information for the new domain data to determine a loss function, which is used to refine model parameters. After determining that model outputs achieve a threshold similarity to the ground truth information, the model is output as a domain-agnostic model.

Type: Grant

Filed: August 9, 2022

Date of Patent: May 7, 2024

Assignee: Adobe Inc.

Inventors: Kai Li, Christopher Alan Tensmeyer, Curtis Michael Wigington, Handong Zhao, Nikolaos Barmpalios, Tong Sun, Varun Manjunatha, Vlad Ion Morariu
TRAINING LANGUAGE MODELS AND PRESERVING PRIVACY

Publication number: 20240135103

Abstract: In implementations of systems for training language models and preserving privacy, a computing device implements a privacy system to predict a next word after a last word in a sequence of words by processing input data using a machine learning model trained on training data to predict next words after last words in sequences of words. The training data describes a corpus of text associated with clients and including sensitive samples and non-sensitive samples. The machine learning model is trained by sampling a client of the clients and using a subset of the sensitive samples associated with the client and a subset of the non-sensitive samples associated with the client to update parameters of the machine learning model. The privacy system generates an indication of the next word after the last word in the sequence of words for display in a user interface.

Type: Application

Filed: February 23, 2023

Publication date: April 25, 2024

Applicant: Adobe Inc.

Inventors: Franck Dernoncourt, Tong Sun, Thi kim phung Lai, Rajiv Bhawanji Jain, Nikolaos Barmpalios, Jiuxiang Gu
LABEL INDUCTION

Publication number: 20240135096

Abstract: Systems and methods for document classification are described. Embodiments of the present disclosure generate classification data for a plurality of samples using a neural network trained to identify a plurality of known classes; select a set of samples for annotation from the plurality of samples using an open-set metric based on the classification data, wherein the annotation includes an unknown class; and train the neural network to identify the unknown class based on the annotation of the set of samples.

Type: Application

Filed: October 23, 2022

Publication date: April 25, 2024

Inventors: Rajiv Bhawanji Jain, Michelle Yuan, Vlad Ion Morariu, Ani Nenkova Nenkova, Smitha Bangalore Naresh, Nikolaos Barmpalios, Ruchi Deshpande, Ruiyi Zhang, Jiuxiang Gu, Varun Manjunatha, Nedim Lipka, Andrew Marc Greene
CONFIDENCE EVALUATION MODEL FOR STRUCTURE PREDICTION TASKS

Publication number: 20240028972

Abstract: Techniques for training for and determining a confidence of an output of a machine learning model are disclosed. Such techniques include, in some embodiments, receiving, from the machine learning model configured to receive information associated with a data object, information associated with a predicted structure for the data object; encoding, using a second machine learning model, the information associated with the predicted structure for the data object to produce encoded input channels; evaluating, using the second machine learning model, the information associated with the data object with the encoded input channels; and based on the evaluating, determining, using the second machine learning model, a probability of correctness of the predicted structure for the data object.

Type: Application

Filed: July 27, 2022

Publication date: January 25, 2024

Inventors: Christopher Tensmeyer, Nikolaos Barmpalios, Sruthi Madapoosi Ravi, Ruchi Deshpande, Varun Manjunatha, Smitha Bangalore Naresh, Priyank Mathur, Oghenetegiri Sido
MULTIMODAL EXTRACTION ACROSS MULTIPLE GRANULARITIES

Publication number: 20230376687

Abstract: Embodiments are provided for facilitating multimodal extraction across multiple granularities. In one implementation, a set of features of a document for a plurality of granularities of the document is obtained. Via a machine learning model, the set of features of the document are modified to generate a set of modified features using a set of self-attention values to determine relationships within a first type of feature and a set of cross-attention values to determine relationships between the first type of feature and a second type of feature. Thereafter, the set of modified features are provided to a second machine learning model to perform a classification task.

Type: Application

Filed: May 17, 2022

Publication date: November 23, 2023

Inventors: Vlad Ion Morariu, Tong Sun, Nikolaos Barmpalios, Zilong Wang, Jiuxiang Gu, Ani Nenkova Nenkova, Christopher Tensmeyer
Preserving user-entity differential privacy in natural language modeling

Patent number: 11816243

Abstract: Systems, methods, and non-transitory computer-readable media can generate a natural language model that provides user-entity differential privacy. For example, in one or more embodiments, a system samples sensitive data points from a natural language dataset. Using the sampled sensitive data points, the system determines gradient values corresponding to the natural language model. Further, the system generates noise for the natural language model. The system generates parameters for the natural language model using the gradient values and the noise, facilitating simultaneous protection of the users and sensitive entities associated with the natural language dataset. In some implementations, the system generates the natural language model through an iterative process (e.g., by iteratively modifying the parameters).

Type: Grant

Filed: August 9, 2021

Date of Patent: November 14, 2023

Assignee: Adobe Inc.

Inventors: Thi Kim Phung Lai, Tong Sun, Rajiv Jain, Nikolaos Barmpalios, Jiuxiang Gu, Franck Dernoncourt
Privacy Preserving Document Analysis

Publication number: 20230336532

Abstract: Systems and techniques for privacy preserving document analysis are described that derive insights pertaining to a digital document without communication of the content of the digital document. To do so, the privacy preserving document analysis techniques described herein capture visual or contextual features of the digital document and creates a stamp representation that represents these features without included the content of the digital document. The stamp representation is projected into a stamp embedding space based on a stamp encoding model generated through machine learning techniques capturing feature patterns and interaction in the stamp representations. The stamp encoding model exploits these feature interactions to define similarity of source documents based on location within the stamp embedding space. Accordingly, the techniques described herein can determine a similarity of documents without having access to the documents themselves.

Type: Application

Filed: May 15, 2023

Publication date: October 19, 2023

Applicant: Adobe Inc.

Inventors: Nikolaos Barmpalios, Ruchi Rajiv Deshpande, Randy Lee Swineford, Nargol Rezvani, Andrew Marc Greene, Shawn Alan Gaither, Michael Kraley
Privacy preserving document analysis

Patent number: 11689507

Abstract: Systems and techniques for privacy preserving document analysis are described that derive insights pertaining to a digital document without communication of the content of the digital document. To do so, the privacy preserving document analysis techniques described herein capture visual or contextual features of the digital document and creates a stamp representation that represents these features without included the content of the digital document. The stamp representation is projected into a stamp embedding space based on a stamp encoding model generated through machine learning techniques capturing feature patterns and interaction in the stamp representations. The stamp encoding model exploits these feature interactions to define similarity of source documents based on location within the stamp embedding space. Accordingly, the techniques described herein can determine a similarity of documents without having access to the documents themselves.

Type: Grant

Filed: November 26, 2019

Date of Patent: June 27, 2023

Assignee: Adobe Inc.

Inventors: Nikolaos Barmpalios, Ruchi Rajiv Deshpande, Randy Lee Swineford, Nargol Rezvani, Andrew Marc Greene, Shawn Alan Gaither, Michael Kraley
UNIFIED PRETRAINING FRAMEWORK FOR DOCUMENT UNDERSTANDING

Publication number: 20230154221

Abstract: The technology described includes methods for pretraining a document encoder model based on multimodal self cross-attention. One method includes receiving image data that encodes a set of pretraining documents. A set of sentences is extracted from the image data. A bounding box for each sentence is generated. For each sentence, a set of predicted features is generated by using an encoder machine-learning model. The encoder model performs cross-attention between a set of masked-textual features for the sentence and a set of masked-visual features for the sentence. The set of masked-textual features is based on a masking function and the sentence. The set of masked-visual features is based on the masking function and the corresponding bounding box. A document-encoder model is pretrained based on the set of predicted features for each sentence and pretraining tasks. The pretraining tasks includes masked sentence modeling, visual contrastive learning, or visual-language alignment.

Type: Application

Filed: November 16, 2021

Publication date: May 18, 2023

Inventors: Jiuxiang Gu, Ani Nenkova Nenkova, Nikolaos Barmpalios, Vlad Ion Morariu, Tong Sun, Rajiv Bhawanji Jain, Jason wen yong Kuen, Handong Zhao
PRESERVING USER-ENTITY DIFFERENTIAL PRIVACY IN NATURAL LANGUAGE MODELING

Publication number: 20230059367

Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that generate a natural language model that provides user-entity differential privacy. For example, in one or more embodiments, the disclosed systems sample sensitive data points from a natural language dataset. Using the sampled sensitive data points, the disclosed systems determine gradient values corresponding to the natural language model. Further, the disclosed systems generate noise for the natural language model. The disclosed systems generate parameters for the natural language model using the gradient values and the noise, facilitating simultaneous protection of the users and sensitive entities associated with the natural language dataset. In some implementations, the disclosed systems generate the natural language model through an iterative process (e.g., by iteratively modifying the parameters).

Type: Application

Filed: August 9, 2021

Publication date: February 23, 2023

Inventors: Thi Kim Phung Lai, Tong Sun, Rajiv Jain, Nikolaos Barmpalios, Jiuxiang Gu, Franck Dernoncourt
Domain alignment for object detection domain adaptation tasks

Patent number: 11544503

Abstract: A domain alignment technique for cross-domain object detection tasks is introduced. During a preliminary pretraining phase, an object detection model is pretrained to detect objects in images associated with a source domain using a source dataset of images associated with the source domain. After completing the pretraining phase, a domain adaptation phase is performed using the source dataset and a target dataset to adapt the pretrained object detection model to detect objects in images associated with the target domain. The domain adaptation phase may involve the use of various domain alignment modules that, for example, perform multi-scale pixel/path alignment based on input feature maps or perform instance-level alignment based on input region proposals.

Type: Grant

Filed: May 27, 2020

Date of Patent: January 3, 2023

Assignee: Adobe Inc.

Inventors: Christopher Tensmeyer, Vlad Ion Morariu, Varun Manjunatha, Tong Sun, Nikolaos Barmpalios, Kai Li, Handong Zhao, Curtis Wigington
Domain Adaptation for Machine Learning Models

Publication number: 20220391768

Abstract: Adapting a machine learning model to process data that differs from training data used to configure the model for a specified objective is described. A domain adaptation system trains the model to process new domain data that differs from a training data domain by using the model to generate a feature representation for the new domain data, which describes different content types included in the new domain data. The domain adaptation system then generates a probability distribution for each discrete region of the new domain data, which describes a likelihood of the region including different content described by the feature representation. The probability distribution is compared to ground truth information for the new domain data to determine a loss function, which is used to refine model parameters. After determining that model outputs achieve a threshold similarity to the ground truth information, the model is output as a domain-agnostic model.

Type: Application

Filed: August 9, 2022

Publication date: December 8, 2022

Applicant: Adobe Inc.

Inventors: Kai Li, Christopher Alan Tensmeyer, Curtis Michael Wigington, Handong Zhao, Nikolaos Barmpalios, Tong Sun, Varun Manjunatha, Vlad Ion Morariu
Domain adaptation for machine learning models

Patent number: 11443193

Abstract: Adapting a machine learning model to process data that differs from training data used to configure the model for a specified objective is described. A domain adaptation system trains the model to process new domain data that differs from a training data domain by using the model to generate a feature representation for the new domain data, which describes different content types included in the new domain data. The domain adaptation system then generates a probability distribution for each discrete region of the new domain data, which describes a likelihood of the region including different content described by the feature representation. The probability distribution is compared to ground truth information for the new domain data to determine a loss function, which is used to refine model parameters. After determining that model outputs achieve a threshold similarity to the ground truth information, the model is output as a domain-agnostic model.

Type: Grant

Filed: May 4, 2020

Date of Patent: September 13, 2022

Assignee: Adobe Inc.

Inventors: Kai Li, Christopher Alan Tensmeyer, Curtis Michael Wigington, Handong Zhao, Nikolaos Barmpalios, Tong Sun, Varun Manjunatha, Vlad Ion Morariu
ASIDES DETECTION IN DOCUMENTS

Publication number: 20220172501

Abstract: Techniques are disclosed for identifying asides within a document, and detecting a display order of contents based of the identified asides. In a document, an “aside” represents a content region of the document that is distinct from the main content regions, and may be visually distinguishable from the main content region. In an example, a document is received, where the document lacks identification of asides. The document is analyzed to identify asides within the document. A display order of contents within the document is then determined, based on the identified asides. For example, in the display order, the asides are ordered between two segments of the main content and/or at a beginning or an end of the main content, but may not be ordered to be embedded in between a segment of the main content. The document is displayed in accordance with the display order.

Type: Application

Filed: February 17, 2022

Publication date: June 2, 2022

Applicant: Adobe Inc.

Inventors: Sanjeev Tagra, Shawn Alan Gaither, Shagun Kush, Samarth Gupta, Sachin Soni, Nikolaos Barmpalios, Abhishek Jain, Naqushab Neyazee
Asides detection in documents

Patent number: 11256913

Abstract: Techniques are disclosed for identifying asides within a document, and detecting a display order of contents based of the identified asides. In a document, an “aside” represents a content region of the document that is distinct from the main content regions, and may be visually distinguishable from the main content region. In an example, a document is received, where the document lacks identification of asides. The document is analyzed to identify asides within the document. A display order of contents within the document is then determined, based on the identified asides. For example, in the display order, the asides are ordered between two segments of the main content and/or at a beginning or an end of the main content, but may not be ordered to be embedded in between a segment of the main content. The document is displayed in accordance with the display order.

Type: Grant

Filed: October 10, 2019

Date of Patent: February 22, 2022

Assignee: Adobe Inc.

Inventors: Sanjeev Tagra, Shawn Alan Gaither, Shagun Kush, Samarth Gupta, Sachin Soni, Nikolaos Barmpalios, Abhishek Jain, Naqushab Neyazee
Automatically generating labeled synthetic documents

Patent number: 11238312

Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for generating diverse and realistic synthetic documents using deep learning. In particular, the disclosed systems can utilize a trained neural network to generate realistic image layouts comprising page elements that comply with layout parameters. The disclosed systems can also generate synthetic content corresponding to the page elements within the image layouts. The disclosed systems insert the synthetic content into the corresponding page elements of documents based on the image layouts to generate synthetic documents.

Type: Grant

Filed: November 21, 2019

Date of Patent: February 1, 2022

Assignee: Adobe Inc.

Inventors: Verena Kaynig-Fittkau, Sruthi Madapoosi Ravi, Richard Cohn, Nikolaos Barmpalios, Michael Kraley, Kanchana Sethu
Domain Adaptation for Machine Learning Models

Publication number: 20210334664

Abstract: Adapting a machine learning model to process data that differs from training data used to configure the model for a specified objective is described. A domain adaptation system trains the model to process new domain data that differs from a training data domain by using the model to generate a feature representation for the new domain data, which describes different content types included in the new domain data. The domain adaptation system then generates a probability distribution for each discrete region of the new domain data, which describes a likelihood of the region including different content described by the feature representation. The probability distribution is compared to ground truth information for the new domain data to determine a loss function, which is used to refine model parameters. After determining that model outputs achieve a threshold similarity to the ground truth information, the model is output as a domain-agnostic model.

Type: Application

Filed: May 4, 2020

Publication date: October 28, 2021

Applicant: Adobe Inc.

Inventors: Kai Li, Christopher Alan Tensmeyer, Curtis Michael Wigington, Handong Zhao, Nikolaos Barmpalios, Tong Sun, Varun Manjunatha, Vlad Ion Morariu
DOMAIN ALIGNMENT FOR OBJECT DETECTION DOMAIN ADAPTATION TASKS

Publication number: 20210312232

Abstract: A domain alignment technique for cross-domain object detection tasks is introduced. During a preliminary pretraining phase, an object detection model is pretrained to detect objects in images associated with a source domain using a source dataset of images associated with the source domain. After completing the pretraining phase, a domain adaptation phase is performed using the source dataset and a target dataset to adapt the pretrained object detection model to detect objects in images associated with the target domain. The domain adaptation phase may involve the use of various domain alignment modules that, for example, perform multi-scale pixel/path alignment based on input feature maps or perform instance-level alignment based on input region proposals.

Type: Application

Filed: May 27, 2020

Publication date: October 7, 2021

Inventors: Christopher Tensmeyer, Vlad Ion Morariu, Varun Manjunatha, Tong Sun, Nikolaos Barmpalios, Kai Li, Handong Zhao, Curtis Wigington
AUTOMATICALLY GENERATING LABELED SYNTHETIC DOCUMENTS

Publication number: 20210158093

Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for generating diverse and realistic synthetic documents using deep learning. In particular, the disclosed systems can utilize a trained neural network to generate realistic image layouts comprising page elements that comply with layout parameters. The disclosed systems can also generate synthetic content corresponding to the page elements within the image layouts. The disclosed systems insert the synthetic content into the corresponding page elements of documents based on the image layouts to generate synthetic documents.

Type: Application

Filed: November 21, 2019

Publication date: May 27, 2021

Inventors: Verena Kaynig-Fittkau, Sruthi Madapoosi Ravi, Richard Cohn, Nikolaos Barmpalios, Michael Kraley, Kanchana Sethu
Privacy Preserving Document Analysis

Publication number: 20210160221

Abstract: Systems and techniques for privacy preserving document analysis are described that derive insights pertaining to a digital document without communication of the content of the digital document. To do so, the privacy preserving document analysis techniques described herein capture visual or contextual features of the digital document and creates a stamp representation that represents these features without included the content of the digital document. The stamp representation is projected into a stamp embedding space based on a stamp encoding model generated through machine learning techniques capturing feature patterns and interaction in the stamp representations. The stamp encoding model exploits these feature interactions to define similarity of source documents based on location within the stamp embedding space. Accordingly, the techniques described herein can determine a similarity of documents without having access to the documents themselves.

Type: Application

Filed: November 26, 2019

Publication date: May 27, 2021

Applicant: Adobe Inc.

Inventors: Nikolaos Barmpalios, Ruchi Rajiv Deshpande, Randy Lee Swineford, Nargol Rezvani, Andrew Marc Greene, Shawn Alan Gaither, Michael Kraley

1 2 next