Patents by Inventor SHAWN ALAN GAITHER

SHAWN ALAN GAITHER has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Privacy Preserving Document Analysis

Publication number: 20230336532

Abstract: Systems and techniques for privacy preserving document analysis are described that derive insights pertaining to a digital document without communication of the content of the digital document. To do so, the privacy preserving document analysis techniques described herein capture visual or contextual features of the digital document and creates a stamp representation that represents these features without included the content of the digital document. The stamp representation is projected into a stamp embedding space based on a stamp encoding model generated through machine learning techniques capturing feature patterns and interaction in the stamp representations. The stamp encoding model exploits these feature interactions to define similarity of source documents based on location within the stamp embedding space. Accordingly, the techniques described herein can determine a similarity of documents without having access to the documents themselves.

Type: Application

Filed: May 15, 2023

Publication date: October 19, 2023

Applicant: Adobe Inc.

Inventors: Nikolaos Barmpalios, Ruchi Rajiv Deshpande, Randy Lee Swineford, Nargol Rezvani, Andrew Marc Greene, Shawn Alan Gaither, Michael Kraley
Probabilistic language models for identifying sequential reading order of discontinuous text segments

Patent number: 11769111

Abstract: The present invention is directed towards providing automated workflows for the identification of a reading order from text segments extracted from a document. Ordering the text segments is based on trained natural language models. In some embodiments, the workflows are enabled to perform a method for identifying a sequence associated with a portable document. The methods includes iteratively generating a probabilistic language model, receiving the portable document, and selectively extracting features (such as but not limited to text segments) from the document. The method may generate pairs of features (or feature pair from the extracted features). The method may further generate a score for each of the pairs based on the probabilistic language model and determine an order to features based on the scores. The method may provide the extracted features in the determined order.

Type: Grant

Filed: June 18, 2020

Date of Patent: September 26, 2023

Assignee: Adobe Inc.

Inventors: Trung Huu Bui, Hung Hai Bui, Shawn Alan Gaither, Walter Wei-Tuh Chang, Michael Frank Kraley, Pranjal Daga
Privacy preserving document analysis

Patent number: 11689507

Abstract: Systems and techniques for privacy preserving document analysis are described that derive insights pertaining to a digital document without communication of the content of the digital document. To do so, the privacy preserving document analysis techniques described herein capture visual or contextual features of the digital document and creates a stamp representation that represents these features without included the content of the digital document. The stamp representation is projected into a stamp embedding space based on a stamp encoding model generated through machine learning techniques capturing feature patterns and interaction in the stamp representations. The stamp encoding model exploits these feature interactions to define similarity of source documents based on location within the stamp embedding space. Accordingly, the techniques described herein can determine a similarity of documents without having access to the documents themselves.

Type: Grant

Filed: November 26, 2019

Date of Patent: June 27, 2023

Assignee: Adobe Inc.

Inventors: Nikolaos Barmpalios, Ruchi Rajiv Deshpande, Randy Lee Swineford, Nargol Rezvani, Andrew Marc Greene, Shawn Alan Gaither, Michael Kraley
Multiple channels of rasterized content for page decomposition using machine learning

Patent number: 11386685

Abstract: Techniques are provided for identifying structural elements of a document. One Methodology includes generating a first channel of rasterized content by rasterizing a full page of the document and generating one or more additional channels of rasterized content from the page of the document by rasterizing one or more corresponding content types from the page of the document. Each of the one or more additional channels includes a specific type of content that is different from each of the other one or more additional channels. The methodology further includes inputting the first channel of rasterized content and the one or more additional channels of rasterized content into a machine learning (ML) model. The methodology continues with determining location and classification for each of a plurality of structural elements on the page of the document using the ML model.

Type: Grant

Filed: October 17, 2019

Date of Patent: July 12, 2022

Assignee: Adobe Inc.

Inventors: Verena Sabine Kaynig-Fittkau, Smitha Bangalore Naresh, Shawn Alan Gaither, Richard Cohn, Paul John Asente, Eylon Stroh, Emily Seminerio
ASIDES DETECTION IN DOCUMENTS

Publication number: 20220172501

Abstract: Techniques are disclosed for identifying asides within a document, and detecting a display order of contents based of the identified asides. In a document, an “aside” represents a content region of the document that is distinct from the main content regions, and may be visually distinguishable from the main content region. In an example, a document is received, where the document lacks identification of asides. The document is analyzed to identify asides within the document. A display order of contents within the document is then determined, based on the identified asides. For example, in the display order, the asides are ordered between two segments of the main content and/or at a beginning or an end of the main content, but may not be ordered to be embedded in between a segment of the main content. The document is displayed in accordance with the display order.

Type: Application

Filed: February 17, 2022

Publication date: June 2, 2022

Applicant: Adobe Inc.

Inventors: Sanjeev Tagra, Shawn Alan Gaither, Shagun Kush, Samarth Gupta, Sachin Soni, Nikolaos Barmpalios, Abhishek Jain, Naqushab Neyazee
Asides detection in documents

Patent number: 11256913

Abstract: Techniques are disclosed for identifying asides within a document, and detecting a display order of contents based of the identified asides. In a document, an “aside” represents a content region of the document that is distinct from the main content regions, and may be visually distinguishable from the main content region. In an example, a document is received, where the document lacks identification of asides. The document is analyzed to identify asides within the document. A display order of contents within the document is then determined, based on the identified asides. For example, in the display order, the asides are ordered between two segments of the main content and/or at a beginning or an end of the main content, but may not be ordered to be embedded in between a segment of the main content. The document is displayed in accordance with the display order.

Type: Grant

Filed: October 10, 2019

Date of Patent: February 22, 2022

Assignee: Adobe Inc.

Inventors: Sanjeev Tagra, Shawn Alan Gaither, Shagun Kush, Samarth Gupta, Sachin Soni, Nikolaos Barmpalios, Abhishek Jain, Naqushab Neyazee
Privacy Preserving Document Analysis

Publication number: 20210160221

Abstract: Systems and techniques for privacy preserving document analysis are described that derive insights pertaining to a digital document without communication of the content of the digital document. To do so, the privacy preserving document analysis techniques described herein capture visual or contextual features of the digital document and creates a stamp representation that represents these features without included the content of the digital document. The stamp representation is projected into a stamp embedding space based on a stamp encoding model generated through machine learning techniques capturing feature patterns and interaction in the stamp representations. The stamp encoding model exploits these feature interactions to define similarity of source documents based on location within the stamp embedding space. Accordingly, the techniques described herein can determine a similarity of documents without having access to the documents themselves.

Type: Application

Filed: November 26, 2019

Publication date: May 27, 2021

Applicant: Adobe Inc.

Inventors: Nikolaos Barmpalios, Ruchi Rajiv Deshpande, Randy Lee Swineford, Nargol Rezvani, Andrew Marc Greene, Shawn Alan Gaither, Michael Kraley
MULTIPLE CHANNELS OF RASTERIZED CONTENT FOR PAGE DECOMPOSITION USING MACHINE LEARNING

Publication number: 20210117666

Abstract: Techniques are provided for identifying structural elements of a document. One Methodology includes generating a first channel of rasterized content by rasterizing a full page of the document and generating one or more additional channels of rasterized content from the page of the document by rasterizing one or more corresponding content types from the page of the document. Each of the one or more additional channels includes a specific type of content that is different from each of the other one or more additional channels. The methodology further includes inputting the first channel of rasterized content and the one or more additional channels of rasterized content into a machine learning (ML) model. The methodology continues with determining location and classification for each of a plurality of structural elements on the page of the document using the ML model.

Type: Application

Filed: October 17, 2019

Publication date: April 22, 2021

Applicant: Adobe Inc.

Inventors: Verena Sabine Kaynig-Fittkau, Smitha Bangalore Naresh, Shawn Alan Gaither, Richard Cohn, Paul John Asente, Eylon Stroh, Emily Seminerio
ASIDES DETECTION IN DOCUMENTS

Publication number: 20210110151

Abstract: Techniques are disclosed for identifying asides within a document, and detecting a display order of contents based of the identified asides. In a document, an “aside” represents a content region of the document that is distinct from the main content regions, and may be visually distinguishable from the main content region. In an example, a document is received, where the document lacks identification of asides. The document is analyzed to identify asides within the document. A display order of contents within the document is then determined, based on the identified asides. For example, in the display order, the asides are ordered between two segments of the main content and/or at a beginning or an end of the main content, but may not be ordered to be embedded in between a segment of the main content. The document is displayed in accordance with the display order.

Type: Application

Filed: October 10, 2019

Publication date: April 15, 2021

Applicant: Adobe Inc.

Inventors: Sanjeev Tagra, Shawn Alan Gaither, Shagun Kush, Samarth Gupta, Sachin Soni, Nikolaos Barmpalios, Abhishek Jain, Naqushab Neyazee
Heading Identification and Classification for a Digital Document

Publication number: 20210110153

Abstract: Techniques described herein implement heading identification and classification for a digital document in a digital medium environment. A document analysis system is leveraged to extract structural features from a digital document, identify heading candidates from among the structural features, validate the headings candidates, and classify validated headings into different headings types. The classified headings are then utilized to generate a sectioned version of the digital document (“sectioned document”) that is divided into different sections based on the headings. Further, a document directory is generated that includes the headings and that enables navigation to different sections of the sectioned document.

Type: Application

Filed: October 9, 2019

Publication date: April 15, 2021

Applicant: Adobe Inc.

Inventors: Mohit Gupta, Uttam Dwivedi, Shawn Alan Gaither, Jayant Vaibhav Srivastava, Ashutosh Mehra
Heading identification and classification for a digital document

Patent number: 10956731

Abstract: Techniques described herein implement heading identification and classification for a digital document in a digital medium environment. A document analysis system is leveraged to extract structural features from a digital document, identify heading candidates from among the structural features, validate the headings candidates, and classify validated headings into different headings types. The classified headings are then utilized to generate a sectioned version of the digital document (“sectioned document”) that is divided into different sections based on the headings. Further, a document directory is generated that includes the headings and that enables navigation to different sections of the sectioned document.

Type: Grant

Filed: October 9, 2019

Date of Patent: March 23, 2021

Assignee: Adobe Inc.

Inventors: Mohit Gupta, Uttam Dwivedi, Shawn Alan Gaither, Jayant Vaibhav Srivastava, Ashutosh Mehra
PROBABILISTIC LANGUAGE MODELS FOR IDENTIFYING SEQUENTIAL READING ORDER OF DISCONTINUOUS TEXT SEGMENTS

Publication number: 20200320329

Abstract: The present invention is directed towards providing automated workflows for the identification of a reading order from text segments extracted from a document. Ordering the text segments is based on trained natural language models. In some embodiments, the workflows are enabled to perform a method for identifying a sequence associated with a portable document. The methods includes iteratively generating a probabilistic language model, receiving the portable document, and selectively extracting features (such as but not limited to text segments) from the document. The method may generate pairs of features (or feature pair from the extracted features). The method may further generate a score for each of the pairs based on the probabilistic language model and determine an order to features based on the scores. The method may provide the extracted features in the determined order.

Type: Application

Filed: June 18, 2020

Publication date: October 8, 2020

Inventors: Trung Huu Bui, Hung Hai Bui, Shawn Alan Gaither, Walter Wei-Tuh Chang, Michael Frank Kraley, Pranjal Daga
Automated workflows for identification of reading order from text segments using probabilistic language models

Patent number: 10713519

Abstract: The present invention is directed towards providing automated workflows for the identification of a reading order from text segments extracted from a document. Ordering the text segments is based on trained natural language models. In some embodiments, the workflows are enabled to perform a method for identifying a sequence associated with a portable document. The methods includes iteratively generating a probabilistic language model, receiving the portable document, and selectively extracting features (such as but not limited to text segments) from the document. The method may generate pairs of features (or feature pair from the extracted features). The method may further generate a score for each of the pairs based on the probabilistic language model and determine an order to features based on the scores. The method may provide the extracted features in the determined order.

Type: Grant

Filed: June 22, 2017

Date of Patent: July 14, 2020

Assignee: ADOBE INC.

Inventors: Trung Huu Bui, Hung Hai Bui, Shawn Alan Gaither, Walter Wei-Tuh Chang, Michael Frank Kraley, Pranjal Daga
AUTOMATED WORKFLOWS FOR IDENTIFICATION OF READING ORDER FROM TEXT SEGMENTS USING PROBABILISTIC LANGUAGE MODELS

Publication number: 20180373952

Abstract: The present invention is directed towards providing automated workflows for the identification of a reading order from text segments extracted from a document. Ordering the text segments is based on trained natural language models. In some embodiments, the workflows are enabled to perform a method for identifying a sequence associated with a portable document. The methods includes iteratively generating a probabilistic language model, receiving the portable document, and selectively extracting features (such as but not limited to text segments) from the document. The method may generate pairs of features (or feature pair from the extracted features). The method may further generate a score for each of the pairs based on the probabilistic language model and determine an order to features based on the scores. The method may provide the extracted features in the determined order.

Type: Application

Filed: June 22, 2017

Publication date: December 27, 2018

Inventors: Trung Huu Bui, Hung Hai Bui, Shawn Alan Gaither, Walter Wei-Tuh Chang, Michael Frank Kraley, Pranjal Daga
Form value prediction utilizing synonymous field recognition

Patent number: 10133813

Abstract: Embodiments of the present invention provide systems, methods, and computer storage media directed at predicting values for an electronic form. In embodiments, the method can include forming synonym groupings of form field labels for a number of users. The synonym groupings can be based on an analysis of the similarity of form field values that are associated with form field labels. In embodiments a predictive model may be generated from these synonym groupings. The predictive model can correlate the synonym groupings of one user with synonym groupings of one or more additional users to enable a determination of one or more predicted form field values for the one user based on a queried form field label even though the one user may have never submitted an electronic form with the queried form field label. Other embodiments may be described and/or claimed.

Type: Grant

Filed: August 12, 2015

Date of Patent: November 20, 2018

Assignee: Adobe Systems Incorporated

Inventors: Shawn Alan Gaither, Eylon Stroh, Priyank Mathur, Randy Swineford
FORM VALUE PREDICTION UTILIZING SYNONYMOUS FIELD RECOGNITION

Publication number: 20170046622

Abstract: Embodiments of the present invention provide systems, methods, and computer storage media directed at predicting values for an electronic form. In embodiments, the method can include forming synonym groupings of form field labels for a number of users. The synonym groupings can be based on an analysis of the similarity of form field values that are associated with form field labels. In embodiments a predictive model may be generated from these synonym groupings. The predictive model can correlate the synonym groupings of one user with synonym groupings of one or more additional users to enable a determination of one or more predicted form field values for the one user based on a queried form field label even though the one user may have never submitted an electronic form with the queried form field label. Other embodiments may be described and/or claimed.

Type: Application

Filed: August 12, 2015

Publication date: February 16, 2017

Inventors: SHAWN ALAN GAITHER, EYLON STROH, PRIYANK MATHUR, RANDY SWINEFORD