Patents by Inventor Christian Reisswig
Christian Reisswig has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250103815Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content.Type: ApplicationFiled: December 10, 2024Publication date: March 27, 2025Applicant: SAP SEInventor: Christian Reisswig
-
Patent number: 12204860Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content.Type: GrantFiled: February 22, 2023Date of Patent: January 21, 2025Assignee: SAP SEInventor: Christian Reisswig
-
Publication number: 20230334309Abstract: Systems, methods, and computer-readable media for generating a synthetic training data set from an original unstructured electronic document are disclosed. The synthetic training data set may be used to train a deep learning model to extract data from the original electronic document. The original electronic document may comprise annotated data fields. Each annotated data field may comprise a bounding box and a label. The original electronic document may comprise a header, a table, and a footer. Macro augmentation operations may be applied to the original electronic document to create sub-templates representative of distinct page layouts in the original electronic document. The synthetic training data set may be generated by applying geometric and semantic data augmentations to the sub-templates and the original electronic documents. The synthetic training data set may then be provided the deep learning model for training.Type: ApplicationFiled: April 14, 2022Publication date: October 19, 2023Inventors: Alexey Streltsov, Monit Shah Singh, Dhananjay Tomar, Christian Reisswig, Minh Duc Bui
-
Patent number: 11763094Abstract: Natural language processing systems and methods are disclosed herein. In some embodiments, digital document information comprising text is received. The digital document information may be processed through word and character encoding operations to generate word and character vectors while retaining document location information for the words and characters. The data may be then be processed by a series of convolution and maximum pooling operations to obtain maximum valued elements from the data. The document location information as well as the maximum values element data may be further processed for semantic classification of the data using a semantic classifier and bounding box regression.Type: GrantFiled: May 13, 2021Date of Patent: September 19, 2023Assignee: SAP SEInventor: Christian Reisswig
-
Publication number: 20230206000Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content.Type: ApplicationFiled: February 22, 2023Publication date: June 29, 2023Applicant: SAP SEInventor: Christian Reisswig
-
Patent number: 11615246Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content.Type: GrantFiled: June 3, 2020Date of Patent: March 28, 2023Assignee: SAP SEInventor: Christian Reisswig
-
Publication number: 20230075369Abstract: Systems and methods include training of each of a plurality of models based on a first set of training data comprising a first plurality of pairs, each of the first plurality of pairs comprising a feature and a corresponding label, inputting of each of a plurality of features into each of the plurality of trained models to generate, for each feature of the plurality of features, a plurality of output labels, determining, for each of the plurality of features, a pseudo-label based on the plurality of output labels generated for the feature, determining a second set of training data comprising a second plurality of pairs, each of the second plurality of pairs comprising one of the plurality of features and a pseudo-label determined for the one of the plurality of features, and training an inference model to output an inferred label based on the first set of training data and the second set of training data.Type: ApplicationFiled: September 8, 2021Publication date: March 9, 2023Inventors: Sohyeong KIM, Christian REISSWIG
-
Patent number: 11557140Abstract: Disclosed herein are system, method, and computer program product embodiments for correcting extracted document information based on generated confidence and correctness scores. In an embodiment, a document correcting system may receive a document and document information that represents information extracted from the document. The document correcting system may determine the correctness of the document information by processing the document to generate a character grid representing textual information and spatial arrangements for the text within the document. The document correcting system may apply a convolutional neural network on character grid and the document information. The convolutional neural network may output corrected document information, a correctness value indicating the possible errors in the document information, and a confidence value indicating a likelihood of the possible errors.Type: GrantFiled: November 30, 2020Date of Patent: January 17, 2023Assignee: SAP SEInventor: Christian Reisswig
-
Patent number: 11514489Abstract: Disclosed herein are various embodiments for targeted document information extraction. An embodiment operates by receiving a document associated with a particular customer of a plurality of customers. It is determined whether to use a global processor or template processor to analyze the document based on whether one or more customer templates are associated with the particular customer. Which of the one or more templates associated with the particular customer correspond to the document is identified. The document is compared to the identified template associated with the customer. Information is extracted from the document based on the identified template and the identified plurality of variations. The extracted information for the document is output.Type: GrantFiled: January 6, 2021Date of Patent: November 29, 2022Assignee: SAP SEInventors: Ying Jiang, Christian Reisswig
-
Publication number: 20220366301Abstract: In an example embodiment, a confidence score is computed for a predicted label (from a first model) for information extracted from a document. The confidence score is computed using a machine learned model different than the first model which is based on a Sliding-Window method. The Sliding-Window method may be based on convolutional neural networks classification, using sliding windows. It receives as input (1) the string of extracted information from an independent previous information extracted step (the “input text”), (2) the string's predicted class label, (3) the string's coordinate location in the document, and (4) the text of the document (for additional context information). The Sliding-Window method's task is to predict the confidence score to determine the correctness of the predicted label for the information.Type: ApplicationFiled: June 22, 2021Publication date: November 17, 2022Inventors: Nurzat Rakhmanberdieva, Alexey Streltsov, Christian Reisswig
-
Publication number: 20220366144Abstract: Natural language processing systems and methods are disclosed herein. In some embodiments, digital document information comprising text is received. The digital document information may be processed through word and character encoding operations to generate word and character vectors while retaining document location information for the words and characters. The data may be then be processed by a series of convolution and maximum pooling operations to obtain maximum valued elements from the data. The document location information as well as the maximum values element data may be further processed for semantic classification of the data using a semantic classifier and bounding box regression.Type: ApplicationFiled: May 13, 2021Publication date: November 17, 2022Inventor: Christian Reisswig
-
Patent number: 11488020Abstract: Technologies are described for performing adaptive high-resolution digital image processing using neural networks. For example, a number of different regions can be defined representing portions of a digital image. One of the regions covers the entire digital image at a reduced resolution. The other regions cover less than the entire digital image at resolutions higher than the region covering the entire digital image. Neural networks are then used to process each of the regions. The neural networks share information using prolongation and restriction operations. Prolongation operations propagate activations from a neural network operating on a lower resolution region to context zones of a neural network operating on a higher resolution region. Restriction operations propagate activations from the neural network operating on the higher resolution region back to the neural network operating on the lower resolution region.Type: GrantFiled: June 2, 2020Date of Patent: November 1, 2022Assignee: SAP SEInventors: Christian Reisswig, Shachar Klaiman
-
Publication number: 20220215446Abstract: Disclosed herein are various embodiments for targeted document information extraction. An embodiment operates by receiving a document associated with a particular customer of a plurality of customers. It is determined whether to use a global processor or template processor to analyze the document based on whether one or more customer templates are associated with the particular customer. Which of the one or more templates associated with the particular customer correspond to the document is identified. The document is compared to the identified template associated with the customer. Information is extracted from the document based on the identified template and the identified plurality of variations. The extracted information for the document is output.Type: ApplicationFiled: January 6, 2021Publication date: July 7, 2022Inventors: YING JIANG, Christian Reisswig
-
Publication number: 20220171967Abstract: Disclosed herein are system, method, and computer program product embodiments for correcting extracted document information based on generated confidence and correctness scores. In an embodiment, a document correcting system may receive a document and document information that represents information extracted from the document. The document correcting system may determine the correctness of the document information by processing the document to generate a character grid representing textual information and spatial arrangements for the text within the document. The document correcting system may apply a convolutional neural network on character grid and the document information. The convolutional neural network may output corrected document information, a correctness value indicating the possible errors in the document information, and a confidence value indicating a likelihood of the possible errors.Type: ApplicationFiled: November 30, 2020Publication date: June 2, 2022Inventor: Christian REISSWIG
-
Publication number: 20220092328Abstract: Disclosed herein are system, method, and computer program product embodiments for querying document terms and identifying target data from documents. In an embodiment, a document processing system may receive a document and a query string. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters of the document. The document processing system may generate a two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid and the query string to identify target data from the document corresponding to the query string. The convolutional neural network may then produce a segmentation mask and/or bounding boxes to identify the targeted data.Type: ApplicationFiled: September 23, 2020Publication date: March 24, 2022Inventors: Johannes HOEHNE, Christian REISSWIG
-
Patent number: 11281928Abstract: Disclosed herein are system, method, and computer program product embodiments for querying document terms and identifying target data from documents. In an embodiment, a document processing system may receive a document and a query string. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters of the document. The document processing system may generate a two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid and the query string to identify target data from the document corresponding to the query string. The convolutional neural network may then produce a segmentation mask and/or bounding boxes to identify the targeted data.Type: GrantFiled: September 23, 2020Date of Patent: March 22, 2022Assignee: SAP SEInventors: Johannes Hoehne, Christian Reisswig
-
Patent number: 11275934Abstract: Disclosed herein are system, method, and computer program product embodiments for generating document labels using positional embeddings. In an embodiment, a label system may identify tokens, such as words, of a document image. The label system may apply a position vector neural network to the document image to analyze the pixels and determine positional embedding vectors corresponding to the words. The label system may then combine the positional embedding vectors to corresponding word vectors for use as an input to a neural network trained to generate document labels. This combination may embed the positional information with the corresponding word information in a serialized manner for processing by the document label neural network. Using this formatting, the label system may generate document labels in a light-weight and fast manner while still preserving spatial relationships between words.Type: GrantFiled: November 20, 2019Date of Patent: March 15, 2022Assignee: SAP SEInventors: Christian Reisswig, Stefan Klaus Baur
-
Patent number: 11244208Abstract: Disclosed herein are system, method, and computer program product embodiments for processing a document. In an embodiment, a document processing system may receive a document. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters. The document processing system may generate a down-sampled two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid to obtain semantic meaning for the document. The convolutional neural network may produce a segmentation mask and bounding boxes to correspond to the document.Type: GrantFiled: December 12, 2019Date of Patent: February 8, 2022Assignee: SAP SEInventors: Christian Reisswig, Anoop Raveendra Katti, Steffen Bickel, Johannes Hoehne, Jean Baptiste Faddoul
-
Publication number: 20210383067Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content.Type: ApplicationFiled: June 3, 2020Publication date: December 9, 2021Applicant: SAP SEInventor: Christian Reisswig
-
Publication number: 20210374548Abstract: Technologies are described for performing adaptive high-resolution digital image processing using neural networks. For example, a number of different regions can be defined representing portions of a digital image. One of the regions covers the entire digital image at a reduced resolution. The other regions cover less than the entire digital image at resolutions higher than the region covering the entire digital image. Neural networks are then used to process each of the regions. The neural networks share information using prolongation and restriction operations. Prolongation operations propagate activations from a neural network operating on a lower resolution region to context zones of a neural network operating on a higher resolution region. Restriction operations propagate activations from the neural network operating on the higher resolution region back to the neural network operating on the lower resolution region.Type: ApplicationFiled: June 2, 2020Publication date: December 2, 2021Applicant: SAP SEInventors: Christian Reisswig, Shachar Klaiman