Patents by Inventor Christian Reisswig

Christian Reisswig has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250103815
    Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content.
    Type: Application
    Filed: December 10, 2024
    Publication date: March 27, 2025
    Applicant: SAP SE
    Inventor: Christian Reisswig
  • Patent number: 12204860
    Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content.
    Type: Grant
    Filed: February 22, 2023
    Date of Patent: January 21, 2025
    Assignee: SAP SE
    Inventor: Christian Reisswig
  • Publication number: 20230334309
    Abstract: Systems, methods, and computer-readable media for generating a synthetic training data set from an original unstructured electronic document are disclosed. The synthetic training data set may be used to train a deep learning model to extract data from the original electronic document. The original electronic document may comprise annotated data fields. Each annotated data field may comprise a bounding box and a label. The original electronic document may comprise a header, a table, and a footer. Macro augmentation operations may be applied to the original electronic document to create sub-templates representative of distinct page layouts in the original electronic document. The synthetic training data set may be generated by applying geometric and semantic data augmentations to the sub-templates and the original electronic documents. The synthetic training data set may then be provided the deep learning model for training.
    Type: Application
    Filed: April 14, 2022
    Publication date: October 19, 2023
    Inventors: Alexey Streltsov, Monit Shah Singh, Dhananjay Tomar, Christian Reisswig, Minh Duc Bui
  • Patent number: 11763094
    Abstract: Natural language processing systems and methods are disclosed herein. In some embodiments, digital document information comprising text is received. The digital document information may be processed through word and character encoding operations to generate word and character vectors while retaining document location information for the words and characters. The data may be then be processed by a series of convolution and maximum pooling operations to obtain maximum valued elements from the data. The document location information as well as the maximum values element data may be further processed for semantic classification of the data using a semantic classifier and bounding box regression.
    Type: Grant
    Filed: May 13, 2021
    Date of Patent: September 19, 2023
    Assignee: SAP SE
    Inventor: Christian Reisswig
  • Publication number: 20230206000
    Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content.
    Type: Application
    Filed: February 22, 2023
    Publication date: June 29, 2023
    Applicant: SAP SE
    Inventor: Christian Reisswig
  • Patent number: 11615246
    Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content.
    Type: Grant
    Filed: June 3, 2020
    Date of Patent: March 28, 2023
    Assignee: SAP SE
    Inventor: Christian Reisswig
  • Publication number: 20230075369
    Abstract: Systems and methods include training of each of a plurality of models based on a first set of training data comprising a first plurality of pairs, each of the first plurality of pairs comprising a feature and a corresponding label, inputting of each of a plurality of features into each of the plurality of trained models to generate, for each feature of the plurality of features, a plurality of output labels, determining, for each of the plurality of features, a pseudo-label based on the plurality of output labels generated for the feature, determining a second set of training data comprising a second plurality of pairs, each of the second plurality of pairs comprising one of the plurality of features and a pseudo-label determined for the one of the plurality of features, and training an inference model to output an inferred label based on the first set of training data and the second set of training data.
    Type: Application
    Filed: September 8, 2021
    Publication date: March 9, 2023
    Inventors: Sohyeong KIM, Christian REISSWIG
  • Patent number: 11557140
    Abstract: Disclosed herein are system, method, and computer program product embodiments for correcting extracted document information based on generated confidence and correctness scores. In an embodiment, a document correcting system may receive a document and document information that represents information extracted from the document. The document correcting system may determine the correctness of the document information by processing the document to generate a character grid representing textual information and spatial arrangements for the text within the document. The document correcting system may apply a convolutional neural network on character grid and the document information. The convolutional neural network may output corrected document information, a correctness value indicating the possible errors in the document information, and a confidence value indicating a likelihood of the possible errors.
    Type: Grant
    Filed: November 30, 2020
    Date of Patent: January 17, 2023
    Assignee: SAP SE
    Inventor: Christian Reisswig
  • Patent number: 11514489
    Abstract: Disclosed herein are various embodiments for targeted document information extraction. An embodiment operates by receiving a document associated with a particular customer of a plurality of customers. It is determined whether to use a global processor or template processor to analyze the document based on whether one or more customer templates are associated with the particular customer. Which of the one or more templates associated with the particular customer correspond to the document is identified. The document is compared to the identified template associated with the customer. Information is extracted from the document based on the identified template and the identified plurality of variations. The extracted information for the document is output.
    Type: Grant
    Filed: January 6, 2021
    Date of Patent: November 29, 2022
    Assignee: SAP SE
    Inventors: Ying Jiang, Christian Reisswig
  • Publication number: 20220366301
    Abstract: In an example embodiment, a confidence score is computed for a predicted label (from a first model) for information extracted from a document. The confidence score is computed using a machine learned model different than the first model which is based on a Sliding-Window method. The Sliding-Window method may be based on convolutional neural networks classification, using sliding windows. It receives as input (1) the string of extracted information from an independent previous information extracted step (the “input text”), (2) the string's predicted class label, (3) the string's coordinate location in the document, and (4) the text of the document (for additional context information). The Sliding-Window method's task is to predict the confidence score to determine the correctness of the predicted label for the information.
    Type: Application
    Filed: June 22, 2021
    Publication date: November 17, 2022
    Inventors: Nurzat Rakhmanberdieva, Alexey Streltsov, Christian Reisswig
  • Publication number: 20220366144
    Abstract: Natural language processing systems and methods are disclosed herein. In some embodiments, digital document information comprising text is received. The digital document information may be processed through word and character encoding operations to generate word and character vectors while retaining document location information for the words and characters. The data may be then be processed by a series of convolution and maximum pooling operations to obtain maximum valued elements from the data. The document location information as well as the maximum values element data may be further processed for semantic classification of the data using a semantic classifier and bounding box regression.
    Type: Application
    Filed: May 13, 2021
    Publication date: November 17, 2022
    Inventor: Christian Reisswig
  • Patent number: 11488020
    Abstract: Technologies are described for performing adaptive high-resolution digital image processing using neural networks. For example, a number of different regions can be defined representing portions of a digital image. One of the regions covers the entire digital image at a reduced resolution. The other regions cover less than the entire digital image at resolutions higher than the region covering the entire digital image. Neural networks are then used to process each of the regions. The neural networks share information using prolongation and restriction operations. Prolongation operations propagate activations from a neural network operating on a lower resolution region to context zones of a neural network operating on a higher resolution region. Restriction operations propagate activations from the neural network operating on the higher resolution region back to the neural network operating on the lower resolution region.
    Type: Grant
    Filed: June 2, 2020
    Date of Patent: November 1, 2022
    Assignee: SAP SE
    Inventors: Christian Reisswig, Shachar Klaiman
  • Publication number: 20220215446
    Abstract: Disclosed herein are various embodiments for targeted document information extraction. An embodiment operates by receiving a document associated with a particular customer of a plurality of customers. It is determined whether to use a global processor or template processor to analyze the document based on whether one or more customer templates are associated with the particular customer. Which of the one or more templates associated with the particular customer correspond to the document is identified. The document is compared to the identified template associated with the customer. Information is extracted from the document based on the identified template and the identified plurality of variations. The extracted information for the document is output.
    Type: Application
    Filed: January 6, 2021
    Publication date: July 7, 2022
    Inventors: YING JIANG, Christian Reisswig
  • Publication number: 20220171967
    Abstract: Disclosed herein are system, method, and computer program product embodiments for correcting extracted document information based on generated confidence and correctness scores. In an embodiment, a document correcting system may receive a document and document information that represents information extracted from the document. The document correcting system may determine the correctness of the document information by processing the document to generate a character grid representing textual information and spatial arrangements for the text within the document. The document correcting system may apply a convolutional neural network on character grid and the document information. The convolutional neural network may output corrected document information, a correctness value indicating the possible errors in the document information, and a confidence value indicating a likelihood of the possible errors.
    Type: Application
    Filed: November 30, 2020
    Publication date: June 2, 2022
    Inventor: Christian REISSWIG
  • Publication number: 20220092328
    Abstract: Disclosed herein are system, method, and computer program product embodiments for querying document terms and identifying target data from documents. In an embodiment, a document processing system may receive a document and a query string. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters of the document. The document processing system may generate a two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid and the query string to identify target data from the document corresponding to the query string. The convolutional neural network may then produce a segmentation mask and/or bounding boxes to identify the targeted data.
    Type: Application
    Filed: September 23, 2020
    Publication date: March 24, 2022
    Inventors: Johannes HOEHNE, Christian REISSWIG
  • Patent number: 11281928
    Abstract: Disclosed herein are system, method, and computer program product embodiments for querying document terms and identifying target data from documents. In an embodiment, a document processing system may receive a document and a query string. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters of the document. The document processing system may generate a two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid and the query string to identify target data from the document corresponding to the query string. The convolutional neural network may then produce a segmentation mask and/or bounding boxes to identify the targeted data.
    Type: Grant
    Filed: September 23, 2020
    Date of Patent: March 22, 2022
    Assignee: SAP SE
    Inventors: Johannes Hoehne, Christian Reisswig
  • Patent number: 11275934
    Abstract: Disclosed herein are system, method, and computer program product embodiments for generating document labels using positional embeddings. In an embodiment, a label system may identify tokens, such as words, of a document image. The label system may apply a position vector neural network to the document image to analyze the pixels and determine positional embedding vectors corresponding to the words. The label system may then combine the positional embedding vectors to corresponding word vectors for use as an input to a neural network trained to generate document labels. This combination may embed the positional information with the corresponding word information in a serialized manner for processing by the document label neural network. Using this formatting, the label system may generate document labels in a light-weight and fast manner while still preserving spatial relationships between words.
    Type: Grant
    Filed: November 20, 2019
    Date of Patent: March 15, 2022
    Assignee: SAP SE
    Inventors: Christian Reisswig, Stefan Klaus Baur
  • Patent number: 11244208
    Abstract: Disclosed herein are system, method, and computer program product embodiments for processing a document. In an embodiment, a document processing system may receive a document. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters. The document processing system may generate a down-sampled two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid to obtain semantic meaning for the document. The convolutional neural network may produce a segmentation mask and bounding boxes to correspond to the document.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: February 8, 2022
    Assignee: SAP SE
    Inventors: Christian Reisswig, Anoop Raveendra Katti, Steffen Bickel, Johannes Hoehne, Jean Baptiste Faddoul
  • Publication number: 20210383067
    Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content.
    Type: Application
    Filed: June 3, 2020
    Publication date: December 9, 2021
    Applicant: SAP SE
    Inventor: Christian Reisswig
  • Publication number: 20210374548
    Abstract: Technologies are described for performing adaptive high-resolution digital image processing using neural networks. For example, a number of different regions can be defined representing portions of a digital image. One of the regions covers the entire digital image at a reduced resolution. The other regions cover less than the entire digital image at resolutions higher than the region covering the entire digital image. Neural networks are then used to process each of the regions. The neural networks share information using prolongation and restriction operations. Prolongation operations propagate activations from a neural network operating on a lower resolution region to context zones of a neural network operating on a higher resolution region. Restriction operations propagate activations from the neural network operating on the higher resolution region back to the neural network operating on the lower resolution region.
    Type: Application
    Filed: June 2, 2020
    Publication date: December 2, 2021
    Applicant: SAP SE
    Inventors: Christian Reisswig, Shachar Klaiman