Patents by Inventor Christian Reisswig
Christian Reisswig has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20220092328Abstract: Disclosed herein are system, method, and computer program product embodiments for querying document terms and identifying target data from documents. In an embodiment, a document processing system may receive a document and a query string. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters of the document. The document processing system may generate a two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid and the query string to identify target data from the document corresponding to the query string. The convolutional neural network may then produce a segmentation mask and/or bounding boxes to identify the targeted data.Type: ApplicationFiled: September 23, 2020Publication date: March 24, 2022Inventors: Johannes HOEHNE, Christian REISSWIG
-
Patent number: 11281928Abstract: Disclosed herein are system, method, and computer program product embodiments for querying document terms and identifying target data from documents. In an embodiment, a document processing system may receive a document and a query string. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters of the document. The document processing system may generate a two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid and the query string to identify target data from the document corresponding to the query string. The convolutional neural network may then produce a segmentation mask and/or bounding boxes to identify the targeted data.Type: GrantFiled: September 23, 2020Date of Patent: March 22, 2022Assignee: SAP SEInventors: Johannes Hoehne, Christian Reisswig
-
Patent number: 11275934Abstract: Disclosed herein are system, method, and computer program product embodiments for generating document labels using positional embeddings. In an embodiment, a label system may identify tokens, such as words, of a document image. The label system may apply a position vector neural network to the document image to analyze the pixels and determine positional embedding vectors corresponding to the words. The label system may then combine the positional embedding vectors to corresponding word vectors for use as an input to a neural network trained to generate document labels. This combination may embed the positional information with the corresponding word information in a serialized manner for processing by the document label neural network. Using this formatting, the label system may generate document labels in a light-weight and fast manner while still preserving spatial relationships between words.Type: GrantFiled: November 20, 2019Date of Patent: March 15, 2022Assignee: SAP SEInventors: Christian Reisswig, Stefan Klaus Baur
-
Patent number: 11244208Abstract: Disclosed herein are system, method, and computer program product embodiments for processing a document. In an embodiment, a document processing system may receive a document. The document processing system may perform optical character recognition to obtain character information and positioning information for the characters. The document processing system may generate a down-sampled two-dimensional character grid for the document. The document processing system may apply a convolutional neural network to the character grid to obtain semantic meaning for the document. The convolutional neural network may produce a segmentation mask and bounding boxes to correspond to the document.Type: GrantFiled: December 12, 2019Date of Patent: February 8, 2022Assignee: SAP SEInventors: Christian Reisswig, Anoop Raveendra Katti, Steffen Bickel, Johannes Hoehne, Jean Baptiste Faddoul
-
Publication number: 20210383067Abstract: Methods and apparatus are disclosed for extracting structured content, as graphs, from text documents. Graph vertices and edges correspond to document tokens and pairwise relationships between tokens. Undirected peer relationships and directed relationships (e.g. key-value or composition) are supported. Vertices can be identified with predefined fields, and thence mapped to database columns for automated storage of document content in a database. A trained neural network classifier determines relationship classifications for all pairwise combinations of input tokens. The relationship classification can differentiate multiple relationship types. A multi-level classifier extracts multi-level graph structure from a document. Disclosed embodiments support arbitrary graph structures with hierarchical and planar relationships. Relationships are not restricted by spatial proximity or document layout. Composite tokens can be identified interspersed with other content.Type: ApplicationFiled: June 3, 2020Publication date: December 9, 2021Applicant: SAP SEInventor: Christian Reisswig
-
Publication number: 20210374548Abstract: Technologies are described for performing adaptive high-resolution digital image processing using neural networks. For example, a number of different regions can be defined representing portions of a digital image. One of the regions covers the entire digital image at a reduced resolution. The other regions cover less than the entire digital image at resolutions higher than the region covering the entire digital image. Neural networks are then used to process each of the regions. The neural networks share information using prolongation and restriction operations. Prolongation operations propagate activations from a neural network operating on a lower resolution region to context zones of a neural network operating on a higher resolution region. Restriction operations propagate activations from the neural network operating on the higher resolution region back to the neural network operating on the lower resolution region.Type: ApplicationFiled: June 2, 2020Publication date: December 2, 2021Applicant: SAP SEInventors: Christian Reisswig, Shachar Klaiman
-
Patent number: 11138424Abstract: Disclosed herein are system, method, and computer program product embodiments for analyzing contextual symbol information for document processing. In an embodiment, a language model system may generate a vector grid that incorporates contextual document information. The language model system may receive a document file and identify symbols of the document file to generate a symbol grid. The language model system may also identify position parameters corresponding to each of the symbols. The language model system may then analyze the symbols using an embedding function and neighboring symbols to determine contextual vector values corresponding to each of the symbols. The language model system may then generate a vector grid mapping the contextual vector values using the position parameters. The contextual information from the vector grid may provide increase document processing accuracy as well as faster processing convergence.Type: GrantFiled: November 20, 2019Date of Patent: October 5, 2021Assignee: SAP SEInventors: Timo Denk, Christian Reisswig
-
Publication number: 20210150202Abstract: Disclosed herein are system, method, and computer program product embodiments for analyzing contextual symbol information for document processing. In an embodiment, a language model system may generate a vector grid that incorporates contextual document information. The language model system may receive a document file and identify symbols of the document file to generate a symbol grid. The language model system may also identify position parameters corresponding to each of the symbols. The language model system may then analyze the symbols using an embedding function and neighboring symbols to determine contextual vector values corresponding to each of the symbols. The language model system may then generate a vector grid mapping the contextual vector values using the position parameters. The contextual information from the vector grid may provide increase document processing accuracy as well as faster processing convergence.Type: ApplicationFiled: November 20, 2019Publication date: May 20, 2021Inventors: Timo DENK, Christian REISSWIG
-
Publication number: 20210150201Abstract: Disclosed herein are system, method, and computer program product embodiments for generating document labels using positional embeddings. In an embodiment, a label system may identify tokens, such as words, of a document image. The label system may apply a position vector neural network to the document image to analyze the pixels and determine positional embedding vectors corresponding to the words. The label system may then combine the positional embedding vectors to corresponding word vectors for use as an input to a neural network trained to generate document labels. This combination may embed the positional information with the corresponding word information in a serialized manner for processing by the document label neural network. Using this formatting, the label system may generate document labels in a light-weight and fast manner while still preserving spatial relationships between words.Type: ApplicationFiled: November 20, 2019Publication date: May 20, 2021Inventors: Christian Reisswig, Stefan Klaus Baur
-
Patent number: 11003861Abstract: Various examples are directed to systems and methods for classifying text. A computing device may access, from a database, an input sample comprising a first set of ordered words. The computing device may generate a first language model feature vector for the input sample using a word level language model and a second language model feature vector for the input sample using a partial word level language model. The computing device may generate a descriptor of the input sample using a target model, the input sample, the first language model feature vector, and the second language model feature vector and write the descriptor of the input sample to the database.Type: GrantFiled: February 13, 2019Date of Patent: May 11, 2021Assignee: SAP SEInventors: Christian Reisswig, Darko Velkoski, Sohyeong Kim, Hung Tu Dinh
-
Patent number: 10963645Abstract: Various examples described herein are directed to systems and methods for analyzing text. A computing device may train an autoencoder language model using a plurality of language model training samples. The autoencoder language mode may comprise a first convolutional layer. Also, a first language model training sample of the plurality of language model training samples may comprise a first set of ordered strings comprising a masked string, a first string preceding the masked string in the first set of ordered strings, and a second string after the masked string in the first set of ordered strings. The computing device may generate a first feature vector using an input sample and the autoencoder language model. The computing device may also generate a descriptor of the input sample using a target model, the input sample, and the first feature vector.Type: GrantFiled: February 7, 2019Date of Patent: March 30, 2021Assignee: SAP SEInventors: Christian Reisswig, Darko Velkoski, Sohyeong Kim, Hung Tu Dinh, Faisal El Hussein
-
Patent number: 10915788Abstract: Disclosed herein are system, method, and computer program product embodiments for optical character recognition using end-to-end deep learning. In an embodiment, an optical character recognition system may train a neural network to identify characters of pixel images and to assign index values to the characters. The neural network may also be trained to identify groups of characters and to generate bounding boxes to group these characters. The optical character recognition system may then analyze documents to identify character information based on the pixel data and produce a segmentation mask and one or more bounding box masks. The optical character recognition system may supply these masks as an output or may combine the masks to generate a version of the received document having optically recognized characters.Type: GrantFiled: September 6, 2018Date of Patent: February 9, 2021Assignee: SAP SEInventors: Johannes Hoehne, Anoop Raveendra Katti, Christian Reisswig
-
Patent number: 10915786Abstract: Disclosed herein are system, method, and computer program product embodiments for providing object detection and filtering operations. An embodiment operates by receiving an image comprising a plurality of pixels and pixel information for each pixel. The pixel information indicates a bounding box corresponding to an object within the image associated with a respective pixel and a confidence score associated with the bounding box for the respective pixel. Pixels that do not correspond to a center of at least one of the bounding boxes are iteratively removed from the plurality of pixels until a subset of pixels each of which correspond to a center of at least one of the bounding boxes remains. Based on the subset, a final bounding box associated with each object of the image is determined based on an overlapping of the bounding boxes of the subset of pixels and the corresponding confidence scores.Type: GrantFiled: February 28, 2019Date of Patent: February 9, 2021Assignee: SAP SEInventors: Johannes Hoehne, Anoop Raveendra Katti, Christian Reisswig, Marco Spinaci
-
Patent number: 10846553Abstract: Disclosed herein are system, method, and computer program product embodiments for optical character recognition using end-to-end deep learning. In an embodiment, an optical character recognition system may train a neural network to identify characters of pixel images, assign index values to the characters, and recognize different formatting of the characters, such as distinguishing between handwritten and typewritten characters. The neural network may also be trained to identify, groups of characters and to generate bounding boxes to group these characters. The optical character recognition system may then analyze documents to identify character information based on the pixel data and produce segmentation masks, such as a type grid segmentation mask, and one or more bounding box masks. The optical character recognition system may supply these masks as an output or may combine the masks to generate a version of the received document having optically recognized characters.Type: GrantFiled: March 20, 2019Date of Patent: November 24, 2020Assignee: SAP SEInventors: Johannes Hoehne, Christian Reisswig, Anoop Raveendra Katti, Marco Spinaci
-
Patent number: 10824808Abstract: Disclosed herein are system, method, and computer program product embodiments for robust key value extraction. In an embodiment, one or more hierarchical concepts units (HCUs) may be configured to extract key value and hierarchical information from text inputs. The HCUs may use a convolutional neural network, a recurrent neural network, and feature selectors to analyze the text inputs using machine learning techniques to extract the key value and hierarchical information. Multiple HCUs may be used together and configured to identify different categories of hierarchical information. While multiple HCUs may be used, each may use a skip connection to transmit extracted information to a feature concatenation layer. This allows an HCU to directly send a concept that has been identified as important to the feature concatenation layer and bypass other HCUs.Type: GrantFiled: November 20, 2018Date of Patent: November 3, 2020Assignee: SAP SEInventors: Christian Reisswig, Eduardo Vellasques, Sohyeong Kim, Darko Velkoski, Hung Tu Dinh
-
Publication number: 20200302208Abstract: Disclosed herein are system, method, and computer program product embodiments for optical character recognition using end-to-end deep learning. In an embodiment, an optical character recognition system may train a neural network to identify characters of pixel images, assign index values to the characters, and recognize different formatting of the characters, such as distinguishing between handwritten and typewritten characters. The neural network may also be trained to identify, groups of characters and to generate bounding boxes to group these characters. The optical character recognition system may then analyze documents to identify character information based on the pixel data and produce segmentation masks, such as a type grid segmentation mask, and one or more bounding box masks. The optical character recognition system may supply these masks as an output or may combine the masks to generate a version of the received document having optically recognized characters.Type: ApplicationFiled: March 20, 2019Publication date: September 24, 2020Inventors: Johannes HOEHNE, Christian REISSWIG, Anoop Raveendra KATTI, Marco SPINACI
-
Publication number: 20200279128Abstract: Disclosed herein are system, method, and computer program product embodiments for providing object detection and filtering operations. An embodiment operates by receiving an image comprising a plurality of pixels and pixel information for each pixel. The pixel information indicates a bounding box corresponding to an object within the image associated with a respective pixel and a confidence score associated with the bounding box for the respective pixel. Pixels that do not correspond to a center of at least one of the bounding boxes are iteratively removed from the plurality of pixels until a subset of pixels each of which correspond to a center of at least one of the bounding boxes remains. Based on the subset, a final bounding box associated with each object of the image is determined based on an overlapping of the bounding boxes of the subset of pixels and the corresponding confidence scores.Type: ApplicationFiled: February 28, 2019Publication date: September 3, 2020Inventors: Johannes Hoehne, Anoop Raveendra Katti, Christian Reisswig, Marco Spinaci
-
Publication number: 20200258498Abstract: Various examples described herein are directed to systems and methods for analyzing text. A computing device may train an autoencoder language model using a plurality of language model training samples. The autoencoder language mode may comprise a first convolutional layer. Also, a first language model training sample of the plurality of language model training samples may comprise a first set of ordered strings comprising a masked string, a first string preceding the masked string in the first set of ordered strings, and a second string after the masked string in the first set of ordered strings. The computing device may generate a first feature vector using an input sample and the autoencoder language model. The computing device may also generate a descriptor of the input sample using a target model, the input sample, and the first feature vector.Type: ApplicationFiled: February 7, 2019Publication date: August 13, 2020Inventors: Christian Reisswig, Darko Velkoski, Sohyeong Kim, Hung Tu Dinh, Faisal El Hussein
-
Publication number: 20200257764Abstract: Various examples are directed to systems and methods for classifying text. A computing device may access, from a database, an input sample comprising a first set of ordered words. The computing device may generate a first language model feature vector for the input sample using a word level language model and a second language model feature vector for the input sample using a partial word level language model. The computing device may generate a descriptor of the input sample using a target model, the input sample, the first language model feature vector, and the second language model feature vector and write the descriptor of the input sample to the database.Type: ApplicationFiled: February 13, 2019Publication date: August 13, 2020Inventors: Christian Reisswig, Darko Velkoski, Sohyeong Kim, Hung Tu Dinh
-
Publication number: 20200159828Abstract: Disclosed herein are system, method, and computer program product embodiments for robust key value extraction. In an embodiment, one or more hierarchical concepts units (HCUs) may be configured to extract key value and hierarchical information from text inputs. The HCUs may use a convolutional neural network, a recurrent neural network, and feature selectors to analyze the text inputs using machine learning techniques to extract the key value and hierarchical information. Multiple HCUs may be used together and configured to identify different categories of hierarchical information. While multiple HCUs may be used, each may use a skip connection to transmit extracted information to a feature concatenation layer. This allows an HCU to directly send a concept that has been identified as important to the feature concatenation layer and bypass other HCUs.Type: ApplicationFiled: November 20, 2018Publication date: May 21, 2020Inventors: Christian REISSWIG, Eduardo VELLASQUES, Sohyeong KIM, Darko VELKOSKI, Hung Tu DINH