Patents by Inventor Marek Polewczyk

Marek Polewczyk has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

NEURAL NETWORK WORD CLUSTERING SYSTEM

Publication number: 20240177011

Abstract: Various embodiments for a neural network clustering system are described herein. An embodiment operates by detecting a plurality of bounding boxes and identifying coordinates for each of the bounding boxes. An adjacency matrix is generated based on combining a key matrix and a query matrix. The plurality of words are clustered into a plurality of clusters, each cluster corresponding to a different line on the first document. A second document is generated in which the plurality of words corresponding to a respective cluster of the plurality of clusters is arranged on a same line on the second document. The second document is provided for display.

Type: Application

Filed: November 29, 2022

Publication date: May 30, 2024

Inventors: MAREK POLEWCZYK, Marco SPINACI, Xiang YU
Character encoding and decoding for optical character recognition

Patent number: 11816182

Abstract: The present disclosure provides techniques for encoding and decoding characters for optical character recognition. The techniques involve determining sets of numbers for encoding a character set where each number in a particular set of numbers for encoding a particular character is mapped to a graphical unit (e.g., radical) of the particular character. A mapping between each set of numbers in the possible encodings and the character set may be determined based the closest character already encoded. A machine learning model may be trained to perform optical character recognition using training data labeled using the set of encodings and the mappings.

Type: Grant

Filed: June 7, 2021

Date of Patent: November 14, 2023

Assignee: SAP SE

Inventors: Marco Spinaci, Marek Polewczyk
MACHINE LEARNING ENABLED DOCUMENT DESKEWING

Publication number: 20230222632

Abstract: A method may include determining, based at least on an image of a document, a plurality of text bounding boxes enclosing lines of text present in the document. A machine learning model may be trained to determine, based at least on the coordinates defining the text bounding boxes, the coordinates of a document bounding box enclosing the text bounding boxes. The document bounding box may encapsulate the visual aberrations that are present in the image of the document. As such, one or more transformations may be determined based on the coordinates of the document bounding box. The image of the document may be deskewed by applying the transformations. One or more downstream tasks may be performed based on the deskewed image of the document. Related methods and articles of manufacture are also disclosed.

Type: Application

Filed: January 7, 2022

Publication date: July 13, 2023

Inventors: Marek Polewczyk, Marco Spinaci
CHARACTER ENCODING AND DECODING FOR OPTICAL CHARACTER RECOGNITION

Publication number: 20220391637

Abstract: The present disclosure provides techniques for encoding and decoding characters for optical character recognition. The techniques involve determining sets of numbers for encoding a character set where each number in a particular set of numbers for encoding a particular character is mapped to a graphical unit (e.g., radical) of the particular character. A mapping between each set of numbers in the possible encodings and the character set may be determined based the closest character already encoded. A machine learning model may be trained to perform optical character recognition using training data labeled using the set of encodings and the mappings.

Type: Application

Filed: June 7, 2021

Publication date: December 8, 2022

Inventors: Marco Spinaci, Marek Polewczyk

NEURAL NETWORK WORD CLUSTERING SYSTEM

Character encoding and decoding for optical character recognition

MACHINE LEARNING ENABLED DOCUMENT DESKEWING

CHARACTER ENCODING AND DECODING FOR OPTICAL CHARACTER RECOGNITION