Patents by Inventor Konstantinos Bekas

Konstantinos Bekas has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Dynamic in-memory construction of a knowledge graph

Patent number: 12260344

Abstract: In an approach for a dynamic in-memory construction of a knowledge graph structure, the knowledge graph structure comprising a plurality of nodes and edges linking selected nodes to each other, a processor receives a record comprising a plurality of strings. The plurality of strings relates to a command combined with a set of strings. A processor determines content records relating to nodes relating to each of the strings. A processor assigns node identifiers for respective determined content records. A processor appends the node identifiers to a dynamic in-memory knowledge graph structure. A processor modifies an edge between selected ones of the node identifiers based on the command combined with the set of strings. A processor builds the dynamic in-memory knowledge graph structure.

Type: Grant

Filed: July 20, 2020

Date of Patent: March 25, 2025

Assignee: International Business Machines Corporation

Inventors: Leonidas Georgopoulos, Peter Willem Jan Staar, Christoph Auer, Michele Dolfi, Konstantinos Bekas
Efficient ground truth annotation

Patent number: 11556852

Abstract: A computer-implemented method for determining a set of target items to be annotated for training a machine learning application. The method comprises providing a training data set with a set of data samples and an auto-encoder with a classifier. The auto-encoder comprises an embedding model that maps the set of data samples to a set of compressed feature vectors. The set of compressed feature vectors define a compressed feature matrix. Further provided are: a definition of a graph associated to the compressed feature matrix, applying a clustering-algorithm to identify node clusters of the graph and applying a centrality algorithm to identify central nodes of the node clusters, retrieving from an annotator node labels for the central nodes, propagating the annotated node labels to other nodes of the graph and performing a training of the embedding model and the classifier with the annotated and the propagated node labels.

Type: Grant

Filed: March 6, 2020

Date of Patent: January 17, 2023

Assignee: International Business Machines Corporation

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Ralf Kaestner, Alexander Velizhev, Dal Noguer Hidalgo, Rita Kuznetsova, Konstantinos Bekas
Ground truth generation for image segmentation

Patent number: 11494588

Abstract: A method, system and computer program product to generate a training data set for image segmentation applications, comprising providing a set of input documents of a first format. The input documents each comprise one or more pages. The input documents are split into individual document pages and parsed. Parsing comprises identifying a predefined set of items including position information of the position of the predefined set of items in the individual document pages; generating a bitmap image of a second format for each individual document page of the first format. The bitmap image comprises a predefined number of pixels. A mask is generated for each individual document. The mask comprises the predefined number of pixels of the corresponding bitmap image. Generating the mask comprises assigning an encoded class label to each pixel of the mask based on the position information of identified items of the predefined set of items.

Type: Grant

Filed: March 6, 2020

Date of Patent: November 8, 2022

Assignee: International Business Machines Corporation

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
Digital image processing

Patent number: 11495038

Abstract: A computer-implemented method for processing a digital image. The digital image comprises one or more text cells, wherein each of the one or more text cells comprises a string and a bounding box. The method comprises receiving the digital image in a first format, the first format providing access to the strings and the bounding boxes of the one more text cells. The methods further comprises encoding the strings of the one or more text cells as visual pattern according to a predefined string encoding scheme and providing the digital image in a second format. The second format comprises the visual pattern of the strings of the one or more text cells. A corresponding system and a related computer program product is provided.

Type: Grant

Filed: March 6, 2020

Date of Patent: November 8, 2022

Assignee: International Business Machines Corporation

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
Multiplication of a matrix with an input vector

Patent number: 11416581

Abstract: Aspects of the present invention disclose a method, computer program product, and system for performing a multiplication of a matrix with an input vector. The method includes one or more processors subdividing a matrix into logical segments, the matrix being given in a sparse-matrix data format. The method further includes one or more processors obtaining one or more test vectors. The method further includes one or more processors performing an optimization cycle. In an additional aspect, performing the optimization cycle further comprises, for each of the test vectors, one or more processors, performing a cache performance test.

Type: Grant

Filed: March 9, 2020

Date of Patent: August 16, 2022

Assignee: International Business Machines Corporation

Inventors: Leonidas Georgopoulos, Peter Staar, Michele Dolfi, Christoph Auer, Konstantinos Bekas
Memory-efficient document processing

Patent number: 11361146

Abstract: The invention is notably directed to a computer-implemented method for processing a plurality of documents. The method comprises providing the plurality of documents in a first format and splitting each of the plurality of documents of the first format into one or more individual pages. The method further comprises individually parsing the one or more individual pages of the plurality of documents. The parsing comprises identifying a predefined set of items of the one or more individual pages. Further processing comprises gathering the predefined set of items of each of the one or more individual pages of the plurality of documents into individual page files of a second format and performing the document processing service with the individual page files of the second format. The invention further concerns a corresponding computing system and a related computer program product.

Type: Grant

Filed: March 6, 2020

Date of Patent: June 14, 2022

Assignee: International Business Machines Corporation

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
Dynamic In-Memory Construction of a Knowledge Graph

Publication number: 20220019907

Abstract: In an approach for a dynamic in-memory construction of a knowledge graph structure, the knowledge graph structure comprising a plurality of nodes and edges linking selected nodes to each other, a processor receives a record comprising a plurality of strings. The plurality of strings relates to a command combined with a set of strings. A processor determines content records relating to nodes relating to each of the strings. A processor assigns node identifiers for respective determined content records. A processor appends the node identifiers to a dynamic in-memory knowledge graph structure. A processor modifies an edge between selected ones of the node identifiers based on the command combined with the set of strings. A processor builds the dynamic in-memory knowledge graph structure.

Type: Application

Filed: July 20, 2020

Publication date: January 20, 2022

Inventors: Leonidas Georgopoulos, Peter Willem Jan Staar, Christoph Auer, Michele Dolfi, Konstantinos Bekas
Automatic determination of cognitive models for deployment at computerized devices having various hardware constraints

Patent number: 11210578

Abstract: Determining cognitive models to be deployed at auxiliary devices may include maintaining relations, e.g., in a database. The relations map hardware characteristics of auxiliary devices and example datasets to cognitive models. Cognitive models are determined for auxiliary devices, based on said relations, e.g., for each of the auxiliary devices. An input dataset is accessed, which comprises data of interest, e.g., collected at a core computing system (CCS), and hardware characteristics of each of the auxiliary devices. An auxiliary cognitive model is determined based on a core cognitive model run on the input dataset accessed, wherein the core cognitive model has been trained to learn at least part of said relations. Parameters of the auxiliary model determined can be communicated to said each of the auxiliary devices for the latter to deploy the auxiliary model determined. Method may be implemented in a network having an edge computing architecture.

Type: Grant

Filed: December 12, 2018

Date of Patent: December 28, 2021

Assignee: International Business Machines Corporation

Inventors: Florian Michael Scheidegger, Roxana Istrate, Giovanni Mariani, Konstantinos Bekas, Adelmo Cristiano Innocenza Malossi
Hardware accelerator for executing a computation task

Patent number: 11175957

Abstract: The present disclosure relates to a hardware accelerator for executing a computation task composed of a set of operations. The hardware accelerator comprises a controller and a set of computation units. Each computation unit of the set of computation units is configured to receive input data of an operation of the set of operations and to perform the operation, wherein the input data is represented with a distinct bit length associated with each computation unit. The controller is configured to receive the input data represented with a certain bit length of the bit lengths and to select one of the set of computation units that can deliver a valid result and that is associated with a bit length smaller than or equal to the certain bit length.

Type: Grant

Filed: September 22, 2020

Date of Patent: November 16, 2021

Assignee: International Business Machines Corporation

Inventors: Dionysios Diamantopoulos, Florian Michael Scheidegger, Adelmo Cristiano Innocenza Malossi, Christoph Hagleitner, Konstantinos Bekas
GROUND TRUTH GENERATION FOR IMAGE SEGMENTATION

Publication number: 20210279516

Abstract: A method, system and computer program product to generate a training data set for image segmentation applications, comprising providing a set of input documents of a first format. The input documents each comprise one or more pages. The input documents are split into individual document pages and parsed. Parsing comprises identifying a predefined set of items including position information of the position of the predefined set of items in the individual document pages; generating a bitmap image of a second format for each individual document page of the first format. The bitmap image comprises a predefined number of pixels. A mask is generated for each individual document. The mask comprises the predefined number of pixels of the corresponding bitmap image. Generating the mask comprises assigning an encoded class label to each pixel of the mask based on the position information of identified items of the predefined set of items.

Type: Application

Filed: March 6, 2020

Publication date: September 9, 2021

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
DIGITAL IMAGE PROCESSING

Publication number: 20210279532

Abstract: A computer-implemented method for processing a digital image. The digital image comprises one or more text cells, wherein each of the one or more text cells comprises a string and a bounding box. The method comprises receiving the digital image in a first format, the first format providing access to the strings and the bounding boxes of the one more text cells. The methods further comprises encoding the strings of the one or more text cells as visual pattern according to a predefined string encoding scheme and providing the digital image in a second format. The second format comprises the visual pattern of the strings of the one or more text cells. A corresponding system and a related computer program product is provided.

Type: Application

Filed: March 6, 2020

Publication date: September 9, 2021

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
MEMORY-EFFICIENT DOCUMENT PROCESSING

Publication number: 20210279400

Abstract: The invention is notably directed to a computer-implemented method for processing a plurality of documents. The method comprises providing the plurality of documents in a first format and splitting each of the plurality of documents of the first format into one or more individual pages. The method further comprises individually parsing the one or more individual pages of the plurality of documents. The parsing comprises identifying a predefined set of items of the one or more individual pages. Further processing comprises gathering the predefined set of items of each of the one or more individual pages of the plurality of documents into individual page files of a second format and performing the document processing service with the individual page files of the second format. The invention further concerns a corresponding computing system and a related computer program product.

Type: Application

Filed: March 6, 2020

Publication date: September 9, 2021

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
EFFICIENT GROUND TRUTH ANNOTATION

Publication number: 20210279636

Abstract: A computer-implemented method for determining a set of target items to be annotated for training a machine learning application. The method comprises providing a training data set with a set of data samples and an auto-encoder with a classifier. The auto-encoder comprises an embedding model that maps the set of data samples to a set of compressed feature vectors. The set of compressed feature vectors define a compressed feature matrix. Further provided are: a definition of a graph associated to the compressed feature matrix, applying a clustering-algorithm to identify node clusters of the graph and applying a centrality algorithm to identify central nodes of the node clusters, retrieving from an annotator node labels for the central nodes, propagating the annotated node labels to other nodes of the graph and performing a training of the embedding model and the classifier with the annotated and the propagated node labels.

Type: Application

Filed: March 6, 2020

Publication date: September 9, 2021

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Ralf Kaestner, Alexander Velizhev, Dal Noguer Hidalgo, Rita Kuznetsova, Konstantinos Bekas
MULTIPLICATION OF A MATRIX WITH AN INPUT VECTOR

Publication number: 20210279299

Abstract: Aspects of the present invention disclose a method, computer program product, and system for performing a multiplication of a matrix with an input vector. The method includes one or more processors subdividing a matrix into logical segments, the matrix being given in a sparse-matrix data format. The method further includes one or more processors obtaining one or more test vectors. The method further includes one or more processors performing an optimization cycle. In an additional aspect, performing the optimization cycle further comprises, for each of the test vectors, one or more processors, performing a cache performance test.

Type: Application

Filed: March 9, 2020

Publication date: September 9, 2021

Inventors: Leonidas Georgopoulos, Peter Staar, Michele Dolfi, Christoph Auer, Konstantinos Bekas
Translating a natural language query into a formal data query

Patent number: 11086861

Abstract: A computer-implemented method for generating ground-truth for natural language querying may include providing a knowledge graph as data model, receiving a natural language query from a user and translating the natural language query into a formal data query. The method can also include visualizing the formal data query to the user and receiving a feedback response from the user. The feedback response can include a verified and/or edited formal data query. The method can also include storing the natural language query and the corresponding feedback response as ground-truth pair. Corresponding system and a related computer program product may be provided.

Type: Grant

Filed: June 20, 2019

Date of Patent: August 10, 2021

Assignee: International Business Machines Corporation

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Aleksandros Sobczyk, Tim Jan Baccaert, Konstantinos Bekas
Ground truth generation from scanned documents

Patent number: 11017498

Abstract: A plurality of electronic documents comprising one or more document pages are received. First position markers, second position markers and page identifiers are inserted to the pages. The plurality of electronic documents are printed, thereby generating a printed corpus comprising a plurality of printed documents. The plurality of printed documents are scanned, thereby generating a scanned corpus comprising a plurality of scanned images. Scanning frame positions of the first and the second position markers are detected and the detected scanning frame positions and the page positions are used to define affine transformations between the plurality of scanned images and the corresponding document pages. The affine transformations are applied to the plurality of scanned images to align the plurality of scanned images with the corresponding document pages of the plurality of electronic documents.

Type: Grant

Filed: March 14, 2019

Date of Patent: May 25, 2021

Assignee: International Business Machines Corporation

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
Digital image-based document digitization using a graph model

Patent number: 10885323

Abstract: A computer-implemented method for digitizing a document, wherein the document has assigned a classification scheme may be provided. A digital image and an identifier of the classification scheme may be received, the image representing a portion of the document. A segmentation of the image may be determined into one or more image segments; for each of the image segments, content information may be captured from the image segment and a category may be assigned to the image segment, the category being selected from the classification scheme. One or more digitization segments may be selected from the segmentation. A graph model of the document may be populated, wherein each of the digitization segments is represented by a segment node of the graph model.

Type: Grant

Filed: February 28, 2019

Date of Patent: January 5, 2021

Assignee: International Business Machines Corporation

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
TRANSLATING A NATURAL LANGUAGE QUERY INTO A FORMAL DATA QUERY

Publication number: 20200401590

Abstract: A computer-implemented method for generating ground-truth for natural language querying may include providing a knowledge graph as data model, receiving a natural language query from a user and translating the natural language query into a formal data query. The method can also include visualizing the formal data query to the user and receiving a feedback response from the user. The feedback response can include a verified and/or edited formal data query. The method can also include storing the natural language query and the corresponding feedback response as ground-truth pair. Corresponding system and a related computer program product may be provided.

Type: Application

Filed: June 20, 2019

Publication date: December 24, 2020

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Aleksandros Sobczyk, Tim Jan Baccaert, Konstantinos Bekas
Label propagation in graphs

Patent number: 10824674

Abstract: Each node in a subset of graph nodes has an associated label value indicating a characteristic of the corresponding item. Matrix data and graph label data are stored. The matrix data defines a matrix representing the graph. The graph label data defines a graph label vector indicating label values associated with nodes of the graph. For at least one set of nodes, test label data is generated defining a test label vector. A propagation function is defined, comprising a set of basis functions, having respective coefficients. The coefficients are calculated which minimize a difference function dependent on difference between the graph label vector and a result of applying the propagation function to the test label vector for said at least one set. New label values are calculated for nodes in K by applying the propagation function with the calculated coefficients to the graph label vector, thereby propagating labels.

Type: Grant

Filed: June 3, 2016

Date of Patent: November 3, 2020

Assignees: International Business Machines Corporation, UNIVERSITÈ LIBRE DE BRUXELLES BRUXELLES

Inventors: Konstantinos Bekas, Robin Devooght, Peter Willem Jan Staar
Collecting training data from TeX files

Patent number: 10824788

Abstract: A method of collecting training data of a document component may be provided. The documents have a structure and are coded in the typesetting language TeX. The method comprise receiving a TeX source file, compiling it into a PDF file and a related sync file, analyzing the PDF file, thereby determining a non-text-only document component. The method comprises also determining first coordinates of the non-text-only document component and a corresponding page number, determining a typesetting command relating to a non-text-only document component and determining second coordinates of a bounding box and a corresponding page number from the sync file, determining text elements in the non-text-only document component of the PDF file for which the first coordinates and the second coordinates overlap, and combining the determined text elements and linking them to a type of a non-text document component determined in the non-text-only document component in the TeX source file.

Type: Grant

Filed: February 8, 2019

Date of Patent: November 3, 2020

Assignee: International Business Machines Corporation

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Aleksandros Sobczyk, Konstantinos Bekas

1 2 3 4 next