Patents by Inventor Konstantinos Bekas
Konstantinos Bekas has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11556852Abstract: A computer-implemented method for determining a set of target items to be annotated for training a machine learning application. The method comprises providing a training data set with a set of data samples and an auto-encoder with a classifier. The auto-encoder comprises an embedding model that maps the set of data samples to a set of compressed feature vectors. The set of compressed feature vectors define a compressed feature matrix. Further provided are: a definition of a graph associated to the compressed feature matrix, applying a clustering-algorithm to identify node clusters of the graph and applying a centrality algorithm to identify central nodes of the node clusters, retrieving from an annotator node labels for the central nodes, propagating the annotated node labels to other nodes of the graph and performing a training of the embedding model and the classifier with the annotated and the propagated node labels.Type: GrantFiled: March 6, 2020Date of Patent: January 17, 2023Assignee: International Business Machines CorporationInventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Ralf Kaestner, Alexander Velizhev, Dal Noguer Hidalgo, Rita Kuznetsova, Konstantinos Bekas
-
Patent number: 11494588Abstract: A method, system and computer program product to generate a training data set for image segmentation applications, comprising providing a set of input documents of a first format. The input documents each comprise one or more pages. The input documents are split into individual document pages and parsed. Parsing comprises identifying a predefined set of items including position information of the position of the predefined set of items in the individual document pages; generating a bitmap image of a second format for each individual document page of the first format. The bitmap image comprises a predefined number of pixels. A mask is generated for each individual document. The mask comprises the predefined number of pixels of the corresponding bitmap image. Generating the mask comprises assigning an encoded class label to each pixel of the mask based on the position information of identified items of the predefined set of items.Type: GrantFiled: March 6, 2020Date of Patent: November 8, 2022Assignee: International Business Machines CorporationInventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Patent number: 11495038Abstract: A computer-implemented method for processing a digital image. The digital image comprises one or more text cells, wherein each of the one or more text cells comprises a string and a bounding box. The method comprises receiving the digital image in a first format, the first format providing access to the strings and the bounding boxes of the one more text cells. The methods further comprises encoding the strings of the one or more text cells as visual pattern according to a predefined string encoding scheme and providing the digital image in a second format. The second format comprises the visual pattern of the strings of the one or more text cells. A corresponding system and a related computer program product is provided.Type: GrantFiled: March 6, 2020Date of Patent: November 8, 2022Assignee: International Business Machines CorporationInventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Patent number: 11416581Abstract: Aspects of the present invention disclose a method, computer program product, and system for performing a multiplication of a matrix with an input vector. The method includes one or more processors subdividing a matrix into logical segments, the matrix being given in a sparse-matrix data format. The method further includes one or more processors obtaining one or more test vectors. The method further includes one or more processors performing an optimization cycle. In an additional aspect, performing the optimization cycle further comprises, for each of the test vectors, one or more processors, performing a cache performance test.Type: GrantFiled: March 9, 2020Date of Patent: August 16, 2022Assignee: International Business Machines CorporationInventors: Leonidas Georgopoulos, Peter Staar, Michele Dolfi, Christoph Auer, Konstantinos Bekas
-
Patent number: 11361146Abstract: The invention is notably directed to a computer-implemented method for processing a plurality of documents. The method comprises providing the plurality of documents in a first format and splitting each of the plurality of documents of the first format into one or more individual pages. The method further comprises individually parsing the one or more individual pages of the plurality of documents. The parsing comprises identifying a predefined set of items of the one or more individual pages. Further processing comprises gathering the predefined set of items of each of the one or more individual pages of the plurality of documents into individual page files of a second format and performing the document processing service with the individual page files of the second format. The invention further concerns a corresponding computing system and a related computer program product.Type: GrantFiled: March 6, 2020Date of Patent: June 14, 2022Assignee: International Business Machines CorporationInventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Publication number: 20220019907Abstract: In an approach for a dynamic in-memory construction of a knowledge graph structure, the knowledge graph structure comprising a plurality of nodes and edges linking selected nodes to each other, a processor receives a record comprising a plurality of strings. The plurality of strings relates to a command combined with a set of strings. A processor determines content records relating to nodes relating to each of the strings. A processor assigns node identifiers for respective determined content records. A processor appends the node identifiers to a dynamic in-memory knowledge graph structure. A processor modifies an edge between selected ones of the node identifiers based on the command combined with the set of strings. A processor builds the dynamic in-memory knowledge graph structure.Type: ApplicationFiled: July 20, 2020Publication date: January 20, 2022Inventors: Leonidas Georgopoulos, Peter Willem Jan Staar, Christoph Auer, Michele Dolfi, Konstantinos Bekas
-
Patent number: 11210578Abstract: Determining cognitive models to be deployed at auxiliary devices may include maintaining relations, e.g., in a database. The relations map hardware characteristics of auxiliary devices and example datasets to cognitive models. Cognitive models are determined for auxiliary devices, based on said relations, e.g., for each of the auxiliary devices. An input dataset is accessed, which comprises data of interest, e.g., collected at a core computing system (CCS), and hardware characteristics of each of the auxiliary devices. An auxiliary cognitive model is determined based on a core cognitive model run on the input dataset accessed, wherein the core cognitive model has been trained to learn at least part of said relations. Parameters of the auxiliary model determined can be communicated to said each of the auxiliary devices for the latter to deploy the auxiliary model determined. Method may be implemented in a network having an edge computing architecture.Type: GrantFiled: December 12, 2018Date of Patent: December 28, 2021Assignee: International Business Machines CorporationInventors: Florian Michael Scheidegger, Roxana Istrate, Giovanni Mariani, Konstantinos Bekas, Adelmo Cristiano Innocenza Malossi
-
Patent number: 11175957Abstract: The present disclosure relates to a hardware accelerator for executing a computation task composed of a set of operations. The hardware accelerator comprises a controller and a set of computation units. Each computation unit of the set of computation units is configured to receive input data of an operation of the set of operations and to perform the operation, wherein the input data is represented with a distinct bit length associated with each computation unit. The controller is configured to receive the input data represented with a certain bit length of the bit lengths and to select one of the set of computation units that can deliver a valid result and that is associated with a bit length smaller than or equal to the certain bit length.Type: GrantFiled: September 22, 2020Date of Patent: November 16, 2021Assignee: International Business Machines CorporationInventors: Dionysios Diamantopoulos, Florian Michael Scheidegger, Adelmo Cristiano Innocenza Malossi, Christoph Hagleitner, Konstantinos Bekas
-
Publication number: 20210279636Abstract: A computer-implemented method for determining a set of target items to be annotated for training a machine learning application. The method comprises providing a training data set with a set of data samples and an auto-encoder with a classifier. The auto-encoder comprises an embedding model that maps the set of data samples to a set of compressed feature vectors. The set of compressed feature vectors define a compressed feature matrix. Further provided are: a definition of a graph associated to the compressed feature matrix, applying a clustering-algorithm to identify node clusters of the graph and applying a centrality algorithm to identify central nodes of the node clusters, retrieving from an annotator node labels for the central nodes, propagating the annotated node labels to other nodes of the graph and performing a training of the embedding model and the classifier with the annotated and the propagated node labels.Type: ApplicationFiled: March 6, 2020Publication date: September 9, 2021Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Ralf Kaestner, Alexander Velizhev, Dal Noguer Hidalgo, Rita Kuznetsova, Konstantinos Bekas
-
Publication number: 20210279299Abstract: Aspects of the present invention disclose a method, computer program product, and system for performing a multiplication of a matrix with an input vector. The method includes one or more processors subdividing a matrix into logical segments, the matrix being given in a sparse-matrix data format. The method further includes one or more processors obtaining one or more test vectors. The method further includes one or more processors performing an optimization cycle. In an additional aspect, performing the optimization cycle further comprises, for each of the test vectors, one or more processors, performing a cache performance test.Type: ApplicationFiled: March 9, 2020Publication date: September 9, 2021Inventors: Leonidas Georgopoulos, Peter Staar, Michele Dolfi, Christoph Auer, Konstantinos Bekas
-
Publication number: 20210279400Abstract: The invention is notably directed to a computer-implemented method for processing a plurality of documents. The method comprises providing the plurality of documents in a first format and splitting each of the plurality of documents of the first format into one or more individual pages. The method further comprises individually parsing the one or more individual pages of the plurality of documents. The parsing comprises identifying a predefined set of items of the one or more individual pages. Further processing comprises gathering the predefined set of items of each of the one or more individual pages of the plurality of documents into individual page files of a second format and performing the document processing service with the individual page files of the second format. The invention further concerns a corresponding computing system and a related computer program product.Type: ApplicationFiled: March 6, 2020Publication date: September 9, 2021Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Publication number: 20210279516Abstract: A method, system and computer program product to generate a training data set for image segmentation applications, comprising providing a set of input documents of a first format. The input documents each comprise one or more pages. The input documents are split into individual document pages and parsed. Parsing comprises identifying a predefined set of items including position information of the position of the predefined set of items in the individual document pages; generating a bitmap image of a second format for each individual document page of the first format. The bitmap image comprises a predefined number of pixels. A mask is generated for each individual document. The mask comprises the predefined number of pixels of the corresponding bitmap image. Generating the mask comprises assigning an encoded class label to each pixel of the mask based on the position information of identified items of the predefined set of items.Type: ApplicationFiled: March 6, 2020Publication date: September 9, 2021Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Publication number: 20210279532Abstract: A computer-implemented method for processing a digital image. The digital image comprises one or more text cells, wherein each of the one or more text cells comprises a string and a bounding box. The method comprises receiving the digital image in a first format, the first format providing access to the strings and the bounding boxes of the one more text cells. The methods further comprises encoding the strings of the one or more text cells as visual pattern according to a predefined string encoding scheme and providing the digital image in a second format. The second format comprises the visual pattern of the strings of the one or more text cells. A corresponding system and a related computer program product is provided.Type: ApplicationFiled: March 6, 2020Publication date: September 9, 2021Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Patent number: 11086861Abstract: A computer-implemented method for generating ground-truth for natural language querying may include providing a knowledge graph as data model, receiving a natural language query from a user and translating the natural language query into a formal data query. The method can also include visualizing the formal data query to the user and receiving a feedback response from the user. The feedback response can include a verified and/or edited formal data query. The method can also include storing the natural language query and the corresponding feedback response as ground-truth pair. Corresponding system and a related computer program product may be provided.Type: GrantFiled: June 20, 2019Date of Patent: August 10, 2021Assignee: International Business Machines CorporationInventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Aleksandros Sobczyk, Tim Jan Baccaert, Konstantinos Bekas
-
Patent number: 11017498Abstract: A plurality of electronic documents comprising one or more document pages are received. First position markers, second position markers and page identifiers are inserted to the pages. The plurality of electronic documents are printed, thereby generating a printed corpus comprising a plurality of printed documents. The plurality of printed documents are scanned, thereby generating a scanned corpus comprising a plurality of scanned images. Scanning frame positions of the first and the second position markers are detected and the detected scanning frame positions and the page positions are used to define affine transformations between the plurality of scanned images and the corresponding document pages. The affine transformations are applied to the plurality of scanned images to align the plurality of scanned images with the corresponding document pages of the plurality of electronic documents.Type: GrantFiled: March 14, 2019Date of Patent: May 25, 2021Assignee: International Business Machines CorporationInventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Patent number: 10885323Abstract: A computer-implemented method for digitizing a document, wherein the document has assigned a classification scheme may be provided. A digital image and an identifier of the classification scheme may be received, the image representing a portion of the document. A segmentation of the image may be determined into one or more image segments; for each of the image segments, content information may be captured from the image segment and a category may be assigned to the image segment, the category being selected from the classification scheme. One or more digitization segments may be selected from the segmentation. A graph model of the document may be populated, wherein each of the digitization segments is represented by a segment node of the graph model.Type: GrantFiled: February 28, 2019Date of Patent: January 5, 2021Assignee: International Business Machines CorporationInventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Publication number: 20200401590Abstract: A computer-implemented method for generating ground-truth for natural language querying may include providing a knowledge graph as data model, receiving a natural language query from a user and translating the natural language query into a formal data query. The method can also include visualizing the formal data query to the user and receiving a feedback response from the user. The feedback response can include a verified and/or edited formal data query. The method can also include storing the natural language query and the corresponding feedback response as ground-truth pair. Corresponding system and a related computer program product may be provided.Type: ApplicationFiled: June 20, 2019Publication date: December 24, 2020Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Aleksandros Sobczyk, Tim Jan Baccaert, Konstantinos Bekas
-
Patent number: 10824674Abstract: Each node in a subset of graph nodes has an associated label value indicating a characteristic of the corresponding item. Matrix data and graph label data are stored. The matrix data defines a matrix representing the graph. The graph label data defines a graph label vector indicating label values associated with nodes of the graph. For at least one set of nodes, test label data is generated defining a test label vector. A propagation function is defined, comprising a set of basis functions, having respective coefficients. The coefficients are calculated which minimize a difference function dependent on difference between the graph label vector and a result of applying the propagation function to the test label vector for said at least one set. New label values are calculated for nodes in K by applying the propagation function with the calculated coefficients to the graph label vector, thereby propagating labels.Type: GrantFiled: June 3, 2016Date of Patent: November 3, 2020Assignees: International Business Machines Corporation, UNIVERSITÈ LIBRE DE BRUXELLES BRUXELLESInventors: Konstantinos Bekas, Robin Devooght, Peter Willem Jan Staar
-
Patent number: 10824788Abstract: A method of collecting training data of a document component may be provided. The documents have a structure and are coded in the typesetting language TeX. The method comprise receiving a TeX source file, compiling it into a PDF file and a related sync file, analyzing the PDF file, thereby determining a non-text-only document component. The method comprises also determining first coordinates of the non-text-only document component and a corresponding page number, determining a typesetting command relating to a non-text-only document component and determining second coordinates of a bounding box and a corresponding page number from the sync file, determining text elements in the non-text-only document component of the PDF file for which the first coordinates and the second coordinates overlap, and combining the determined text elements and linking them to a type of a non-text document component determined in the non-text-only document component in the TeX source file.Type: GrantFiled: February 8, 2019Date of Patent: November 3, 2020Assignee: International Business Machines CorporationInventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Aleksandros Sobczyk, Konstantinos Bekas
-
Publication number: 20200302307Abstract: Embodiments of the invention disclose a computer-implemented method for the automatic generation of a hypothesis from a graph. The method includes receiving an initial graph, wherein the initial graph includes a plurality of nodes and a plurality of edges between the plurality of nodes. A predefined property of the initial graph is computed, and one or more of the plurality of edges of the initial graph are amended, thereby creating an amended graph that includes a plurality of original edges and one or more amended edges. The predefined property of the amended graph is computed, and the predefined property of the initial graph is compared with the predefined property of the amended graph. The one or more amended edges are marked as hypothesis if a predefined measure of difference between the predefined property of the initial graph and the predefined property of the amended graph exceeds a predefined threshold.Type: ApplicationFiled: March 21, 2019Publication date: September 24, 2020Inventors: Konstantinos Bekas, Peter Staar, Christoph Auer, Michele Dolfi, Alessandro Curioni