Patents by Inventor Michele Dolfi
Michele Dolfi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11934419Abstract: The present disclosure relates to a method for displaying a search result in a user interface comprising a sequence of K table rows, a sequence of first type columns (named regular columns) and a sequence of B second type columns (named braided columns), the search result comprising values of attributes. The method includes creating a structure that includes a sequence table rows and a sequence of regular columns and a sequence of braided columns. The method selects a row of the sequence of rows and a braided column of the braided columns for displaying the search result. In each regular column a regular cell that is included in the selected row may be defined. From a braided column a braided cell may be defined that includes in the selected row and subsequent B?1 rows. The attributes values are displayed in the defined regular and braided cells.Type: GrantFiled: February 1, 2022Date of Patent: March 19, 2024Assignee: International Business Machines CorporationInventors: Kasper Dinkla, Michele Dolfi, Christoph Auer, Birgit Monika Pfitzmann, Peter Willem Jan Staar
-
Publication number: 20230252309Abstract: A computer-implemented method, a computer program product, and a computer system for building a knowledge graph. A computer system converts user inputs as to a partial topology of a knowledge graph that a user wants to build into one or more initial nodes corresponding to respective natural language descriptions. A computer system interprets the respective natural language descriptions using natural language processing to match the one or more initial nodes against reference data. A computer system, based on matched reference data, obtains a valid topology of nodes and edges, wherein the nodes and edges are mapped onto the matched reference data. A computer system, based on the valid topology, generates a data flow linking to the matched reference data via associations of the nodes and edges and the matched reference data. A computer system builds an executable knowledge graph from the data flow.Type: ApplicationFiled: February 7, 2022Publication date: August 10, 2023Inventors: Birgit Monika Pfitzmann, Christoph Auer, Kasper Dinkla, Michele Dolfi, Peter Willem Jan Staar
-
Publication number: 20230244682Abstract: The present disclosure relates to a method for displaying a search result in a user interface comprising a sequence of K table rows, a sequence of first type columns (named regular columns) and a sequence of B second type columns (named braided columns), the search result comprising values of attributes. The method includes creating a structure that includes a sequence table rows and a sequence of regular columns and a sequence of braided columns. The method selects a row of the sequence of rows and a braided column of the braided columns for displaying the search result. In each regular column a regular cell that is included in the selected row may be defined. From a braided column a braided cell may be defined that includes in the selected row and subsequent B?1 rows. The attributes values are displayed in the defined regular and braided cells.Type: ApplicationFiled: February 1, 2022Publication date: August 3, 2023Inventors: Kasper Dinkla, Michele Dolfi, Christoph Auer, Birgit Monika Pfitzmann, Peter Willem Jan Staar
-
Patent number: 11687700Abstract: The present disclosure relates to a method for generating a structure of a PDF-document, wherein the PDF-document comprises elements. The method comprises detecting document cells of the PDF-document dependent on commands of a page description language for printing the elements of the PDF-document. The method comprises determining parts of the PDF-document dependent on the PDF-document by a machine learning module. The determining of the respective part comprises associating a respective portion of the elements of the PDF-document with the respective part. Furthermore, a respective label may be assigned to the respective part. The method may further comprise using a symbolic artificial intelligence module, wherein rules of the symbolic AI-module for reconciling the document cells with the parts may be applied. The elements of the structure of the PDF-document may be generated and labelled dependent on a result of the reconciling and dependent on the respective label to the respective part.Type: GrantFiled: February 1, 2022Date of Patent: June 27, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Birgit Monika Pfitzmann, Christoph Auer, Michele Dolfi, Peter Willem Jan Staar, Ahmed Samy Nassar
-
Publication number: 20230132061Abstract: Information extraction systems and computer-implemented methods for producing a searchable representation of information contained in a corpus of documents by generating a document structure graph for each document, the graph indicating a structural hierarchy of document items in that document based on a predefined hierarchy of predetermined item-types, and linking document items to a parent document item in the structural hierarchy, for each document, generating a knowledge graph including first nodes, representing document items in the corpus and second nodes representing language items identified in those document items, interconnecting the first nodes and second nodes by edges representing a defined relation between items represented by the nodes interconnected by that edge, storing the knowledge graph in a knowledge graph database, and producing the searchable representation by traversing edges of the graph in response to input search queries.Type: ApplicationFiled: October 22, 2021Publication date: April 27, 2023Inventors: Birgit Monika Pfitzmann, Christoph Auer, Kasper Dinkla, Michele Dolfi, Peter Willem Jan Staar
-
Patent number: 11556852Abstract: A computer-implemented method for determining a set of target items to be annotated for training a machine learning application. The method comprises providing a training data set with a set of data samples and an auto-encoder with a classifier. The auto-encoder comprises an embedding model that maps the set of data samples to a set of compressed feature vectors. The set of compressed feature vectors define a compressed feature matrix. Further provided are: a definition of a graph associated to the compressed feature matrix, applying a clustering-algorithm to identify node clusters of the graph and applying a centrality algorithm to identify central nodes of the node clusters, retrieving from an annotator node labels for the central nodes, propagating the annotated node labels to other nodes of the graph and performing a training of the embedding model and the classifier with the annotated and the propagated node labels.Type: GrantFiled: March 6, 2020Date of Patent: January 17, 2023Assignee: International Business Machines CorporationInventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Ralf Kaestner, Alexander Velizhev, Dal Noguer Hidalgo, Rita Kuznetsova, Konstantinos Bekas
-
Patent number: 11495038Abstract: A computer-implemented method for processing a digital image. The digital image comprises one or more text cells, wherein each of the one or more text cells comprises a string and a bounding box. The method comprises receiving the digital image in a first format, the first format providing access to the strings and the bounding boxes of the one more text cells. The methods further comprises encoding the strings of the one or more text cells as visual pattern according to a predefined string encoding scheme and providing the digital image in a second format. The second format comprises the visual pattern of the strings of the one or more text cells. A corresponding system and a related computer program product is provided.Type: GrantFiled: March 6, 2020Date of Patent: November 8, 2022Assignee: International Business Machines CorporationInventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Patent number: 11494588Abstract: A method, system and computer program product to generate a training data set for image segmentation applications, comprising providing a set of input documents of a first format. The input documents each comprise one or more pages. The input documents are split into individual document pages and parsed. Parsing comprises identifying a predefined set of items including position information of the position of the predefined set of items in the individual document pages; generating a bitmap image of a second format for each individual document page of the first format. The bitmap image comprises a predefined number of pixels. A mask is generated for each individual document. The mask comprises the predefined number of pixels of the corresponding bitmap image. Generating the mask comprises assigning an encoded class label to each pixel of the mask based on the position information of identified items of the predefined set of items.Type: GrantFiled: March 6, 2020Date of Patent: November 8, 2022Assignee: International Business Machines CorporationInventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Publication number: 20220292121Abstract: The present disclosure relates to a method for searching a graph representing content of digital objects. A set of operations for traversing the graph may be determined according to a search request. The set of operations may be executed, resulting in intermediate result vectors of nodes and a result vector of nodes, wherein the result vector of nodes is associated with a result set of one or more object units of the digital objects. Intermediate result vectors may be selected from of the intermediate result vectors. A set of result entities may be identified. The set of result entities are entities which are part of the object units and part of entities represented by nodes of said selected intermediate result vectors. The set of result entities and the result set of object units may be provided as a result of the search request.Type: ApplicationFiled: March 12, 2021Publication date: September 15, 2022Inventors: Birgit Monika Pfitzmann, Kasper Dinkla, Michele Dolfi, Christoph Auer, Peter Willem Jan Staar, André Carvalho
-
Patent number: 11416581Abstract: Aspects of the present invention disclose a method, computer program product, and system for performing a multiplication of a matrix with an input vector. The method includes one or more processors subdividing a matrix into logical segments, the matrix being given in a sparse-matrix data format. The method further includes one or more processors obtaining one or more test vectors. The method further includes one or more processors performing an optimization cycle. In an additional aspect, performing the optimization cycle further comprises, for each of the test vectors, one or more processors, performing a cache performance test.Type: GrantFiled: March 9, 2020Date of Patent: August 16, 2022Assignee: International Business Machines CorporationInventors: Leonidas Georgopoulos, Peter Staar, Michele Dolfi, Christoph Auer, Konstantinos Bekas
-
Patent number: 11361146Abstract: The invention is notably directed to a computer-implemented method for processing a plurality of documents. The method comprises providing the plurality of documents in a first format and splitting each of the plurality of documents of the first format into one or more individual pages. The method further comprises individually parsing the one or more individual pages of the plurality of documents. The parsing comprises identifying a predefined set of items of the one or more individual pages. Further processing comprises gathering the predefined set of items of each of the one or more individual pages of the plurality of documents into individual page files of a second format and performing the document processing service with the individual page files of the second format. The invention further concerns a corresponding computing system and a related computer program product.Type: GrantFiled: March 6, 2020Date of Patent: June 14, 2022Assignee: International Business Machines CorporationInventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Publication number: 20220019907Abstract: In an approach for a dynamic in-memory construction of a knowledge graph structure, the knowledge graph structure comprising a plurality of nodes and edges linking selected nodes to each other, a processor receives a record comprising a plurality of strings. The plurality of strings relates to a command combined with a set of strings. A processor determines content records relating to nodes relating to each of the strings. A processor assigns node identifiers for respective determined content records. A processor appends the node identifiers to a dynamic in-memory knowledge graph structure. A processor modifies an edge between selected ones of the node identifiers based on the command combined with the set of strings. A processor builds the dynamic in-memory knowledge graph structure.Type: ApplicationFiled: July 20, 2020Publication date: January 20, 2022Inventors: Leonidas Georgopoulos, Peter Willem Jan Staar, Christoph Auer, Michele Dolfi, Konstantinos Bekas
-
Patent number: 11205287Abstract: Computer-implemented methods and apparatus are provided for annotating digital images of line plots with ground truth labels. For each digital image, such a method includes supplying image data defining the image of a line plot to a machine-learning model trained to generate a set of control points defining a spline corresponding to the line plot. The method further comprises displaying the spline, and the set of control points, superimposed on the image in a graphical user interface and, in response to user manipulation via the graphical user interface of one or more control points, dynamically adjusting the displayed spline in accordance with manipulated control points whereby the displayed spline can be adjusted for conformity with the line plot. The set of control points for the adjusted spline is then stored as a ground truth label for the image.Type: GrantFiled: March 27, 2020Date of Patent: December 21, 2021Assignee: International Business Machines CorporationInventors: Martin Rufli, Ralf Kaestner, Alexander Velizhev, Peter Willem Jan Staar, Michele Dolfi, Elliot Jacques Vincent, Christoph Auer
-
Publication number: 20210304463Abstract: Computer-implemented methods and apparatus are provided for annotating digital images of line plots with ground truth labels. For each digital image, such a method includes supplying image data defining the image of a line plot to a machine-learning model trained to generate a set of control points defining a spline corresponding to the line plot. The method further comprises displaying the spline, and the set of control points, superimposed on the image in a graphical user interface and, in response to user manipulation via the graphical user interface of one or more control points, dynamically adjusting the displayed spline in accordance with manipulated control points whereby the displayed spline can be adjusted for conformity with the line plot. The set of control points for the adjusted spline is then stored as a ground truth label for the image.Type: ApplicationFiled: March 27, 2020Publication date: September 30, 2021Inventors: Martin Rufli, Ralf Kaestner, Alexander Velizhev, Peter Willem Jan Staar, Michele Dolfi, Elliot Jacques Vincent, Christoph Auer
-
Publication number: 20210279516Abstract: A method, system and computer program product to generate a training data set for image segmentation applications, comprising providing a set of input documents of a first format. The input documents each comprise one or more pages. The input documents are split into individual document pages and parsed. Parsing comprises identifying a predefined set of items including position information of the position of the predefined set of items in the individual document pages; generating a bitmap image of a second format for each individual document page of the first format. The bitmap image comprises a predefined number of pixels. A mask is generated for each individual document. The mask comprises the predefined number of pixels of the corresponding bitmap image. Generating the mask comprises assigning an encoded class label to each pixel of the mask based on the position information of identified items of the predefined set of items.Type: ApplicationFiled: March 6, 2020Publication date: September 9, 2021Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Publication number: 20210279299Abstract: Aspects of the present invention disclose a method, computer program product, and system for performing a multiplication of a matrix with an input vector. The method includes one or more processors subdividing a matrix into logical segments, the matrix being given in a sparse-matrix data format. The method further includes one or more processors obtaining one or more test vectors. The method further includes one or more processors performing an optimization cycle. In an additional aspect, performing the optimization cycle further comprises, for each of the test vectors, one or more processors, performing a cache performance test.Type: ApplicationFiled: March 9, 2020Publication date: September 9, 2021Inventors: Leonidas Georgopoulos, Peter Staar, Michele Dolfi, Christoph Auer, Konstantinos Bekas
-
Publication number: 20210279532Abstract: A computer-implemented method for processing a digital image. The digital image comprises one or more text cells, wherein each of the one or more text cells comprises a string and a bounding box. The method comprises receiving the digital image in a first format, the first format providing access to the strings and the bounding boxes of the one more text cells. The methods further comprises encoding the strings of the one or more text cells as visual pattern according to a predefined string encoding scheme and providing the digital image in a second format. The second format comprises the visual pattern of the strings of the one or more text cells. A corresponding system and a related computer program product is provided.Type: ApplicationFiled: March 6, 2020Publication date: September 9, 2021Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Publication number: 20210279400Abstract: The invention is notably directed to a computer-implemented method for processing a plurality of documents. The method comprises providing the plurality of documents in a first format and splitting each of the plurality of documents of the first format into one or more individual pages. The method further comprises individually parsing the one or more individual pages of the plurality of documents. The parsing comprises identifying a predefined set of items of the one or more individual pages. Further processing comprises gathering the predefined set of items of each of the one or more individual pages of the plurality of documents into individual page files of a second format and performing the document processing service with the individual page files of the second format. The invention further concerns a corresponding computing system and a related computer program product.Type: ApplicationFiled: March 6, 2020Publication date: September 9, 2021Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Konstantinos Bekas
-
Publication number: 20210279636Abstract: A computer-implemented method for determining a set of target items to be annotated for training a machine learning application. The method comprises providing a training data set with a set of data samples and an auto-encoder with a classifier. The auto-encoder comprises an embedding model that maps the set of data samples to a set of compressed feature vectors. The set of compressed feature vectors define a compressed feature matrix. Further provided are: a definition of a graph associated to the compressed feature matrix, applying a clustering-algorithm to identify node clusters of the graph and applying a centrality algorithm to identify central nodes of the node clusters, retrieving from an annotator node labels for the central nodes, propagating the annotated node labels to other nodes of the graph and performing a training of the embedding model and the classifier with the annotated and the propagated node labels.Type: ApplicationFiled: March 6, 2020Publication date: September 9, 2021Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Ralf Kaestner, Alexander Velizhev, Dal Noguer Hidalgo, Rita Kuznetsova, Konstantinos Bekas
-
Patent number: 11086861Abstract: A computer-implemented method for generating ground-truth for natural language querying may include providing a knowledge graph as data model, receiving a natural language query from a user and translating the natural language query into a formal data query. The method can also include visualizing the formal data query to the user and receiving a feedback response from the user. The feedback response can include a verified and/or edited formal data query. The method can also include storing the natural language query and the corresponding feedback response as ground-truth pair. Corresponding system and a related computer program product may be provided.Type: GrantFiled: June 20, 2019Date of Patent: August 10, 2021Assignee: International Business Machines CorporationInventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Aleksandros Sobczyk, Tim Jan Baccaert, Konstantinos Bekas