Patents by Inventor Christoph Adrian Miksovic Czasch

Christoph Adrian Miksovic Czasch has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Selection-based searching using concatenated word and context

Patent number: 11983208

Abstract: A method, computer system, and a computer program product for searching are provided. The method may include receiving a word and a context of the word. The context may include additional words. A first word embedding may be generated by inputting a sequence into a word embedding model that resultantly outputs the first word embedding. The sequence may include the word and the context that are concatenated to each other in the sequence. The first word embedding may be compared with other word embeddings. The other word embeddings may have been generated by inputting respective text portions of other texts into the word embedding model. A candidate match of the other texts may be presented. A respective word embedding of the candidate match may be, of the other word embeddings, most similar to the first word embedding according to the comparing.

Type: Grant

Filed: February 16, 2021

Date of Patent: May 14, 2024

Assignee: International Business Machines Corporation

Inventors: Richard Obinna Osuala, Christoph Adrian Miksovic Czasch
EXPLAINABLE PREDICTION MODELS BASED ON CONCEPTS

Publication number: 20240119276

Abstract: Generating a neural network model for producing explainable prediction outputs for input data samples is provided. Training dataset of data samples are provided, each having a prediction label indicating a desired prediction output from the model for that sample, and a set of concept vectors are defined comprising a plurality of concept vectors which are associated with respective predefined concepts characterizing information content of the data samples. A set of input vectors are produced from each data sample. A neural network model is trained that includes a cross-attention module for producing a sample embedding for a data sample and a prediction module for producing a prediction output from the sample embedding.

Type: Application

Filed: September 30, 2022

Publication date: April 11, 2024

Inventors: MATTIA RIGOTTI, IOANA GIURGIU, THOMAS GSCHWIND, CHRISTOPH ADRIAN MIKSOVIC CZASCH, PAOLO SCOTTON
STRING SIMILARITY DETERMINATION

Publication number: 20230012602

Abstract: A system and a method for determining a similarity between a first string and a second string. A sequence of edit operations are performed on the first string in order to obtain the second string may be determined. The edit operation is of a first type or a second type. The first type operation comprises a character insertion operation or character removal operation. The second type operation comprises a character maintenance operation. The first type edit operation is associated with an operation score indicative of a cost for applying the edit operation. The first type edit operation is associated with a switching score indicative whether it is immediately followed by a second type edit operation. The switching scores and/or operation scores associated with the sequence of edit operations are combined in order to obtain a combined score that is indicative of the similarity level between the first and second strings.

Type: Application

Filed: July 14, 2021

Publication date: January 19, 2023

Inventors: Thomas Gschwind, Christoph Adrian Miksovic Czasch, Paolo Scotton
String similarity determination

Patent number: 11556593

Abstract: A system and a method for determining a similarity between a first string and a second string. A sequence of edit operations are performed on the first string in order to obtain the second string may be determined. The edit operation is of a first type or a second type. The first type operation comprises a character insertion operation or character removal operation. The second type operation comprises a character maintenance operation. The first type edit operation is associated with an operation score indicative of a cost for applying the edit operation. The first type edit operation is associated with a switching score indicative whether it is immediately followed by a second type edit operation. The switching scores and/or operation scores associated with the sequence of edit operations are combined in order to obtain a combined score that is indicative of the similarity level between the first and second strings.

Type: Grant

Filed: July 14, 2021

Date of Patent: January 17, 2023

Assignee: International Business Machines Corporation

Inventors: Thomas Gschwind, Christoph Adrian Miksovic Czasch, Paolo Scotton
Determining metadata of a dataset

Patent number: 11550777

Abstract: The present disclosure relates to a method for enabling a processing of a dataset of records having a set of attributes. The method comprises: selecting a first attribute of the set of attributes and a subset of one or more second attributes of the set of attributes. Distinct values of the subset of second attributes may be determined from the dataset. For each distinct value of the determined distinct values records of the dataset that have said each distinct value may be identified, and a group of words may be formed from values of the first attribute of the identified records. Distinct word sequences may be identified in the formed groups and a level of presence of each word sequence of the word sequences in each of the formed groups may be determined. At least part of the levels of presence may be provided as metadata.

Type: Grant

Filed: July 29, 2020

Date of Patent: January 10, 2023

Assignee: International Business Machines Corporation

Inventors: Thomas Gschwind, Christoph Adrian Miksovic Czasch, Paolo Scotton
Multicriteria record linkage with surrogate blocking keys

Patent number: 11520764

Abstract: A computer-implemented method and a related system for record linkage of an incoming record to a reference data set may be provided. The method comprises providing a reference data set comprising a plurality of records, each record comprising a plurality of attributes. The method comprises further assigning each of the plurality of records an initial surrogate identifier value, assigning a plurality of block identifiers to each of the records by applying a locality sensitive hashing function to a predefined attribute of the records, resulting in the plurality of the block identifiers, and determining a final surrogate identifier value to each of the records assigned to one of the blocks such that the final surrogate identifier values in each block are uniformly distributed.

Type: Grant

Filed: June 27, 2019

Date of Patent: December 6, 2022

Assignee: International Business Machines Corporation

Inventors: Thomas Gschwind, Christoph Adrian Miksovic Czasch, Paolo Scotton
LEVERAGING EXPLANATIONS FOR TRAINING OF AN AI SYSTEM

Publication number: 20220374703

Abstract: Computer-implemented methods, computer program products, and computer systems for training of an explaining machine-learning model is disclosed. The computer-implemented method may include one or more processors configured for providing an untrained machine-learning model, providing training data for the machine-learning model comprising training input data elements, wherein each of the training input data elements relates to a prediction label representing an expected prediction value as well as to a concept label, wherein the concept label relates to a reason why the expected prediction label is expected given the training input data elements, and simultaneously updating, during a supervised training of the machine-learning model, prediction parameter values as well as concept parameter values, thereby building the explaining machine-learning model.

Type: Application

Filed: May 18, 2021

Publication date: November 24, 2022

Inventors: Mattia Rigotti, Christoph Adrian Miksovic Czasch, Paolo Scotton, Thomas Gschwind, Adelmo Cristiano Innocenza Malossi, Thomas Frick, Filip Michal Janicki
SELECTION-BASED SEARCHING USING CONCATENATED WORD AND CONTEXT

Publication number: 20220261428

Abstract: A method, computer system, and a computer program product for searching are provided. The method may include receiving a word and a context of the word. The context may include additional words. A first word embedding may be generated by inputting a sequence into a word embedding model that resultantly outputs the first word embedding. The sequence may include the word and the context that are concatenated to each other in the sequence. The first word embedding may be compared with other word embeddings. The other word embeddings may have been generated by inputting respective text portions of other texts into the word embedding model. A candidate match of the other texts may be presented. A respective word embedding of the candidate match may be, of the other word embeddings, most similar to the first word embedding according to the comparing.

Type: Application

Filed: February 16, 2021

Publication date: August 18, 2022

Inventors: Richard Obinna Osuala, Christoph Adrian Miksovic Czasch
KNOWLEDGE GRAPH EMBEDDING USING GRAPH CONVOLUTIONAL NETWORKS WITH RELATION-AWARE ATTENTION

Publication number: 20220245425

Abstract: A knowledge graph embedding method, system, and computer program product using a computing device to embed a knowledge graph using a graph convolutional network, the method including learning, by the computing device, an embedding of the knowledge graph that includes entities, relations, and edges, weighing, by the computing device, initial feature vectors of nodes and a convolutional layer output to compute a weight and modifying the embedding based on the weight, and using, by the computing device, the modified embedding to perform a task related to the knowledge graph.

Type: Application

Filed: January 29, 2021

Publication date: August 4, 2022

Inventors: Nasrullah Sheikh, Xiao Qin, Berthold Reinwald, Christoph Adrian Miksovic Czasch, Thomas Gschwind, Paolo Scotton
DETERMINING METADATA OF A DATASET

Publication number: 20220035792

Abstract: The present disclosure relates to a method for enabling a processing of a dataset of records having a set of attributes. The method comprises: selecting a first attribute of the set of attributes and a subset of one or more second attributes of the set of attributes. Distinct values of the subset of second attributes may be determined from the dataset. For each distinct value of the determined distinct values records of the dataset that have said each distinct value may be identified, and a group of words may be formed from values of the first attribute of the identified records. Distinct word sequences may be identified in the formed groups and a level of presence of each word sequence of the word sequences in each of the formed groups may be determined. At least part of the levels of presence may be provided as metadata.

Type: Application

Filed: July 29, 2020

Publication date: February 3, 2022

Inventors: Thomas Gschwind, Christoph Adrian Miksovic Czasch, Paolo Scotton
Similarity matching systems and methods for record linkage

Patent number: 11182395

Abstract: A given query entity of a query database and a set of reference entities from a master database are accessed; each entity accessed corresponds to an entry in a respective database, which is mapped to a set of words that are decomposed into tokens. For each reference entity, a closest token is identified therein for each token of the given query entity, via a given string metric. A number of closest tokens are thus respectively associated with highest scores of similarity between tokens of the query entity and tokens of each reference entity. An entity similarity score is computed based on said highest scores. A reference entity of the master database is identified, which is closest to said given query entity, based on the entity similarity score. Records of the given query entity are linked to records of the master database, based on the closest reference entity identified.

Type: Grant

Filed: May 15, 2018

Date of Patent: November 23, 2021

Assignee: International Business Machines Corporation

Inventors: Katsiaryna Mirylenka, Paolo Scotton, Christoph Adrian Miksovic Czasch, Andreas Schade
MULTICRITERIA RECORD LINKAGE WITH SURROGATE BLOCKING KEYS

Publication number: 20200409922

Abstract: A computer-implemented method and a related system for record linkage of an incoming record to a reference data set may be provided. The method comprises providing a reference data set comprising a plurality of records, each record comprising a plurality of attributes. The method comprises further assigning each of the plurality of records an initial surrogate identifier value, assigning a plurality of block identifiers to each of the records by applying a locality sensitive hashing function to a predefined attribute of the records, resulting in the plurality of the block identifiers, and determining a final surrogate identifier value to each of the records assigned to one of the blocks such that the final surrogate identifier values in each block are uniformly distributed.

Type: Application

Filed: June 27, 2019

Publication date: December 31, 2020

Inventors: Thomas Gschwind, Christoph Adrian Miksovic Czasch, Paolo Scotton
SIMILARITY MATCHING SYSTEMS AND METHODS FOR RECORD LINKAGE

Publication number: 20190354596

Abstract: A given query entity of a query database and a set of reference entities from a master database are accessed; each entity accessed corresponds to an entry in a respective database, which is mapped to a set of words that are decomposed into tokens. For each reference entity, a closest token is identified therein for each token of the given query entity, via a given string metric. A number of closest tokens are thus respectively associated with highest scores of similarity between tokens of the query entity and tokens of each reference entity. An entity similarity score is computed based on said highest scores. A reference entity of the master database is identified, which is closest to said given query entity, based on the entity similarity score. Records of the given query entity are linked to records of the master database, based on the closest reference entity identified.

Type: Application

Filed: May 15, 2018

Publication date: November 21, 2019

Inventors: Katsiaryna Mirylenka, Paolo Scotton, Christoph Adrian Miksovic Czasch, Andreas Schade