Patents by Inventor Christoph Adrian Miksovic Czasch
Christoph Adrian Miksovic Czasch has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11983208Abstract: A method, computer system, and a computer program product for searching are provided. The method may include receiving a word and a context of the word. The context may include additional words. A first word embedding may be generated by inputting a sequence into a word embedding model that resultantly outputs the first word embedding. The sequence may include the word and the context that are concatenated to each other in the sequence. The first word embedding may be compared with other word embeddings. The other word embeddings may have been generated by inputting respective text portions of other texts into the word embedding model. A candidate match of the other texts may be presented. A respective word embedding of the candidate match may be, of the other word embeddings, most similar to the first word embedding according to the comparing.Type: GrantFiled: February 16, 2021Date of Patent: May 14, 2024Assignee: International Business Machines CorporationInventors: Richard Obinna Osuala, Christoph Adrian Miksovic Czasch
-
Publication number: 20240119276Abstract: Generating a neural network model for producing explainable prediction outputs for input data samples is provided. Training dataset of data samples are provided, each having a prediction label indicating a desired prediction output from the model for that sample, and a set of concept vectors are defined comprising a plurality of concept vectors which are associated with respective predefined concepts characterizing information content of the data samples. A set of input vectors are produced from each data sample. A neural network model is trained that includes a cross-attention module for producing a sample embedding for a data sample and a prediction module for producing a prediction output from the sample embedding.Type: ApplicationFiled: September 30, 2022Publication date: April 11, 2024Inventors: MATTIA RIGOTTI, IOANA GIURGIU, THOMAS GSCHWIND, CHRISTOPH ADRIAN MIKSOVIC CZASCH, PAOLO SCOTTON
-
Publication number: 20230012602Abstract: A system and a method for determining a similarity between a first string and a second string. A sequence of edit operations are performed on the first string in order to obtain the second string may be determined. The edit operation is of a first type or a second type. The first type operation comprises a character insertion operation or character removal operation. The second type operation comprises a character maintenance operation. The first type edit operation is associated with an operation score indicative of a cost for applying the edit operation. The first type edit operation is associated with a switching score indicative whether it is immediately followed by a second type edit operation. The switching scores and/or operation scores associated with the sequence of edit operations are combined in order to obtain a combined score that is indicative of the similarity level between the first and second strings.Type: ApplicationFiled: July 14, 2021Publication date: January 19, 2023Inventors: Thomas Gschwind, Christoph Adrian Miksovic Czasch, Paolo Scotton
-
Patent number: 11556593Abstract: A system and a method for determining a similarity between a first string and a second string. A sequence of edit operations are performed on the first string in order to obtain the second string may be determined. The edit operation is of a first type or a second type. The first type operation comprises a character insertion operation or character removal operation. The second type operation comprises a character maintenance operation. The first type edit operation is associated with an operation score indicative of a cost for applying the edit operation. The first type edit operation is associated with a switching score indicative whether it is immediately followed by a second type edit operation. The switching scores and/or operation scores associated with the sequence of edit operations are combined in order to obtain a combined score that is indicative of the similarity level between the first and second strings.Type: GrantFiled: July 14, 2021Date of Patent: January 17, 2023Assignee: International Business Machines CorporationInventors: Thomas Gschwind, Christoph Adrian Miksovic Czasch, Paolo Scotton
-
Patent number: 11550777Abstract: The present disclosure relates to a method for enabling a processing of a dataset of records having a set of attributes. The method comprises: selecting a first attribute of the set of attributes and a subset of one or more second attributes of the set of attributes. Distinct values of the subset of second attributes may be determined from the dataset. For each distinct value of the determined distinct values records of the dataset that have said each distinct value may be identified, and a group of words may be formed from values of the first attribute of the identified records. Distinct word sequences may be identified in the formed groups and a level of presence of each word sequence of the word sequences in each of the formed groups may be determined. At least part of the levels of presence may be provided as metadata.Type: GrantFiled: July 29, 2020Date of Patent: January 10, 2023Assignee: International Business Machines CorporationInventors: Thomas Gschwind, Christoph Adrian Miksovic Czasch, Paolo Scotton
-
Patent number: 11520764Abstract: A computer-implemented method and a related system for record linkage of an incoming record to a reference data set may be provided. The method comprises providing a reference data set comprising a plurality of records, each record comprising a plurality of attributes. The method comprises further assigning each of the plurality of records an initial surrogate identifier value, assigning a plurality of block identifiers to each of the records by applying a locality sensitive hashing function to a predefined attribute of the records, resulting in the plurality of the block identifiers, and determining a final surrogate identifier value to each of the records assigned to one of the blocks such that the final surrogate identifier values in each block are uniformly distributed.Type: GrantFiled: June 27, 2019Date of Patent: December 6, 2022Assignee: International Business Machines CorporationInventors: Thomas Gschwind, Christoph Adrian Miksovic Czasch, Paolo Scotton
-
Publication number: 20220374703Abstract: Computer-implemented methods, computer program products, and computer systems for training of an explaining machine-learning model is disclosed. The computer-implemented method may include one or more processors configured for providing an untrained machine-learning model, providing training data for the machine-learning model comprising training input data elements, wherein each of the training input data elements relates to a prediction label representing an expected prediction value as well as to a concept label, wherein the concept label relates to a reason why the expected prediction label is expected given the training input data elements, and simultaneously updating, during a supervised training of the machine-learning model, prediction parameter values as well as concept parameter values, thereby building the explaining machine-learning model.Type: ApplicationFiled: May 18, 2021Publication date: November 24, 2022Inventors: Mattia Rigotti, Christoph Adrian Miksovic Czasch, Paolo Scotton, Thomas Gschwind, Adelmo Cristiano Innocenza Malossi, Thomas Frick, Filip Michal Janicki
-
Publication number: 20220261428Abstract: A method, computer system, and a computer program product for searching are provided. The method may include receiving a word and a context of the word. The context may include additional words. A first word embedding may be generated by inputting a sequence into a word embedding model that resultantly outputs the first word embedding. The sequence may include the word and the context that are concatenated to each other in the sequence. The first word embedding may be compared with other word embeddings. The other word embeddings may have been generated by inputting respective text portions of other texts into the word embedding model. A candidate match of the other texts may be presented. A respective word embedding of the candidate match may be, of the other word embeddings, most similar to the first word embedding according to the comparing.Type: ApplicationFiled: February 16, 2021Publication date: August 18, 2022Inventors: Richard Obinna Osuala, Christoph Adrian Miksovic Czasch
-
Publication number: 20220245425Abstract: A knowledge graph embedding method, system, and computer program product using a computing device to embed a knowledge graph using a graph convolutional network, the method including learning, by the computing device, an embedding of the knowledge graph that includes entities, relations, and edges, weighing, by the computing device, initial feature vectors of nodes and a convolutional layer output to compute a weight and modifying the embedding based on the weight, and using, by the computing device, the modified embedding to perform a task related to the knowledge graph.Type: ApplicationFiled: January 29, 2021Publication date: August 4, 2022Inventors: Nasrullah Sheikh, Xiao Qin, Berthold Reinwald, Christoph Adrian Miksovic Czasch, Thomas Gschwind, Paolo Scotton
-
Publication number: 20220035792Abstract: The present disclosure relates to a method for enabling a processing of a dataset of records having a set of attributes. The method comprises: selecting a first attribute of the set of attributes and a subset of one or more second attributes of the set of attributes. Distinct values of the subset of second attributes may be determined from the dataset. For each distinct value of the determined distinct values records of the dataset that have said each distinct value may be identified, and a group of words may be formed from values of the first attribute of the identified records. Distinct word sequences may be identified in the formed groups and a level of presence of each word sequence of the word sequences in each of the formed groups may be determined. At least part of the levels of presence may be provided as metadata.Type: ApplicationFiled: July 29, 2020Publication date: February 3, 2022Inventors: Thomas Gschwind, Christoph Adrian Miksovic Czasch, Paolo Scotton
-
Patent number: 11182395Abstract: A given query entity of a query database and a set of reference entities from a master database are accessed; each entity accessed corresponds to an entry in a respective database, which is mapped to a set of words that are decomposed into tokens. For each reference entity, a closest token is identified therein for each token of the given query entity, via a given string metric. A number of closest tokens are thus respectively associated with highest scores of similarity between tokens of the query entity and tokens of each reference entity. An entity similarity score is computed based on said highest scores. A reference entity of the master database is identified, which is closest to said given query entity, based on the entity similarity score. Records of the given query entity are linked to records of the master database, based on the closest reference entity identified.Type: GrantFiled: May 15, 2018Date of Patent: November 23, 2021Assignee: International Business Machines CorporationInventors: Katsiaryna Mirylenka, Paolo Scotton, Christoph Adrian Miksovic Czasch, Andreas Schade
-
Publication number: 20200409922Abstract: A computer-implemented method and a related system for record linkage of an incoming record to a reference data set may be provided. The method comprises providing a reference data set comprising a plurality of records, each record comprising a plurality of attributes. The method comprises further assigning each of the plurality of records an initial surrogate identifier value, assigning a plurality of block identifiers to each of the records by applying a locality sensitive hashing function to a predefined attribute of the records, resulting in the plurality of the block identifiers, and determining a final surrogate identifier value to each of the records assigned to one of the blocks such that the final surrogate identifier values in each block are uniformly distributed.Type: ApplicationFiled: June 27, 2019Publication date: December 31, 2020Inventors: Thomas Gschwind, Christoph Adrian Miksovic Czasch, Paolo Scotton
-
Publication number: 20190354596Abstract: A given query entity of a query database and a set of reference entities from a master database are accessed; each entity accessed corresponds to an entry in a respective database, which is mapped to a set of words that are decomposed into tokens. For each reference entity, a closest token is identified therein for each token of the given query entity, via a given string metric. A number of closest tokens are thus respectively associated with highest scores of similarity between tokens of the query entity and tokens of each reference entity. An entity similarity score is computed based on said highest scores. A reference entity of the master database is identified, which is closest to said given query entity, based on the entity similarity score. Records of the given query entity are linked to records of the master database, based on the closest reference entity identified.Type: ApplicationFiled: May 15, 2018Publication date: November 21, 2019Inventors: Katsiaryna Mirylenka, Paolo Scotton, Christoph Adrian Miksovic Czasch, Andreas Schade