Patents by Inventor Mike W. Grasselt

Mike W. Grasselt has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method for weighting a graph

Patent number: 11941056

Abstract: The present disclosure relates to a method for a weighting graph comprising nodes representing entities and edges representing relationships between entities in accordance with one or more domains. The method comprises: pre-processing the graph comprising assigning weights to the nodes and/or the edges of the graph in accordance with a specific domain of the domains, wherein the weight indicates a domain specific data quality problem of attribute values representing an edge of the edges and/or an entity involved in that edge. The weighted graph may be provided for enabling a processing of the graph in accordance with the specific domain.

Type: Grant

Filed: April 20, 2021

Date of Patent: March 26, 2024

Assignee: International Business Machines Corporation

Inventors: Martin Oberhofer, Mike W. Grasselt, Claudio Andrea Fanconi, Thuany Karoline Stuart, Yannick Saillet, Basem Elasioty, Hemanth Kumar Babu, Robert Kern
Data classification

Patent number: 11748382

Abstract: A method provides for classifying data fields of a dataset. A classifier configured for determining confidence values for a plurality of data classes for the data fields may be applied. Using the confidence values, data class candidates may be identified. Data fields may be determined for which a plurality of data class candidates is identifiable. Using previous user-selected data class assignments, a probability may be determined for the data class candidates that the respective data class candidate is a data class to which the respective data field is to be assigned. The data fields may be classified using the probabilities to select for the data fields a data class from the data class candidates. The dataset may be provided with metadata identifying for the data fields the data classes to which the respective data fields are assigned.

Type: Grant

Filed: May 18, 2020

Date of Patent: September 5, 2023

Assignee: International Business Machines Corporation

Inventors: Yannick Saillet, Namit Kabra, Mike W. Grasselt, Krishna Kishore Bonagiri
DETECT DATA STANDARDIZATION GAPS

Publication number: 20230185786

Abstract: A computer-implemented method for detecting reference data standardization gaps in data sets is disclosed. The method comprises identifying at least one reference data candidate in a data set, using an index for values of the identified at least one reference data candidate, and determining a difference between an earlier version of a reference data set relating to the reference data candidate and a current version of the reference data set. Furthermore, the method comprises comparing the determined difference with values of the index, and identifying entries in the at least one reference data candidate having a value identical to a value of the difference as reference data standardization gap.

Type: Application

Filed: December 13, 2021

Publication date: June 15, 2023

Inventors: Albert Maier, Dennis Butterstein, Alexandre Luz Xavier Da Costa, Mike W. Grasselt, Timo Kussmaul, Yevgen Karpenko
CONDITIONAL ACCESS TO DATA

Publication number: 20230177193

Abstract: A database system can comprise records, each record including a set of attributes. The database system can further comprise database views, each database view representing a subset of the set of attributes. Data purpose objects indicating a subset of attributes of the set of attributes and a processing purpose can be stored. Each processing purpose can be associated with one or more entities that authorized access to the subset of attributes of the processing purpose. A request for data for a specific processing purpose and a selected view of the database views can be received. A data purpose object that indicates the specific processing purpose can be retrieved. The subset of attributes represented by the selected view can be compared with the subset of the attributes indicated in the retrieved data purpose object. Values of the subset of attributes of the selected view can be provided.

Type: Application

Filed: December 8, 2021

Publication date: June 8, 2023

Inventors: Lars Bremer, Albert Maier, Mike W. Grasselt, Yannick Saillet, Michael Baessler
Measuring data quality of data in a graph database

Patent number: 11651055

Abstract: Methods, computer program products and/or systems are provided that perform the following operations: obtaining a first graph comprising first nodes representing first entities and first edges representing relationships between first entities, the first nodes being associated with first entity attributes descriptive of the first entities represented by the first nodes, the first edges being associated with first edge attributes descriptive of the relationships represented by the first edges; determining a first subgraph for a certain node of the first nodes of the first graph, the first subgraph including the certain node and at least one neighboring node of the certain node; and determining a data quality issue regarding the certain node based, at least in part, on applying one or more applicable rules of a set of data quality rules to first entity attribute values and first edge attribute values of the first subgraph.

Type: Grant

Filed: October 29, 2020

Date of Patent: May 16, 2023

Assignee: International Business Machines Corporation

Inventors: Yannick Saillet, Claudio Andrea Fanconi, Martin Oberhofer, Hemanth Kumar Babu, Basem Elasioty, Mike W. Grasselt, Robert Kern, Thuany Karoline Stuart
Standardization in the context of data integration

Patent number: 11550813

Abstract: Techniques are described relating to automatic data standardization in a managed services domain of a cloud computing environment. An associated computer-implemented method includes receiving a dataset during a data onboarding procedure and classifying datapoints within the dataset. The method further includes applying a machine learning data standardization model to each classified datapoint within the dataset and deriving a proposed set of data standardization rules for the dataset based upon any standardization modification determined consequent to model application. Optionally, the method includes presenting the proposed set of data standardization rules for client review and, responsive to acceptance of the proposed set of data standardization rules, applying the proposed set of data standardization rules to the dataset. The method further includes, responsive to acceptance of the proposed set of data standardization rules, updating the machine learning data standardization model accordingly.

Type: Grant

Filed: February 24, 2021

Date of Patent: January 10, 2023

Assignee: International Business Machines Corporation

Inventors: Namit Kabra, Krishna Kishore Bonagiri, Mike W. Grasselt, Yannick Saillet
Rule generation in a data governance framework

Patent number: 11537552

Abstract: A computer system, computer program product, and a computer-implemented method for supplementing a data governance framework with one or more new data governance technical rules is disclosed. The method comprises providing a plurality of expressions and a first mapping. The expressions assign natural language patterns to technical language patterns. The first mapping maps first terms to data sources. A rule generator receives a new natural language (NL) rule comprising one or more natural-language patterns and one or more first terms. The rule generator resolves the new NL rule into one or more new technical rules interpretable by a respective rule engine and stores the one or more technical rules in a rule repository.

Type: Grant

Filed: May 11, 2020

Date of Patent: December 27, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Mike W. Grasselt, Yannick Saillet, Marvin Schaefer
CONDENSING HIERARCHIES IN A GOVERNANCE SYSTEM BASED ON USAGE

Publication number: 20220391848

Abstract: Embodiments of the present invention provide methods, computer program products, and systems. Embodiments of the present invention can condense a hierarchy in a data governance system, wherein the hierarchy comprises a root node and at least one child node comprising related sub-trees by determining, for a parent node in the hierarchy of governance system, governance terms and respective assignment relationships from a plurality of information assets, determining usage of the governance term in at least one of a plurality of governance rules, and marking a governance term of the plurality of governance terms for elimination based on the determined assignment relationships and the determined usage of the governance term in the plurality of governance rules. Embodiments of the present invention can then delete the governance term from the hierarchy if the governance term is marked for elimination.

Type: Application

Filed: June 7, 2021

Publication date: December 8, 2022

Inventors: Albert Maier, Mike W. Grasselt, Yannick Saillet, Lars Bremer, Michael Baessler
Method for duplicate determination in a graph

Patent number: 11500876

Abstract: Embodiments of the present invention determines duplicates in a graph. The graph comprises nodes representing entities and edges representing relationships between the entities. The method comprises: identifying at least two nodes in the graph. A neighborhood subgraph may be determined for each of the two nodes. The neighborhood subgraph includes the respective node. The method further comprises determining whether the two nodes are duplicates with respect to each other, based on a result of a comparison between the two subgraphs.

Type: Grant

Filed: December 8, 2020

Date of Patent: November 15, 2022

Assignee: International Business Machines Corporation

Inventors: Thuany Karoline Stuart, Basem Elasioty, Claudio Andrea Fanconi, Mike W. Grasselt, Hemanth Kumar Babu, Yannick Saillet, Robert Kern, Martin Oberhofer, Lars Bremer, Jonathan Roesner, Jason Allen Woods
Sorting data elements of a given set of data elements

Patent number: 11487770

Abstract: A computer implemented method is used for sorting data elements of a given set. The method includes performing an evaluation of a first type of usage of each data element. The method includes determining a set of data element candidates dependent on the evaluation of the first type of usage. The method includes performing an evaluation of a second type of usage of each data element of the set of data element candidates. The method includes sorting the data elements of the set of data element candidates dependent on the evaluation of the second type of usage of each data element of the set of data element candidates. The method includes providing the sorted data elements of the set of data element candidates, and in response, receiving a request for a data processing based on the provided sorted data elements of the set of data element candidates.

Type: Grant

Filed: May 18, 2020

Date of Patent: November 1, 2022

Assignee: International Business Machines Corporation

Inventors: Albert Maier, Mike W. Grasselt, Yannick Saillet, Lars Bremer, Michael Baessler
AUTOMATIC APPLICATION DEPENDENCY MANAGEMENT

Publication number: 20220318028

Abstract: A database of deployed configurations, as well as attempted configurations that failed is maintained and used as reference to compare against configurations of attempted software deployments. Upon detecting a failed deployment, disclosed embodiments search the database for working configurations that most closely resemble the failed configuration, and rank the configurations based on various criteria. Disclosed embodiments may then automatically select a highest ranked working configuration, and perform an automatic upgrade of the necessary components to create a working configuration.

Type: Application

Filed: April 6, 2021

Publication date: October 6, 2022

Inventors: Krishna Kishore Bonagiri, Namit Kabra, Yannick Saillet, Mike W. Grasselt
STANDARDIZATION IN THE CONTEXT OF DATA INTEGRATION

Publication number: 20220277017

Abstract: Techniques are described relating to automatic data standardization in a managed services domain of a cloud computing environment. An associated computer-implemented method includes receiving a dataset during a data onboarding procedure and classifying datapoints within the dataset. The method further includes applying a machine learning data standardization model to each classified datapoint within the dataset and deriving a proposed set of data standardization rules for the dataset based upon any standardization modification determined consequent to model application. Optionally, the method includes presenting the proposed set of data standardization rules for client review and, responsive to acceptance of the proposed set of data standardization rules, applying the proposed set of data standardization rules to the dataset. The method further includes, responsive to acceptance of the proposed set of data standardization rules, updating the machine learning data standardization model accordingly.

Type: Application

Filed: February 24, 2021

Publication date: September 1, 2022

Inventors: Namit Kabra, Krishna Kishore Bonagiri, Mike W. Grasselt, Yannick Saillet
MEASURING DATA QUALITY OF DATA IN A GRAPH DATABASE

Publication number: 20220138512

Abstract: Methods, computer program products and/or systems are provided that perform the following operations: obtaining a first graph comprising first nodes representing first entities and first edges representing relationships between first entities, the first nodes being associated with first entity attributes descriptive of the first entities represented by the first nodes, the first edges being associated with first edge attributes descriptive of the relationships represented by the first edges; determining a first subgraph for a certain node of the first nodes of the first graph, the first subgraph including the certain node and at least one neighboring node of the certain node; and determining a data quality issue regarding the certain node based, at least in part, on applying one or more applicable rules of a set of data quality rules to first entity attribute values and first edge attribute values of the first subgraph.

Type: Application

Filed: October 29, 2020

Publication date: May 5, 2022

Inventors: Yannick Saillet, Claudio Andrea Fanconi, Martin Oberhofer, Hemanth Kumar Babu, Basem Elasioty, Mike W. Grasselt, Robert Kern, Thuany Karoline Stuart
MASKING SENSITIVE INFORMATION IN A DOCUMENT

Publication number: 20220123935

Abstract: The exemplary embodiments disclose a method, a computer program product, and a computer system for protecting sensitive information. The exemplary embodiments may include using an inverted text index for evaluating one or more statistical measures of an index token of the inverted text index, using the one or more statistical measures for selecting a set of candidate tokens, extracting metadata from the inverted text index, associating the set of candidate tokens with respective token metadata, tokenizing at least one document resulting in one or more document tokens, comparing the one or more document tokens with the set of candidate tokens, selecting a set of document tokens to be masked, selecting at least part of the set of document tokens that comprises sensitive information according to the associated token metadata, masking the at least part of the set of document tokens, and providing one or more masked documents.

Type: Application

Filed: October 19, 2020

Publication date: April 21, 2022

Inventors: Michael Baessler, Albert Maier, Mike W. Grasselt, Yannick Saillet, Lars Bremer
PROTECTING SENSITIVE DATA IN DOCUMENTS

Publication number: 20220100899

Abstract: In an approach, a processor receives a request of a document. A processor identifies a set of datasets comprising a sensitive dataset, the set of datasets being interrelated in accordance with a relational model. A processor extracts attribute values of the document. A processor determines that a set of one or more attribute values of the extracted attribute values is in the set of datasets, the set of attribute values being values of a set of attributes. A processor determines that one or more entities of the sensitive dataset can be identified based on relations of the relational model between the set of attributes, where at least part of attribute values of the one or more entities comprises sensitive information. A processor, responsive to determining that the one or more entities can be identified, masks at least part of the set of one or more attribute values in the document.

Type: Application

Filed: September 25, 2020

Publication date: March 31, 2022

Inventors: Yannick Saillet, Albert Maier, Mike W. Grasselt, Michael Baessler, Lars Bremer
DATA QUALITY ASSESSMENT FOR DATA ANALYTICS

Publication number: 20210357699

Abstract: The invention relates to an approach for data quality assessment for data analytics, the approach comprising providing a data set, the data set comprising multiple data fields, predicting by a first trained machine learning model at least one usage type of the data set using characteristics of the data fields as input, for each usage type of the at least one usage type, determining a usage specific data quality score of each of the predicted usage types, and using of the data set based on the at least one usage type and associated data quality score.

Type: Application

Filed: May 14, 2020

Publication date: November 18, 2021

Inventors: Yannick Saillet, Mike W. Grasselt, Namit Kabra, Krishna Kishore Bonagiri
SORTING DATA ELEMENTS OF A GIVEN SET OF DATA ELEMENTS

Publication number: 20210357183

Abstract: A computer implemented method is used for sorting data elements of a given set. The method includes performing an evaluation of a first type of usage of each data element. The method includes determining a set of data element candidates dependent on the evaluation of the first type of usage. The method includes performing an evaluation of a second type of usage of each data element of the set of data element candidates. The method includes sorting the data elements of the set of data element candidates dependent on the evaluation of the second type of usage of each data element of the set of data element candidates. The method includes providing the sorted data elements of the set of data element candidates, and in response, receiving a request for a data processing based on the provided sorted data elements of the set of data element candidates.

Type: Application

Filed: May 18, 2020

Publication date: November 18, 2021

Inventors: Albert Maier, Mike W. Grasselt, Yannick Saillet, Lars Bremer, Michael Baessler
METHOD FOR WEIGHTING A GRAPH

Publication number: 20210342397

Abstract: The present disclosure relates to a method for a weighting graph comprising nodes representing entities and edges representing relationships between entities in accordance with one or more domains. The method comprises: pre-processing the graph comprising assigning weights to the nodes and/or the edges of the graph in accordance with a specific domain of the domains, wherein the weight indicates a domain specific data quality problem of attribute values representing an edge of the edges and/or an entity involved in that edge. The weighted graph may be provided for enabling a processing of the graph in accordance with the specific domain.

Type: Application

Filed: April 20, 2021

Publication date: November 4, 2021

Inventors: Martin Oberhofer, Mike W. Grasselt, Claudio Andrea Fanconi, Thuany Karoline Stuart, Yannick Saillet, Basem Elasioty, Hemanth Kumar Babu, Robert Kern
METHOD FOR DUPLICATE DETERMINATION IN A GRAPH

Publication number: 20210342352

Abstract: Embodiments of the present invention determines duplicates in a graph. The graph comprises nodes representing entities and edges representing relationships between the entities. The method comprises: identifying at least two nodes in the graph. A neighborhood subgraph may be determined for each of the two nodes. The neighborhood subgraph includes the respective node. The method further comprises determining whether the two nodes are duplicates with respect to each other, based on a result of a comparison between the two subgraphs.

Type: Application

Filed: December 8, 2020

Publication date: November 4, 2021

Inventors: Thuany Karoline Stuart, Basem Elasioty, Claudio Andrea Fanconi, Mike W. Grasselt, Hemanth Kumar Babu, Yannick Saillet, Robert Kern, Martin Oberhofer, Lars Bremer, Jonathan Roesner, Jason Allen Woods
Data classification

Patent number: 11023497

Abstract: Data classification includes tracking classification of columns of data into data classes of a collection of classes available for classifying the columns, obtaining a target column of data, of a target dataset, to be classified into a data class of the collection of candidate classes, and classifying the target column of data into a data class of the collection of classes based on historical data classification characteristics provided by the tracking. The classifying includes selecting a group of candidate data classes of the collection of classes to compare to value(s) of the target column, the selecting excludes at least some candidate data classes of the collection from comparison to the value(s), and establishing a priority between the candidate data classes of the group of candidate classes in comparing the value(s) of the target column of data to the selected group of candidate classes.

Type: Grant

Filed: September 12, 2019

Date of Patent: June 1, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Namit Kabra, Krishna Kishore Bonagiri, Yannick Saillet, Mike W. Grasselt

1 2 next