Patents by Inventor Mike W. Grasselt

Mike W. Grasselt has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11941056
    Abstract: The present disclosure relates to a method for a weighting graph comprising nodes representing entities and edges representing relationships between entities in accordance with one or more domains. The method comprises: pre-processing the graph comprising assigning weights to the nodes and/or the edges of the graph in accordance with a specific domain of the domains, wherein the weight indicates a domain specific data quality problem of attribute values representing an edge of the edges and/or an entity involved in that edge. The weighted graph may be provided for enabling a processing of the graph in accordance with the specific domain.
    Type: Grant
    Filed: April 20, 2021
    Date of Patent: March 26, 2024
    Assignee: International Business Machines Corporation
    Inventors: Martin Oberhofer, Mike W. Grasselt, Claudio Andrea Fanconi, Thuany Karoline Stuart, Yannick Saillet, Basem Elasioty, Hemanth Kumar Babu, Robert Kern
  • Patent number: 11748382
    Abstract: A method provides for classifying data fields of a dataset. A classifier configured for determining confidence values for a plurality of data classes for the data fields may be applied. Using the confidence values, data class candidates may be identified. Data fields may be determined for which a plurality of data class candidates is identifiable. Using previous user-selected data class assignments, a probability may be determined for the data class candidates that the respective data class candidate is a data class to which the respective data field is to be assigned. The data fields may be classified using the probabilities to select for the data fields a data class from the data class candidates. The dataset may be provided with metadata identifying for the data fields the data classes to which the respective data fields are assigned.
    Type: Grant
    Filed: May 18, 2020
    Date of Patent: September 5, 2023
    Assignee: International Business Machines Corporation
    Inventors: Yannick Saillet, Namit Kabra, Mike W. Grasselt, Krishna Kishore Bonagiri
  • Publication number: 20230185786
    Abstract: A computer-implemented method for detecting reference data standardization gaps in data sets is disclosed. The method comprises identifying at least one reference data candidate in a data set, using an index for values of the identified at least one reference data candidate, and determining a difference between an earlier version of a reference data set relating to the reference data candidate and a current version of the reference data set. Furthermore, the method comprises comparing the determined difference with values of the index, and identifying entries in the at least one reference data candidate having a value identical to a value of the difference as reference data standardization gap.
    Type: Application
    Filed: December 13, 2021
    Publication date: June 15, 2023
    Inventors: Albert Maier, Dennis Butterstein, Alexandre Luz Xavier Da Costa, Mike W. Grasselt, Timo Kussmaul, Yevgen Karpenko
  • Publication number: 20230177193
    Abstract: A database system can comprise records, each record including a set of attributes. The database system can further comprise database views, each database view representing a subset of the set of attributes. Data purpose objects indicating a subset of attributes of the set of attributes and a processing purpose can be stored. Each processing purpose can be associated with one or more entities that authorized access to the subset of attributes of the processing purpose. A request for data for a specific processing purpose and a selected view of the database views can be received. A data purpose object that indicates the specific processing purpose can be retrieved. The subset of attributes represented by the selected view can be compared with the subset of the attributes indicated in the retrieved data purpose object. Values of the subset of attributes of the selected view can be provided.
    Type: Application
    Filed: December 8, 2021
    Publication date: June 8, 2023
    Inventors: Lars Bremer, Albert Maier, Mike W. Grasselt, Yannick Saillet, Michael Baessler
  • Patent number: 11651055
    Abstract: Methods, computer program products and/or systems are provided that perform the following operations: obtaining a first graph comprising first nodes representing first entities and first edges representing relationships between first entities, the first nodes being associated with first entity attributes descriptive of the first entities represented by the first nodes, the first edges being associated with first edge attributes descriptive of the relationships represented by the first edges; determining a first subgraph for a certain node of the first nodes of the first graph, the first subgraph including the certain node and at least one neighboring node of the certain node; and determining a data quality issue regarding the certain node based, at least in part, on applying one or more applicable rules of a set of data quality rules to first entity attribute values and first edge attribute values of the first subgraph.
    Type: Grant
    Filed: October 29, 2020
    Date of Patent: May 16, 2023
    Assignee: International Business Machines Corporation
    Inventors: Yannick Saillet, Claudio Andrea Fanconi, Martin Oberhofer, Hemanth Kumar Babu, Basem Elasioty, Mike W. Grasselt, Robert Kern, Thuany Karoline Stuart
  • Patent number: 11550813
    Abstract: Techniques are described relating to automatic data standardization in a managed services domain of a cloud computing environment. An associated computer-implemented method includes receiving a dataset during a data onboarding procedure and classifying datapoints within the dataset. The method further includes applying a machine learning data standardization model to each classified datapoint within the dataset and deriving a proposed set of data standardization rules for the dataset based upon any standardization modification determined consequent to model application. Optionally, the method includes presenting the proposed set of data standardization rules for client review and, responsive to acceptance of the proposed set of data standardization rules, applying the proposed set of data standardization rules to the dataset. The method further includes, responsive to acceptance of the proposed set of data standardization rules, updating the machine learning data standardization model accordingly.
    Type: Grant
    Filed: February 24, 2021
    Date of Patent: January 10, 2023
    Assignee: International Business Machines Corporation
    Inventors: Namit Kabra, Krishna Kishore Bonagiri, Mike W. Grasselt, Yannick Saillet
  • Patent number: 11537552
    Abstract: A computer system, computer program product, and a computer-implemented method for supplementing a data governance framework with one or more new data governance technical rules is disclosed. The method comprises providing a plurality of expressions and a first mapping. The expressions assign natural language patterns to technical language patterns. The first mapping maps first terms to data sources. A rule generator receives a new natural language (NL) rule comprising one or more natural-language patterns and one or more first terms. The rule generator resolves the new NL rule into one or more new technical rules interpretable by a respective rule engine and stores the one or more technical rules in a rule repository.
    Type: Grant
    Filed: May 11, 2020
    Date of Patent: December 27, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mike W. Grasselt, Yannick Saillet, Marvin Schaefer
  • Publication number: 20220391848
    Abstract: Embodiments of the present invention provide methods, computer program products, and systems. Embodiments of the present invention can condense a hierarchy in a data governance system, wherein the hierarchy comprises a root node and at least one child node comprising related sub-trees by determining, for a parent node in the hierarchy of governance system, governance terms and respective assignment relationships from a plurality of information assets, determining usage of the governance term in at least one of a plurality of governance rules, and marking a governance term of the plurality of governance terms for elimination based on the determined assignment relationships and the determined usage of the governance term in the plurality of governance rules. Embodiments of the present invention can then delete the governance term from the hierarchy if the governance term is marked for elimination.
    Type: Application
    Filed: June 7, 2021
    Publication date: December 8, 2022
    Inventors: Albert Maier, Mike W. Grasselt, Yannick Saillet, Lars Bremer, Michael Baessler
  • Patent number: 11500876
    Abstract: Embodiments of the present invention determines duplicates in a graph. The graph comprises nodes representing entities and edges representing relationships between the entities. The method comprises: identifying at least two nodes in the graph. A neighborhood subgraph may be determined for each of the two nodes. The neighborhood subgraph includes the respective node. The method further comprises determining whether the two nodes are duplicates with respect to each other, based on a result of a comparison between the two subgraphs.
    Type: Grant
    Filed: December 8, 2020
    Date of Patent: November 15, 2022
    Assignee: International Business Machines Corporation
    Inventors: Thuany Karoline Stuart, Basem Elasioty, Claudio Andrea Fanconi, Mike W. Grasselt, Hemanth Kumar Babu, Yannick Saillet, Robert Kern, Martin Oberhofer, Lars Bremer, Jonathan Roesner, Jason Allen Woods
  • Patent number: 11487770
    Abstract: A computer implemented method is used for sorting data elements of a given set. The method includes performing an evaluation of a first type of usage of each data element. The method includes determining a set of data element candidates dependent on the evaluation of the first type of usage. The method includes performing an evaluation of a second type of usage of each data element of the set of data element candidates. The method includes sorting the data elements of the set of data element candidates dependent on the evaluation of the second type of usage of each data element of the set of data element candidates. The method includes providing the sorted data elements of the set of data element candidates, and in response, receiving a request for a data processing based on the provided sorted data elements of the set of data element candidates.
    Type: Grant
    Filed: May 18, 2020
    Date of Patent: November 1, 2022
    Assignee: International Business Machines Corporation
    Inventors: Albert Maier, Mike W. Grasselt, Yannick Saillet, Lars Bremer, Michael Baessler
  • Publication number: 20220318028
    Abstract: A database of deployed configurations, as well as attempted configurations that failed is maintained and used as reference to compare against configurations of attempted software deployments. Upon detecting a failed deployment, disclosed embodiments search the database for working configurations that most closely resemble the failed configuration, and rank the configurations based on various criteria. Disclosed embodiments may then automatically select a highest ranked working configuration, and perform an automatic upgrade of the necessary components to create a working configuration.
    Type: Application
    Filed: April 6, 2021
    Publication date: October 6, 2022
    Inventors: Krishna Kishore Bonagiri, Namit Kabra, Yannick Saillet, Mike W. Grasselt
  • Publication number: 20220277017
    Abstract: Techniques are described relating to automatic data standardization in a managed services domain of a cloud computing environment. An associated computer-implemented method includes receiving a dataset during a data onboarding procedure and classifying datapoints within the dataset. The method further includes applying a machine learning data standardization model to each classified datapoint within the dataset and deriving a proposed set of data standardization rules for the dataset based upon any standardization modification determined consequent to model application. Optionally, the method includes presenting the proposed set of data standardization rules for client review and, responsive to acceptance of the proposed set of data standardization rules, applying the proposed set of data standardization rules to the dataset. The method further includes, responsive to acceptance of the proposed set of data standardization rules, updating the machine learning data standardization model accordingly.
    Type: Application
    Filed: February 24, 2021
    Publication date: September 1, 2022
    Inventors: Namit Kabra, Krishna Kishore Bonagiri, Mike W. Grasselt, Yannick Saillet
  • Publication number: 20220138512
    Abstract: Methods, computer program products and/or systems are provided that perform the following operations: obtaining a first graph comprising first nodes representing first entities and first edges representing relationships between first entities, the first nodes being associated with first entity attributes descriptive of the first entities represented by the first nodes, the first edges being associated with first edge attributes descriptive of the relationships represented by the first edges; determining a first subgraph for a certain node of the first nodes of the first graph, the first subgraph including the certain node and at least one neighboring node of the certain node; and determining a data quality issue regarding the certain node based, at least in part, on applying one or more applicable rules of a set of data quality rules to first entity attribute values and first edge attribute values of the first subgraph.
    Type: Application
    Filed: October 29, 2020
    Publication date: May 5, 2022
    Inventors: Yannick Saillet, Claudio Andrea Fanconi, Martin Oberhofer, Hemanth Kumar Babu, Basem Elasioty, Mike W. Grasselt, Robert Kern, Thuany Karoline Stuart
  • Publication number: 20220123935
    Abstract: The exemplary embodiments disclose a method, a computer program product, and a computer system for protecting sensitive information. The exemplary embodiments may include using an inverted text index for evaluating one or more statistical measures of an index token of the inverted text index, using the one or more statistical measures for selecting a set of candidate tokens, extracting metadata from the inverted text index, associating the set of candidate tokens with respective token metadata, tokenizing at least one document resulting in one or more document tokens, comparing the one or more document tokens with the set of candidate tokens, selecting a set of document tokens to be masked, selecting at least part of the set of document tokens that comprises sensitive information according to the associated token metadata, masking the at least part of the set of document tokens, and providing one or more masked documents.
    Type: Application
    Filed: October 19, 2020
    Publication date: April 21, 2022
    Inventors: Michael Baessler, Albert Maier, Mike W. Grasselt, Yannick Saillet, Lars Bremer
  • Publication number: 20220100899
    Abstract: In an approach, a processor receives a request of a document. A processor identifies a set of datasets comprising a sensitive dataset, the set of datasets being interrelated in accordance with a relational model. A processor extracts attribute values of the document. A processor determines that a set of one or more attribute values of the extracted attribute values is in the set of datasets, the set of attribute values being values of a set of attributes. A processor determines that one or more entities of the sensitive dataset can be identified based on relations of the relational model between the set of attributes, where at least part of attribute values of the one or more entities comprises sensitive information. A processor, responsive to determining that the one or more entities can be identified, masks at least part of the set of one or more attribute values in the document.
    Type: Application
    Filed: September 25, 2020
    Publication date: March 31, 2022
    Inventors: Yannick Saillet, Albert Maier, Mike W. Grasselt, Michael Baessler, Lars Bremer
  • Publication number: 20210357699
    Abstract: The invention relates to an approach for data quality assessment for data analytics, the approach comprising providing a data set, the data set comprising multiple data fields, predicting by a first trained machine learning model at least one usage type of the data set using characteristics of the data fields as input, for each usage type of the at least one usage type, determining a usage specific data quality score of each of the predicted usage types, and using of the data set based on the at least one usage type and associated data quality score.
    Type: Application
    Filed: May 14, 2020
    Publication date: November 18, 2021
    Inventors: Yannick Saillet, Mike W. Grasselt, Namit Kabra, Krishna Kishore Bonagiri
  • Publication number: 20210357183
    Abstract: A computer implemented method is used for sorting data elements of a given set. The method includes performing an evaluation of a first type of usage of each data element. The method includes determining a set of data element candidates dependent on the evaluation of the first type of usage. The method includes performing an evaluation of a second type of usage of each data element of the set of data element candidates. The method includes sorting the data elements of the set of data element candidates dependent on the evaluation of the second type of usage of each data element of the set of data element candidates. The method includes providing the sorted data elements of the set of data element candidates, and in response, receiving a request for a data processing based on the provided sorted data elements of the set of data element candidates.
    Type: Application
    Filed: May 18, 2020
    Publication date: November 18, 2021
    Inventors: Albert Maier, Mike W. Grasselt, Yannick Saillet, Lars Bremer, Michael Baessler
  • Publication number: 20210342397
    Abstract: The present disclosure relates to a method for a weighting graph comprising nodes representing entities and edges representing relationships between entities in accordance with one or more domains. The method comprises: pre-processing the graph comprising assigning weights to the nodes and/or the edges of the graph in accordance with a specific domain of the domains, wherein the weight indicates a domain specific data quality problem of attribute values representing an edge of the edges and/or an entity involved in that edge. The weighted graph may be provided for enabling a processing of the graph in accordance with the specific domain.
    Type: Application
    Filed: April 20, 2021
    Publication date: November 4, 2021
    Inventors: Martin Oberhofer, Mike W. Grasselt, Claudio Andrea Fanconi, Thuany Karoline Stuart, Yannick Saillet, Basem Elasioty, Hemanth Kumar Babu, Robert Kern
  • Publication number: 20210342352
    Abstract: Embodiments of the present invention determines duplicates in a graph. The graph comprises nodes representing entities and edges representing relationships between the entities. The method comprises: identifying at least two nodes in the graph. A neighborhood subgraph may be determined for each of the two nodes. The neighborhood subgraph includes the respective node. The method further comprises determining whether the two nodes are duplicates with respect to each other, based on a result of a comparison between the two subgraphs.
    Type: Application
    Filed: December 8, 2020
    Publication date: November 4, 2021
    Inventors: Thuany Karoline Stuart, Basem Elasioty, Claudio Andrea Fanconi, Mike W. Grasselt, Hemanth Kumar Babu, Yannick Saillet, Robert Kern, Martin Oberhofer, Lars Bremer, Jonathan Roesner, Jason Allen Woods
  • Patent number: 11023497
    Abstract: Data classification includes tracking classification of columns of data into data classes of a collection of classes available for classifying the columns, obtaining a target column of data, of a target dataset, to be classified into a data class of the collection of candidate classes, and classifying the target column of data into a data class of the collection of classes based on historical data classification characteristics provided by the tracking. The classifying includes selecting a group of candidate data classes of the collection of classes to compare to value(s) of the target column, the selecting excludes at least some candidate data classes of the collection from comparison to the value(s), and establishing a priority between the candidate data classes of the group of candidate classes in comparing the value(s) of the target column of data to the selected group of candidate classes.
    Type: Grant
    Filed: September 12, 2019
    Date of Patent: June 1, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Namit Kabra, Krishna Kishore Bonagiri, Yannick Saillet, Mike W. Grasselt