Patents by Inventor Michael Baessler

Michael Baessler has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11921676
    Abstract: Techniques are described relating to unstructured document processing. An associated computer-implemented method includes identifying a plurality of deduplicated data blocks associated with a collection of unstructured documents. The method further includes sorting the plurality of deduplicated data blocks in descending order based upon at least one block frequency metric, selecting a highest sorted unprocessed deduplicated data block, applying text analytics to the selected deduplicated data block, and applying at least one result of the text analytics to any document among the collection of unstructured documents including the selected deduplicated data block. The method is terminated responsive to satisfaction of at least one stopping condition.
    Type: Grant
    Filed: November 29, 2021
    Date of Patent: March 5, 2024
    Assignee: International Business Machines Corporation
    Inventors: Michael Baessler, Thomas Hampp-Bahnmueller, Yannick Saillet
  • Publication number: 20240004939
    Abstract: A method for providing one or more random sample documents from a corpus of documents using a search engine is provided. The providing of each of the random sample documents comprises selecting randomly a time window from a set of time windows. A search query is sent to the search engine defining a search for documents of the corpus with time-stamps within the time window defined by the randomly selected time window. In response to the sending of the search query, a search result is receiving from the search engine. The search result comprises a set of the documents of the corpus with time-stamps within the time window. One of the documents comprised by the received set of documents is then selected randomly.
    Type: Application
    Filed: September 19, 2023
    Publication date: January 4, 2024
    Inventors: Michael Baessler, Thomas Hampp-Bahnmueller, Jojo Joseph, Pavlo Petrenko
  • Patent number: 11860904
    Abstract: Aspects of the present invention disclose a method, computer program product, and system for governing a set of information assets using an information governance system. The method includes one or more processors applying one or more high-level classification assignment rules to one or more information assets of the set of information assets. Furthermore, the method includes one or more processors applying one or more high-level classification propagation rules to the one or more information assets provided with the high-level classification assignments for propagating the respective high-level classification assignments upwards within a containment hierarchy formed by the set of information assets to one or more superordinate information assets of the set of information assets.
    Type: Grant
    Filed: December 1, 2020
    Date of Patent: January 2, 2024
    Assignee: International Business Machines Corporation
    Inventors: Oliver Suhre, Albert Maier, Peter Gerstl, Thomas Schwarz, Michael Baessler
  • Patent number: 11797615
    Abstract: A method for providing one or more random sample documents from a corpus of documents using a search engine is provided. The providing of each of the random sample documents comprises selecting randomly a time window from a set of time windows. A search query is sent to the search engine defining a search for documents of the corpus with time-stamps within the time window defined by the randomly selected time window. In response to the sending of the search query, a search result is received from the search engine. The search result comprises a set of the documents of the corpus with time-stamps within the time window. One of the documents comprised by the received set of documents is then selected randomly.
    Type: Grant
    Filed: January 7, 2020
    Date of Patent: October 24, 2023
    Assignee: International Business Machines Corporation
    Inventors: Michael Baessler, Thomas Hampp-Bahnmueller, Jojo Joseph, Pavlo Petrenko
  • Patent number: 11783088
    Abstract: A method for processing electronic documents comprises an iteration including: (i) applying, by a computer device, a first statistical test process to a first subset of the documents, the first statistical test process estimating whether or not content of the documents of the first subset comply with a predefined criterion; (ii) in response to a result of the first statistical test process, estimating, by the computer device, that the documents of the first subset do not comply with the criterion, selecting, by the computer device, a part of the documents of the first subset, and moving, by the computer device, the part of the documents to a second subset of the documents; and (iii) applying, by the computer device, a second statistical test process to the second subset of the documents, the second statistical test process calculating at least one statistical metric related to the documents of the second subset.
    Type: Grant
    Filed: February 1, 2019
    Date of Patent: October 10, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michael Bässler, Amir Jaibaji, Jojo Joseph, Thomas Hampp-Bahnmueller
  • Patent number: 11687574
    Abstract: A computer implemented method comprising processing the unstructured objects of each record of records of a database for identifying a set of one or more values of attributes in the unstructured objects of the each record. The sets of unstructured attribute values of two records of the database may be compared for determining a similarity level between the two sets. It may be determined whether the two records are representing a same entity based on the comparison result.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: June 27, 2023
    Assignee: International Business Machines Corporation
    Inventors: Lars Bremer, Martin Oberhofer, Karin Steckler, Mariya Chkalova, Michael Baessler, Holger Koenig
  • Publication number: 20230177193
    Abstract: A database system can comprise records, each record including a set of attributes. The database system can further comprise database views, each database view representing a subset of the set of attributes. Data purpose objects indicating a subset of attributes of the set of attributes and a processing purpose can be stored. Each processing purpose can be associated with one or more entities that authorized access to the subset of attributes of the processing purpose. A request for data for a specific processing purpose and a selected view of the database views can be received. A data purpose object that indicates the specific processing purpose can be retrieved. The subset of attributes represented by the selected view can be compared with the subset of the attributes indicated in the retrieved data purpose object. Values of the subset of attributes of the selected view can be provided.
    Type: Application
    Filed: December 8, 2021
    Publication date: June 8, 2023
    Inventors: Lars Bremer, Albert Maier, Mike W. Grasselt, Yannick Saillet, Michael Baessler
  • Publication number: 20230169041
    Abstract: Techniques are described relating to unstructured document processing. An associated computer-implemented method includes identifying a plurality of deduplicated data blocks associated with a collection of unstructured documents. The method further includes sorting the plurality of deduplicated data blocks in descending order based upon at least one block frequency metric, selecting a highest sorted unprocessed deduplicated data block, applying text analytics to the selected deduplicated data block, and applying at least one result of the text analytics to any document among the collection of unstructured documents including the selected deduplicated data block. The method is terminated responsive to satisfaction of at least one stopping condition.
    Type: Application
    Filed: November 29, 2021
    Publication date: June 1, 2023
    Inventors: Michael Baessler, Thomas Hampp-Bahnmueller, Yannick Saillet
  • Patent number: 11606351
    Abstract: In an approach for authentication of a username, a processor maintains a mapping of usernames and realms. A processor receives a username and a time-based one-time password code (TOTP code) for the username based on an authentication application. A processor, upon receiving the TOTP code: determines a realm from the mapping based on the received username and the received TOTP; and requests an entry of a credential relating to the username in the realm. A processor, upon receiving of the requested credential, authenticates the username by determining that the received credential matches an expected credential for the realm.
    Type: Grant
    Filed: December 15, 2020
    Date of Patent: March 14, 2023
    Assignee: International Business Machines Corporation
    Inventors: Thomas Dürr, Michael Baessler, Holger Koenig, Oliver Koeth, Thomas Schwarz
  • Patent number: 11593417
    Abstract: In an approach, a processor groups documents into a plurality of groups based on similarity, where: documents of each group have a same document structure; and the document structure is defined by coordinates of text blocks. A processor, for each group of the plurality of groups and for each document of the respective group: retrieves a value of each text block of the respective document in accordance with a document structure of the group; and assigns to each text block of the respective document an attribute that represents the retrieved value of the text block. A processor assigns a first document of the documents to an entity of a database that matches the first document based on the group of text block values and the assigned attributes of the document.
    Type: Grant
    Filed: January 21, 2021
    Date of Patent: February 28, 2023
    Assignee: International Business Machines Corporation
    Inventors: Thomas Schwarz, Albert Maier, Michael Baessler, Oliver Suhre, Peter Gerstl, Werner Schuetz, Jonathan Roesner, Mariya Chkalova
  • Publication number: 20220391848
    Abstract: Embodiments of the present invention provide methods, computer program products, and systems. Embodiments of the present invention can condense a hierarchy in a data governance system, wherein the hierarchy comprises a root node and at least one child node comprising related sub-trees by determining, for a parent node in the hierarchy of governance system, governance terms and respective assignment relationships from a plurality of information assets, determining usage of the governance term in at least one of a plurality of governance rules, and marking a governance term of the plurality of governance terms for elimination based on the determined assignment relationships and the determined usage of the governance term in the plurality of governance rules. Embodiments of the present invention can then delete the governance term from the hierarchy if the governance term is marked for elimination.
    Type: Application
    Filed: June 7, 2021
    Publication date: December 8, 2022
    Inventors: Albert Maier, Mike W. Grasselt, Yannick Saillet, Lars Bremer, Michael Baessler
  • Patent number: 11487770
    Abstract: A computer implemented method is used for sorting data elements of a given set. The method includes performing an evaluation of a first type of usage of each data element. The method includes determining a set of data element candidates dependent on the evaluation of the first type of usage. The method includes performing an evaluation of a second type of usage of each data element of the set of data element candidates. The method includes sorting the data elements of the set of data element candidates dependent on the evaluation of the second type of usage of each data element of the set of data element candidates. The method includes providing the sorted data elements of the set of data element candidates, and in response, receiving a request for a data processing based on the provided sorted data elements of the set of data element candidates.
    Type: Grant
    Filed: May 18, 2020
    Date of Patent: November 1, 2022
    Assignee: International Business Machines Corporation
    Inventors: Albert Maier, Mike W. Grasselt, Yannick Saillet, Lars Bremer, Michael Baessler
  • Publication number: 20220309084
    Abstract: A computer implemented method comprising processing the unstructured objects of each record of records of a database for identifying a set of one or more values of attributes in the unstructured objects of the each record. The sets of unstructured attribute values of two records of the database may be compared for determining a similarity level between the two sets. It may be determined whether the two records are representing a same entity based on the comparison result.
    Type: Application
    Filed: March 29, 2021
    Publication date: September 29, 2022
    Inventors: Lars Bremer, Martin Oberhofer, Karin Steckler, Mariya Chkalova, Michael Baessler, Holger Koenig
  • Publication number: 20220229863
    Abstract: In an approach, a processor groups documents into a plurality of groups based on similarity, where: documents of each group have a same document structure; and the document structure is defined by coordinates of text blocks. A processor, for each group of the plurality of groups and for each document of the respective group: retrieves a value of each text block of the respective document in accordance with a document structure of the group; and assigns to each text block of the respective document an attribute that represents the retrieved value of the text block. A processor assigns a first document of the documents to an entity of a database that matches the first document based on the group of text block values and the assigned attributes of the document.
    Type: Application
    Filed: January 21, 2021
    Publication date: July 21, 2022
    Inventors: Thomas Schwarz, Albert Maier, Michael Baessler, Oliver Suhre, Peter Gerstl, Werner Schuetz, Jonathan Roesner, Mariya Chkalova
  • Publication number: 20220188512
    Abstract: A system may receive a data glossary comprising a list of terms. The system may then measure a usage dimension for a set of the terms from the list of terms. The system may select a candidate term from the set based on the usage dimension and perform a maintenance action on the candidate terms.
    Type: Application
    Filed: December 13, 2020
    Publication date: June 16, 2022
    Inventors: Albert Maier, Michael Baessler, Peter Gerstl, Oliver Suhre, Thomas Schwarz
  • Publication number: 20220191192
    Abstract: In an approach for authentication of a username, a processor maintains a mapping of usernames and realms. A processor receives a username and a time-based one-time password code (TOTP code) for the username based on an authentication application. A processor, upon receiving the TOTP code: determines a realm from the mapping based on the received username and the received TOTP; and requests an entry of a credential relating to the username in the realm. A processor, upon receiving of the requested credential, authenticates the username by determining that the received credential matches an expected credential for the realm.
    Type: Application
    Filed: December 15, 2020
    Publication date: June 16, 2022
    Inventors: Thomas Dürr, Michael Baessler, Holger Koenig, Oliver Koeth, Thomas Schwarz
  • Publication number: 20220171793
    Abstract: Aspects of the present invention disclose a method, computer program product, and system for governing a set of information assets using an information governance system. The method includes one or more processors applying one or more high-level classification assignment rules to one or more information assets of the set of information assets. Furthermore, the method includes one or more processors applying one or more high-level classification propagation rules to the one or more information assets provided with the high-level classification assignments for propagating the respective high-level classification assignments upwards within a containment hierarchy formed by the set of information assets to one or more superordinate information assets of the set of information assets.
    Type: Application
    Filed: December 1, 2020
    Publication date: June 2, 2022
    Inventors: Oliver Suhre, Albert Maier, Peter Gerstl, Thomas Schwarz, Michael Baessler
  • Publication number: 20220123935
    Abstract: The exemplary embodiments disclose a method, a computer program product, and a computer system for protecting sensitive information. The exemplary embodiments may include using an inverted text index for evaluating one or more statistical measures of an index token of the inverted text index, using the one or more statistical measures for selecting a set of candidate tokens, extracting metadata from the inverted text index, associating the set of candidate tokens with respective token metadata, tokenizing at least one document resulting in one or more document tokens, comparing the one or more document tokens with the set of candidate tokens, selecting a set of document tokens to be masked, selecting at least part of the set of document tokens that comprises sensitive information according to the associated token metadata, masking the at least part of the set of document tokens, and providing one or more masked documents.
    Type: Application
    Filed: October 19, 2020
    Publication date: April 21, 2022
    Inventors: Michael Baessler, Albert Maier, Mike W. Grasselt, Yannick Saillet, Lars Bremer
  • Publication number: 20220114189
    Abstract: Embodiments of the present invention provide methods, computer program products, and systems. Embodiments of the present invention can extract of structured information for unstructured document analysis. Embodiments of the present invention can extract structured information for unstructured document analysis by identifying tables and columns of a database that correspond to business terms of a business glossary. Embodiments of the present invention can then receive a specification of business terms of interest for recognizing in an unstructured document. Embodiments of the present invention can then generate an analysis module based on the identified tables and columns that enables to identify or recognize attribute values of attributes of the tables and columns. Embodiments of the present invention can then use the analysis module for automatic extraction of values of at least part of the attributes from the unstructured document based on the specification of business terms of interest.
    Type: Application
    Filed: October 14, 2020
    Publication date: April 14, 2022
    Inventors: Michael Baessler, Albert Maier, Dirk Jahn, Thomas Hampp-Bahnmueller
  • Publication number: 20220108126
    Abstract: A computer device identifies a set of documents for classification. The computing device classifies documents of a first subset of the set of documents based, at least in part, on a text analysis of the documents of the first subset. The computing device trains a document classifier using, as training data: (i) results of the classifying of the documents of the first subset, and (ii) metadata associated with the documents of the first subset. The computing device classifies documents of a second subset of the set of documents by providing metadata of the documents of the second subset to the trained document classifier.
    Type: Application
    Filed: October 7, 2020
    Publication date: April 7, 2022
    Inventors: Dieter Hans Schieber, Holger Koenig, Hemanth Kumar Babu, Peter Gerstl, Werner Schuetz, Robert Kern, Lars Bremer, Michael Baessler