Patents by Inventor Albert Maier

Albert Maier has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11106820
    Abstract: The present disclosure relates to a method for data anonymization of a database system. The method comprises: determining if a first dataset and second dataset of the database system have a relationship indicative of an entity having values in the two datasets. A request may be received from a user for at least one of the first and second datasets. In case the first dataset and second dataset have the relationship, at least one of the first and second datasets may be modified such that the indication of the entity is not accessible to the user. And the requested dataset may be provided.
    Type: Grant
    Filed: March 19, 2018
    Date of Patent: August 31, 2021
    Assignee: International Business Machines Corporation
    Inventors: Martin Oberhofer, Albert Maier, Yannick Saillet
  • Patent number: 11036701
    Abstract: A computer-implemented method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.
    Type: Grant
    Filed: January 6, 2020
    Date of Patent: June 15, 2021
    Assignee: International Business Machines Corporation
    Inventors: Albert Maier, Yannick Saillet, Damir Spisic
  • Publication number: 20210144218
    Abstract: The present disclosure relates to a method for accessing data of one or more data sources using a discovery engine. The method comprises: determining a discovery space content from initial metadata of a data source indicated in a data exploration request. The discovery space content may be rendered. The rendered content may be used for determining a set of one or more tasks for generating further metadata from at least part of the data of the data source, wherein the set of tasks comprises a combination of API calls. The API calls may be issued to the discovery engine. Discovery results of the issued API calls may be received. A data discovery status may be devalued using the discovery results. The discovery space content may be augmented using the further metadata and the data discovery status. The augmented discovery space content may be rendered for receiving further API calls.
    Type: Application
    Filed: September 16, 2020
    Publication date: May 13, 2021
    Inventors: Albert Maier, Bernhard Mitschang, Peter Gerstl, Kunjavihari Madhav Kashalikar
  • Patent number: 10740488
    Abstract: A computer implemented method for data anonymization comprises: receiving a request for data that needs anonymization. The request comprises at least one field descriptor of data to be retrieved and a usage scenario of a user for the requested data. Then, based on the usage scenario, an anonymization algorithm to be applied to the data that is referred to by the field descriptor is determined. Subsequently, the determined anonymization algorithm is applied to the data that is referred to by the field descriptor. A testing is performed, as to whether the degree of anonymization fulfills a requirement that is related to the usage scenario. In the case, the requirement is fulfilled, access to the anonymized data is provided.
    Type: Grant
    Filed: November 17, 2017
    Date of Patent: August 11, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Albert Maier, Martin Oberhofer, Yannick Saillet
  • Patent number: 10719627
    Abstract: A computer implemented method for data anonymization comprises: receiving a request for data that needs anonymization. The request comprises at least one field descriptor of data to be retrieved and a usage scenario of a user for the requested data. Then, based on the usage scenario, an anonymization algorithm to be applied to the data that is referred to by the field descriptor is determined. Subsequently, the determined anonymization algorithm is applied to the data that is referred to by the field descriptor. A testing is performed, as to whether the degree of anonymization fulfills a requirement that is related to the usage scenario. In the case, the requirement is fulfilled, access to the anonymized data is provided.
    Type: Grant
    Filed: April 23, 2019
    Date of Patent: July 21, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Albert Maier, Martin Oberhofer, Yannick Saillet
  • Publication number: 20200142870
    Abstract: A computer-implemented method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.
    Type: Application
    Filed: January 6, 2020
    Publication date: May 7, 2020
    Inventors: Albert Maier, Yannick Saillet, Damir Spisic
  • Patent number: 10635486
    Abstract: The invention provides for a method for processing a plurality of data sets (105; 106; 108; 110-113; DB1; DB2) in a data repository (104) for storing at least unstructured data, the method comprising: —providing (302) a set of agents (150-168), each agent being operable to trigger the processing of one or more of the data sets, the execution of each of said agents being automatically triggered in case one or more conditions assigned to said agent are met, at least one of the conditions relating to the existence, structure, content and/or annotations of the data set whose processing can be triggered by said agent; —executing (304) a first one of the agents; —updating (306) the annotations (115) of the first data set by the first agent; and —executing (308) a second one of the agents, said execution being triggered by the updated annotations of the first data set meeting the conditions of the second agent, thereby triggering a further updating of the annotations of the first data set.
    Type: Grant
    Filed: August 14, 2018
    Date of Patent: April 28, 2020
    Assignee: International Business Machines Corporation
    Inventors: Albert Maier, Yannick Saillet, Harald C. Smith, Daniel C. Wolfson
  • Patent number: 10534763
    Abstract: A computer-implemented method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.
    Type: Grant
    Filed: May 10, 2019
    Date of Patent: January 14, 2020
    Assignee: International Business Machines Corporation
    Inventors: Albert Maier, Yannick Saillet, Damir Spisic
  • Patent number: 10534762
    Abstract: A computer-implemented method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.
    Type: Grant
    Filed: May 10, 2019
    Date of Patent: January 14, 2020
    Assignee: International Business Machines Corporation
    Inventors: Albert Maier, Yannick Saillet, Damir Spisic
  • Patent number: 10467206
    Abstract: A method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.
    Type: Grant
    Filed: March 8, 2017
    Date of Patent: November 5, 2019
    Assignee: International Business Machines Corporation
    Inventors: Albert Maier, Yannick Saillet, Damir Spisic
  • Patent number: 10467204
    Abstract: A method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.
    Type: Grant
    Filed: February 18, 2016
    Date of Patent: November 5, 2019
    Assignee: International Business Machines Corporation
    Inventors: Albert Maier, Yannick Saillet, Damir Spisic
  • Publication number: 20190286849
    Abstract: The present disclosure relates to a method for data anonymization of a database system. The method comprises: determining if a first dataset and second dataset of the database system have a relationship indicative of an entity having values in the two datasets. A request may be received from a user for at least one of the first and second datasets. In case the first dataset and second dataset have the relationship, at least one of the first and second datasets may be modified such that the indication of the entity is not accessible to the user. And the requested dataset may be provided.
    Type: Application
    Filed: March 19, 2018
    Publication date: September 19, 2019
    Inventors: Martin Oberhofer, Albert Maier, Yannick Saillet
  • Publication number: 20190266137
    Abstract: A computer-implemented method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.
    Type: Application
    Filed: May 10, 2019
    Publication date: August 29, 2019
    Inventors: Albert Maier, Yannick Saillet, Damir Spisic
  • Publication number: 20190266136
    Abstract: A computer-implemented method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.
    Type: Application
    Filed: May 10, 2019
    Publication date: August 29, 2019
    Inventors: Albert Maier, Yannick Saillet, Damir Spisic
  • Patent number: 10387236
    Abstract: Processing data errors in a data processing system, includes a computer receiving one or more patterns and a data set. The one or more patterns describe characteristics of an erroneous data record and are associated with a root cause. The root cause includes a description of a technical deficiency causing the data error in the erroneous data record. Responsive to the computer determining that a first set of data records in the received data set have characteristics that match a first pattern of the one or more patterns, the computer assigns the first set of data records of the received data set having characteristics that match the first pattern to a first error group.
    Type: Grant
    Filed: September 2, 2015
    Date of Patent: August 20, 2019
    Assignee: International Business Machines Corporation
    Inventors: Peter Gerstl, Mike Grasselt, Albert Maier, Thomas Schwarz, Oliver Suhre
  • Publication number: 20190251107
    Abstract: The invention relates to a computer-implemented method for classifying a set of data values. For each of the data values of the set of data values, a set of one or more terms associated with the respective data value is determined using one or more first knowledge bases. A set of common terms is determined. The set of common terms comprises terms present in more than one of the sets of terms. For each of the common terms, a number of hits for a lookup query against one or more second knowledge data bases is determined. One or more common terms of the set of common terms with the smallest number of hits are determined and a result is returned. The result comprises the one or more common terms with the smallest number of hits as one or more candidate classes for classifying the set of data values.
    Type: Application
    Filed: April 23, 2019
    Publication date: August 15, 2019
    Inventors: ALBERT MAIER, MARTIN OBERHOFER, YANNICK SAILLET
  • Publication number: 20190251290
    Abstract: A computer implemented method for data anonymization comprises: receiving a request for data that needs anonymization. The request comprises at least one field descriptor of data to be retrieved and a usage scenario of a user for the requested data. Then, based on the usage scenario, an anonymization algorithm to be applied to the data that is referred to by the field descriptor is determined. Subsequently, the determined anonymization algorithm is applied to the data that is referred to by the field descriptor. A testing is performed, as to whether the degree of anonymization fulfills a requirement that is related to the usage scenario. In the case, the requirement is fulfilled, access to the anonymized data is provided.
    Type: Application
    Filed: April 23, 2019
    Publication date: August 15, 2019
    Inventors: ALBERT MAIER, MARTIN OBERHOFER, YANNICK SAILLET
  • Patent number: 10338960
    Abstract: The invention provides for a method for processing a plurality of data sets (105; 106; 108; 110-113; DB1; DB2) in a data repository (104) for storing at least unstructured data, the method comprising: —providing (302) a set of agents (150-168), each agent being operable to trigger the processing of one or more of the data sets, the execution of each of said agents being automatically triggered in case one or more conditions assigned to said agent are met, at least one of the conditions relating to the existence, structure, content and/or annotations of the data set whose processing can be triggered by said agent; —executing (304) a first one of the agents; —updating (306) the annotations (115) of the first data set by the first agent; and —executing (308) a second one of the agents, said execution being triggered by the updated annotations of the first data set meeting the conditions of the second agent, thereby triggering a further up-dating of the annotations of the first data set.
    Type: Grant
    Filed: February 18, 2015
    Date of Patent: July 2, 2019
    Assignee: International Business Machines Corporation
    Inventors: Albert Maier, Yannick Saillet, Harald C. Smith, Daniel C. Wolfson
  • Publication number: 20190179949
    Abstract: A method, system and computer program product for classifying a data collection of data of a predefined domain. A hierarchical representation scheme describing terms of the domain and one or more relationships between the terms is provided. At least one classifier may be applied on the data collection, resulting in a set of term assignments. Each term assignment of the term assignments associates a term candidate with a respective confidence value to the collection or to one or more data items of the collection. At least one of the term assignments may be refined based on the representation scheme and the set of term assignments.
    Type: Application
    Filed: December 11, 2017
    Publication date: June 13, 2019
    Inventors: Peter Gerstl, Robert Kern, Albert Maier, Thomas Schwarz, Oliver Suhre
  • Publication number: 20190156060
    Abstract: A computer implemented method for data anonymization comprises: receiving a request for data that needs anonymization. The request comprises at least one field descriptor of data to be retrieved and a usage scenario of a user for the requested data. Then, based on the usage scenario, an anonymization algorithm to be applied to the data that is referred to by the field descriptor is determined. Subsequently, the determined anonymization algorithm is applied to the data that is referred to by the field descriptor. A testing is performed, as to whether the degree of anonymization fulfills a requirement that is related to the usage scenario. In the case, the requirement is fulfilled, access to the anonymized data is provided.
    Type: Application
    Filed: November 17, 2017
    Publication date: May 23, 2019
    Inventors: ALBERT MAIER, MARTIN OBERHOFER, YANNICK SAILLET