Patents by Inventor Albert Maier

Albert Maier has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Data anonymization

Patent number: 11106820

Abstract: The present disclosure relates to a method for data anonymization of a database system. The method comprises: determining if a first dataset and second dataset of the database system have a relationship indicative of an entity having values in the two datasets. A request may be received from a user for at least one of the first and second datasets. In case the first dataset and second dataset have the relationship, at least one of the first and second datasets may be modified such that the indication of the entity is not accessible to the user. And the requested dataset may be provided.

Type: Grant

Filed: March 19, 2018

Date of Patent: August 31, 2021

Assignee: International Business Machines Corporation

Inventors: Martin Oberhofer, Albert Maier, Yannick Saillet
Data sampling in a storage system

Patent number: 11036701

Abstract: A computer-implemented method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.

Type: Grant

Filed: January 6, 2020

Date of Patent: June 15, 2021

Assignee: International Business Machines Corporation

Inventors: Albert Maier, Yannick Saillet, Damir Spisic
METHOD AND SYSTEM FOR A DISCOVERY ENGINE

Publication number: 20210144218

Abstract: The present disclosure relates to a method for accessing data of one or more data sources using a discovery engine. The method comprises: determining a discovery space content from initial metadata of a data source indicated in a data exploration request. The discovery space content may be rendered. The rendered content may be used for determining a set of one or more tasks for generating further metadata from at least part of the data of the data source, wherein the set of tasks comprises a combination of API calls. The API calls may be issued to the discovery engine. Discovery results of the issued API calls may be received. A data discovery status may be devalued using the discovery results. The discovery space content may be augmented using the further metadata and the data discovery status. The augmented discovery space content may be rendered for receiving further API calls.

Type: Application

Filed: September 16, 2020

Publication date: May 13, 2021

Inventors: Albert Maier, Bernhard Mitschang, Peter Gerstl, Kunjavihari Madhav Kashalikar
Cognitive data anonymization

Patent number: 10740488

Abstract: A computer implemented method for data anonymization comprises: receiving a request for data that needs anonymization. The request comprises at least one field descriptor of data to be retrieved and a usage scenario of a user for the requested data. Then, based on the usage scenario, an anonymization algorithm to be applied to the data that is referred to by the field descriptor is determined. Subsequently, the determined anonymization algorithm is applied to the data that is referred to by the field descriptor. A testing is performed, as to whether the degree of anonymization fulfills a requirement that is related to the usage scenario. In the case, the requirement is fulfilled, access to the anonymized data is provided.

Type: Grant

Filed: November 17, 2017

Date of Patent: August 11, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Albert Maier, Martin Oberhofer, Yannick Saillet
Cognitive data anonymization

Patent number: 10719627

Abstract: A computer implemented method for data anonymization comprises: receiving a request for data that needs anonymization. The request comprises at least one field descriptor of data to be retrieved and a usage scenario of a user for the requested data. Then, based on the usage scenario, an anonymization algorithm to be applied to the data that is referred to by the field descriptor is determined. Subsequently, the determined anonymization algorithm is applied to the data that is referred to by the field descriptor. A testing is performed, as to whether the degree of anonymization fulfills a requirement that is related to the usage scenario. In the case, the requirement is fulfilled, access to the anonymized data is provided.

Type: Grant

Filed: April 23, 2019

Date of Patent: July 21, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Albert Maier, Martin Oberhofer, Yannick Saillet
DATA SAMPLING IN A STORAGE SYSTEM

Publication number: 20200142870

Abstract: A computer-implemented method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.

Type: Application

Filed: January 6, 2020

Publication date: May 7, 2020

Inventors: Albert Maier, Yannick Saillet, Damir Spisic
Processing data sets in a big data repository

Patent number: 10635486

Abstract: The invention provides for a method for processing a plurality of data sets (105; 106; 108; 110-113; DB1; DB2) in a data repository (104) for storing at least unstructured data, the method comprising: —providing (302) a set of agents (150-168), each agent being operable to trigger the processing of one or more of the data sets, the execution of each of said agents being automatically triggered in case one or more conditions assigned to said agent are met, at least one of the conditions relating to the existence, structure, content and/or annotations of the data set whose processing can be triggered by said agent; —executing (304) a first one of the agents; —updating (306) the annotations (115) of the first data set by the first agent; and —executing (308) a second one of the agents, said execution being triggered by the updated annotations of the first data set meeting the conditions of the second agent, thereby triggering a further updating of the annotations of the first data set.

Type: Grant

Filed: August 14, 2018

Date of Patent: April 28, 2020

Assignee: International Business Machines Corporation

Inventors: Albert Maier, Yannick Saillet, Harald C. Smith, Daniel C. Wolfson
Data sampling in a storage system

Patent number: 10534763

Abstract: A computer-implemented method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.

Type: Grant

Filed: May 10, 2019

Date of Patent: January 14, 2020

Assignee: International Business Machines Corporation

Inventors: Albert Maier, Yannick Saillet, Damir Spisic
Data sampling in a storage system

Patent number: 10534762

Abstract: A computer-implemented method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.

Type: Grant

Filed: May 10, 2019

Date of Patent: January 14, 2020

Assignee: International Business Machines Corporation

Inventors: Albert Maier, Yannick Saillet, Damir Spisic
Data sampling in a storage system

Patent number: 10467206

Abstract: A method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.

Type: Grant

Filed: March 8, 2017

Date of Patent: November 5, 2019

Assignee: International Business Machines Corporation

Inventors: Albert Maier, Yannick Saillet, Damir Spisic
Data sampling in a storage system

Patent number: 10467204

Abstract: A method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.

Type: Grant

Filed: February 18, 2016

Date of Patent: November 5, 2019

Assignee: International Business Machines Corporation

Inventors: Albert Maier, Yannick Saillet, Damir Spisic
DATA ANONYMIZATION

Publication number: 20190286849

Abstract: The present disclosure relates to a method for data anonymization of a database system. The method comprises: determining if a first dataset and second dataset of the database system have a relationship indicative of an entity having values in the two datasets. A request may be received from a user for at least one of the first and second datasets. In case the first dataset and second dataset have the relationship, at least one of the first and second datasets may be modified such that the indication of the entity is not accessible to the user. And the requested dataset may be provided.

Type: Application

Filed: March 19, 2018

Publication date: September 19, 2019

Inventors: Martin Oberhofer, Albert Maier, Yannick Saillet
DATA SAMPLING IN A STORAGE SYSTEM

Publication number: 20190266137

Abstract: A computer-implemented method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.

Type: Application

Filed: May 10, 2019

Publication date: August 29, 2019

Inventors: Albert Maier, Yannick Saillet, Damir Spisic
DATA SAMPLING IN A STORAGE SYSTEM

Publication number: 20190266136

Abstract: A computer-implemented method, computer program product and system for data sampling in a storage system. The storage system includes a dataset comprising records and a buffer. The dataset is scanned record-by-record to determine whether the current record belongs to a random sample. If so, then the current record may be added to a first set of records. Otherwise, at least one storage score may be calculated or determined for the current record using attribute values of the current record. Next, it may be determined whether the buffer includes available size for storing the current record. In case the buffer comprises the available size, the current record may be stored in the buffer. Otherwise, at least part of the buffer may be free up. A subsample of the dataset may be provided as a result of merging the first set of records and at least part of the buffered records.

Type: Application

Filed: May 10, 2019

Publication date: August 29, 2019

Inventors: Albert Maier, Yannick Saillet, Damir Spisic
Processing data errors for a data processing system

Patent number: 10387236

Abstract: Processing data errors in a data processing system, includes a computer receiving one or more patterns and a data set. The one or more patterns describe characteristics of an erroneous data record and are associated with a root cause. The root cause includes a description of a technical deficiency causing the data error in the erroneous data record. Responsive to the computer determining that a first set of data records in the received data set have characteristics that match a first pattern of the one or more patterns, the computer assigns the first set of data records of the received data set having characteristics that match the first pattern to a first error group.

Type: Grant

Filed: September 2, 2015

Date of Patent: August 20, 2019

Assignee: International Business Machines Corporation

Inventors: Peter Gerstl, Mike Grasselt, Albert Maier, Thomas Schwarz, Oliver Suhre
DATA CLASSIFICATION

Publication number: 20190251107

Abstract: The invention relates to a computer-implemented method for classifying a set of data values. For each of the data values of the set of data values, a set of one or more terms associated with the respective data value is determined using one or more first knowledge bases. A set of common terms is determined. The set of common terms comprises terms present in more than one of the sets of terms. For each of the common terms, a number of hits for a lookup query against one or more second knowledge data bases is determined. One or more common terms of the set of common terms with the smallest number of hits are determined and a result is returned. The result comprises the one or more common terms with the smallest number of hits as one or more candidate classes for classifying the set of data values.

Type: Application

Filed: April 23, 2019

Publication date: August 15, 2019

Inventors: ALBERT MAIER, MARTIN OBERHOFER, YANNICK SAILLET
COGNITIVE DATA ANONYMIZATION

Publication number: 20190251290

Abstract: A computer implemented method for data anonymization comprises: receiving a request for data that needs anonymization. The request comprises at least one field descriptor of data to be retrieved and a usage scenario of a user for the requested data. Then, based on the usage scenario, an anonymization algorithm to be applied to the data that is referred to by the field descriptor is determined. Subsequently, the determined anonymization algorithm is applied to the data that is referred to by the field descriptor. A testing is performed, as to whether the degree of anonymization fulfills a requirement that is related to the usage scenario. In the case, the requirement is fulfilled, access to the anonymized data is provided.

Type: Application

Filed: April 23, 2019

Publication date: August 15, 2019

Inventors: ALBERT MAIER, MARTIN OBERHOFER, YANNICK SAILLET
Processing data sets in a big data repository by executing agents to update annotations of the data sets

Patent number: 10338960

Abstract: The invention provides for a method for processing a plurality of data sets (105; 106; 108; 110-113; DB1; DB2) in a data repository (104) for storing at least unstructured data, the method comprising: —providing (302) a set of agents (150-168), each agent being operable to trigger the processing of one or more of the data sets, the execution of each of said agents being automatically triggered in case one or more conditions assigned to said agent are met, at least one of the conditions relating to the existence, structure, content and/or annotations of the data set whose processing can be triggered by said agent; —executing (304) a first one of the agents; —updating (306) the annotations (115) of the first data set by the first agent; and —executing (308) a second one of the agents, said execution being triggered by the updated annotations of the first data set meeting the conditions of the second agent, thereby triggering a further up-dating of the annotations of the first data set.

Type: Grant

Filed: February 18, 2015

Date of Patent: July 2, 2019

Assignee: International Business Machines Corporation

Inventors: Albert Maier, Yannick Saillet, Harald C. Smith, Daniel C. Wolfson
REFINING CLASSIFICATION RESULTS BASED ON GLOSSARY RELATIONSHIPS

Publication number: 20190179949

Abstract: A method, system and computer program product for classifying a data collection of data of a predefined domain. A hierarchical representation scheme describing terms of the domain and one or more relationships between the terms is provided. At least one classifier may be applied on the data collection, resulting in a set of term assignments. Each term assignment of the term assignments associates a term candidate with a respective confidence value to the collection or to one or more data items of the collection. At least one of the term assignments may be refined based on the representation scheme and the set of term assignments.

Type: Application

Filed: December 11, 2017

Publication date: June 13, 2019

Inventors: Peter Gerstl, Robert Kern, Albert Maier, Thomas Schwarz, Oliver Suhre
COGNITIVE DATA ANONYMIZATION

Publication number: 20190156060

Abstract: A computer implemented method for data anonymization comprises: receiving a request for data that needs anonymization. The request comprises at least one field descriptor of data to be retrieved and a usage scenario of a user for the requested data. Then, based on the usage scenario, an anonymization algorithm to be applied to the data that is referred to by the field descriptor is determined. Subsequently, the determined anonymization algorithm is applied to the data that is referred to by the field descriptor. A testing is performed, as to whether the degree of anonymization fulfills a requirement that is related to the usage scenario. In the case, the requirement is fulfilled, access to the anonymized data is provided.

Type: Application

Filed: November 17, 2017

Publication date: May 23, 2019

Inventors: ALBERT MAIER, MARTIN OBERHOFER, YANNICK SAILLET

prev 1 2 3 4 5 6 next