Patents by Inventor Jens P. Seifert

Jens P. Seifert has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Cognitive recommendations for data preparation

Patent number: 11429878

Abstract: A method, computer system, and computer program product for providing recommendations about processing datasets. A set of machine learning models are provided for use in respectively determining data processing action performable on a dataset based on a respective set of features of the dataset. A current dataset is received. A set of features of the current dataset are determined. One or more data processing actions are generated to be executed on the current dataset, which are determined by at least two machine learning models of the provided set, based on the determined set of features of the current dataset. One or more of the data processing actions are performed on the current dataset.

Type: Grant

Filed: September 22, 2017

Date of Patent: August 30, 2022

Assignee: International Business Machines Corporation

Inventors: Yannick Saillet, Martin A. Oberhofer, Jens P. Seifert
Model-driven profiling job generator for data sources

Patent number: 11023483

Abstract: Embodiments of the present invention disclose generating a data profiling jobs for source data in a data processing system, the source data being described by at least one source functional data model. A target functional data model is provided, for describing target data that can be generated from the source data. One or more source functional data models are identified that correspond to the target functional data model. At least one functional source-to-target model mapping is associated to at least one source-target pair based on the target functional data model and identified source functional data models. A physical source-to-target model mapping for at least one source-target pair based on the logical source-to-target model mapping is calculated. For all physical source attributes, the needed data profiling jobs are generated based on the target attribute for analyzing the physical source attributes.

Type: Grant

Filed: August 4, 2016

Date of Patent: June 1, 2021

Assignee: International Business Machines Corporation

Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens P. Seifert
Model-driven profiling job generator for data sources

Patent number: 11023484

Abstract: Embodiments of the present invention disclose generating a data profiling jobs for source data in a data processing system, the source data being described by at least one source functional data model. A target functional data model is provided, for describing target data that can be generated from the source data. One or more source functional data models are identified that correspond to the target functional data model. At least one functional source-to-target model mapping is associated to at least one source-target pair based on the target functional data model and identified source functional data models. A physical source-to-target model mapping for at least one source-target pair based on the logical source-to-target model mapping is calculated. For all physical source attributes, the needed data profiling jobs are generated based on the target attribute for analyzing the physical source attributes.

Type: Grant

Filed: December 6, 2017

Date of Patent: June 1, 2021

Assignee: International Business Machines Corporation

Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens P. Seifert
Logging process in a data storage system

Patent number: 10970173

Abstract: A logging process in a data storage system having a set of storage tiers, each storage tier of the set of storage tiers having different performance characteristics, wherein the set of storage tiers is divided into a plurality of subsets of storage tiers using the performance characteristics, may include initiating the logging process for creating a separate log file for each of the plurality of subsets of storage tiers for maintaining a history of data changes in the subset of storage tiers, thereby creating a plurality of log files. In response to a change in data stored in at least one storage tier of a subset of storage tiers of the plurality of subsets of storage tiers, one or more log records including information about the change may be generated and written into respective log files.

Type: Grant

Filed: November 30, 2018

Date of Patent: April 6, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Martin Oberhofer, Jens P. Seifert, Kostas Rakopoulos, Stephen Rees
Aligning most significant bits of different sized elements in comparison result vectors

Patent number: 10740098

Abstract: A method, computer program product, and computer system for providing a comparison result vector of a predefined number of elements w resulting from comparison of multiple vectors of compressed data within a processor comprising registers of same size m is provided. Vector elements of the comparison result vector are stored in a register of the registers. Zero bits are padded between vector elements of each of the comparison result vectors. A compare bit result vector indicative of the vector elements is generated for accessing the results of the comparison in the comparison result vector.

Type: Grant

Filed: February 6, 2018

Date of Patent: August 11, 2020

Assignee: International Business Machines Corporation

Inventors: Cedric Lichtenau, Silvia M. Mueller, Jens P. Seifert, Jörg-Stephan Vogt, Markus Lachenmayr, L'Emir Salim Chehab, Pavankrishna Ellore Ramesh, Sourabh Chougule
METHOD FOR CREATING RUN-TIME EXECUTABLES FOR DATA ANALYSIS FUNCTIONS

Publication number: 20200225941

Abstract: The present disclosure relates to a method for creating run-time executables for data analysis functions. The method comprises in response to receiving a data analysis request from a user, selecting from a repository a repository of data analysis functions a set of data analysis functions for execution in a hosting environment or on premises of the user. Usage conditions of the set of data analysis functions by the user may be determined. An additional code for applying the determined usage conditions may be created. The selected data analysis functions and the additional code may be compiled, resulting in an executable code. The executable code may be certified. The certified executable code may be deployed or provided for download to a run-time environment for certified executable codes.

Type: Application

Filed: January 15, 2019

Publication date: July 16, 2020

Inventors: Martin Oberhofer, Mike W. Grasselt, Yannick Saillet, Jens P. Seifert
METHOD FOR CREATING RUN-TIME EXECUTABLES FOR DATA ANALYSIS FUNCTIONS

Publication number: 20200225942

Abstract: The present disclosure relates to a method for creating run-time executables for data analysis functions. The method comprises in response to receiving a data analysis request from a user, selecting from a repository a repository of data analysis functions a set of data analysis functions for execution in a hosting environment or on premises of the user. Usage conditions of the set of data analysis functions by the user may be determined. An additional code for applying the determined usage conditions may be created. The selected data analysis functions and the additional code may be compiled, resulting in an executable code. The executable code may be certified. The certified executable code may be deployed or provided for download to a run-time environment for certified executable codes.

Type: Application

Filed: July 2, 2019

Publication date: July 16, 2020

Inventors: Martin Oberhofer, Mike W. Grasselt, Yannick Saillet, Jens P. Seifert
Multiple record linkage algorithm selector

Patent number: 10621492

Abstract: The present disclosure relates to a method for centrally processing data records using a record linkage algorithm. The method comprises providing a centralized master repository for storing data records in a predefined data structure having a set of attributes. At least one clustering metric is provided. Clusters of records may be determined using a clustering function that is based on the at least one clustering metric. For each particular cluster, a set of configuration data for the record linkage algorithm may be defined based on a value of the clustering metric within that particular cluster. The individual data records may be assigned to one or more clusters of the clusters using the clustering metric values and the record linkage algorithm may be applied to a set of two or more individual data records assigned to at least one common cluster using the set of configuration data for the common cluster.

Type: Grant

Filed: October 21, 2016

Date of Patent: April 14, 2020

Assignee: International Business Machines Corporation

Inventors: Martin Oberhofer, Yannick Saillet, Scott Schumacher, Jens P. Seifert
Multiple record linkage algorithm selector

Patent number: 10621493

Abstract: The present disclosure relates to a method for centrally processing data records using a record linkage algorithm. The method comprises providing a centralized master repository for storing data records in a predefined data structure having a set of attributes. At least one clustering metric is provided. Clusters of records may be determined using a clustering function that is based on the at least one clustering metric. For each particular cluster, a set of configuration data for the record linkage algorithm may be defined based on a value of the clustering metric within that particular cluster. The individual data records may be assigned to one or more clusters of the clusters using the clustering metric values and the record linkage algorithm may be applied to a set of two or more individual data records assigned to at least one common cluster using the set of configuration data for the common cluster.

Type: Grant

Filed: January 2, 2018

Date of Patent: April 14, 2020

Assignee: International Business Machines Corporation

Inventors: Martin Oberhofer, Yannick Saillet, Scott Schumacher, Jens P. Seifert
Preventing staleness in query results when using asynchronously updated indexes

Patent number: 10614070

Abstract: A method, computer program product, and computer system for optimizing query processing is provided. An asynchronously updated index is provided for a main dataset. A time-sequences log of data modifications to the main dataset is provided. A query of the main dataset is received. The main dataset is joined with the time-sequenced log data resulting in a first intermediate result. The query is processed by keeping one or more entries satisfying the query by emulating a function of the asynchronously updated index resulting in a second intermediate result. Updated, deleted dataset entries are deleted from the asynchronously updated index. The query is processed resulting in a third intermediate result. A union of the second intermediate result and third intermediate result is built defining a final result.

Type: Grant

Filed: October 27, 2015

Date of Patent: April 7, 2020

Assignee: International Business Machines Corporation

Inventors: Marion E. Behnen, Joern Klauke, Jens P. Seifert, Calisto P. Zuzarte
Preventing staleness in query results when using asynchronously updated indexes

Patent number: 10606839

Abstract: A method, computer program product, and computer system for optimizing query processing is provided. An asynchronously updated index is provided for a main dataset. A time-sequences log of data modifications to the main dataset is provided. A query of the main dataset is received. The main dataset is joined with the time-sequenced log data resulting in a first intermediate result. The query is processed by keeping one or more entries satisfying the query by emulating a function of the asynchronously updated index resulting in a second intermediate result. Updated, deleted dataset entries are deleted from the asynchronously updated index. The query is processed resulting in a third intermediate result. A union of the second intermediate result and third intermediate result is built defining a final result.

Type: Grant

Filed: May 23, 2017

Date of Patent: March 31, 2020

Assignee: International Business Machines Corporation

Inventors: Marion E. Behnen, Joern Klauke, Jens P. Seifert, Calisto P. Zuzarte
Method to build reconfigurable variable length comparators

Patent number: 10579375

Abstract: The present disclosure relates performing of comparisons between a first and a second vector. The memory location has a size or length of m bits. A compare block to compare two single bits is used. The compare block comprises: two input bits associated to one of the bits from the first and the second vector respectively; a greater than input bit and a lower than input bit; a cascade enable input bit to control if the greater than input bit and the lower than input bit are considered; a greater than result bit, a lower than result bit, and an equal result bit. A daisy chaining of m of the one-bit compare blocks is performed such that the result bits of one compare block represents the compare result of the previous compare blocks in the chain.

Type: Grant

Filed: February 6, 2018

Date of Patent: March 3, 2020

Assignee: International Business Machines Corporation

Inventors: Cedric Lichtenau, Silvia M. Mueller, Jens P. Seifert, Jörg-Stephan Vogt, Markus Lachenmayr, L'Emir Salim Chehab, Pavankrishna Ellore Ramesh, Sourabh Chougule
METHOD TO REDUCE EFFORT IN VARIABLE WIDTH COMPARATORS

Publication number: 20190243649

Abstract: The present disclosure relates a method, computer program product, and computer system to provide a comparison result vector of a predefined number of elements w resulting from comparison of multiple vectors of compressed data within a processor comprising registers of same size m. Vector elements of the comparison result vector are stored in a register of the registers. Zero bits are padded between vector elements of each of the comparison result vectors. A compare bit result vector indicative of the vector elements is generated for accessing the results of the comparison in the comparison result vector.

Type: Application

Filed: February 6, 2018

Publication date: August 8, 2019

Inventors: Cedric Lichtenau, Silvia M. Mueller, Jens P. Seifert, Jörg-Stephan Vogt, Markus Lachenmayr, L'Emir Salim Chehab, Pavankrishna Ellore Ramesh, Sourabh Chougule
METHOD TO BUILD RECONFIGURABLE VARIABLE LENGTH COMPARATORS

Publication number: 20190243650

Abstract: The present disclosure relates performing of comparisons between a first and a second vector. The memory location has a size or length of m bits. A compare block to compare two single bits is used. The compare block comprises: two input bits associated to one of the bits from the first and the second vector respectively; a greater than input bit and a lower than input bit; a cascade enable input bit to control if the greater than input bit and the lower than input bit are considered; a greater than result bit, a lower than result bit, and an equal result bit. A daisy chaining of m of the one-bit compare blocks is performed such that the result bits of one compare block represents the compare result of the previous compare blocks in the chain.

Type: Application

Filed: February 6, 2018

Publication date: August 8, 2019

Inventors: Cedric Lichtenau, Silvia M. Mueller, Jens P. Seifert, Jörg-Stephan Vogt, Markus Lachenmayr, L'Emir Salim Chehab, Pavankrishna Ellore Ramesh, Sourabh Chougule
LOGGING PROCESS IN A DATA STORAGE SYSTEM

Publication number: 20190102259

Abstract: A logging process in a data storage system having a set of storage tiers, each storage tier of the set of storage tiers having different performance characteristics, wherein the set of storage tiers is divided into a plurality of subsets of storage tiers using the performance characteristics, may include initiating the logging process for creating a separate log file for each of the plurality of subsets of storage tiers for maintaining a history of data changes in the subset of storage tiers, thereby creating a plurality of log files. In response to a change in data stored in at least one storage tier of a subset of storage tiers of the plurality of subsets of storage tiers, one or more log records including information about the change may be generated and written into respective log files.

Type: Application

Filed: November 30, 2018

Publication date: April 4, 2019

Inventors: Martin Oberhofer, Jens P. Seifert, Kostas Rakopoulos, Stephen Rees
COGNITIVE RECOMMENDATIONS FOR DATA PREPARATION

Publication number: 20190095801

Abstract: A method, computer system, and computer program product for providing recommendations about processing datasets. A set of machine learning models are provided for use in respectively determining data processing action performable on a dataset based on a respective set of features of the dataset. A current dataset is received. A set of features of the current dataset are determined. One or more data processing actions are generated to be executed on the current dataset, which are determined by at least two machine learning models of the provided set, based on the determined set of features of the current dataset. One or more of the data processing actions are performed on the current dataset.

Type: Application

Filed: September 22, 2017

Publication date: March 28, 2019

Inventors: Yannick Saillet, Martin A. Oberhofer, Jens P. Seifert
Logging process in a data storage system

Patent number: 10176049

Abstract: A logging process in a data storage system having a set of storage tiers, each storage tier of the set of storage tiers having different performance characteristics, wherein the set of storage tiers is divided into a plurality of subsets of storage tiers using the performance characteristics, may include initiating the logging process for creating a separate log file for each of the plurality of subsets of storage tiers for maintaining a history of data changes in the subset of storage tiers, thereby creating a plurality of log files. In response to a change in data stored in at least one storage tier of a subset of storage tiers of the plurality of subsets of storage tiers, one or more log records including information about the change may be generated and written into respective log files.

Type: Grant

Filed: July 7, 2014

Date of Patent: January 8, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Martin Oberhofer, Jens P. Seifert, Kostas Rakopoulos, Stephen Rees
MULTIPLE RECORD LINKAGE ALGORITHM SELECTOR

Publication number: 20180121535

Abstract: The present disclosure relates to a method for centrally processing data records using a record linkage algorithm. The method comprises providing a centralized master repository for storing data records in a predefined data structure having a set of attributes. At least one clustering metric is provided. Clusters of records may be determined using a clustering function that is based on the at least one clustering metric. For each particular cluster, a set of configuration data for the record linkage algorithm may be defined based on a value of the clustering metric within that particular cluster. The individual data records may be assigned to one or more clusters of the clusters using the clustering metric values and the record linkage algorithm may be applied to a set of two or more individual data records assigned to at least one common cluster using the set of configuration data for the common cluster.

Type: Application

Filed: January 2, 2018

Publication date: May 3, 2018

Inventors: Martin Oberhofer, Yannick Saillet, Scott Schumacher, Jens P. Seifert
MULTIPLE RECORD LINKAGE ALGORITHM SELECTOR

Publication number: 20180113928

Abstract: The present disclosure relates to a method for centrally processing data records using a record linkage algorithm. The method comprises providing a centralized master repository for storing data records in a predefined data structure having a set of attributes. At least one clustering metric is provided. Clusters of records may be determined using a clustering function that is based on the at least one clustering metric. For each particular cluster, a set of configuration data for the record linkage algorithm may be defined based on a value of the clustering metric within that particular cluster. The individual data records may be assigned to one or more clusters of the clusters using the clustering metric values and the record linkage algorithm may be applied to a set of two or more individual data records assigned to at least one common cluster using the set of configuration data for the common cluster.

Type: Application

Filed: October 21, 2016

Publication date: April 26, 2018

Inventors: Martin Oberhofer, Yannick Saillet, Scott Schumacher, Jens P. Seifert
MODEL-DRIVEN PROFILING JOB GENERATOR FOR DATA SOURCES

Publication number: 20180096038

Abstract: Embodiments of the present invention disclose generating a data profiling jobs for source data in a data processing system, the source data being described by at least one source functional data model. A target functional data model is provided, for describing target data that can be generated from the source data. One or more source functional data models are identified that correspond to the target functional data model. At least one functional source-to-target model mapping is associated to at least one source-target pair based on the target functional data model and identified source functional data models. A physical source-to-target model mapping for at least one source-target pair based on the logical source-to-target model mapping is calculated. For all physical source attributes, the needed data profiling jobs are generated based on the target attribute for analyzing the physical source attributes.

Type: Application

Filed: December 6, 2017

Publication date: April 5, 2018

Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens P. Seifert

1 2 next