Patents by Inventor Christoph Lingenfelder

Christoph Lingenfelder has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20120158624
    Abstract: A predictive analysis generates a predictive model (Padj(Y|X)) based on two separate pieces of information, a set of original training data (Dorig), and a “true” distribution of indicators (Ptrue(X)). The predictive analysis begins by generating a base model distribution (Pgen(Y|X)) from the original training data set (Dorig) containing tuples (x,y) of indicators (x) and corresponding labels (y). Using the “true” distribution (Ptrue(X)) of indicators, a random data set (D?) of indicator records (x) is generated reflecting this “true” distribution (Ptrue(X)). Subsequently, the base model (Pgen(Y|X)) is applied to said random data set (D?), thus assigning a label (y) or a distribution of labels to each indicator record (x) in said random data set (D?) and generating an adjusted training set (Dadj). Finally, an adjusted predictive model (Padj(Y|X)) is trained based on said adjusted training set (Dadj).
    Type: Application
    Filed: August 19, 2011
    Publication date: June 21, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Christoph LINGENFELDER, Pascal POMPEY, Michael WURST
  • Publication number: 20120084251
    Abstract: A first data mining model and a second data mining model are compared. A first data mining model M1 represents results of a first data mining task on a first data set D1 and provides a set of first prediction values. A second data mining model M2 represents results of a second data mining task on a second data set D2 and provides a set of second prediction values. A relation R is determined between said sets of prediction values. For at least a first record of an input data set, a first and second probability distribution is created based on the first and second data mining models applied to the first record. A distance measure d is calculated for said first record using the first and second probability distributions and the relation. At least one region of interest is determined based on said distance measure d.
    Type: Application
    Filed: August 19, 2011
    Publication date: April 5, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Christoph LINGENFELDER, Pascal POMPEY, Michael WURST
  • Publication number: 20120047099
    Abstract: A method, system, and computer usable program product for non-intrusive event-driven prediction of a metric in a data processing environment are provided in the illustrative embodiments. At least one set of events is observed in the data processing environment, the set of events being generated by several processes executing in the data processing environment. A subset of the set of events are tracked for an observation period, the tracking resulting in bookkeeping information about the subset of events. A pattern of events is detected in the bookkeeping information. The pattern is formed as a tuple representing a process in the several processes, the metric corresponding to the process. A prediction model is selected for the tuple. The prediction model is supplied with the tuple and executed to generate a predicted value of the metric.
    Type: Application
    Filed: August 18, 2010
    Publication date: February 23, 2012
    Applicant: International Business Machines Corporation
    Inventors: Hung-Yang Chang, Joachim H. Frank, Christoph Lingenfelder, Liangzhao Zeng
  • Patent number: 8122045
    Abstract: The invention relates to a method for mapping at least one data column from a database source to at least one data column of a data target, the method comprising: defining at least one reference column of the data target and at least one database source column; performing a comparison of data contained in the data column(s) with the reference column(s); and determining mapping candidates between the data column(s) and the reference column(s).
    Type: Grant
    Filed: January 9, 2008
    Date of Patent: February 21, 2012
    Assignee: International Business Machines Corporation
    Inventors: Christoph Lingenfelder, Stefan Raspl, Yannick Saillet
  • Patent number: 7937350
    Abstract: The invention relates to a method for determining a time for retraining a data mining model, including the steps of: calculating multivariate statistics of a training model during a training phase; storing the multivariate statistics in the data mining model; evaluating reliability of the data mining model based on the multivariate statistics and at least one distribution parameter, and deciding to retrain the data mining model based on an arbitrary measure of one or more statistical parameters including an F-test statistical analysis.
    Type: Grant
    Filed: November 6, 2007
    Date of Patent: May 3, 2011
    Assignee: International Business Machines Corporation
    Inventors: Christoph Lingenfelder, Stefan Raspl, Yannick Saillet
  • Patent number: 7882128
    Abstract: Methods and apparatus, including computer program products, implementing and using techniques for pattern detection in input data containing several transactions, each transaction having at least one item. Filter conditions for interesting patterns are received, and a first set of filter conditions applicable in connection with generation of candidate patterns is determined. An evaluated candidate pattern is selected as a parent candidate pattern, and evaluation information about the parent candidate pattern is maintained. Child candidate patterns are generated by extending the parent candidate pattern and taking into account the first set of filter conditions. The child candidate patterns are evaluated with respect to the input data together in sets of similar candidate patterns and based on the evaluation information about the parent candidate pattern. At least one child candidate pattern successfully passing the evaluation step is recursively used as a parent candidate pattern.
    Type: Grant
    Filed: February 6, 2007
    Date of Patent: February 1, 2011
    Assignee: International Business Machines Corporation
    Inventors: Toni Bollinger, Ansgar Dorneich, Christoph Lingenfelder
  • Publication number: 20090292743
    Abstract: Embodiments of the invention provide a method for detecting changes in behavior of authorized users of computer resources and reporting the detected changes to the relevant individuals. The method includes evaluating actions performed by each user against user behavioral models and business rules. As a result of the analysis, a subset of users may be identified and reported as having unusual or suspicious behavior. In response, the management may provide feedback indicating that the user behavior is due to the normal expected business needs or that the behavior warrants further review. The management feedback is available for use by machine learning algorithms to improve the analysis of user actions over time. Consequently, investigation of user actions regarding computer resources is facilitated and data loss is prevented more efficiently relative to the prior art approaches with only minimal disruption to the ongoing business processes.
    Type: Application
    Filed: May 21, 2008
    Publication date: November 26, 2009
    Inventors: Joseph P. Bigus, Leon Gong, Christoph Lingenfelder
  • Publication number: 20090293121
    Abstract: Embodiments of the invention provide a method for detecting changes in behavior of authorized users of computer resources and reporting the detected changes to the relevant individuals. The method includes evaluating actions performed by each user against user behavioral models and business rules. As a result of the analysis, a subset of users may be identified and reported as having unusual or suspicious behavior. In response, the management may provide feedback indicating that the user behavior is due to the normal expected business needs or that the behavior warrants further review. The management feedback is available for use by machine learning algorithms to improve the analysis of user actions over time. Consequently, investigation of user actions regarding computer resources is facilitated and data loss is prevented more efficiently relative to the prior art approaches with only minimal disruption to the ongoing business processes.
    Type: Application
    Filed: May 21, 2008
    Publication date: November 26, 2009
    Inventors: Joseph P. Bigus, Leon Gong, Christoph Lingenfelder
  • Patent number: 7567972
    Abstract: A computerized method and system for analyzing a multitude of items in a high dimensional (n-dimensional) data space Dn each described by n item features. The method uses a mining function f with at least one control parameter Pi controlling the target of the data mining function. The method selects a transformation function T for reducing dimensions of the n-dimensional space by space-filling curves mapping said n-dimensional space to a m-dimensional space (m<n). The method determines a transformed control parameter PT i controlling the target of the data mining function in the m-dimensional space. The method applies the selected transformation function T on the multitude Dn of items to create a transformed multitude Dm of items, executes the mining function f controlled by the transformed control parameter PT i on the transformed multitude of items Dm, and stores the result.
    Type: Grant
    Filed: February 26, 2004
    Date of Patent: July 28, 2009
    Assignee: International Business Machines Corporation
    Inventors: Reinhold Geiselhart, Christoph Lingenfelder, Janna Orechkina
  • Publication number: 20090182554
    Abstract: A list of reference terms can be provided. Text and the list of reference terms can be broken down into tokens. At least one candidate can be generated in the text for mapping to at least one of the reference terms. Characters of the candidate can be compared to characters of the reference term according to one or more mapping rules. A confidence value of the mapping can be generated based on the comparison of characters. Candidates can be ranked according to their confidence value.
    Type: Application
    Filed: January 8, 2009
    Publication date: July 16, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Stefan Abraham, Christoph Lingenfelder
  • Publication number: 20090112927
    Abstract: A process of transforming data residing in databases, such as relational databases, into forms suitable as input to data analysis tools, such as predictive modeling tools includes the steps of defining a business process problem to be solved and identifying data requirements. For example, the business process problem may relate to predicting a customer's propensity to make purchases in the future or a store's requirements for inventory in the future. In the process, a computer implemented method is used for automatically transforming data for data analysis such as predictive modeling. Database metadata that describe database tables, their interrelationships, dimensional information, fact tables and measures are accessed. A mining transformation profile is created to encapsulate aggregations and transformation on data stored in relational databases in order to convert the data to forms suitable for predictive mining tools.
    Type: Application
    Filed: October 26, 2007
    Publication date: April 30, 2009
    Inventors: Upendra Chitnis, Christoph Lingenfelder, Edwin Peter Dawson Pednault
  • Publication number: 20080208855
    Abstract: The invention relates to a method for mapping at least one data column from a database source to at least one data column of a data target, the method comprising: defining at least one reference column of the data target and at least one database source column; performing a comparison of data contained in the data column(s) with the reference column(s); and determining mapping candidates between the data column(s) and the reference column(s).
    Type: Application
    Filed: January 9, 2008
    Publication date: August 28, 2008
    Inventors: CHRISTOPH LINGENFELDER, STEFAN RASPL, YANNICK SAILLET
  • Publication number: 20080195650
    Abstract: The invention relates to a method for determining a time for retraining a data mining model, comprising the steps of: calculating multivariate statistics of a training model during a training phase; storing the multivariate statistics in the data mining model; monitoring at least one distribution parameter of application data as a function of time during a deployment phase; and evaluating reliability of the data mining model based on the multivariate statistics and at least one distribution parameter.
    Type: Application
    Filed: November 6, 2007
    Publication date: August 14, 2008
    Inventors: CHRISTOPH LINGENFELDER, Stefan Raspl, Yannick Saillet
  • Publication number: 20070219992
    Abstract: Methods and apparatus, including computer program products, implementing and using techniques for pattern detection in input data containing several transactions, each transaction having at least one item. Filter conditions for interesting patterns are received, and a first set of filter conditions applicable in connection with generation of candidate patterns is determined. An evaluated candidate pattern is selected as a parent candidate pattern, and evaluation information about the parent candidate pattern is maintained. Child candidate patterns are generated by extending the parent candidate pattern and taking into account the first set of filter conditions. The child candidate patterns are evaluated with respect to the input data together in sets of similar candidate patterns and based on the evaluation information about the parent candidate pattern. At least one child candidate pattern successfully passing the evaluation step is recursively used as a parent candidate pattern.
    Type: Application
    Filed: February 6, 2007
    Publication date: September 20, 2007
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Toni Bollinger, Ansgar Dorneich, Christoph Lingenfelder
  • Publication number: 20070220030
    Abstract: Methods and apparatus, including computer program products, implementing and using techniques for compressing data included in several transactions. Each transaction has at least one item. A unique identifier is assigned to each different item and, if taxonomy is defined, to each different taxonomy parent. Sets of transactions are formed from the several transactions. The sets of transactions are stored using a computer data structure including: a list of identifiers of different items in the set of transactions, information indicating number of identifiers in the list, and bit field information indicating presence of the different items in the set of transactions, said bit field information being organized in accordance with the list for facilitating evaluation of patterns with respect to the set of transactions. A data structure for compressing data included in a set of transactions is also provided.
    Type: Application
    Filed: February 6, 2007
    Publication date: September 20, 2007
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Toni Bollinger, Ansgar Dorneich, Christoph Lingenfelder
  • Patent number: 7272590
    Abstract: A system and method determine numerical representations for categorical data fields by taking advantage of the redundancy of the data records to allow automatic discovery of an order of the categories. A categorical data field is recoded by creating separate tables for each numerical data field occurring in the data records. The separate tables are sorted according to the numerical values of the respective data fields. The recoding of the categories is performed based on the average sort order of occurrences of the category in a specific sorted table. The standard deviation of the numerical codes provided by the categories is calculated for each of the separate recoding tables. The recoding table with the maximum standard deviation is selected as the recoding table to perform the recoding of the categories contained in the respective categorical data field of the data records.
    Type: Grant
    Filed: March 6, 2003
    Date of Patent: September 18, 2007
    Assignee: International Business Machines Corporation
    Inventors: Andreas Arning, Christoph Lingenfelder, Gregor Meyer, Dieter Roller, Swen Wohland
  • Patent number: 7177863
    Abstract: A system and associated method for tuning a data clustering program to a clustering task, determine at least one internal parameter of a data clustering program. The determination of one or more of the internal parameters of the data clustering program occurs before the clustering begins. Consequently, clustering does not need to be performed iteratively, thus improving clustering program performance in terms of the required processing time and processing resources. The system provides pairs of data records; the user indicates whether or not these data records should belong to the same cluster. The similarity values of the records of the selected pairs are calculated based on the default parameters of the clustering program. From the resulting similarity values, an optimal similarity threshold is determined. When the optimization criterion does not yield a single optimal similarity threshold range, equivalent candidate ranges are selected.
    Type: Grant
    Filed: March 14, 2003
    Date of Patent: February 13, 2007
    Assignee: International Business Machines Corporation
    Inventors: Boris Charpiot, Barbara Hartel, Christoph Lingenfelder, Thilo Maier
  • Patent number: 7099880
    Abstract: A system and associated method for data mining prediction is presented according to which the user selects a database table by means of a graphical user interface. Some records in the table are complete, while other records are incomplete. A subset of records of the database table is determined wherein each record of the subset contains a data value in the column selected for prediction. This subset of records is used to generate a model by means of a data mining algorithm, such as linear regression, radial basis function, decision tree or neural network methods. The resulting model is then utilized to predict the empty data fields in the column. After completing the prediction, the predicted values are entered into the column for display to the user.
    Type: Grant
    Filed: December 6, 2002
    Date of Patent: August 29, 2006
    Assignee: International Business Machines Corporation
    Inventors: Andreas Arning, Martin Keller, Christoph Lingenfelder, Gregor Meyer
  • Publication number: 20040225638
    Abstract: The proposed computerized method and system is adapted for analyzing a multitude of items in a high dimensional (n-dimensional) data space Dn each described by n item features. The method uses a mining function f with at least one control parameter Pi controlling the target of the data mining function.
    Type: Application
    Filed: February 26, 2004
    Publication date: November 11, 2004
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Reinhold Geiselhart, Christoph Lingenfelder, Janna Orechkina
  • Publication number: 20030204483
    Abstract: A system and method determine numerical representations for categorical data fields by taking advantage of the redundancy of the data records to allow automatic discovery of an order of the categories. A categorical data field is recoded by creating separate tables for each numerical data field occurring in the data records. The separate tables are sorted according to the numerical values of the respective data fields. The recoding of the categories is performed based on the average sort order of occurrences of the category in a specific sorted table. The standard deviation of the numerical codes provided by the categories is calculated for each of the separate recoding tables. The recoding table with the maximum standard deviation is selected as the recoding table to perform the recoding of the categories contained in the respective categorical data field of the data records.
    Type: Application
    Filed: March 6, 2003
    Publication date: October 30, 2003
    Applicant: International Business Machines Corporation
    Inventors: Andreas Arning, Christoph Lingenfelder, Gregor Meyer, Dieter Roller, Swen Wohland