Patents by Inventor Yannick Saillet
Yannick Saillet has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20160092494Abstract: A method, executed by a computer, for de-duplicating data includes receiving a dataset, pivoting the dataset along a set of columns that have a common domain to provide a pivoted dataset, de-duplicating the pivoted dataset to provide a de-duplicated dataset, and using the de-duplicated dataset. De-duplicating the pivoted dataset may include computing similarity scores for records that have different primary keys and merging records that have a similarity score that exceeds a selected threshold value. The method may include determining the set of columns having a common domain by referencing a business catalog and/or conducting a data classification operation on some or all of the columns of the dataset. The method may also include pivoting the dataset along another set of columns that have a different common domain. A computer system and computer program product corresponding to the method are also disclosed herein.Type: ApplicationFiled: September 30, 2014Publication date: March 31, 2016Inventors: Namit Kabra, Yannick Saillet
-
Publication number: 20160092453Abstract: A processor receives statistical information about a data set included in a column of a data table. The processor receives additional information about the data set that indicates a data format utilized by the data set and a type of information represented by the data set. The processor generates a data dictionary for compression of the data set based, at least in part, on the statistical information and the additional information. The data dictionary is created such that the data dictionary is capable of compressing data that is statistically predicted to be received at a future point.Type: ApplicationFiled: September 30, 2014Publication date: March 31, 2016Inventors: Martin A. Oberhofer, Yannick Saillet, Jens Seifert
-
Patent number: 9292478Abstract: Methods and apparatus, including computer program products, implementing and using techniques for providing a visual editor allowing graphical editing of expressions in an expression language. A graphical user interface is displayed. A first user input of an expression is received. The expression is defined in a logical or textual form, and each component of the expression is represented by a graphical element on the graphical user interface. A syntax of the first user input is verified and an alert is provided to the user in response to detecting a syntax error or an inconsistency of the first user input when verifying the syntax.Type: GrantFiled: December 22, 2008Date of Patent: March 22, 2016Assignee: International Business Machines CorporationInventors: Frederick Charles Ernest Briden, Yannick Saillet
-
Patent number: 9256827Abstract: Embodiments for methods, systems, and computer program products for creating and managing a portable data rule using an electronic computing device are presented including: causing the electronic computing device to create a rule definition including, defining an expression by a user, where the expression defines a logic of a rule, causing the electronic computing device to parse the expression into a logical variable associated with the expression, causing the electronic computing device to identify the logical variable, and causing the electronic computing device to store the rule definition, where the rule definition includes the expression and the logical variable. In some embodiments, the causing the electronic computing device to identify the logical variable includes: causing the electronic computing device to return a name of the logical variable; and causing the electronic computing device to return an expected type of the logical variable.Type: GrantFiled: March 9, 2012Date of Patent: February 9, 2016Assignee: International Business Machines CorporationInventors: David T. Meeks, Yannick Saillet
-
Patent number: 9253053Abstract: Method, system, and computer program product to facilitate selection of data nodes configured to satisfy a set of requirements for processing client data in a distributed computing environment by providing, for each data node of a plurality of data nodes in the distributed computing environment, nodal data describing the respective data node of the plurality of data nodes, receiving a request to process the client data, the client data being identified in the request, retrieving the set of requirements for processing the client data, and analyzing the retrieved data policy and the nodal data describing at least one of the data nodes, to select a first data node of the plurality of data nodes as a delegation target, the first data node selected based on having a higher suitability level for satisfying the set of requirements than a second data node of the plurality of data nodes.Type: GrantFiled: October 11, 2012Date of Patent: February 2, 2016Assignee: International Business Machines CorporationInventors: Sebastian Nelke, Martin A. Oberhofer, Yannick Saillet, Jens Seifert
-
Patent number: 9253055Abstract: Method, system, and computer program product to facilitate selection of data nodes configured to satisfy a set of requirements for processing client data in a distributed computing environment by providing, for each data node of a plurality of data nodes in the distributed computing environment, nodal data describing the respective data node of the plurality of data nodes, receiving a request to process the client data, the client data being identified in the request, retrieving the set of requirements for processing the client data, and analyzing the retrieved data policy and the nodal data describing at least one of the data nodes, to select a first data node of the plurality of data nodes as a delegation target, the first data node selected based on having a higher suitability level for satisfying the set of requirements than a second data node of the plurality of data nodes.Type: GrantFiled: May 2, 2013Date of Patent: February 2, 2016Assignee: International Business Machines CorporationInventors: Sebastian Nelke, Martin A. Oberhofer, Yannick Saillet, Jens Seifert
-
Patent number: 9209992Abstract: An improved method for handling instant messaging sessions in an instant messaging server is disclosed. The method comprises providing global annotators for annotating instant messaging communications, wherein instant messaging users are being able to select for a private enhancement stack at least one of the following: annotators and look-up services; providing instant messaging users with a capability to obtain contextual information by activating enhancement functions provided by said private enhancement stack; establishing an instant messaging session between a set of instant messaging users; and supporting sharing said contextual information among said set of instant messaging users as part of the instant messaging session.Type: GrantFiled: November 2, 2010Date of Patent: December 8, 2015Assignee: International Business Machines CorporationInventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
-
Publication number: 20150339360Abstract: Embodiments relate to processing a data set stored in a computer system. In one aspect, a method of processing a data set stored in a computer system includes providing one or more parameters for quantifying data quality of the data set. A processor generates, for each parameter of the one or more parameters, a reference pattern indicating a dysfunctional behavior of the values of the parameter. The data set is processed to obtain values of the one or more parameters. A parameter of the one or more parameters is identified whose obtained values match a corresponding reference pattern of the generated reference patterns. The identified parameter is assigned a resource weight value indicating the amount of processing resources required to fix the dysfunctional behavior of the identified parameter.Type: ApplicationFiled: May 13, 2015Publication date: November 26, 2015Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
-
Publication number: 20150254474Abstract: The invention provides for a data processing system comprising an application server comprising at least one processor. Execution of the instructions cause the processor to: receive an analysis request, the analysis request comprising multiple data analysis commands for generating an analysis report descriptive of a structured data file; divide the commands into private analysis commands and public analysis commands; send the private analysis commands to a trusted distributed file system; send a portion of the public analysis commands to a public distributed file system; send a remainder of the public analysis commands to the trusted distributed file system; and generate the analysis report using public analysis results from the public distributed file system and trusted analysis results from the trusted distributed file system.Type: ApplicationFiled: February 26, 2015Publication date: September 10, 2015Inventors: Sebastian Nelke, Martin A. Oberhofer, Yannick Saillet, Jens Seifert
-
Patent number: 9104784Abstract: An aspect includes a computer-implemented method for detecting one or more multi-column composite key column sets. The method includes accessing a plurality of first columns, each first column representing a parameter, each first column including a set of distinct parameter values of its respective parameter, each distinct parameter value being stored in association with one or more object identifiers. Two or more of the first columns are selected for use as a current candidate column set, the current candidate column set including at least a first and a second candidate column, the current candidate column set being of a current cardinality. The method also includes determining, by comparing object-identifiers, whether for the current candidate column set at least one tuple of parameter values exists with parameter values respectively stored in association with two or more shared ones of the object identifiers to identify a multi-column composite key column set.Type: GrantFiled: July 22, 2013Date of Patent: August 11, 2015Assignee: International Business Machines CorporationInventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
-
Patent number: 9092468Abstract: A computer implemented method, computer program product and system for data quality monitoring includes measuring a data quality of loaded data relative to a predefined data quality metric. The measuring the data quality includes identifying delta changes in at least one of the loaded data and the data quality rules relative to a previous measurement of the data quality of the loaded data. Logical calculus defined in the data quality rules is applied to the identified delta changes.Type: GrantFiled: June 29, 2012Date of Patent: July 28, 2015Assignee: International Business Machines CorporationInventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
-
Patent number: 9043294Abstract: Overflow access records (OARs) are managed in a database system. An OAR is created in response to receiving an update command for a data record and to the updated data record generated by the update command not fitting onto the page in the table where the data record was stored. The OAR that is created includes an index counter that indicates a number of indexes associated with the table. When an OAR is accessed in response to a query command, an identifier of the accessed OAR is replaced in the index by an identifier of a data record pointed to by the OAR, and the index counter in the accessed OAR is changed by a predefined amount. When the index counter reaches a predefined value, the accessed OAR is removed from the table.Type: GrantFiled: March 2, 2012Date of Patent: May 26, 2015Assignee: International Business Machines CorporationInventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert, Knut Stolze
-
Publication number: 20150066987Abstract: Embodiments relate to accessing a set of data tables in a source database. A set of table categories is provided for tables in the source database and a set of metrics is provided. For each table of the set of the data tables: the set of metrics is evaluated, the evaluated set of metrics is analyzed, and the table is categorized into one of the set of table categories using the result of the analysis. Information indicative of the table category of each table of the set of tables is output, and in response, a request to select data tables of the set of data tables is received according to a part of the table categories for data processing. A subset of data tables of the set of data tables is selected using the table categories for performing the data processing on the subset of data tables.Type: ApplicationFiled: September 3, 2014Publication date: March 5, 2015Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
-
Publication number: 20150058280Abstract: A computer implemented method, computer program product and system for data quality monitoring includes measuring a data quality of loaded data relative to a predefined data quality metric. The measuring the data quality includes identifying delta changes in at least one of the loaded data and the data quality rules relative to a previous measurement of the data quality of the loaded data. Logical calculus defined in the data quality rules is applied to the identified delta changes.Type: ApplicationFiled: October 21, 2014Publication date: February 26, 2015Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
-
Patent number: 8949166Abstract: A data rule is created and processed by receiving an expression defining a logic of a rule and at least one logical variable, creating a rule definition including the expression and the at least one logical variable for binding each logical variable of the rule with at least one column, associating a characteristic enabling comparison of columns with a first logical variable of the rule definition, and storing the characteristic as part of the rule definition.Type: GrantFiled: August 25, 2011Date of Patent: February 3, 2015Assignee: International Business Machines CorporationInventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
-
Publication number: 20140379667Abstract: According to one embodiment of the present invention, a system assesses the quality of column data. The system assigns a pre-defined domain to one or more columns of the data based on a validity condition for the domain, applies the validity condition for the domain assigned to a column to data values in the column to compute a data quality metric for the column, and computes and displays a metric for a group of columns based on the computed data quality metric of at least one column in the group. Embodiments of the present invention further include a method and computer program product for assessing the quality of column data in substantially the same manners described above.Type: ApplicationFiled: September 9, 2014Publication date: December 25, 2014Inventors: Thomas Hollifield, Yannick Saillet
-
Patent number: 8719271Abstract: A data profile request is handles by utilizing data in a distributed file system. Tabular data is extracted from a data source and stored in a distributed file system. Each table in the tabular data is split by columns, which are each stored in separate files in a set of physical nodes of the distributed file system. In response to a data profiling request, a master node determines, based on the profiling request, which groups of files are needed to be on a same physical node in order to perform the profiling analysis. The master node creates jobs using physical nodes that contain the requisite files needed for each job.Type: GrantFiled: October 5, 2012Date of Patent: May 6, 2014Assignee: International Business Machines CorporationInventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
-
Publication number: 20140108639Abstract: Method, system, and computer program product to facilitate selection of data nodes configured to satisfy a set of requirements for processing client data in a distributed computing environment by providing, for each data node of a plurality of data nodes in the distributed computing environment, nodal data describing the respective data node of the plurality of data nodes, receiving a request to process the client data, the client data being identified in the request, retrieving the set of requirements for processing the client data, and analyzing the retrieved data policy and the nodal data describing at least one of the data nodes, to select a first data node of the plurality of data nodes as a delegation target, the first data node selected based on having a higher suitability level for satisfying the set of requirements than a second data node of the plurality of data nodes.Type: ApplicationFiled: October 11, 2012Publication date: April 17, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Sebastian Nelke, Martin A. Oberhofer, Yannick Saillet, Jens Seifert
-
Patent number: 8666998Abstract: A method, system and computer program product provides a first characteristic associated with a first data set and a single data value, and a second characteristic associated with a second data set; and calculates at least one of: 1) the similarity of the first data set with the second data set based on the first and second characteristics, 2) the similarity of the first data set with the single data value based on the first characteristic and the single data value, 3) confidence indicating how well the first characteristic reflects properties of the first data set based on the first characteristic, and 4) confidence indicating how well the similarity of the first data set with the single data value reflects properties of the single data value based on the first characteristic and the single data value.Type: GrantFiled: June 30, 2011Date of Patent: March 4, 2014Assignee: International Business Machines CorporationInventors: Sebastian Nelke, Martin A Oberhofer, Yannick Saillet, Jens Seifert
-
Publication number: 20140046927Abstract: An aspect includes a computer-implemented method for detecting one or more multi-column composite key column sets. The method includes accessing a plurality of first columns, each first column representing a parameter, each first column including a set of distinct parameter values of its respective parameter, each distinct parameter value being stored in association with one or more object identifiers. Two or more of the first columns are selected for use as a current candidate column set, the current candidate column set including at least a first and a second candidate column, the current candidate column set being of a current cardinality. The method also includes determining, by comparing object-identifiers, whether for the current candidate column set at least one tuple of parameter values exists with parameter values respectively stored in association with two or more shared ones of the object identifiers to identify a multi-column composite key column set.Type: ApplicationFiled: July 22, 2013Publication date: February 13, 2014Applicant: International Business Machines CorporationInventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert