Patents by Inventor Yannick Saillet

Yannick Saillet has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20160092494
    Abstract: A method, executed by a computer, for de-duplicating data includes receiving a dataset, pivoting the dataset along a set of columns that have a common domain to provide a pivoted dataset, de-duplicating the pivoted dataset to provide a de-duplicated dataset, and using the de-duplicated dataset. De-duplicating the pivoted dataset may include computing similarity scores for records that have different primary keys and merging records that have a similarity score that exceeds a selected threshold value. The method may include determining the set of columns having a common domain by referencing a business catalog and/or conducting a data classification operation on some or all of the columns of the dataset. The method may also include pivoting the dataset along another set of columns that have a different common domain. A computer system and computer program product corresponding to the method are also disclosed herein.
    Type: Application
    Filed: September 30, 2014
    Publication date: March 31, 2016
    Inventors: Namit Kabra, Yannick Saillet
  • Publication number: 20160092453
    Abstract: A processor receives statistical information about a data set included in a column of a data table. The processor receives additional information about the data set that indicates a data format utilized by the data set and a type of information represented by the data set. The processor generates a data dictionary for compression of the data set based, at least in part, on the statistical information and the additional information. The data dictionary is created such that the data dictionary is capable of compressing data that is statistically predicted to be received at a future point.
    Type: Application
    Filed: September 30, 2014
    Publication date: March 31, 2016
    Inventors: Martin A. Oberhofer, Yannick Saillet, Jens Seifert
  • Patent number: 9292478
    Abstract: Methods and apparatus, including computer program products, implementing and using techniques for providing a visual editor allowing graphical editing of expressions in an expression language. A graphical user interface is displayed. A first user input of an expression is received. The expression is defined in a logical or textual form, and each component of the expression is represented by a graphical element on the graphical user interface. A syntax of the first user input is verified and an alert is provided to the user in response to detecting a syntax error or an inconsistency of the first user input when verifying the syntax.
    Type: Grant
    Filed: December 22, 2008
    Date of Patent: March 22, 2016
    Assignee: International Business Machines Corporation
    Inventors: Frederick Charles Ernest Briden, Yannick Saillet
  • Patent number: 9256827
    Abstract: Embodiments for methods, systems, and computer program products for creating and managing a portable data rule using an electronic computing device are presented including: causing the electronic computing device to create a rule definition including, defining an expression by a user, where the expression defines a logic of a rule, causing the electronic computing device to parse the expression into a logical variable associated with the expression, causing the electronic computing device to identify the logical variable, and causing the electronic computing device to store the rule definition, where the rule definition includes the expression and the logical variable. In some embodiments, the causing the electronic computing device to identify the logical variable includes: causing the electronic computing device to return a name of the logical variable; and causing the electronic computing device to return an expected type of the logical variable.
    Type: Grant
    Filed: March 9, 2012
    Date of Patent: February 9, 2016
    Assignee: International Business Machines Corporation
    Inventors: David T. Meeks, Yannick Saillet
  • Patent number: 9253053
    Abstract: Method, system, and computer program product to facilitate selection of data nodes configured to satisfy a set of requirements for processing client data in a distributed computing environment by providing, for each data node of a plurality of data nodes in the distributed computing environment, nodal data describing the respective data node of the plurality of data nodes, receiving a request to process the client data, the client data being identified in the request, retrieving the set of requirements for processing the client data, and analyzing the retrieved data policy and the nodal data describing at least one of the data nodes, to select a first data node of the plurality of data nodes as a delegation target, the first data node selected based on having a higher suitability level for satisfying the set of requirements than a second data node of the plurality of data nodes.
    Type: Grant
    Filed: October 11, 2012
    Date of Patent: February 2, 2016
    Assignee: International Business Machines Corporation
    Inventors: Sebastian Nelke, Martin A. Oberhofer, Yannick Saillet, Jens Seifert
  • Patent number: 9253055
    Abstract: Method, system, and computer program product to facilitate selection of data nodes configured to satisfy a set of requirements for processing client data in a distributed computing environment by providing, for each data node of a plurality of data nodes in the distributed computing environment, nodal data describing the respective data node of the plurality of data nodes, receiving a request to process the client data, the client data being identified in the request, retrieving the set of requirements for processing the client data, and analyzing the retrieved data policy and the nodal data describing at least one of the data nodes, to select a first data node of the plurality of data nodes as a delegation target, the first data node selected based on having a higher suitability level for satisfying the set of requirements than a second data node of the plurality of data nodes.
    Type: Grant
    Filed: May 2, 2013
    Date of Patent: February 2, 2016
    Assignee: International Business Machines Corporation
    Inventors: Sebastian Nelke, Martin A. Oberhofer, Yannick Saillet, Jens Seifert
  • Patent number: 9209992
    Abstract: An improved method for handling instant messaging sessions in an instant messaging server is disclosed. The method comprises providing global annotators for annotating instant messaging communications, wherein instant messaging users are being able to select for a private enhancement stack at least one of the following: annotators and look-up services; providing instant messaging users with a capability to obtain contextual information by activating enhancement functions provided by said private enhancement stack; establishing an instant messaging session between a set of instant messaging users; and supporting sharing said contextual information among said set of instant messaging users as part of the instant messaging session.
    Type: Grant
    Filed: November 2, 2010
    Date of Patent: December 8, 2015
    Assignee: International Business Machines Corporation
    Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
  • Publication number: 20150339360
    Abstract: Embodiments relate to processing a data set stored in a computer system. In one aspect, a method of processing a data set stored in a computer system includes providing one or more parameters for quantifying data quality of the data set. A processor generates, for each parameter of the one or more parameters, a reference pattern indicating a dysfunctional behavior of the values of the parameter. The data set is processed to obtain values of the one or more parameters. A parameter of the one or more parameters is identified whose obtained values match a corresponding reference pattern of the generated reference patterns. The identified parameter is assigned a resource weight value indicating the amount of processing resources required to fix the dysfunctional behavior of the identified parameter.
    Type: Application
    Filed: May 13, 2015
    Publication date: November 26, 2015
    Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
  • Publication number: 20150254474
    Abstract: The invention provides for a data processing system comprising an application server comprising at least one processor. Execution of the instructions cause the processor to: receive an analysis request, the analysis request comprising multiple data analysis commands for generating an analysis report descriptive of a structured data file; divide the commands into private analysis commands and public analysis commands; send the private analysis commands to a trusted distributed file system; send a portion of the public analysis commands to a public distributed file system; send a remainder of the public analysis commands to the trusted distributed file system; and generate the analysis report using public analysis results from the public distributed file system and trusted analysis results from the trusted distributed file system.
    Type: Application
    Filed: February 26, 2015
    Publication date: September 10, 2015
    Inventors: Sebastian Nelke, Martin A. Oberhofer, Yannick Saillet, Jens Seifert
  • Patent number: 9104784
    Abstract: An aspect includes a computer-implemented method for detecting one or more multi-column composite key column sets. The method includes accessing a plurality of first columns, each first column representing a parameter, each first column including a set of distinct parameter values of its respective parameter, each distinct parameter value being stored in association with one or more object identifiers. Two or more of the first columns are selected for use as a current candidate column set, the current candidate column set including at least a first and a second candidate column, the current candidate column set being of a current cardinality. The method also includes determining, by comparing object-identifiers, whether for the current candidate column set at least one tuple of parameter values exists with parameter values respectively stored in association with two or more shared ones of the object identifiers to identify a multi-column composite key column set.
    Type: Grant
    Filed: July 22, 2013
    Date of Patent: August 11, 2015
    Assignee: International Business Machines Corporation
    Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
  • Patent number: 9092468
    Abstract: A computer implemented method, computer program product and system for data quality monitoring includes measuring a data quality of loaded data relative to a predefined data quality metric. The measuring the data quality includes identifying delta changes in at least one of the loaded data and the data quality rules relative to a previous measurement of the data quality of the loaded data. Logical calculus defined in the data quality rules is applied to the identified delta changes.
    Type: Grant
    Filed: June 29, 2012
    Date of Patent: July 28, 2015
    Assignee: International Business Machines Corporation
    Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
  • Patent number: 9043294
    Abstract: Overflow access records (OARs) are managed in a database system. An OAR is created in response to receiving an update command for a data record and to the updated data record generated by the update command not fitting onto the page in the table where the data record was stored. The OAR that is created includes an index counter that indicates a number of indexes associated with the table. When an OAR is accessed in response to a query command, an identifier of the accessed OAR is replaced in the index by an identifier of a data record pointed to by the OAR, and the index counter in the accessed OAR is changed by a predefined amount. When the index counter reaches a predefined value, the accessed OAR is removed from the table.
    Type: Grant
    Filed: March 2, 2012
    Date of Patent: May 26, 2015
    Assignee: International Business Machines Corporation
    Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert, Knut Stolze
  • Publication number: 20150066987
    Abstract: Embodiments relate to accessing a set of data tables in a source database. A set of table categories is provided for tables in the source database and a set of metrics is provided. For each table of the set of the data tables: the set of metrics is evaluated, the evaluated set of metrics is analyzed, and the table is categorized into one of the set of table categories using the result of the analysis. Information indicative of the table category of each table of the set of tables is output, and in response, a request to select data tables of the set of data tables is received according to a part of the table categories for data processing. A subset of data tables of the set of data tables is selected using the table categories for performing the data processing on the subset of data tables.
    Type: Application
    Filed: September 3, 2014
    Publication date: March 5, 2015
    Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
  • Publication number: 20150058280
    Abstract: A computer implemented method, computer program product and system for data quality monitoring includes measuring a data quality of loaded data relative to a predefined data quality metric. The measuring the data quality includes identifying delta changes in at least one of the loaded data and the data quality rules relative to a previous measurement of the data quality of the loaded data. Logical calculus defined in the data quality rules is applied to the identified delta changes.
    Type: Application
    Filed: October 21, 2014
    Publication date: February 26, 2015
    Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
  • Patent number: 8949166
    Abstract: A data rule is created and processed by receiving an expression defining a logic of a rule and at least one logical variable, creating a rule definition including the expression and the at least one logical variable for binding each logical variable of the rule with at least one column, associating a characteristic enabling comparison of columns with a first logical variable of the rule definition, and storing the characteristic as part of the rule definition.
    Type: Grant
    Filed: August 25, 2011
    Date of Patent: February 3, 2015
    Assignee: International Business Machines Corporation
    Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
  • Publication number: 20140379667
    Abstract: According to one embodiment of the present invention, a system assesses the quality of column data. The system assigns a pre-defined domain to one or more columns of the data based on a validity condition for the domain, applies the validity condition for the domain assigned to a column to data values in the column to compute a data quality metric for the column, and computes and displays a metric for a group of columns based on the computed data quality metric of at least one column in the group. Embodiments of the present invention further include a method and computer program product for assessing the quality of column data in substantially the same manners described above.
    Type: Application
    Filed: September 9, 2014
    Publication date: December 25, 2014
    Inventors: Thomas Hollifield, Yannick Saillet
  • Patent number: 8719271
    Abstract: A data profile request is handles by utilizing data in a distributed file system. Tabular data is extracted from a data source and stored in a distributed file system. Each table in the tabular data is split by columns, which are each stored in separate files in a set of physical nodes of the distributed file system. In response to a data profiling request, a master node determines, based on the profiling request, which groups of files are needed to be on a same physical node in order to perform the profiling analysis. The master node creates jobs using physical nodes that contain the requisite files needed for each job.
    Type: Grant
    Filed: October 5, 2012
    Date of Patent: May 6, 2014
    Assignee: International Business Machines Corporation
    Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert
  • Publication number: 20140108639
    Abstract: Method, system, and computer program product to facilitate selection of data nodes configured to satisfy a set of requirements for processing client data in a distributed computing environment by providing, for each data node of a plurality of data nodes in the distributed computing environment, nodal data describing the respective data node of the plurality of data nodes, receiving a request to process the client data, the client data being identified in the request, retrieving the set of requirements for processing the client data, and analyzing the retrieved data policy and the nodal data describing at least one of the data nodes, to select a first data node of the plurality of data nodes as a delegation target, the first data node selected based on having a higher suitability level for satisfying the set of requirements than a second data node of the plurality of data nodes.
    Type: Application
    Filed: October 11, 2012
    Publication date: April 17, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Sebastian Nelke, Martin A. Oberhofer, Yannick Saillet, Jens Seifert
  • Patent number: 8666998
    Abstract: A method, system and computer program product provides a first characteristic associated with a first data set and a single data value, and a second characteristic associated with a second data set; and calculates at least one of: 1) the similarity of the first data set with the second data set based on the first and second characteristics, 2) the similarity of the first data set with the single data value based on the first characteristic and the single data value, 3) confidence indicating how well the first characteristic reflects properties of the first data set based on the first characteristic, and 4) confidence indicating how well the similarity of the first data set with the single data value reflects properties of the single data value based on the first characteristic and the single data value.
    Type: Grant
    Filed: June 30, 2011
    Date of Patent: March 4, 2014
    Assignee: International Business Machines Corporation
    Inventors: Sebastian Nelke, Martin A Oberhofer, Yannick Saillet, Jens Seifert
  • Publication number: 20140046927
    Abstract: An aspect includes a computer-implemented method for detecting one or more multi-column composite key column sets. The method includes accessing a plurality of first columns, each first column representing a parameter, each first column including a set of distinct parameter values of its respective parameter, each distinct parameter value being stored in association with one or more object identifiers. Two or more of the first columns are selected for use as a current candidate column set, the current candidate column set including at least a first and a second candidate column, the current candidate column set being of a current cardinality. The method also includes determining, by comparing object-identifiers, whether for the current candidate column set at least one tuple of parameter values exists with parameter values respectively stored in association with two or more shared ones of the object identifiers to identify a multi-column composite key column set.
    Type: Application
    Filed: July 22, 2013
    Publication date: February 13, 2014
    Applicant: International Business Machines Corporation
    Inventors: Sebastian Nelke, Martin Oberhofer, Yannick Saillet, Jens Seifert