Patents by Inventor Yannick Saillet

Yannick Saillet has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11461135
    Abstract: In an approach to dynamically identifying and modifying the parallelism of a particular task in a pipeline, the optimal execution time of each stage in a dynamic pipeline is calculated. The actual execution time of each stage in the dynamic pipeline is measured. Whether the actual time of completion of the data processing job will exceed a threshold is determined. If it is determined that the actual time of completion of the data processing job will exceed the threshold, then additional instances of the stages are created.
    Type: Grant
    Filed: October 25, 2019
    Date of Patent: October 4, 2022
    Assignee: International Business Machines Corporation
    Inventors: Yannick Saillet, Namit Kabra, Ritesh Kumar Gupta
  • Publication number: 20220277017
    Abstract: Techniques are described relating to automatic data standardization in a managed services domain of a cloud computing environment. An associated computer-implemented method includes receiving a dataset during a data onboarding procedure and classifying datapoints within the dataset. The method further includes applying a machine learning data standardization model to each classified datapoint within the dataset and deriving a proposed set of data standardization rules for the dataset based upon any standardization modification determined consequent to model application. Optionally, the method includes presenting the proposed set of data standardization rules for client review and, responsive to acceptance of the proposed set of data standardization rules, applying the proposed set of data standardization rules to the dataset. The method further includes, responsive to acceptance of the proposed set of data standardization rules, updating the machine learning data standardization model accordingly.
    Type: Application
    Filed: February 24, 2021
    Publication date: September 1, 2022
    Inventors: Namit Kabra, Krishna Kishore Bonagiri, Mike W. Grasselt, Yannick Saillet
  • Patent number: 11429878
    Abstract: A method, computer system, and computer program product for providing recommendations about processing datasets. A set of machine learning models are provided for use in respectively determining data processing action performable on a dataset based on a respective set of features of the dataset. A current dataset is received. A set of features of the current dataset are determined. One or more data processing actions are generated to be executed on the current dataset, which are determined by at least two machine learning models of the provided set, based on the determined set of features of the current dataset. One or more of the data processing actions are performed on the current dataset.
    Type: Grant
    Filed: September 22, 2017
    Date of Patent: August 30, 2022
    Assignee: International Business Machines Corporation
    Inventors: Yannick Saillet, Martin A. Oberhofer, Jens P. Seifert
  • Patent number: 11397855
    Abstract: A method for generating data standardization rules includes receiving a training data set containing tokenized and tagged data values. A set of machine mining models is built using different learning algorithms for identifying tags and tag patterns using the training set. For each data value in a further data set: a tokenization is applied on the data value, resulting in a set of tokens. For each token of the set of tokens one or more tag candidates are determined using a lookup dictionary of tags and tokens and/or at least part of the set of machine mining models, resulting for each token of the set of tokens in a list of possible tags. Unique combinations of the sets of tags of the further data set having highest aggregated confidence values are provided for use as standardization rules.
    Type: Grant
    Filed: December 12, 2017
    Date of Patent: July 26, 2022
    Assignee: International Business Machines Corporation
    Inventors: Yannick Saillet, Martin Oberhofer, Namit Kabra
  • Patent number: 11393171
    Abstract: Aspects of the present disclosure relate to controlling virtual reality (VR) content displayed on a VR head mounted display (HMD). Communication can be established between a computer system, a VR HMD, and a mobile device. A user input configured to control VR content displayed on a display of the VR HMD can be received on the mobile device. The VR content displayed on the VR HMD can then be controlled based on the user input received on the mobile device.
    Type: Grant
    Filed: July 21, 2020
    Date of Patent: July 19, 2022
    Assignee: International Business Machines Corporation
    Inventors: Namit Kabra, Smitkumar Narotambhai Marvaniya, Yannick Saillet, Kunjavihari Madhav Kashalikar
  • Patent number: 11366843
    Abstract: The invention relates to a computer-implemented method for classifying a set of data values. For each of the data values of the set of data values, a set of one or more terms associated with the respective data value is determined using one or more first knowledge bases. A set of common terms is determined. The set of common terms comprises terms present in more than one of the sets of terms. For each of the common terms, a number of hits for a lookup query against one or more second knowledge data bases is determined. One or more common terms of the set of common terms with the smallest number of hits are determined and a result is returned. The result comprises the one or more common terms with the smallest number of hits as one or more candidate classes for classifying the set of data values.
    Type: Grant
    Filed: April 23, 2019
    Date of Patent: June 21, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Albert Maier, Martin Oberhofer, Yannick Saillet
  • Patent number: 11354282
    Abstract: A computer implemented method for classifying at least one source dataset of a computer system. The method may include providing a plurality of associated reference tables organized and associated in accordance with a reference storage model in the computer system. The method may also include calculating, by a data classifier application of the computer system, a first similarity score between the source dataset and a first reference table of the reference tables based on common attributes in the source dataset and a join of the first reference table with at least one further reference table of the reference tables having a relationship with the first reference table. The method may further include classifying, by the data classifier application, the source dataset by determining using at least the calculated first similarity score whether the source dataset is organized as the first reference table in accordance to the reference storage model.
    Type: Grant
    Filed: January 10, 2020
    Date of Patent: June 7, 2022
    Assignee: International Business Machinos Corporation
    Inventors: Martin Oberhofer, Adapala S. Reddy, Yannick Saillet, Jens Seifert
  • Patent number: 11334603
    Abstract: A method, system and computer program product for finding groups of potential duplicates in attribute values. Each attribute value of the attribute values is converted to a respective set of bigrams. All bigrams present in the attribute values may be determined. Bigrams present in the attribute values may be represented as bits. This may result in a bitmap representing the presence of the bigrams in the attribute values. The attribute values may be grouped using bitwise operations on the bitmap, where each group includes attribute values that are determined based on pairwise bigram-based similarity scores. The pairwise bigram-based similarity score reflects the number of common bigrams between two attribute values.
    Type: Grant
    Filed: February 14, 2020
    Date of Patent: May 17, 2022
    Assignee: International Business Machines Corporation
    Inventors: Namit Kabra, Yannick Saillet
  • Publication number: 20220138512
    Abstract: Methods, computer program products and/or systems are provided that perform the following operations: obtaining a first graph comprising first nodes representing first entities and first edges representing relationships between first entities, the first nodes being associated with first entity attributes descriptive of the first entities represented by the first nodes, the first edges being associated with first edge attributes descriptive of the relationships represented by the first edges; determining a first subgraph for a certain node of the first nodes of the first graph, the first subgraph including the certain node and at least one neighboring node of the certain node; and determining a data quality issue regarding the certain node based, at least in part, on applying one or more applicable rules of a set of data quality rules to first entity attribute values and first edge attribute values of the first subgraph.
    Type: Application
    Filed: October 29, 2020
    Publication date: May 5, 2022
    Inventors: Yannick Saillet, Claudio Andrea Fanconi, Martin Oberhofer, Hemanth Kumar Babu, Basem Elasioty, Mike W. Grasselt, Robert Kern, Thuany Karoline Stuart
  • Publication number: 20220123935
    Abstract: The exemplary embodiments disclose a method, a computer program product, and a computer system for protecting sensitive information. The exemplary embodiments may include using an inverted text index for evaluating one or more statistical measures of an index token of the inverted text index, using the one or more statistical measures for selecting a set of candidate tokens, extracting metadata from the inverted text index, associating the set of candidate tokens with respective token metadata, tokenizing at least one document resulting in one or more document tokens, comparing the one or more document tokens with the set of candidate tokens, selecting a set of document tokens to be masked, selecting at least part of the set of document tokens that comprises sensitive information according to the associated token metadata, masking the at least part of the set of document tokens, and providing one or more masked documents.
    Type: Application
    Filed: October 19, 2020
    Publication date: April 21, 2022
    Inventors: Michael Baessler, Albert Maier, Mike W. Grasselt, Yannick Saillet, Lars Bremer
  • Publication number: 20220100899
    Abstract: In an approach, a processor receives a request of a document. A processor identifies a set of datasets comprising a sensitive dataset, the set of datasets being interrelated in accordance with a relational model. A processor extracts attribute values of the document. A processor determines that a set of one or more attribute values of the extracted attribute values is in the set of datasets, the set of attribute values being values of a set of attributes. A processor determines that one or more entities of the sensitive dataset can be identified based on relations of the relational model between the set of attributes, where at least part of attribute values of the one or more entities comprises sensitive information. A processor, responsive to determining that the one or more entities can be identified, masks at least part of the set of one or more attribute values in the document.
    Type: Application
    Filed: September 25, 2020
    Publication date: March 31, 2022
    Inventors: Yannick Saillet, Albert Maier, Mike W. Grasselt, Michael Baessler, Lars Bremer
  • Publication number: 20220075762
    Abstract: A computer implemented method for classifying at least one source dataset of a computer system. The method may include providing a plurality of associated reference tables organized and associated in accordance with a reference storage model in the computer system. The method may also include calculating, by a data classifier application of the computer system, a first similarity score between the source dataset and a first reference table of the reference tables based on common attributes in the source dataset and a join of the first reference table with at least one further reference table of the reference tables having a relationship with the first reference table. The method may further include classifying, by the data classifier application, the source dataset by determining using at least the calculated first similarity score whether the source dataset is organized as the first reference table in accordance to the reference storage model.
    Type: Application
    Filed: November 16, 2021
    Publication date: March 10, 2022
    Inventors: Martin Oberhofer, Adapala S. Reddy, Yannick Saillet, Jens Seifert
  • Patent number: 11243924
    Abstract: A method, system and computer program product for determining a data standardization score for an attribute of a dataset. A data standardization score is calculated, which reflects whether data quality of attribute values would increase if a standardization rule is applied to the attribute values. Based on attribute metadata, it may be determined whether an indication to carry or not to carry out standardization is available for at least part of the attribute values of the dataset. In response to finding the indication, a respective value may be set for the data standardization score. In response to not finding the indication, a data standardization score algorithm may be run on the at least part of the attribute values of the dataset. The data standardization score value may be compared to a predefined criterion to determine whether data standardization is to be applied on the attribute.
    Type: Grant
    Filed: August 22, 2019
    Date of Patent: February 8, 2022
    Assignee: International Business Machines Corporation
    Inventors: Namit Kabra, Yannick Saillet
  • Patent number: 11243923
    Abstract: A method, system and computer program product for determining a data standardization score for an attribute of a dataset. A data standardization score is calculated, which reflects whether data quality of attribute values would increase if a standardization rule is applied to the attribute values. Based on attribute metadata, it may be determined whether an indication to carry or not to carry out standardization is available for at least part of the attribute values of the dataset. In response to finding the indication, a respective value may be set for the data standardization score. In response to not finding the indication, a data standardization score algorithm may be run on the at least part of the attribute values of the dataset. The data standardization score value may be compared to a predefined criterion to determine whether data standardization is to be applied on the attribute.
    Type: Grant
    Filed: August 22, 2019
    Date of Patent: February 8, 2022
    Assignee: International Business Machines Corporation
    Inventors: Namit Kabra, Yannick Saillet
  • Publication number: 20220035667
    Abstract: According to a computer-implemented method, an available amount of each of multiple computing resources is determined by machine logic over a period of time at a computing device. The machine logic also determines an expected usage of each computing resource to execute each workflow in a queue. The machine logic also determines a time of execution of each workflow in the queue based on the available amount of each of the multiple computing resources over time and the expected usage of each computing resource to execute each workflow in the queue.
    Type: Application
    Filed: October 19, 2021
    Publication date: February 3, 2022
    Inventors: Yannick Saillet, Namit Kabra
  • Publication number: 20220028168
    Abstract: Aspects of the present disclosure relate to controlling virtual reality (VR) content displayed on a VR head mounted display (HMD). Communication can be established between a computer system, a VR HMD, and a mobile device. A user input configured to control VR content displayed on a display of the VR HMD can be received on the mobile device. The VR content displayed on the VR HMD can then be controlled based on the user input received on the mobile device.
    Type: Application
    Filed: July 21, 2020
    Publication date: January 27, 2022
    Inventors: Namit Kabra, Smitkumar Narotambhai Marvaniya, Yannick Saillet, Kunjavihari Madhav Kashalikar
  • Patent number: 11200215
    Abstract: Methods and systems for data quality evaluation are disclosed. A method includes: receiving, by a computing device, at least one data set and a list of rule expressions to bind; building, by the computing device, candidate binding combinations between columns of the at least one data set and variables of each rule expression in the list of rule expressions; building, by the computing device, a new bound rule expression candidate based on the candidate binding combinations; generating, by the computing device, a new bound rule expression based on the new bound rule expression candidate and a data transformation applied to at least one of the columns of the at least one data set; and storing, by the computing device, the new bound rule expression.
    Type: Grant
    Filed: January 30, 2020
    Date of Patent: December 14, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Kunjavihari Madhav Kashalikar, Yannick Saillet, Ketki Ramesh Purandare
  • Publication number: 20210357183
    Abstract: A computer implemented method is used for sorting data elements of a given set. The method includes performing an evaluation of a first type of usage of each data element. The method includes determining a set of data element candidates dependent on the evaluation of the first type of usage. The method includes performing an evaluation of a second type of usage of each data element of the set of data element candidates. The method includes sorting the data elements of the set of data element candidates dependent on the evaluation of the second type of usage of each data element of the set of data element candidates. The method includes providing the sorted data elements of the set of data element candidates, and in response, receiving a request for a data processing based on the provided sorted data elements of the set of data element candidates.
    Type: Application
    Filed: May 18, 2020
    Publication date: November 18, 2021
    Inventors: Albert Maier, Mike W. Grasselt, Yannick Saillet, Lars Bremer, Michael Baessler
  • Publication number: 20210357699
    Abstract: The invention relates to an approach for data quality assessment for data analytics, the approach comprising providing a data set, the data set comprising multiple data fields, predicting by a first trained machine learning model at least one usage type of the data set using characteristics of the data fields as input, for each usage type of the at least one usage type, determining a usage specific data quality score of each of the predicted usage types, and using of the data set based on the at least one usage type and associated data quality score.
    Type: Application
    Filed: May 14, 2020
    Publication date: November 18, 2021
    Inventors: Yannick Saillet, Mike W. Grasselt, Namit Kabra, Krishna Kishore Bonagiri
  • Patent number: 11175951
    Abstract: According to a computer-implemented method, an available amount of each of multiple computing resources is determined by machine logic over a period of time at a computing device. The machine logic also determines an expected usage of each computing resource to execute each workflow in a queue. The machine logic also determines a time of execution of each workflow in the queue based on the available amount of each of the multiple computing resources over time and the expected usage of each computing resource to execute each workflow in the queue.
    Type: Grant
    Filed: May 29, 2019
    Date of Patent: November 16, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Yannick Saillet, Namit Kabra