Patents by Inventor Anastasios Kementsietsidis

Anastasios Kementsietsidis has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and apparatus for smart and extensible schema matching framework

Patent number: 12045209

Abstract: A method may include (i) obtaining first data records structured in accordance with a first schema, (ii) determining, for the first schema, one or more first schema property values for each schema property in a set of pre-defined schema properties, (iii) determining, for a second schema, one of more second schema property values for each schema property in the set of pre-defined schema properties, (iv) providing, to a schema matching engine, first and second schema property values, where the schema matching engine contains schema mapping techniques and rules, where each rule suggests a schema mapping technique based on schema properties from the set of pre-defined schema properties, (v) applying the rules to select a schema mapping technique, (vi) transforming the first data records in accordance with the selected schema mapping technique, and (vii) providing the transformed first data records in a data structure in accordance with the second schema.

Type: Grant

Filed: November 6, 2019

Date of Patent: July 23, 2024

Assignee: Google LLC

Inventors: Anastasios Kementsietsidis, Jay Pandya, Chrysovalantis Anastasiou
Systems and methods for generation and application of schema-agnostic query templates

Patent number: 11983173

Abstract: The present disclosure provides systems and methods that generate query templates that are expressed in a generic schema-agnostic language. The query templates can be generated “from scratch” or can be automatically generated from existing queries, a process which may be referred to as “templatizing” the existing queries. As one example, generation of query templates can be performed through an iterative process that iteratively generates candidate templates over time to optimize a coverage over a set of existing queries. After generation of the schema-agnostic query templates, the systems and methods described herein can automatically translate/map the templatized queries into “concrete,” schema-specific queries that can be evaluated over specific customer schemas/datasets. In this manner, a query template for a given semantic query (e.g., “return the names of all employees”), is required to be written only once.

Type: Grant

Filed: December 16, 2019

Date of Patent: May 14, 2024

Assignee: GOOGLE LLC

Inventors: Anastasios Kementsietsidis, Jay Yogeshbhai Pandya, Tingting Tang, Laurren Kanner
Systems and Methods for Generation and Application of Schema-Agnostic Query Templates

Publication number: 20230037412

Abstract: The present disclosure provides systems and methods that generate query templates that are expressed in a generic schema-agnostic language. The query templates can be generated “from scratch” or can be automatically generated from existing queries, a process which may be referred to as “templatizing” the existing queries. As one example, generation of query templates can be performed through an iterative process that iteratively generates candidate templates over time to optimize a coverage over a set of existing queries. After generation of the schema-agnostic query templates, the systems and methods described herein can automatically translate/map the templatized queries into “concrete,” schema-specific queries that can be evaluated over specific customer schemas/datasets. In this manner, a query template for a given semantic query (e.g., “return the names of all employees”), is required to be written only once.

Type: Application

Filed: December 16, 2019

Publication date: February 9, 2023

Inventors: Anastasios Kementsietsidis, Jay Yogeshbhai Pandya, Tingting Tang, Laurren Kanner
Determining the schema of a graph dataset

Patent number: 11573935

Abstract: A schema for a dataset is identified by identifying a dataset comprising data and relationships between data pairs. An original schema is identified for the dataset. This original schema comprises an organizational structure. An initial fit between the dataset and the original schema is determined. The initial fit quantifying a conformity of the data in the dataset to the organizational structure of the original schema. A plurality of additional schemas are identified. Each additional schema is a distinct organizational schema. The dataset is partitioned into a plurality of subsets. Each subset comprises a modified fit quantifying a modified conformity of subset data in each subset to one of the original schema and the additional schemas. The modified fit is greater than the original fit.

Type: Grant

Filed: March 27, 2017

Date of Patent: February 7, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Marcelo Arenas, Gonzalo Diaz, Achille Fokoue, Anastasios Kementsietsidis, Kavitha Srinivas
METHOD AND APPARATUS FOR SMART AND EXTENSIBLE SCHEMA MATCHING FRAMEWORK

Publication number: 20220374399

Abstract: A method may include (i) obtaining first data records structured in accordance with a first schema, (ii) determining, for the first schema, one or more first schema property values for each schema property in a set of pre-defined schema properties, (iii) determining, for a second schema, one of more second schema property values for each schema property in the set of pre-defined schema properties, (iv) providing, to a schema matching engine, first and second schema property values, where the schema matching engine contains schema mapping techniques and rules, where each rule suggests a schema mapping technique based on schema properties from the set of pre-defined schema properties, (v) applying the rules to select a schema mapping technique, (vi) transforming the first data records in accordance with the selected schema mapping technique, and (vii) providing the transformed first data records in a data structure in accordance with the second schema.

Type: Application

Filed: November 6, 2019

Publication date: November 24, 2022

Inventors: Anastasios KEMENTSIETSIDIS, Jay PANDYA, Chrysovalantis ANASTASIOU
Method and apparatus for identifying semantically related records

Patent number: 11227002

Abstract: An apparatus and method of identifying semantically related records, including receiving input data from an input device, splitting the input data into a plurality of clusters according to semantic relationship, each of the clusters including a plurality of source terms and a plurality of target terms, transforming each of the plurality of clusters based on the transformation which includes tokenization of the plurality of clusters, for each of the plurality of clusters that are transformed, finding relatedness scores of a plurality of semantic relatedness measures with the plurality of target terms, building a vector of similarity scores for each of the plurality of target terms, and for each of the plurality of source terms, selecting a predetermined number of the plurality of target terms according to the similarity scores.

Type: Grant

Filed: November 30, 2015

Date of Patent: January 18, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Oktie Hassanzadeh, Anastasios Kementsietsidis
Method and apparatus for identifying the optimal schema to store graph data in a relational store

Patent number: 10949464

Abstract: A system for identifying a schema for storing graph data includes a database containing a graph dataset of data and relationships between data pairs and a list of storage methods that each are a distinct structural arrangement of the data and relationships from the graph data set. An analyzer module collects statistics for the graph dataset, and a data classification module uses the collected statistics to calculate metrics describing the data and relationships in the graph dataset, uses the calculated metrics to group the data and relationships into a plurality of graph dataset subsets and associates each graph dataset subset with one of the plurality of storage methods. The resulting group of storage methods associated with the plurality of graph dataset subsets includes a unique storage method for each graph dataset subset. The data and relationships in each graph dataset subset are arranged in accordance with associated storage methods.

Type: Grant

Filed: March 23, 2016

Date of Patent: March 16, 2021

Assignee: International Business Machines Corporation

Inventors: Mihaela Ancuta Bornea, Julian Timothy Dolby, Achille Belly Fokoue-Nkoutche, Anastasios Kementsietsidis, Kavitha Srinivas
Data virtualization across heterogeneous formats

Patent number: 10740304

Abstract: Various embodiments virtualize data across heterogeneous formats. In one embodiment, a plurality of heterogeneous data sources is received as input. A local schema graph including a set of attribute nodes and a set of type nodes is generated for each of the plurality of heterogeneous data sources. A global schema graph is generated based on each local schema graph that has been generated. The global schema graph comprises each of the local schema graphs and at least one edge between at least one of two or more attributes nodes and two or more type nodes from different local schema graphs. The edge indicates a relationship between the data sources represented by the different local schema graphs comprising the two or more attributes nodes based on a computed similarity between at least one value associated with each of the two or more attributes nodes.

Type: Grant

Filed: August 25, 2014

Date of Patent: August 11, 2020

Assignee: International Business Machines Corporation

Inventors: Achille Belly Fokoue-Nkoutche, Oktie Hassanzadeh, Anastasios Kementsietsidis, Kavitha Srinivas, Michael J. Ward
Method and apparatus for storing sparse graph data as multi-dimensional cluster

Patent number: 10509804

Abstract: A system for storing graph data as a multi-dimensional cluster having a database with a graph dataset containing data and relationships between data pairs and a schema list of storage methods that use a table with columns and rows associated with data or relationships. An analyzer module to collect statistics of a graph dataset and a dimension identification module to identify a plurality of dimensions that each represent a column in the table. A schema creation and loading module creates a modified storage method and having a plurality of distinct table blocks and a plurality of table block indexes, one index for each table block and arranges the data and relationships in the given graph dataset in accordance with the modified storage method to create the multi-dimensional cluster.

Type: Grant

Filed: March 24, 2016

Date of Patent: December 17, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Mihaela Ancuta Bornea, Julian Timothy Dolby, Achille Belly Fokoue-Nkoutche, Anastasios Kementsietsidis, Kavitha Srinivas
Storing graph data in a relational database

Patent number: 10387496

Abstract: Embodiments include methods, systems and computer program products for storing graph data for a directed graph in a relational database. Aspects include creating a plurality of relational tables for the graph data, using a processor on a computer, the plurality of relational tables including adjacency tables and attribute tables. Each row of the attribute tables is dedicated to a subject of the graph data in the dataset and stores a JavaScript Object Notation (JSON) object corresponding to the subject. Each row of the adjacency tables includes a hashtable containing properties and values of the subject for that row.

Type: Grant

Filed: May 21, 2015

Date of Patent: August 20, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Achille B. Fokoue-Nkoutche, Gang Hu, Anastasios Kementsietsidis, Kavitha Srinivas, Wen B. Sun, Guo Tong Xie
Storing graph data in a relational database

Patent number: 10387497

Abstract: Embodiments include methods, systems and computer program products for storing graph data for a directed graph in a relational database. Aspects include creating a plurality of relational tables for the graph data, using a processor on a computer, the plurality of relational tables including adjacency tables and attribute tables. Each row of the attribute tables is dedicated to a subject of the graph data in the dataset and stores a JavaScript Object Notation (JSON) object corresponding to the subject. Each row of the adjacency tables includes a hashtable containing properties and values of the subject for that row.

Type: Grant

Filed: June 18, 2015

Date of Patent: August 20, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Achille B. Fokoue-Nkoutche, Gang Hu, Anastasios Kementsietsidis, Kavitha Srinivas, Wen B. Sun, Guo Tong Xie
Optimizing sparse schema-less data in data stores

Patent number: 10360262

Abstract: Various embodiments of the invention relate to optimizing storage of schema-less data. At least one of a schema-less dataset including a plurality of resources one or more query workloads associated with the plurality of resources is received. Each resource is associated with at least a plurality of properties. At least one set of co-occurring properties from the plurality of properties is identified. A graph including a plurality of nodes is generated. Each of the nodes represents a unique property in the set of co-occurring properties. The graph further includes an edge connecting each node representing a pair of co-occurring properties. A schema is generated based on the graph that assigns a column identifier from a table to each unique property represented by one of the nodes in the graph.

Type: Grant

Filed: June 23, 2017

Date of Patent: July 23, 2019

Assignee: International Business Machines Corporation

Inventors: Mihaela Ancuta Bornea, Julian Dolby, Achille Belly Fokoue-Nkoutche, Anastasios Kementsietsidis, Kavitha Srinivas
Linking data elements based on similarity data values and semantic annotations

Patent number: 10229200

Abstract: Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of equivalent fixed size such that similarities among the data values in the data value sets across all data elements is maintained among the plurality of instance signatures. Candidate pairs of data elements to link are identified using the plurality of instance signatures in locality sensitive hash functions, and a similarity index is generated for each candidate pair using a pre-determined measure of similarity. Candidate pairs of data elements having a similarity index above a given threshold are linked.

Type: Grant

Filed: June 8, 2012

Date of Patent: March 12, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Mihaela Ancuta Bornea, Songyun Duan, Achille Belly Fokoue-Nkoutche, Oktie Hassanzadeh, Anastasios Kementsietsidis, Kavitha Srinivas, Michael James Ward
Scalable multi-query optimization for SPARQL

Patent number: 10095742

Abstract: Multiquery optimization is performed in the context of RDF/SPARQL. Heuristic algorithms partition an input batch of queries into groups such that each group of queries can be optimized together. The optimization incorporates an efficient algorithm to discover the common sub-structures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. No assumptions are made about the underlying SPARQL query engine. This provides portability across different RDF stores.

Type: Grant

Filed: November 28, 2016

Date of Patent: October 9, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Feifei Li
Systems and methods for query evaluation over distributed linked data stores

Patent number: 10031922

Abstract: A method for query evaluation comprises receiving a query over a set of distributed data sources, decomposing the query into a set of sub-queries of the query, evaluating each sub-query in the set of sub-queries with respect to each data source in the set of distributed data sources, wherein evaluating comprises determining which data sources in the set of distributed data sources are capable of answering each sub-query and at what cost, computing a set of distributed plans by composing one or more of the sub-queries in one or more of the data sources, evaluating each plan in the set of distributed plans, selecting a sub-set of plans from the set of distributed plans to be executed for responding to the query, executing the selected sub-set of plans, and returning results of the query.

Type: Grant

Filed: July 10, 2015

Date of Patent: July 24, 2018

Assignee: International Business Machines Corporation

Inventors: Achille B. Fokoue-Nkoutche, Anastasios Kementsietsidis, Spyros Kotoulas, Muhammad Mustafa Rafique
Systems and methods for query evaluation over distributed linked data stores

Patent number: 10025795

Abstract: A method for query evaluation comprises receiving a query over a set of distributed data sources, decomposing the query into a set of sub-queries of the query, evaluating each sub-query in the set of sub-queries with respect to each data source in the set of distributed data sources, wherein evaluating comprises determining which data sources in the set of distributed data sources are capable of answering each sub-query and at what cost, computing a set of distributed plans by composing one or more of the sub-queries in one or more of the data sources, evaluating each plan in the set of distributed plans, selecting a sub-set of plans from the set of distributed plans to be executed for responding to the query, executing the selected sub-set of plans, and returning results of the query.

Type: Grant

Filed: March 24, 2015

Date of Patent: July 17, 2018

Assignee: International Business Machines Corporation

Inventors: Achille B. Fokoue-Nkoutche, Anastasios Kementsietsidis, Spyros Kotoulas, Muhammad Mustafa Rafique
Annotating schema elements based on associating data instances with knowledge base entities

Patent number: 9959326

Abstract: Methods and systems for determining schema element types are shown that include pooling potential annotations for an element of an unlabeled schema from a plurality of heterogeneous sources, scoring the pool of potential annotations according to relevancy using information using instance information from the plurality of heterogeneous sources to produce a relevancy score, and annotating the element of the unlabeled schema using the most relevant potential annotations.

Type: Grant

Filed: March 23, 2011

Date of Patent: May 1, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Songyun Duan, Achille B. Fokoue-Nkoutche, Oktie Hassanzadeh, Anastasios Kementsietsidis, Kavitha Srinivas, Michael J. Ward
Finding optimal query plans

Patent number: 9785673

Abstract: Systems and methods for optimizing a query, and more particularly, systems and methods for finding optimal plans for graph queries by casting the task of finding the optimal plan as an integer programming (ILP) problem. A method for optimizing a query, comprises building a data structure for a query, the data structure including a plurality of components, wherein each of the plurality of components corresponds to at least one graph pattern, determining a plurality of flows of query variables between the plurality of components, and determining a combination of the plurality of flows between the plurality of components that results in a minimum cost to execute the query.

Type: Grant

Filed: June 29, 2016

Date of Patent: October 10, 2017

Assignee: International Business Machines Corporation

Inventors: Mihaela A. Bornea, Julian Dolby, Achille B. Fokoue-Nkoutche, Anastasios Kementsietsidis, Kavitha Srinivas
OPTIMIZING SPARSE SCHEMA-LESS DATA IN DATA STORES

Publication number: 20170286566

Abstract: Various embodiments of the invention relate to optimizing storage of schema-less data. At least one of a schema-less dataset including a plurality of resources one or more query workloads associated with the plurality of resources is received. Each resource is associated with at least a plurality of properties. At least one set of co-occurring properties from the plurality of properties is identified. A graph including a plurality of nodes is generated. Each of the nodes represents a unique property in the set of co-occurring properties. The graph further includes an edge connecting each node representing a pair of co-occurring properties. A schema is generated based on the graph that assigns a column identifier from a table to each unique property represented by one of the nodes in the graph.

Type: Application

Filed: June 23, 2017

Publication date: October 5, 2017

Applicant: International Business Machines Corporation

Inventors: Mihaela Ancuta BORNEA, Julian DOLBY, Achille Belly FOKOUE-NKOUTCHE, Anastasios KEMENTSIETSIDIS, Kavitha SRINIVAS
Optimizing sparse schema-less data in data stores

Patent number: 9715560

Abstract: Various embodiments of the invention relate to optimizing storage of schema-less data. At least one of a schema-less dataset including a plurality of resources one or more query workloads associated with the plurality of resources is received. Each resource is associated with at least a plurality of properties. At least one set of co-occurring properties from the plurality of properties is identified. A graph including a plurality of nodes is generated. Each of the nodes represents a unique property in the set of co-occurring properties. The graph further includes an edge connecting each node representing a pair of co-occurring properties. A schema is generated based on the graph that assigns a column identifier from a table to each unique property represented by one of the nodes in the graph.

Type: Grant

Filed: June 27, 2013

Date of Patent: July 25, 2017

Assignee: International Business Machines Corporation

Inventors: Mihaela Ancuta Bornea, Julian Dolby, Achille Belly Fokoue-Nkoutche, Anastasios Kementsietsidis, Kavitha Srinivas

1 2 3 4 5 next