Patents by Inventor Songyun Duan
Songyun Duan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10229200Abstract: Data elements from data sources and having a data value set are linked by using hash functions to determine a dimensionally reduced instance signature for each data element based on all data values associated with that data element to yield a plurality of dimensionally reduced instance signatures of equivalent fixed size such that similarities among the data values in the data value sets across all data elements is maintained among the plurality of instance signatures. Candidate pairs of data elements to link are identified using the plurality of instance signatures in locality sensitive hash functions, and a similarity index is generated for each candidate pair using a pre-determined measure of similarity. Candidate pairs of data elements having a similarity index above a given threshold are linked.Type: GrantFiled: June 8, 2012Date of Patent: March 12, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Mihaela Ancuta Bornea, Songyun Duan, Achille Belly Fokoue-Nkoutche, Oktie Hassanzadeh, Anastasios Kementsietsidis, Kavitha Srinivas, Michael James Ward
-
Patent number: 10095742Abstract: Multiquery optimization is performed in the context of RDF/SPARQL. Heuristic algorithms partition an input batch of queries into groups such that each group of queries can be optimized together. The optimization incorporates an efficient algorithm to discover the common sub-structures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. No assumptions are made about the underlying SPARQL query engine. This provides portability across different RDF stores.Type: GrantFiled: November 28, 2016Date of Patent: October 9, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Feifei Li
-
Patent number: 9959326Abstract: Methods and systems for determining schema element types are shown that include pooling potential annotations for an element of an unlabeled schema from a plurality of heterogeneous sources, scoring the pool of potential annotations according to relevancy using information using instance information from the plurality of heterogeneous sources to produce a relevancy score, and annotating the element of the unlabeled schema using the most relevant potential annotations.Type: GrantFiled: March 23, 2011Date of Patent: May 1, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Songyun Duan, Achille B. Fokoue-Nkoutche, Oktie Hassanzadeh, Anastasios Kementsietsidis, Kavitha Srinivas, Michael J. Ward
-
Publication number: 20170083577Abstract: Multiquery optimization is performed in the context of RDF/SPARQL. Heuristic algorithms partition an input batch of queries into groups such that each group of queries can be optimized together. The optimization incorporates an efficient algorithm to discover the common substructures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. No assumptions are made about the underlying SPARQL query engine. This provides portability across different RDF stores.Type: ApplicationFiled: November 28, 2016Publication date: March 23, 2017Inventors: Songyun DUAN, Anastasios KEMENTSIETSIDIS, Wangchao LE, Feifei LI
-
Patent number: 9542444Abstract: Multiquery optimization is performed in the context of RDF/SPARQL. Heuristic algorithms partition an input batch of queries into groups such that each group of queries can be optimized together. The optimization incorporates an efficient algorithm to discover the common sub-structures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. No assumptions are made about the underlying SPARQL query engine. This provides portability across different RDF stores.Type: GrantFiled: January 28, 2016Date of Patent: January 10, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Feifei Li
-
Publication number: 20160162549Abstract: Multiquery optimization is performed in the context of RDF/SPARQL. Heuristic algorithms partition an input batch of queries into groups such that each group of queries can be optimized together. The optimization incorporates an efficient algorithm to discover the common sub-structures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. No assumptions are made about the underlying SPARQL query engine. This provides portability across different RDF stores.Type: ApplicationFiled: January 28, 2016Publication date: June 9, 2016Applicant: International Business Machines CorporationInventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Feifei Li
-
Patent number: 9317409Abstract: According to an aspect of the present principles, a method is provided for generating resource description framework benchmarks. The method includes deriving a resultant benchmark dataset with a user specified size and a user specified coherence from and with respect to an input dataset of a given size and a given coherence by determining which triples of subject-property-object to add to the input dataset or remove from the input dataset to derive the resultant benchmark dataset.Type: GrantFiled: September 10, 2012Date of Patent: April 19, 2016Assignee: International Business Machines CorporationInventors: Songyun Duan, Anastasios Kementsietsidis, Kavitha Srinivas, Octavian Udrea
-
Patent number: 9280583Abstract: Multiquery optimization is performed in the context of RDF/SPARQL. Heuristic algorithms partition an input batch of queries into groups such that each group of queries can be optimized together. The optimization incorporates an efficient algorithm to discover the common sub-structures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. No assumptions are made about the underlying SPARQL query engine. This provides portability across different RDF stores.Type: GrantFiled: November 30, 2012Date of Patent: March 8, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Feifei Li
-
Patent number: 9244820Abstract: According to an aspect of the present principles, a method is provided for generating resource description framework benchmarks. The method includes deriving a resultant benchmark dataset with a user specified size and a user specified coherence from and with respect to an input dataset of a given size and a given coherence by determining which triples of subject-property-object to add to the input dataset or remove from the input dataset to derive the resultant benchmark dataset.Type: GrantFiled: January 28, 2011Date of Patent: January 26, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Songyun Duan, Anastasios Kementsietsidis, Kavitha Srinivas, Octavian Udrea
-
Patent number: 9037615Abstract: A computer-implemented method, system, and article of manufacture for querying and integrating structured and unstructured data. The method includes: receiving entity information that is extracted from a first set of unstructured data using an open domain information extraction system, wherein the entity in-formation comprises relationship information between a first entity and a second entity of the first set of unstructured data; recognizing a pattern based on the relationship information and creating a schema for the first set of unstructured data based on the pattern; and associating an element of the created schema with (i) an entity of a second set of unstructured data or (ii) a schema element of an existing set of structured data if there is sufficient overall similarity between the created schema element and either the second unstructured data entity or the schema element of the existing structured data.Type: GrantFiled: June 11, 2012Date of Patent: May 19, 2015Assignee: International Business Machines CorporationInventors: Mihaela Ancuta Bornea, Songyun Duan, James J. Fan, Achille Fokoue-Nkoutche, Alfio M. Gliozzo, Aditya Kalyanpur, Anastasios Kementsietsidis, Kavitha Srinivas, Michael J. Ward
-
Patent number: 8984019Abstract: Keyword searching is used to explore and search large Resource Description Framework datasets having unknown or constantly changing structures. A succinct and effective summarization is built from the underlying resource description framework data. Given a keyword query, the summarization lends significant pruning powers to exploratory keyword searches and leads to much better efficiency compared to previous work. The summarization returns exact results and can be updated incrementally and efficiently.Type: GrantFiled: November 20, 2012Date of Patent: March 17, 2015Assignee: International Business Machines CorporationInventors: Songyun Duan, Achille Belly Fokoue-Nkoutche, Anastasios Kementsietsidis, Wangchao Le, Feifei Li, Kavitha Srinivas
-
Patent number: 8983990Abstract: A method of performing a graph query issued by a user is provided. The method includes performing on a processor, receiving a user graph query. The method includes rewriting the user graph query as a new query based on a query policy expressed in a graph query language. The method includes performing the new query on graph data to obtain a result.Type: GrantFiled: August 17, 2010Date of Patent: March 17, 2015Assignee: International Business Machines CorporationInventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Min Wang
-
Patent number: 8977650Abstract: Keyword searching is used to explore and search large Resource Description Framework datasets having unknown or constantly changing structures. A succinct and effective summarization is built from the underlying resource description framework data. Given a keyword query, the summarization lends significant pruning powers to exploratory keyword searches and leads to much better efficiency compared to previous work. The summarization returns exact results and can be updated incrementally and efficiently.Type: GrantFiled: November 21, 2012Date of Patent: March 10, 2015Assignee: International Business Machines CorporationInventors: Songyun Duan, Achille Belly Fokoue-Nkoutche, Anastasios Kementsietsidis, Wangchao Le, Feifei Li, Kavitha Srinivas
-
Publication number: 20140156633Abstract: Multiquery optimization is performed in the context of RDF/SPARQL. Heuristic algorithms partition an input batch of queries into groups such that each group of queries can be optimized together. The optimization incorporates an efficient algorithm to discover the common sub-structures of multiple SPARQL queries and an effective cost model to compare candidate execution plans. No assumptions are made about the underlying SPARQL query engine. This provides portability across different RDF stores.Type: ApplicationFiled: November 30, 2012Publication date: June 5, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Songyun Duan, Anastasios Kementsietsidis, Wangchao Le, Feifei Li
-
Publication number: 20140143281Abstract: Keyword searching is used to explore and search large Resource Description Framework datasets having unknown or constantly changing structures. A succinct and effective summarization is built from the underlying resource description framework data. Given a keyword query, the summarization lends significant pruning powers to exploratory keyword searches and leads to much better efficiency compared to previous work. The summarization returns exact results and can be updated incrementally and efficiently.Type: ApplicationFiled: November 21, 2012Publication date: May 22, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Songyun Duan, Achille Belly Fokoue-Nkoutche, Anastasios Kementsietsidis, Wangchao Le, Feifei Li, Kavitha Srinivas
-
Publication number: 20140143280Abstract: Keyword searching is used to explore and search large Resource Description Framework datasets having unknown or constantly changing structures. A succinct and effective summarization is built from the underlying resource description framework data. Given a keyword query, the summarization lends significant pruning powers to exploratory keyword searches and leads to much better efficiency compared to previous work. The summarization returns exact results and can be updated incrementally and efficiently.Type: ApplicationFiled: November 20, 2012Publication date: May 22, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Songyun Duan, Achille Belly Fokoue-Nkoutche, Anastasios Kementsietsidis, Wangchao Le, Feifei Li, Kavitha Srinivas
-
Publication number: 20140075278Abstract: Aspects of the present invention provide a tool for extracting schema from a spreadsheet. In an embodiment, a set of data that is stored in an uncataloged tabular format, such as a spreadsheet, is retrieved. The structure of the retrieved set of data is surveyed to determine the dataset schema thereof. Then, data elements within the dataset schema are analyzed to obtain information regarding the data elements. Based on dataset schema and the element information, an interface can be constructed that allows remote access to the set of data.Type: ApplicationFiled: September 12, 2012Publication date: March 13, 2014Applicant: INTERNATIONAL BUSINESS MACHINES COPORATIONInventors: Mihaela A. Bornea, Songyun Duan, Achille B. Fokoue-Nkoutche, Anastasios Kementsietsidis, Kavitha Srinivas, Michael J. Ward
-
Publication number: 20140074878Abstract: Aspects of the present invention provide a tool for extracting schema from a spreadsheet. In an embodiment, a set of data that is stored in an uncataloged tabular format, such as a spreadsheet, is retrieved. The structure of the retrieved set of data is surveyed to determine the dataset schema thereof. Then, data elements within the dataset schema are analyzed to obtain information regarding the data elements. Based on dataset schema and the element information, an interface can be constructed that allows remote access to the set of data.Type: ApplicationFiled: September 14, 2012Publication date: March 13, 2014Applicant: INTERNATIONAL BUSINESS MACHINES COPORATIONInventors: Mihaela A. Bornea, Songyun Duan, Achille B. Fokoue-Nkoutche, Anastasios Kementsietsidis, Kavitha Srinivas, Michael J. Ward
-
Publication number: 20140039972Abstract: Systems and methods are provided for the automatic detection of different types of changes in a business process. A system includes a transformer for performing a transformation on data derived from process traces or models extracted from the processes traces to generate transformed data. The process traces are for a business process corresponding to a set of related tasks for a specified goal. Each of the models has at least a transition matrix of dimension N×N, where N is a total number of the related tasks. The system further includes a change detector for performing change detection on the transformed data to identify at least one of when a change occurs in the business process and a degree of the change.Type: ApplicationFiled: October 15, 2013Publication date: February 6, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: SONGYUN DUAN, PAUL T. KEYSER, GEETIKA T. LAKSHMANAN, DAVOOD SHAMSI
-
Publication number: 20130332478Abstract: A computer-implemented method, system, and article of manufacture for querying and integrating structured and unstructured data. The method includes: receiving entity information that is extracted from a first set of unstructured data using an open domain information extraction system, wherein the entity information comprises relationship information between a first entity and a second entity of the first set of unstructured data; recognizing a pattern based on the relationship information and creating a schema for the first set of unstructured data based on the pattern; and associating an element of the created schema with (i) an entity of a second set of unstructured data or (ii) a schema element of an existing set of structured data if there is sufficient overall similarity between the created schema element and either the second unstructured data entity or the schema element of the existing structured data.Type: ApplicationFiled: June 11, 2012Publication date: December 12, 2013Applicant: International Business Machines CorporationInventors: Mihaela Ancuta Bornea, Songyun Duan, James J. Fan, Achille Fokoue-Nkoutche, Alfio M. Gliozzo, Aditya Kalyanpur, Anastasios Kementsietsidis, Kavitha Srinivas, Michael J. Ward