Patents by Inventor Kubilay Atasu

Kubilay Atasu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PROCESSING GRAPHS USING GRAPH PATTERNS

Publication number: 20240330959

Abstract: In an approach, a processor identifies subgraphs of predefined patterns in a first graph, the first graph: (i) representing a specific ontology and (ii) comprising nodes representing first entities and first edges representing relationships between the first entities. A processor represents the identified subgraphs by respective second graphs, thereby forming multi-relational graphs, each second graph comprising nodes representing second entities and second edges representing relationships between the second entities, where: the second entities are the respective nodes of the first graph; and each second edge indicates that the two second entities linked by the second edge are part of a pattern of the predefined patterns. A processor inputs the multi-relational graphs to a multi-relational graph neural network for generating output in accordance with the specific ontology. A processor provides the output.

Type: Application

Filed: April 3, 2023

Publication date: October 3, 2024

Inventors: Jovan Blanuša, Maximo Cravero Baraja, Kubilay Atasu, Charalampos Pozidis
Low-complexity methods for assessing distances between pairs of documents

Patent number: 11222054

Abstract: Two sets X2 and X1 of histograms of words, and a vocabulary V are accessed. Each of the two sets is representable as a sparse matrix, each row of which corresponds to a histogram. Each histogram is representable as a sparse vector, whose dimension is determined by a dimension of the vocabulary. Two phases compute distances between pairs of histograms. The first phase includes computations performed for each histogram and for each word in the vocabulary to obtain a dense, floating-point vector y. The second phase includes computing, for each histogram, a sparse-matrix, dense-vector multiplication between a matrix-representation of the set X1 of histograms and the vector y. The multiplication is performed to obtain distances between all histograms of the set X1 and each histogram X2[j]. Distances between all pairs of histograms are obtained, based on which distances between documents can subsequently be assessed.

Type: Grant

Filed: March 12, 2018

Date of Patent: January 11, 2022

Assignee: International Business Machines Corporation

Inventors: Kubilay Atasu, Cesar Berrospi Ramis, Nikolas Ioannou, Thomas Patrick Parnell, Charalampos Pozidis, Vasileios Vasileiadis
Construing similarities between datasets with explainable cognitive methods

Patent number: 11176186

Abstract: In an approach for construing similarities between datasets, a processor accesses a pair of sets of feature weights, wherein the sets of feature weights include a query dataset and comprises first weights associated to first features and a reference dataset and comprises second weights associated to second features. Based on similarities between the first features and the second features, a processor discovers flows from the first features to the second features, wherein the flows maximize an overall similarity between the pair of sets of feature weights. Based on the similarities and the flows, a processor computes pair contributions to the overall similarity in order to obtain contributive elements, wherein the pair contributions are contributions of pairs joining the first features to the second features. A processor ranks the contributive elements to obtain respective ranks. A processor returns a result comprising the contributive elements and indications to the respective ranks.

Type: Grant

Filed: March 27, 2020

Date of Patent: November 16, 2021

Assignee: International Business Machines Corporation

Inventors: Kubilay Atasu, Cesar Berrospi Ramis
CONSTRUING SIMILARITIES BETWEEN DATASETS WITH EXPLAINABLE COGNITIVE METHODS

Publication number: 20210303609

Abstract: In an approach for construing similarities between datasets, a processor accesses a pair of sets of feature weights, wherein the sets of feature weights include a query dataset and comprises first weights associated to first features and a reference dataset and comprises second weights associated to second features. Based on similarities between the first features and the second features, a processor discovers flows from the first features to the second features, wherein the flows maximize an overall similarity between the pair of sets of feature weights. Based on the similarities and the flows, a processor computes pair contributions to the overall similarity in order to obtain contributive elements, wherein the pair contributions are contributions of pairs joining the first features to the second features. A processor ranks the contributive elements to obtain respective ranks. A processor returns a result comprising the contributive elements and indications to the respective ranks.

Type: Application

Filed: March 27, 2020

Publication date: September 30, 2021

Inventors: Kubilay Atasu, Cesar Berrospi Ramis
Assessing distances between pairs of histograms based on relaxed flow constraints

Patent number: 11042604

Abstract: The example embodiments of the invention notably are directed to a computer-implemented method for assessing distances between pairs of histograms. Each of the histograms is a representation of a digital object; said representation comprises bins associating weights to respective vectors. Such vectors represent respective features of said digital object. This method basically revolves around computing distances between pairs of histograms. That is, for each pair {p, q} of histograms p and q of said pairs of histograms, the method computes a distance between p and q of said each pair {p, q}. In more detail, said distance is computed according to a cost of moving p into q, so as to obtain a flow matrix F, whose matrix elements Fi,j indicate, for each pair {i,j} of bins of p and q, how much weight of a bin i of p has to flow to a bin j of q to move p into q. This is achieved by minimizing a quantity ?i,jFi,j·Ci,j, where Ci,j is a matrix element of a cost matrix C representing said cost.

Type: Grant

Filed: December 4, 2018

Date of Patent: June 22, 2021

Assignee: International Business Machines Corporation

Inventors: Kubilay Atasu, Thomas Mittelholzer
Load-balancing training of recommender system for heterogeneous systems

Patent number: 10839255

Abstract: A method for parallelizing a training of a model using a matrix-factorization-based collaborative filtering algorithm may be provided. The model can be used in a recommender system for a plurality of users and a plurality of items. The method includes providing a sparse training data matrix, selecting a number of user-item co-clusters, and building a user model data matrix by matrix factorization such that a computational load for executing the determining updated elements of the factorized sparse training data matrix is evenly distributed across the heterogeneous computing resources.

Type: Grant

Filed: May 15, 2017

Date of Patent: November 17, 2020

Assignee: Internationl Business Machines Corporation

Inventors: Kubilay Atasu, Celestine Duenner, Thomas Mittelholzer, Thomas Parnell, Charalampos Pozidis, Michail Vlachos
Hardware compilation of cascaded grammars

Patent number: 10803346

Abstract: A cascaded finite-state-transducer array includes a plurality of finite-state-transducers, the finite-state-transducers being distributed in space. The finite-state-transducer array is configured with dedicated data transfer channels between the finite-state-transducers to transfer specific data types. Each data stream on a dedicated data transfer channel may transmit a particular data type, which may be sorted in increasing order of start offsets or token IDs.

Type: Grant

Filed: December 28, 2018

Date of Patent: October 13, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Kubilay Atasu, Akihiro Nakayama, Raphael Polig, Tong Xu
Assessing Distances Between Pairs Of Histograms Based On Relaxed Flow Constraints

Publication number: 20200175092

Abstract: The example embodiments of the invention notably are directed to a computer-implemented method for assessing distances between pairs of histograms. Each of the histograms is a representation of a digital object; said representation comprises bins associating weights to respective vectors. Such vectors represent respective features of said digital object. This method basically revolves around computing distances between pairs of histograms. That is, for each pair {p, q} of histograms p and q of said pairs of histograms, the method computes a distance between p and q of said each pair {p, q}. In more detail, said distance is computed according to a cost of moving p into q, so as to obtain a flow matrix F, whose matrix elements Fi,j indicate, for each pair {i,j} of bins of p and q, how much weight of a bin i of p has to flow to a bin j of q to move p into q. This is achieved by minimizing a quantity ?i,jFi,j·Ci,j, where Ci,j is a matrix element of a cost matrix C representing said cost.

Type: Application

Filed: December 4, 2018

Publication date: June 4, 2020

Inventors: Kubilay Atasu, Thomas Mittelholzer
Detecting longest regular expression matches

Patent number: 10474707

Abstract: In one embodiment, a computer-implemented method includes receiving a regular expression (regex) and input data. One or more spans are identified representing one or more matches in which the regex matches at least a portion of the input data. Each span corresponds to a corresponding match and includes a start offset of the corresponding match in the input data and an end offset of the corresponding match in the input data. The one or more matches are identified in a sequence. An order of the sequence of the one or more spans is modified. One or more filtered spans are generated, by a computer processor, by filtering out a subset of the one or more spans that are each contained by at least one other span in the one or more spans. The identifying, the modifying, and the filtering are performed at streaming rate.

Type: Grant

Filed: September 21, 2015

Date of Patent: November 12, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Kubilay Atasu
Detecting longest regular expression matches

Patent number: 10467272

Abstract: In one embodiment, a computer-implemented method includes receiving a regular expression (regex) and input data. One or more spans are identified representing one or more matches in which the regex matches at least a portion of the input data. Each span corresponds to a corresponding match and includes a start offset of the corresponding match in the input data and an end offset of the corresponding match in the input data. The one or more matches are identified in a sequence. An order of the sequence of the one or more spans is modified. One or more filtered spans are generated, by a computer processor, by filtering out a subset of the one or more spans that are each contained by at least one other span in the one or more spans. The identifying, the modifying, and the filtering are performed at streaming rate.

Type: Grant

Filed: November 30, 2015

Date of Patent: November 5, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Kubilay Atasu
LOW-COMPLEXITY METHODS FOR ASSESSING DISTANCES BETWEEN PAIRS OF DOCUMENTS

Publication number: 20190278850

Abstract: Two sets X2 and X1 of histograms of words, and a vocabulary V are accessed. Each of the two sets is representable as a sparse matrix, each row of which corresponds to a histogram. Each histogram is representable as a sparse vector, whose dimension is determined by a dimension of the vocabulary. Two phases compute distances between pairs of histograms. The first phase includes computations performed for each histogram and for each word in the vocabulary to obtain a dense, floating-point vector y. The second phase includes computing, for each histogram, a sparse-matrix, dense-vector multiplication between a matrix-representation of the set X1 of histograms and the vector y. The multiplication is performed to obtain distances between all histograms of the set X1 and each histogram X2[j]. Distances between all pairs of histograms are obtained, based on which distances between documents can subsequently be assessed.

Type: Application

Filed: March 12, 2018

Publication date: September 12, 2019

Inventors: Kubilay ATASU, Cesar BERROSPI RAMIS, Nikolas IOANNOU, Thomas Patrick PARNELL, Charalampos POZIDIS, Vasileios VASILEIADIS
HARDWARE COMPILATION OF CASCADED GRAMMARS

Publication number: 20190163999

Abstract: A cascaded finite-state-transducer array includes a plurality of finite-state-transducers, the finite-state-transducers being distributed in space. The finite-state-transducer array is configured with dedicated data transfer channels between the finite-state-transducers to transfer specific data types. Each data stream on a dedicated data transfer channel may transmit a particular data type, which may be sorted in increasing order of start offsets or token IDs.

Type: Application

Filed: December 28, 2018

Publication date: May 30, 2019

Inventors: Kubilay Atasu, Akihiro Nakayama, Raphael Polig, Tong Xu
Hardware compilation of cascaded grammars

Patent number: 10198646

Abstract: A cascaded finite-state-transducer array includes a plurality of finite-state-transducers, the finite-state-transducers being distributed in space. The finite-state-transducer array is configured with dedicated data transfer channels between the finite-state-transducers to transfer specific data types. Each data stream on a dedicated data transfer channel may transmit a particular data type, which may be sorted in increasing order of start offsets or token IDs.

Type: Grant

Filed: July 1, 2016

Date of Patent: February 5, 2019

Assignee: International Business Machines Corporation

Inventors: Kubilay Atasu, Akihiro Nakayama, Raphael Polig, Tong Xu
Graph data representation and pre-processing for efficient parallel search tree traversal

Patent number: 10169487

Abstract: One or more embodiments may provide the capability to enumerate maximal cliques of graph data by constructing and traversing a search tree through a single sequential pass on an adjacency list. The adjacency list may be generated so as to enable the at least one maximal clique to be generated in one single sequential pass.

Type: Grant

Filed: April 4, 2016

Date of Patent: January 1, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Kubilay Atasu, Silvio Dragone, Christoph Hagleitner, Robert R. McCune
Load-Balancing Training of Recommender System for Heterogeneous Systems

Publication number: 20180330192

Abstract: A method for parallelizing a training of a model using a matrix-factorization-based collaborative filtering algorithm may be provided. The model can be used in a recommender system for a plurality of users and a plurality of items. The method includes providing a sparse training data matrix, selecting a number of user-item co-clusters, and building a user model data matrix by matrix factorization such that a computational load for executing the determining updated elements of the factorized sparse training data matrix is evenly distributed across the heterogeneous computing resources.

Type: Application

Filed: May 15, 2017

Publication date: November 15, 2018

Inventors: Kubilay Atasu, Celestine Duenner, Thomas Mittelholzer, Thomas Parnell, Charalampos Pozidis, Michail Vlachos
Method for detecting cliques in graphs

Patent number: 10055510

Abstract: A method is provided for searching a graph to identify cliques using a set of processing elements (PEs), a first PE of the set of PEs having access to an adjacency list of a seed vertex of the graph, the adjacency list of the seed vertex including a set of vertices. The method includes: generating a data structure for each intermediate vertex of the set of vertices, the data structure indicating the respective intermediate vertex and an additional list of intermediate vertices of the set of vertices; storing the generated data structures; for each buffered data structure, receiving the buffered data structure and configuring the available PE to receive an adjacency list of the intermediate vertex indicated in the respective data structure and to select from the adjacency list a set of further vertices that are adjacent to the seed vertex and are part of the additional list.

Type: Grant

Filed: November 4, 2015

Date of Patent: August 21, 2018

Assignee: International Business Machines Corporation

Inventors: Kubilay Atasu, Silvio Dragone
Non-deterministic finite state machine module for use in a regular expression matching system

Patent number: 9983876

Abstract: A non-deterministic finite state machine module for use in a regular expression matching system. The system includes a computational unit implementing a non-deterministic finite state machine representing a regular expression, wherein the computational unit is configured to: receive an input data stream, wherein an occurrence of the regular expression is determined, and an activation signal; process the input data stream with respect to the non-deterministic finite state machine depending on the activation signal; and provide at least one branch data output for initializing an additional non-deterministic finite state machine module if the processing of an element of the input data stream according to the non-deterministic finite state machine results in a branching of a processing thread.

Type: Grant

Filed: February 20, 2014

Date of Patent: May 29, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Kubilay Atasu, Christoph Hagleitner, Raphael Polig, Frederick R Reiss
Regular expression matching with back-references using backtracking

Patent number: 9875045

Abstract: A device for matching, in input data, a regular expression with back-references, represented by a finite-state machine (FSM). The device comprises a plurality of parallel processing elements (PPEs), an interconnection network for interconnecting the PPEs with each other, and a memory for receiving and storing input data. The PPEs process the input data stored in the memory, based on backtracking to process the back-references, and implement FA next state logic to generate new active FA configurations or mark themselves as available to receive active FA configurations. The interconnection network retrieves active FA configurations from the PPEs and allocates the active FA configurations to available PPEs. The PPEs are configured to match a regular expression in the input data.

Type: Grant

Filed: July 27, 2015

Date of Patent: January 23, 2018

Assignee: International Business Machines Corporation

Inventors: Kubilay Atasu, Silvio Dragone
ACCELERATED CONTENT ANALYTICS BASED ON A HIERARCHICAL DATA-FLOW-GRAPH REPRESENTATION

Publication number: 20180018152

Abstract: A system and method to hardware-accelerate finite state transducer libraries and their compilation toolchains. In an embodiment, a computer-implemented method for partitioning an UIMA-PEAR file into software-based and hardware-accelerated components may comprise creating a data-flow graph representation of the UIMA-PEAR-file, flattening hierarchies of the data-flow graph representation, and selecting the components to be hardware accelerated from the flattened hierarchies of the data-flow graph representation based on data dependencies of data types produced and consumed by each component of the flattened data-flow graph.

Type: Application

Filed: July 18, 2016

Publication date: January 18, 2018

Inventors: Kubilay Atasu, Akihiro Nakayama, Raphael Polig, Tong Xu
HARDWARE COMPILATION OF CASCADED GRAMMARS

Publication number: 20180005060

Abstract: A cascaded finite-state-transducer array includes a plurality of finite-state-transducers, the finite-state-transducers being distributed in space. The finite-state-transducer array is configured with dedicated data transfer channels between the finite-state-transducers to transfer specific data types. Each data stream on a dedicated data transfer channel may transmit a particular data type, which may be sorted in increasing order of start offsets or token IDs.

Type: Application

Filed: July 1, 2016

Publication date: January 4, 2018

Inventors: Kubilay Atasu, Akihiro Nakayama, Raphael Polig, Tong Xu

1 2 next