Patents by Inventor Lucian Popa

Lucian Popa has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

MODEL-INDEPENDENT DATA SUBSETS

Publication number: 20250181908

Abstract: Embodiments of the invention provide a computer-implemented method that uses a processor system to perform processor system operations. The processor system operations include executing a model-independent selection (MIS) algorithm to select a first data subset from a first dataset based at least in part on one or more data quality metrics. The one or more data quality metrics include a first data quality metric that results from using a first function to map first data points of the first dataset to the first data quality metric. Executing the MIS algorithm to select the first data subset is further based at least in part on sampling the first data subset from the first dataset based on a probability distribution. The processor system operations further include providing the first data subset to a to-be-trained (TBT) model. The first function is independent of a type of the TBT model.

Type: Application

Filed: November 30, 2023

Publication date: June 5, 2025

Inventors: Krishnateja Killamsetty, Alexandre Evfimievski, Tejaswini Pedapati, Kiran A Kate, Lucian Popa, Rishabh Krishnan Iyer
Deep learning of entity resolution rules

Patent number: 12266077

Abstract: A method, system, and computer program product for learning entity resolution rules for determining whether entities are matching. The method may include receiving historical pairs of entities. The method may also include determining a set of rules for determining whether a pair of entities are matching, where the set of rules comprises a plurality of conditions. The method may also include developing, using a deep neural network, an entity resolution model based on the historical pairs of entities. The method may also include receiving a new pair of entities. The method may also include applying the entity resolution model to the new pair of entities. The method may also include determining whether one or more rules from the set of rules are satisfied for the new pair of entities. The method may also include categorizing the new pair of entities as matching or not matching.

Type: Grant

Filed: December 14, 2020

Date of Patent: April 1, 2025

Assignee: International Business Machines Corporation

Inventors: Sheshera Mysore, Sairam Gurajada, Lucian Popa, Kun Qian, Prithviraj Sen
Low-resource entity resolution with transfer learning

Patent number: 11875253

Abstract: Methods, systems, and computer program products for low-resource entity resolution with transfer learning are provided herein. A computer-implemented method includes processing input data via a first entity resolution model, wherein the input data comprise labeled input data and unlabeled input data; identifying one or more portions of the unlabeled input data to be used in training a neural network entity resolution model, wherein said identifying comprises applying one or more active learning algorithms to the first entity resolution model; training, using (i) the one or more portions of the unlabeled input data and (ii) one or more deep learning techniques, the neural network entity resolution model; and performing one or more entity resolution tasks by applying the trained neural network entity resolution model to one or more datasets.

Type: Grant

Filed: June 17, 2019

Date of Patent: January 16, 2024

Assignee: International Business Machines Corporation

Inventors: Jungo Kasai, Kun Qian, Sairam Gurajada, Yunyao Li, Lucian Popa
INTER-OPERATOR BACKPROPAGATION IN AUTOML FRAMEWORKS

Publication number: 20230120658

Abstract: Systems, computer-implemented methods, and computer program products to facilitate inter-operator backpropagation in AutoML frameworks are provided. According to an embodiment, a system can comprise a processor that executes computer executable components stored in memory. The computer executable components comprise a selection component that selects a subset of deep learning and non-deep learning operators. The computer executable components further comprise a training component which trains the subset of deep learning and non-deep learning operators, wherein deep learning operators in the subset of deep learning and non-deep learning operators are trained using backpropagation across at least two deep learning operators of the subset of deep learning and non-deep learning operators.

Type: Application

Filed: October 20, 2021

Publication date: April 20, 2023

Inventors: Kiran A. Kate, Sairam Gurajada, Tejaswini Pedapati, Martin Hirzel, Lucian Popa, Yunyao Li, Jason Tsay
Discovery of linkage points between data sources

Patent number: 11531717

Abstract: Data records are linked across a plurality of datasets. Each dataset contains at least one data record, and each data record is associated with an entity and includes one or more attributes of that entity and a value for each attribute. Values associated with attributes are compared across datasets, and matching attributes having values that satisfy a predetermined similarity threshold are identified. In addition, linkage points between pairs of datasets are identified. Each linkage point links one or more pairs of data records. Each data record in the pair of data records is contained in one of a given pair of datasets, and each pair of data records is associated with a common entity having matching attributes in the given pair of datasets. Data records associated with the common entities are linked across datasets using the identified linkage points.

Type: Grant

Filed: February 19, 2020

Date of Patent: December 20, 2022

Assignee: International Business Machines Corporation

Inventors: Oktie Hassanzadeh, Mauricio A. Hernandez-Sherrington, Ching-Tien Ho, Lucian Popa
MULTI-RELATIONAL GRAPH CONVOLUTIONAL NETWORK (GCN) IN RISK PREDICTION

Publication number: 20220366231

Abstract: A graph neural network can be built and trained to predict a risk of an entity. A multi-relational graph network can include a first graph network and a second graph network. The first graph network can include a first set of nodes and a first set of edges connecting some of the nodes in the first set. The second graph network can include a second set of nodes and a second set of edges connecting some of the nodes in the second set. The first set of nodes and the second set of nodes can represent entities, the first set of edges can represent a first relationship between the entities and the second set of edges can represent a second relationship between the entities. A graph convolutional network (GCN) can be structured to incorporate the multi-relational graph network, and trained to predict a risk associated with a given entity.

Type: Application

Filed: April 27, 2021

Publication date: November 17, 2022

Inventors: Yada Zhu, Sijia Liu, Aparna Gupta, Sai Radhakrishna Manikant Sarma Palepu, Koushik Kar, Lucian Popa, Kumar Bhaskaran, Nitin Gaur
Learning models for entity resolution using active learning

Patent number: 11501111

Abstract: Methods, systems, and computer program products for learning models for entity resolution using active learning are provided herein. A computer-implemented method includes determining a set of data items related to a task associated with structured knowledge base creation, and outputting the set of data items to a user for labeling. Such a method also includes generating, based on a user-labeled version of the set of data items, a candidate model for executing the task, and one or more generalized versions of the candidate model. Additionally, such a method can also include generating a final model based on one or more iterations of analysis of the candidate model and analysis of the one or more generalized versions of the candidate model, and performing the task by executing the final model on one or more datasets.

Type: Grant

Filed: April 6, 2018

Date of Patent: November 15, 2022

Assignee: International Business Machines Corporation

Inventors: Kun Qian, Lucian Popa, Prithviraj Sen, Min Li
Neuro-Symbolic Approach for Entity Linking

Publication number: 20220300799

Abstract: A system, computer program product, and method are provided for entity linking in a logical neural network (LNN). A set of features are generated for one or more entity-mention pairs in an annotated dataset. The generated set of features is evaluated against an entity linking LNN rule template having one or more logically connected rules and corresponding connective weights organized in a tree structure. An artificial neural network is leveraged along with a corresponding machine learning algorithm to learn the connective weights. The connective weights associated with the logically connected rules are selectively updated and a learned model is generated with learned thresholds and the learned weights for the logically connected rules.

Type: Application

Filed: March 16, 2021

Publication date: September 22, 2022

Applicant: International Business Machines Corporation

Inventors: Hang Jiang, Sairam Gurajada, Lucian Popa, Prithviraj Sen, Alexander Gray, Yunyao Li
DEEP LEARNING OF ENTITY RESOLUTION RULES

Publication number: 20220188974

Abstract: A method, system, and computer program product for learning entity resolution rules for determining whether entities are matching. The method may include receiving historical pairs of entities. The method may also include determining a set of rules for determining whether a pair of entities are matching, where the set of rules comprises a plurality of conditions. The method may also include developing, using a deep neural network, an entity resolution model based on the historical pairs of entities. The method may also include receiving a new pair of entities. The method may also include applying the entity resolution model to the new pair of entities. The method may also include determining whether one or more rules from the set of rules are satisfied for the new pair of entities. The method may also include categorizing the new pair of entities as matching or not matching.

Type: Application

Filed: December 14, 2020

Publication date: June 16, 2022

Inventors: Sheshera Mysore, Sairam Gurajada, Lucian Popa, Kun Qian, Prithviraj Sen
USING META-LEARNING TO OPTIMIZE AUTOMATIC SELECTION OF MACHINE LEARNING PIPELINES

Publication number: 20220051049

Abstract: A computer automatically selects a machine learning model pipeline using a meta-learning machine learning model. The computer receives ground truth data and pipeline preference metadata. The computer determines a group of pipelines appropriate for the ground truth data, and each of the pipelines includes an algorithm. The pipelines may include data preprocessing routines. The computer generates hyperparameter sets for the pipelines. The computer applies preprocessing routines to ground truth data to generate a group of preprocessed sets of said ground truth data and ranks hyperparameter set performance for each pipeline to establish a preferred set of hyperparameters for each of pipeline. The computer selects favored data features and applies each of the pipelines, with associated sets of preferred hyperparameters, to score the favored data features of the preprocessed ground truth data. The computer ranks pipeline performance and selects a candidate pipeline according to the ranking.

Type: Application

Filed: August 11, 2020

Publication date: February 17, 2022

Inventors: Dakuo Wang, Chuang Gan, Gregory Bramble, Lisa Amini, Horst Cornelius Samulowitz, Kiran A. Kate, Bei Chen, Martin Wistuba, Alexandre Evfimievski, Ioannis Katsis, Yunyao Li, Adelmo Cristiano Innocenza Malossi, Andrea Bartezzaghi, Ban Kawas, Sairam Gurajada, Lucian Popa, Tejaswini Pedapati, Alexander Gray
Low-Resource Entity Resolution with Transfer Learning

Publication number: 20200394511

Abstract: Methods, systems, and computer program products for low-resource entity resolution with transfer learning are provided herein. A computer-implemented method includes processing input data via a first entity resolution model, wherein the input data comprise labeled input data and unlabeled input data; identifying one or more portions of the unlabeled input data to be used in training a neural network entity resolution model, wherein said identifying comprises applying one or more active learning algorithms to the first entity resolution model; training, using (i) the one or more portions of the unlabeled input data and (ii) one or more deep learning techniques, the neural network entity resolution model; and performing one or more entity resolution tasks by applying the trained neural network entity resolution model to one or more datasets.

Type: Application

Filed: June 17, 2019

Publication date: December 17, 2020

Inventors: Jungo Kasai, Kun Qian, Sairam Gurajada, Yunyao Li, Lucian Popa
Two level compute memoing for large scale entity resolution

Patent number: 10776269

Abstract: One embodiment provides for a method that includes performing, by a processor, active learning of large scale entity resolution using a distributed compute memoing cache to eliminate redundant computation. Link feature vector tables are determined for intermediate results of the active learning of large scale entity resolution. The link feature vector tables are managed by a two-level cache hierarchy.

Type: Grant

Filed: July 24, 2018

Date of Patent: September 15, 2020

Assignee: International Business Machines Corporation

Inventors: Min Li, Lucian Popa, Prithviraj Sen
DISCOVERY OF LINKAGE POINTS BETWEEN DATA SOURCES

Publication number: 20200183995

Abstract: Data records are linked across a plurality of datasets. Each dataset contains at least one data record, and each data record is associated with an entity and includes one or more attributes of that entity and a value for each attribute. Values associated with attributes are compared across datasets, and matching attributes having values that satisfy a predetermined similarity threshold are identified. In addition, linkage points between pairs of datasets are identified. Each linkage point links one or more pairs of data records. Each data record in the pair of data records is contained in one of a given pair of datasets, and each pair of data records is associated with a common entity having matching attributes in the given pair of datasets. Data records associated with the common entities are linked across datasets using the identified linkage points.

Type: Application

Filed: February 19, 2020

Publication date: June 11, 2020

Inventors: Oktie Hassanzadeh, Mauricio A. Hernandez-Sherrington, Ching-Tien Ho, Lucian Popa
Methods and systems for discovery of linkage points between data sources

Patent number: 10599732

Abstract: Data records are linked across a plurality of datasets. Each dataset contains at least one data record, and each data record is associated with an entity and includes one or more attributes of that entity and a value for each attribute. Values associated with attributes are compared across datasets, and matching attributes having values that satisfy a predetermined similarity threshold are identified. In addition, linkage points between pairs of datasets are identified. Each linkage point links one or more pairs of data records. Each data record in the pair of data records is contained in one of a given pair of datasets, and each pair of data records is associated with a common entity having matching attributes in the given pair of datasets. Data records associated with the common entities are linked across datasets using the identified linkage points.

Type: Grant

Filed: February 23, 2017

Date of Patent: March 24, 2020

Assignee: International Business Machines Corporation

Inventors: Oktie Hassanzadeh, Mauricio A Hernandez, Ching-Tien Ho, Lucian Popa
TWO LEVEL COMPUTE MEMOING FOR LARGE SCALE ENTITY RESOLUTION

Publication number: 20200034293

Abstract: One embodiment provides for a method that includes performing, by a processor, active learning of large scale entity resolution using a distributed compute memoing cache to eliminate redundant computation. Link feature vector tables are determined for intermediate results of the active learning of large scale entity resolution. The link feature vector tables are managed by a two-level cache hierarchy.

Type: Application

Filed: July 24, 2018

Publication date: January 30, 2020

Inventors: Min Li, Lucian Popa, Prithviraj Sen
Learning Models For Entity Resolution Using Active Learning

Publication number: 20190311229

Abstract: Methods, systems, and computer program products for learning models for entity resolution using active learning are provided herein. A computer-implemented method includes determining a set of data items related to a task associated with structured knowledge base creation, and outputting the set of data items to a user for labeling. Such a method also includes generating, based on a user-labeled version of the set of data items, a candidate model for executing the task, and one or more generalized versions of the candidate model. Additionally, such a method can also include generating a final model based on one or more iterations of analysis of the candidate model and analysis of the one or more generalized versions of the candidate model, and performing the task by executing the final model on one or more datasets.

Type: Application

Filed: April 6, 2018

Publication date: October 10, 2019

Inventors: Kun Qian, Lucian Popa, Prithviraj Sen, Min Li
Priority assessment of network traffic to conserve bandwidth guarantees in a data center

Patent number: 10110460

Abstract: Example embodiments relate to work conserving bandwidth guarantees using priority, and a method for determining VM-to-VM bandwidth guarantees between a source virtual machine (VM) and at least one destination VM, including a particular VM-toVM bandwidth guarantee between the source VM and a particular destination VM. The method includes monitoring outbound network traffic flow from the source VM to the particular destination VM. The method includes comparing the outbound network traffic flow to the particular VM-to-VM bandwidth guarantee. When the outbound network traffic flow is less than the particular VM-to-VM bandwidth guarantee, packets of the flow are directed according to a first priority. When the outbound network traffic flow is greater than the particular VM-to-VM bandwidth guarantee, packets of the flow are directed according to a second priority.

Type: Grant

Filed: July 23, 2013

Date of Patent: October 23, 2018

Assignee: Hewlett Packard Enterprise Development LP

Inventors: Praveen Yalagandula, Lucian Popa, Sujata Banerjee
Resource allocator

Patent number: 10009285

Abstract: An example method for allocating resources in accordance with aspects of the present disclosure includes collecting proposals from a plurality of modules, the proposals assigning the resources to the plurality of modules and resulting in topology changes in a computer network environment, identifying a set of proposals in the proposals, the set of proposals complying with policies associated with the plurality of modules, instructing the plurality of modules to evaluate the set of proposals, selecting a proposal from the set of proposals, and instructing at least one module associated with the selected proposal to instantiate the selected proposal.

Type: Grant

Filed: July 30, 2013

Date of Patent: June 26, 2018

Assignee: Hewlett Packard Enterprise Development LP

Inventors: Jeffrey Clifford Mogul, Alvin Auyoung, Sujata Banerjee, Jung Gun Lee, Jean Tourrilhes, Michael Schlansker, Puneet Sharma, Lucian Popa
Entity resolution between datasets

Patent number: 9996607

Abstract: Described herein are methods, systems and computer program products for entity resolution. Entity resolution, also known as entity matching or record linkage, seeks to identify equivalent data objects between or among datasets. An example method includes creating a deterministic model by defining an entity to be resolved, selecting two datasets for comparison, defining matching predicates for attributes of the datasets to select a set of candidate matches, and defining a precedence rule for the candidate matches to select a subset of the candidate matches. The method includes running the deterministic model on the two datasets. Running the deterministic model includes applying the matching predicates and the precedence rule to data in the datasets that correspond to the attributes. The method also includes applying a cardinality rule to results of the running, and outputting the matching candidates for which the cardinality rule is satisfied.

Type: Grant

Filed: October 31, 2014

Date of Patent: June 12, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bogdan Alexe, Douglas R. Burdick, Mauricio A. Hernandez-Sherrington, Hima P. Karanam, Rajasekar Krishnamurthy, Lucian Popa, Shivakumar Vaithyanathan
Entity integration using high-level scripting languages

Patent number: 9971804

Abstract: Embodiments of the present invention relate to a new method of entity integration using high-level scripting languages. In one embodiment, a method of and computer product for entity integration is provided. An entity declaration is read from a machine readable medium. The entity declaration describes an entity including at least one nested entity. An index declaration is read from a machine readable medium. The index declaration describes an index of nested entities. An entity population rule is read from a machine readable medium. The entity population rule describes a mapping from an input schema to an output schema. The output schema conforms to the entity declaration. A plurality of input records is read from a first data store. The input records conform to the input schema. The entity population rule applies to the plurality of records to create a plurality of output records complying with the output schema. An index of nested entities is populated. The index complies with the index declaration.

Type: Grant

Filed: October 28, 2016

Date of Patent: May 15, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Armageddon R. Brown, Mauricio A. Hernandez, Georgia Koutrika, Rajasekar Krishnamurthy, Lucian Popa, Suresh Thalamati, Ryan Wisnesky

1 2 3 4 next