Patents by Inventor Lucian Popa
Lucian Popa has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11875253Abstract: Methods, systems, and computer program products for low-resource entity resolution with transfer learning are provided herein. A computer-implemented method includes processing input data via a first entity resolution model, wherein the input data comprise labeled input data and unlabeled input data; identifying one or more portions of the unlabeled input data to be used in training a neural network entity resolution model, wherein said identifying comprises applying one or more active learning algorithms to the first entity resolution model; training, using (i) the one or more portions of the unlabeled input data and (ii) one or more deep learning techniques, the neural network entity resolution model; and performing one or more entity resolution tasks by applying the trained neural network entity resolution model to one or more datasets.Type: GrantFiled: June 17, 2019Date of Patent: January 16, 2024Assignee: International Business Machines CorporationInventors: Jungo Kasai, Kun Qian, Sairam Gurajada, Yunyao Li, Lucian Popa
-
Publication number: 20230120658Abstract: Systems, computer-implemented methods, and computer program products to facilitate inter-operator backpropagation in AutoML frameworks are provided. According to an embodiment, a system can comprise a processor that executes computer executable components stored in memory. The computer executable components comprise a selection component that selects a subset of deep learning and non-deep learning operators. The computer executable components further comprise a training component which trains the subset of deep learning and non-deep learning operators, wherein deep learning operators in the subset of deep learning and non-deep learning operators are trained using backpropagation across at least two deep learning operators of the subset of deep learning and non-deep learning operators.Type: ApplicationFiled: October 20, 2021Publication date: April 20, 2023Inventors: Kiran A. Kate, Sairam Gurajada, Tejaswini Pedapati, Martin Hirzel, Lucian Popa, Yunyao Li, Jason Tsay
-
Patent number: 11531717Abstract: Data records are linked across a plurality of datasets. Each dataset contains at least one data record, and each data record is associated with an entity and includes one or more attributes of that entity and a value for each attribute. Values associated with attributes are compared across datasets, and matching attributes having values that satisfy a predetermined similarity threshold are identified. In addition, linkage points between pairs of datasets are identified. Each linkage point links one or more pairs of data records. Each data record in the pair of data records is contained in one of a given pair of datasets, and each pair of data records is associated with a common entity having matching attributes in the given pair of datasets. Data records associated with the common entities are linked across datasets using the identified linkage points.Type: GrantFiled: February 19, 2020Date of Patent: December 20, 2022Assignee: International Business Machines CorporationInventors: Oktie Hassanzadeh, Mauricio A. Hernandez-Sherrington, Ching-Tien Ho, Lucian Popa
-
Publication number: 20220366231Abstract: A graph neural network can be built and trained to predict a risk of an entity. A multi-relational graph network can include a first graph network and a second graph network. The first graph network can include a first set of nodes and a first set of edges connecting some of the nodes in the first set. The second graph network can include a second set of nodes and a second set of edges connecting some of the nodes in the second set. The first set of nodes and the second set of nodes can represent entities, the first set of edges can represent a first relationship between the entities and the second set of edges can represent a second relationship between the entities. A graph convolutional network (GCN) can be structured to incorporate the multi-relational graph network, and trained to predict a risk associated with a given entity.Type: ApplicationFiled: April 27, 2021Publication date: November 17, 2022Inventors: Yada Zhu, Sijia Liu, Aparna Gupta, Sai Radhakrishna Manikant Sarma Palepu, Koushik Kar, Lucian Popa, Kumar Bhaskaran, Nitin Gaur
-
Patent number: 11501111Abstract: Methods, systems, and computer program products for learning models for entity resolution using active learning are provided herein. A computer-implemented method includes determining a set of data items related to a task associated with structured knowledge base creation, and outputting the set of data items to a user for labeling. Such a method also includes generating, based on a user-labeled version of the set of data items, a candidate model for executing the task, and one or more generalized versions of the candidate model. Additionally, such a method can also include generating a final model based on one or more iterations of analysis of the candidate model and analysis of the one or more generalized versions of the candidate model, and performing the task by executing the final model on one or more datasets.Type: GrantFiled: April 6, 2018Date of Patent: November 15, 2022Assignee: International Business Machines CorporationInventors: Kun Qian, Lucian Popa, Prithviraj Sen, Min Li
-
Publication number: 20220300799Abstract: A system, computer program product, and method are provided for entity linking in a logical neural network (LNN). A set of features are generated for one or more entity-mention pairs in an annotated dataset. The generated set of features is evaluated against an entity linking LNN rule template having one or more logically connected rules and corresponding connective weights organized in a tree structure. An artificial neural network is leveraged along with a corresponding machine learning algorithm to learn the connective weights. The connective weights associated with the logically connected rules are selectively updated and a learned model is generated with learned thresholds and the learned weights for the logically connected rules.Type: ApplicationFiled: March 16, 2021Publication date: September 22, 2022Applicant: International Business Machines CorporationInventors: Hang Jiang, Sairam Gurajada, Lucian Popa, Prithviraj Sen, Alexander Gray, Yunyao Li
-
Publication number: 20220188974Abstract: A method, system, and computer program product for learning entity resolution rules for determining whether entities are matching. The method may include receiving historical pairs of entities. The method may also include determining a set of rules for determining whether a pair of entities are matching, where the set of rules comprises a plurality of conditions. The method may also include developing, using a deep neural network, an entity resolution model based on the historical pairs of entities. The method may also include receiving a new pair of entities. The method may also include applying the entity resolution model to the new pair of entities. The method may also include determining whether one or more rules from the set of rules are satisfied for the new pair of entities. The method may also include categorizing the new pair of entities as matching or not matching.Type: ApplicationFiled: December 14, 2020Publication date: June 16, 2022Inventors: Sheshera Mysore, Sairam Gurajada, Lucian Popa, Kun Qian, Prithviraj Sen
-
Publication number: 20220051049Abstract: A computer automatically selects a machine learning model pipeline using a meta-learning machine learning model. The computer receives ground truth data and pipeline preference metadata. The computer determines a group of pipelines appropriate for the ground truth data, and each of the pipelines includes an algorithm. The pipelines may include data preprocessing routines. The computer generates hyperparameter sets for the pipelines. The computer applies preprocessing routines to ground truth data to generate a group of preprocessed sets of said ground truth data and ranks hyperparameter set performance for each pipeline to establish a preferred set of hyperparameters for each of pipeline. The computer selects favored data features and applies each of the pipelines, with associated sets of preferred hyperparameters, to score the favored data features of the preprocessed ground truth data. The computer ranks pipeline performance and selects a candidate pipeline according to the ranking.Type: ApplicationFiled: August 11, 2020Publication date: February 17, 2022Inventors: Dakuo Wang, Chuang Gan, Gregory Bramble, Lisa Amini, Horst Cornelius Samulowitz, Kiran A. Kate, Bei Chen, Martin Wistuba, Alexandre Evfimievski, Ioannis Katsis, Yunyao Li, Adelmo Cristiano Innocenza Malossi, Andrea Bartezzaghi, Ban Kawas, Sairam Gurajada, Lucian Popa, Tejaswini Pedapati, Alexander Gray
-
Publication number: 20200394511Abstract: Methods, systems, and computer program products for low-resource entity resolution with transfer learning are provided herein. A computer-implemented method includes processing input data via a first entity resolution model, wherein the input data comprise labeled input data and unlabeled input data; identifying one or more portions of the unlabeled input data to be used in training a neural network entity resolution model, wherein said identifying comprises applying one or more active learning algorithms to the first entity resolution model; training, using (i) the one or more portions of the unlabeled input data and (ii) one or more deep learning techniques, the neural network entity resolution model; and performing one or more entity resolution tasks by applying the trained neural network entity resolution model to one or more datasets.Type: ApplicationFiled: June 17, 2019Publication date: December 17, 2020Inventors: Jungo Kasai, Kun Qian, Sairam Gurajada, Yunyao Li, Lucian Popa
-
Patent number: 10776269Abstract: One embodiment provides for a method that includes performing, by a processor, active learning of large scale entity resolution using a distributed compute memoing cache to eliminate redundant computation. Link feature vector tables are determined for intermediate results of the active learning of large scale entity resolution. The link feature vector tables are managed by a two-level cache hierarchy.Type: GrantFiled: July 24, 2018Date of Patent: September 15, 2020Assignee: International Business Machines CorporationInventors: Min Li, Lucian Popa, Prithviraj Sen
-
Publication number: 20200183995Abstract: Data records are linked across a plurality of datasets. Each dataset contains at least one data record, and each data record is associated with an entity and includes one or more attributes of that entity and a value for each attribute. Values associated with attributes are compared across datasets, and matching attributes having values that satisfy a predetermined similarity threshold are identified. In addition, linkage points between pairs of datasets are identified. Each linkage point links one or more pairs of data records. Each data record in the pair of data records is contained in one of a given pair of datasets, and each pair of data records is associated with a common entity having matching attributes in the given pair of datasets. Data records associated with the common entities are linked across datasets using the identified linkage points.Type: ApplicationFiled: February 19, 2020Publication date: June 11, 2020Inventors: Oktie Hassanzadeh, Mauricio A. Hernandez-Sherrington, Ching-Tien Ho, Lucian Popa
-
Patent number: 10599732Abstract: Data records are linked across a plurality of datasets. Each dataset contains at least one data record, and each data record is associated with an entity and includes one or more attributes of that entity and a value for each attribute. Values associated with attributes are compared across datasets, and matching attributes having values that satisfy a predetermined similarity threshold are identified. In addition, linkage points between pairs of datasets are identified. Each linkage point links one or more pairs of data records. Each data record in the pair of data records is contained in one of a given pair of datasets, and each pair of data records is associated with a common entity having matching attributes in the given pair of datasets. Data records associated with the common entities are linked across datasets using the identified linkage points.Type: GrantFiled: February 23, 2017Date of Patent: March 24, 2020Assignee: International Business Machines CorporationInventors: Oktie Hassanzadeh, Mauricio A Hernandez, Ching-Tien Ho, Lucian Popa
-
Publication number: 20200034293Abstract: One embodiment provides for a method that includes performing, by a processor, active learning of large scale entity resolution using a distributed compute memoing cache to eliminate redundant computation. Link feature vector tables are determined for intermediate results of the active learning of large scale entity resolution. The link feature vector tables are managed by a two-level cache hierarchy.Type: ApplicationFiled: July 24, 2018Publication date: January 30, 2020Inventors: Min Li, Lucian Popa, Prithviraj Sen
-
Publication number: 20190311229Abstract: Methods, systems, and computer program products for learning models for entity resolution using active learning are provided herein. A computer-implemented method includes determining a set of data items related to a task associated with structured knowledge base creation, and outputting the set of data items to a user for labeling. Such a method also includes generating, based on a user-labeled version of the set of data items, a candidate model for executing the task, and one or more generalized versions of the candidate model. Additionally, such a method can also include generating a final model based on one or more iterations of analysis of the candidate model and analysis of the one or more generalized versions of the candidate model, and performing the task by executing the final model on one or more datasets.Type: ApplicationFiled: April 6, 2018Publication date: October 10, 2019Inventors: Kun Qian, Lucian Popa, Prithviraj Sen, Min Li
-
Patent number: 10110460Abstract: Example embodiments relate to work conserving bandwidth guarantees using priority, and a method for determining VM-to-VM bandwidth guarantees between a source virtual machine (VM) and at least one destination VM, including a particular VM-toVM bandwidth guarantee between the source VM and a particular destination VM. The method includes monitoring outbound network traffic flow from the source VM to the particular destination VM. The method includes comparing the outbound network traffic flow to the particular VM-to-VM bandwidth guarantee. When the outbound network traffic flow is less than the particular VM-to-VM bandwidth guarantee, packets of the flow are directed according to a first priority. When the outbound network traffic flow is greater than the particular VM-to-VM bandwidth guarantee, packets of the flow are directed according to a second priority.Type: GrantFiled: July 23, 2013Date of Patent: October 23, 2018Assignee: Hewlett Packard Enterprise Development LPInventors: Praveen Yalagandula, Lucian Popa, Sujata Banerjee
-
Patent number: 10009285Abstract: An example method for allocating resources in accordance with aspects of the present disclosure includes collecting proposals from a plurality of modules, the proposals assigning the resources to the plurality of modules and resulting in topology changes in a computer network environment, identifying a set of proposals in the proposals, the set of proposals complying with policies associated with the plurality of modules, instructing the plurality of modules to evaluate the set of proposals, selecting a proposal from the set of proposals, and instructing at least one module associated with the selected proposal to instantiate the selected proposal.Type: GrantFiled: July 30, 2013Date of Patent: June 26, 2018Assignee: Hewlett Packard Enterprise Development LPInventors: Jeffrey Clifford Mogul, Alvin Auyoung, Sujata Banerjee, Jung Gun Lee, Jean Tourrilhes, Michael Schlansker, Puneet Sharma, Lucian Popa
-
Patent number: 9996607Abstract: Described herein are methods, systems and computer program products for entity resolution. Entity resolution, also known as entity matching or record linkage, seeks to identify equivalent data objects between or among datasets. An example method includes creating a deterministic model by defining an entity to be resolved, selecting two datasets for comparison, defining matching predicates for attributes of the datasets to select a set of candidate matches, and defining a precedence rule for the candidate matches to select a subset of the candidate matches. The method includes running the deterministic model on the two datasets. Running the deterministic model includes applying the matching predicates and the precedence rule to data in the datasets that correspond to the attributes. The method also includes applying a cardinality rule to results of the running, and outputting the matching candidates for which the cardinality rule is satisfied.Type: GrantFiled: October 31, 2014Date of Patent: June 12, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bogdan Alexe, Douglas R. Burdick, Mauricio A. Hernandez-Sherrington, Hima P. Karanam, Rajasekar Krishnamurthy, Lucian Popa, Shivakumar Vaithyanathan
-
Patent number: 9971804Abstract: Embodiments of the present invention relate to a new method of entity integration using high-level scripting languages. In one embodiment, a method of and computer product for entity integration is provided. An entity declaration is read from a machine readable medium. The entity declaration describes an entity including at least one nested entity. An index declaration is read from a machine readable medium. The index declaration describes an index of nested entities. An entity population rule is read from a machine readable medium. The entity population rule describes a mapping from an input schema to an output schema. The output schema conforms to the entity declaration. A plurality of input records is read from a first data store. The input records conform to the input schema. The entity population rule applies to the plurality of records to create a plurality of output records complying with the output schema. An index of nested entities is populated. The index complies with the index declaration.Type: GrantFiled: October 28, 2016Date of Patent: May 15, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Armageddon R. Brown, Mauricio A. Hernandez, Georgia Koutrika, Rajasekar Krishnamurthy, Lucian Popa, Suresh Thalamati, Ryan Wisnesky
-
Patent number: 9794185Abstract: According to an example, a method for bandwidth guarantee and work conservation includes determining virtual machine (VM) bandwidth guarantees assigned to VMs in a network including a source VM that communicates with destination VMs. The method further includes assigning minimum bandwidth guarantees to communications between the source VM with the destination VMs by dividing a VM bandwidth guarantee assigned to the source VM between the destination VMs based on active VM-to-VM communications between the source VM and the destination VMs. The method also includes allocating, by a processor, spare bandwidth capacity in the network to a communication between the source VM and a destination VM based on the assigned minimum bandwidth guarantees.Type: GrantFiled: July 31, 2012Date of Patent: October 17, 2017Assignee: Hewlett Packard Enterprise Development LPInventors: Lucian Popa, Praveen Yalagandula, Sujata Banerjee, Jeffrey C. Mogul, Yoshio Turner, Jose Renato G. Santos
-
Patent number: 9710534Abstract: Data records are linked across a plurality of datasets. Each dataset contains at least one data record, and each data record is associated with an entity and includes one or more attributes of that entity and a value for each attribute. Values associated with attributes are compared across datasets, and matching attributes having values that satisfy a predetermined similarity threshold are identified. In addition, linkage points between pairs of datasets are identified. Each linkage point links one or more pairs of data records. Each data record in the pair of data records is contained in one of a given pair of datasets, and each pair of data records is associated with a common entity having matching attributes in the given pair of datasets. Data records associated with the common entities are linked across datasets using the identified linkage points.Type: GrantFiled: May 7, 2013Date of Patent: July 18, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Oktie Hassanzadeh, Mauricio A. Hernandez, Ching-Tien Ho, Lucian Popa