Patents by Inventor Shyam R. Mudambi
Shyam R. Mudambi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10248694Abstract: A computer-implemented method includes inserting a bloom filter creation stage after an inner data source identification stage, wherein a join operation is to be performed to join an outer data source with the inner data source. The method inserts a bloom filter search stage after an outer data source identification stage, wherein each row of data from the outer data source is searched against a bloom filter for the inner data source during the bloom filter search stage. The method initializes a read on the inner data source. Subsequent to determining the bloom filter creation stage is complete, the method initializes a read on the outer data source. The method performs the join operation at a join stage.Type: GrantFiled: August 31, 2015Date of Patent: April 2, 2019Assignee: International Business Machines CorporationInventors: Manish A. Bhide, Shyam R. Mudambi, Sriram K. Padmanabhan, Vivek S. Tirumalaraju
-
Patent number: 10242063Abstract: A computer-implemented method includes inserting a bloom filter creation stage after an inner data source identification stage, wherein a join operation is to be performed to join an outer data source with the inner data source. The method inserts a bloom filter search stage after an outer data source identification stage, wherein each row of data from the outer data source is searched against a bloom filter for the inner data source during the bloom filter search stage. The method initializes a read on the inner data source. Subsequent to determining the bloom filter creation stage is complete, the method initializes a read on the outer data source. The method performs the join operation at a join stage.Type: GrantFiled: July 20, 2016Date of Patent: March 26, 2019Assignee: International Business Machines CorporationInventors: Manish A. Bhide, Shyam R. Mudambi, Sriram K. Padmanabhan, Vivek S. Tirumalaraju
-
Patent number: 10114878Abstract: A computer manages methods for utilizing an index to manage access to data in a dataset stored in one or more file locations in an ETL tool by receiving a request to access a dataset associated with one or more file locations, wherein the dataset is stored in the one or more file locations. The computer queries an index for the one or more file locations associated with the dataset, wherein the dataset has another index for data in the dataset. The computer receives the one or more file locations associated with the dataset. The computer determines to cache the request to access the one or more file locations for the dataset until one or more thresholds are met, wherein the cached request is part of a total number of cached requests.Type: GrantFiled: December 16, 2013Date of Patent: October 30, 2018Assignee: International Business Machines CorporationInventors: Manish A. Bhide, Jean-Claude Mamou, Shyam R. Mudambi
-
Patent number: 9594637Abstract: System, method, and computer program product to process parallel computing tasks on a distributed computing system, by computing an execution plan for a parallel computing job to be executed on the distributed computing system, the distributed computing system comprising a plurality of compute nodes, generating, based on the execution plan, an ordered set of tasks, the ordered set of tasks comprising: (i) configuration tasks, and (ii) execution tasks for executing the parallel computing job on the distributed computing system, and launching a distributed computing application to assign the tasks of the ordered set of tasks to the plurality of compute nodes to execute the parallel computing job on the distributed computing system.Type: GrantFiled: March 25, 2014Date of Patent: March 14, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
-
Publication number: 20170060967Abstract: A computer-implemented method includes inserting a bloom filter creation stage after an inner data source identification stage, wherein a join operation is to be performed to join an outer data source with the inner data source. The method inserts a bloom filter search stage after an outer data source identification stage, wherein each row of data from the outer data source is searched against a bloom filter for the inner data source during the bloom filter search stage. The method initializes a read on the inner data source. Subsequent to determining the bloom filter creation stage is complete, the method initializes a read on the outer data source. The method performs the join operation at a join stage.Type: ApplicationFiled: August 31, 2015Publication date: March 2, 2017Inventors: Manish A. Bhide, Shyam R. Mudambi, Sriram K. Padmanabhan, Vivek S. Tirumalaraju
-
Publication number: 20170060970Abstract: A computer-implemented method includes inserting a bloom filter creation stage after an inner data source identification stage, wherein a join operation is to be performed to join an outer data source with the inner data source. The method inserts a bloom filter search stage after an outer data source identification stage, wherein each row of data from the outer data source is searched against a bloom filter for the inner data source during the bloom filter search stage. The method initializes a read on the inner data source. Subsequent to determining the bloom filter creation stage is complete, the method initializes a read on the outer data source. The method performs the join operation at a join stage.Type: ApplicationFiled: July 20, 2016Publication date: March 2, 2017Inventors: Manish A. Bhide, Shyam R. Mudambi, Sriram K. Padmanabhan, Vivek S. Tirumalaraju
-
Patent number: 9477511Abstract: System, method, and computer program product to perform an operation for task-based modeling for parallel data integration, by determining, for a data flow, a set of processing units, each of the set of processing units defining one or more data processing operations to process the data flow, generating a set of tasks to represent the set of processing units, each task in the set of tasks comprising one or more of the data processing operations of the set of processing units, optimizing the set of tasks based on a set of characteristics of the data flow, and generating a composite execution plan based on the optimized set of tasks to process the data flow in a distributed computing environment.Type: GrantFiled: August 14, 2013Date of Patent: October 25, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
-
Patent number: 9477512Abstract: System, method, and computer program product to perform an operation for task-based modeling for parallel data integration, by determining, for a data flow, a set of processing units, each of the set of processing units defining one or more data processing operations to process the data flow, generating a set of tasks to represent the set of processing units, each task in the set of tasks comprising one or more of the data processing operations of the set of processing units, optimizing the set of tasks based on a set of characteristics of the data flow, and generating a composite execution plan based on the optimized set of tasks to process the data flow in a distributed computing environment.Type: GrantFiled: September 12, 2014Date of Patent: October 25, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
-
Patent number: 9424160Abstract: Data flow disruptions over a series of data processing operators can be detected by a computer system that generates a profile for data flow at an operator. The profile can include data input, processing, and output wait times. Using the profile, the system can detect potential flow disruptions. If the potential disruption satisfies a rule, it is considered a data flow disruption and a recommendation associated with the satisfied rule is identified. The recommendation and the operator identity is displayed.Type: GrantFiled: March 27, 2015Date of Patent: August 23, 2016Assignee: International Business Machines CorporationInventors: Brian K. Caufield, Lawrence A. Greene, Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu, Dong J. Wei
-
Patent number: 9401835Abstract: Techniques are disclosed for data integration on retargetable engines in a networked environment. The networked environment includes data processing engines of different types and having different sets of characteristics. A request is received execute a data flow model in the networked environment. The data flow model includes data flow objects. A first data processing engine is programmatically selected based on a predefined set of criteria and the sets of characteristics of the data processing engines. The data flow model is executed using the selected data processing engine and responsive to the request.Type: GrantFiled: March 15, 2013Date of Patent: July 26, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
-
Patent number: 9323619Abstract: System, method, and computer program product to process parallel computing tasks on a distributed computing system, by computing an execution plan for a parallel computing job to be executed on the distributed computing system, the distributed computing system comprising a plurality of compute nodes, generating, based on the execution plan, an ordered set of tasks, the ordered set of tasks comprising: (i) configuration tasks, and (ii) execution tasks for executing the parallel computing job on the distributed computing system, and launching a distributed computing application to assign the tasks of the ordered set of tasks to the plurality of compute nodes to execute the parallel computing job on the distributed computing system.Type: GrantFiled: March 15, 2013Date of Patent: April 26, 2016Assignee: International Business Machines CorporationInventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
-
Patent number: 9262205Abstract: Techniques are disclosed for qualified checkpointing of a data flow model having data flow operators and links connecting the data flow operators. A link of the data flow model is selected based on a set of checkpoint criteria. A checkpoint is generated for the selected link. The checkpoint is selected from different checkpoint types. The generated checkpoint is assigned to the selected link. The data flow model, having at least one link with no assigned checkpoint, is executed.Type: GrantFiled: March 25, 2014Date of Patent: February 16, 2016Assignee: International Business Machines CorporationInventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
-
Patent number: 9256460Abstract: Techniques are disclosed for qualified checkpointing of a data flow model having data flow operators and links connecting the data flow operators. A link of the data flow model is selected based on a set of checkpoint criteria. A checkpoint is generated for the selected link. The checkpoint is selected from different checkpoint types. The generated checkpoint is assigned to the selected link. The data flow model, having at least one link with no assigned checkpoint, is executed.Type: GrantFiled: March 15, 2013Date of Patent: February 9, 2016Assignee: International Business Machines CorporationInventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
-
Publication number: 20150269006Abstract: Data flow disruptions over a series of data processing operators can be detected by a computer system that generates a profile for data flow at an operator. The profile can include data input, processing, and output wait times. Using the profile, the system can detect potential flow disruptions. If the potential disruption satisfies a rule, it is considered a data flow disruption and a recommendation associated with the satisfied rule is identified. The recommendation and the operator identity is displayed.Type: ApplicationFiled: March 27, 2015Publication date: September 24, 2015Inventors: Brian K. Caufield, Lawrence A. Greene, Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu, Dong J. Wei
-
Publication number: 20150169712Abstract: A computer manages methods for utilizing an index to manage access to data in a dataset stored in one or more file locations in an ETL tool by receiving a request to access a dataset associated with one or more file locations, wherein the dataset is stored in the one or more file locations. The computer queries an index for the one or more file locations associated with the dataset, wherein the dataset has another index for data in the dataset. The computer receives the one or more file locations associated with the dataset. The computer determines to cache the request to access the one or more file locations for the dataset until one or more thresholds are met, wherein the cached request is part of a total number of cached requests.Type: ApplicationFiled: December 16, 2013Publication date: June 18, 2015Applicant: International Business Machines CorporationInventors: Manish A. Bhide, Jean-Claude Mamou, Shyam R. Mudambi
-
Publication number: 20150074669Abstract: System, method, and computer program product to perform an operation for task-based modeling for parallel data integration, by determining, for a data flow, a set of processing units, each of the set of processing units defining one or more data processing operations to process the data flow, generating a set of tasks to represent the set of processing units, each task in the set of tasks comprising one or more of the data processing operations of the set of processing units, optimizing the set of tasks based on a set of characteristics of the data flow, and generating a composite execution plan based on the optimized set of tasks to process the data flow in a distributed computing environment.Type: ApplicationFiled: September 12, 2014Publication date: March 12, 2015Inventors: Eric A. JACOBSON, Yong LI, Shyam R. MUDAMBI, Xiaoyan PU
-
Publication number: 20150052530Abstract: System, method, and computer program product to perform an operation for task-based modeling for parallel data integration, by determining, for a data flow, a set of processing units, each of the set of processing units defining one or more data processing operations to process the data flow, generating a set of tasks to represent the set of processing units, each task in the set of tasks comprising one or more of the data processing operations of the set of processing units, optimizing the set of tasks based on a set of characteristics of the data flow, and generating a composite execution plan based on the optimized set of tasks to process the data flow in a distributed computing environment.Type: ApplicationFiled: August 14, 2013Publication date: February 19, 2015Applicant: International Business Machines CorporationInventors: Eric A. JACOBSON, Yong LI, Shyam R. Mudambi, Xiaoyan PU
-
Publication number: 20140281704Abstract: System, method, and computer program product to process parallel computing tasks on a distributed computing system, by computing an execution plan for a parallel computing job to be executed on the distributed computing system, the distributed computing system comprising a plurality of compute nodes, generating, based on the execution plan, an ordered set of tasks, the ordered set of tasks comprising: (i) configuration tasks, and (ii) execution tasks for executing the parallel computing job on the distributed computing system, and launching a distributed computing application to assign the tasks of the ordered set of tasks to the plurality of compute nodes to execute the parallel computing job on the distributed computing system.Type: ApplicationFiled: March 25, 2014Publication date: September 18, 2014Applicant: International Business Machines CoporationInventors: Eric A. JACOBSON, Yong LI, Shyam R. MUDAMBI, Xiaoyan PU
-
Publication number: 20140282604Abstract: Techniques are disclosed for qualified checkpointing of a data flow model having data flow operators and links connecting the data flow operators. A link of the data flow model is selected based on a set of checkpoint criteria. A checkpoint is generated for the selected link. The checkpoint is selected from different checkpoint types. The generated checkpoint is assigned to the selected link. The data flow model, having at least one link with no assigned checkpoint, is executed.Type: ApplicationFiled: March 15, 2013Publication date: September 18, 2014Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
-
Publication number: 20140282605Abstract: Techniques are disclosed for qualified checkpointing of a data flow model having data flow operators and links connecting the data flow operators. A link of the data flow model is selected based on a set of checkpoint criteria. A checkpoint is generated for the selected link. The checkpoint is selected from different checkpoint types. The generated checkpoint is assigned to the selected link. The data flow model, having at least one link with no assigned checkpoint, is executed.Type: ApplicationFiled: March 25, 2014Publication date: September 18, 2014Applicant: International Business Machines CorporationInventors: Eric A. JACOBSON, Yong LI, Shyam R. MUDAMBI, Xiaoyan PU