Patents by Inventor Shyam R. Mudambi

Shyam R. Mudambi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10248694
    Abstract: A computer-implemented method includes inserting a bloom filter creation stage after an inner data source identification stage, wherein a join operation is to be performed to join an outer data source with the inner data source. The method inserts a bloom filter search stage after an outer data source identification stage, wherein each row of data from the outer data source is searched against a bloom filter for the inner data source during the bloom filter search stage. The method initializes a read on the inner data source. Subsequent to determining the bloom filter creation stage is complete, the method initializes a read on the outer data source. The method performs the join operation at a join stage.
    Type: Grant
    Filed: August 31, 2015
    Date of Patent: April 2, 2019
    Assignee: International Business Machines Corporation
    Inventors: Manish A. Bhide, Shyam R. Mudambi, Sriram K. Padmanabhan, Vivek S. Tirumalaraju
  • Patent number: 10242063
    Abstract: A computer-implemented method includes inserting a bloom filter creation stage after an inner data source identification stage, wherein a join operation is to be performed to join an outer data source with the inner data source. The method inserts a bloom filter search stage after an outer data source identification stage, wherein each row of data from the outer data source is searched against a bloom filter for the inner data source during the bloom filter search stage. The method initializes a read on the inner data source. Subsequent to determining the bloom filter creation stage is complete, the method initializes a read on the outer data source. The method performs the join operation at a join stage.
    Type: Grant
    Filed: July 20, 2016
    Date of Patent: March 26, 2019
    Assignee: International Business Machines Corporation
    Inventors: Manish A. Bhide, Shyam R. Mudambi, Sriram K. Padmanabhan, Vivek S. Tirumalaraju
  • Patent number: 10114878
    Abstract: A computer manages methods for utilizing an index to manage access to data in a dataset stored in one or more file locations in an ETL tool by receiving a request to access a dataset associated with one or more file locations, wherein the dataset is stored in the one or more file locations. The computer queries an index for the one or more file locations associated with the dataset, wherein the dataset has another index for data in the dataset. The computer receives the one or more file locations associated with the dataset. The computer determines to cache the request to access the one or more file locations for the dataset until one or more thresholds are met, wherein the cached request is part of a total number of cached requests.
    Type: Grant
    Filed: December 16, 2013
    Date of Patent: October 30, 2018
    Assignee: International Business Machines Corporation
    Inventors: Manish A. Bhide, Jean-Claude Mamou, Shyam R. Mudambi
  • Patent number: 9594637
    Abstract: System, method, and computer program product to process parallel computing tasks on a distributed computing system, by computing an execution plan for a parallel computing job to be executed on the distributed computing system, the distributed computing system comprising a plurality of compute nodes, generating, based on the execution plan, an ordered set of tasks, the ordered set of tasks comprising: (i) configuration tasks, and (ii) execution tasks for executing the parallel computing job on the distributed computing system, and launching a distributed computing application to assign the tasks of the ordered set of tasks to the plurality of compute nodes to execute the parallel computing job on the distributed computing system.
    Type: Grant
    Filed: March 25, 2014
    Date of Patent: March 14, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Publication number: 20170060967
    Abstract: A computer-implemented method includes inserting a bloom filter creation stage after an inner data source identification stage, wherein a join operation is to be performed to join an outer data source with the inner data source. The method inserts a bloom filter search stage after an outer data source identification stage, wherein each row of data from the outer data source is searched against a bloom filter for the inner data source during the bloom filter search stage. The method initializes a read on the inner data source. Subsequent to determining the bloom filter creation stage is complete, the method initializes a read on the outer data source. The method performs the join operation at a join stage.
    Type: Application
    Filed: August 31, 2015
    Publication date: March 2, 2017
    Inventors: Manish A. Bhide, Shyam R. Mudambi, Sriram K. Padmanabhan, Vivek S. Tirumalaraju
  • Publication number: 20170060970
    Abstract: A computer-implemented method includes inserting a bloom filter creation stage after an inner data source identification stage, wherein a join operation is to be performed to join an outer data source with the inner data source. The method inserts a bloom filter search stage after an outer data source identification stage, wherein each row of data from the outer data source is searched against a bloom filter for the inner data source during the bloom filter search stage. The method initializes a read on the inner data source. Subsequent to determining the bloom filter creation stage is complete, the method initializes a read on the outer data source. The method performs the join operation at a join stage.
    Type: Application
    Filed: July 20, 2016
    Publication date: March 2, 2017
    Inventors: Manish A. Bhide, Shyam R. Mudambi, Sriram K. Padmanabhan, Vivek S. Tirumalaraju
  • Patent number: 9477511
    Abstract: System, method, and computer program product to perform an operation for task-based modeling for parallel data integration, by determining, for a data flow, a set of processing units, each of the set of processing units defining one or more data processing operations to process the data flow, generating a set of tasks to represent the set of processing units, each task in the set of tasks comprising one or more of the data processing operations of the set of processing units, optimizing the set of tasks based on a set of characteristics of the data flow, and generating a composite execution plan based on the optimized set of tasks to process the data flow in a distributed computing environment.
    Type: Grant
    Filed: August 14, 2013
    Date of Patent: October 25, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Patent number: 9477512
    Abstract: System, method, and computer program product to perform an operation for task-based modeling for parallel data integration, by determining, for a data flow, a set of processing units, each of the set of processing units defining one or more data processing operations to process the data flow, generating a set of tasks to represent the set of processing units, each task in the set of tasks comprising one or more of the data processing operations of the set of processing units, optimizing the set of tasks based on a set of characteristics of the data flow, and generating a composite execution plan based on the optimized set of tasks to process the data flow in a distributed computing environment.
    Type: Grant
    Filed: September 12, 2014
    Date of Patent: October 25, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Patent number: 9424160
    Abstract: Data flow disruptions over a series of data processing operators can be detected by a computer system that generates a profile for data flow at an operator. The profile can include data input, processing, and output wait times. Using the profile, the system can detect potential flow disruptions. If the potential disruption satisfies a rule, it is considered a data flow disruption and a recommendation associated with the satisfied rule is identified. The recommendation and the operator identity is displayed.
    Type: Grant
    Filed: March 27, 2015
    Date of Patent: August 23, 2016
    Assignee: International Business Machines Corporation
    Inventors: Brian K. Caufield, Lawrence A. Greene, Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu, Dong J. Wei
  • Patent number: 9401835
    Abstract: Techniques are disclosed for data integration on retargetable engines in a networked environment. The networked environment includes data processing engines of different types and having different sets of characteristics. A request is received execute a data flow model in the networked environment. The data flow model includes data flow objects. A first data processing engine is programmatically selected based on a predefined set of criteria and the sets of characteristics of the data processing engines. The data flow model is executed using the selected data processing engine and responsive to the request.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: July 26, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Patent number: 9323619
    Abstract: System, method, and computer program product to process parallel computing tasks on a distributed computing system, by computing an execution plan for a parallel computing job to be executed on the distributed computing system, the distributed computing system comprising a plurality of compute nodes, generating, based on the execution plan, an ordered set of tasks, the ordered set of tasks comprising: (i) configuration tasks, and (ii) execution tasks for executing the parallel computing job on the distributed computing system, and launching a distributed computing application to assign the tasks of the ordered set of tasks to the plurality of compute nodes to execute the parallel computing job on the distributed computing system.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: April 26, 2016
    Assignee: International Business Machines Corporation
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Patent number: 9262205
    Abstract: Techniques are disclosed for qualified checkpointing of a data flow model having data flow operators and links connecting the data flow operators. A link of the data flow model is selected based on a set of checkpoint criteria. A checkpoint is generated for the selected link. The checkpoint is selected from different checkpoint types. The generated checkpoint is assigned to the selected link. The data flow model, having at least one link with no assigned checkpoint, is executed.
    Type: Grant
    Filed: March 25, 2014
    Date of Patent: February 16, 2016
    Assignee: International Business Machines Corporation
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Patent number: 9256460
    Abstract: Techniques are disclosed for qualified checkpointing of a data flow model having data flow operators and links connecting the data flow operators. A link of the data flow model is selected based on a set of checkpoint criteria. A checkpoint is generated for the selected link. The checkpoint is selected from different checkpoint types. The generated checkpoint is assigned to the selected link. The data flow model, having at least one link with no assigned checkpoint, is executed.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: February 9, 2016
    Assignee: International Business Machines Corporation
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Publication number: 20150269006
    Abstract: Data flow disruptions over a series of data processing operators can be detected by a computer system that generates a profile for data flow at an operator. The profile can include data input, processing, and output wait times. Using the profile, the system can detect potential flow disruptions. If the potential disruption satisfies a rule, it is considered a data flow disruption and a recommendation associated with the satisfied rule is identified. The recommendation and the operator identity is displayed.
    Type: Application
    Filed: March 27, 2015
    Publication date: September 24, 2015
    Inventors: Brian K. Caufield, Lawrence A. Greene, Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu, Dong J. Wei
  • Publication number: 20150169712
    Abstract: A computer manages methods for utilizing an index to manage access to data in a dataset stored in one or more file locations in an ETL tool by receiving a request to access a dataset associated with one or more file locations, wherein the dataset is stored in the one or more file locations. The computer queries an index for the one or more file locations associated with the dataset, wherein the dataset has another index for data in the dataset. The computer receives the one or more file locations associated with the dataset. The computer determines to cache the request to access the one or more file locations for the dataset until one or more thresholds are met, wherein the cached request is part of a total number of cached requests.
    Type: Application
    Filed: December 16, 2013
    Publication date: June 18, 2015
    Applicant: International Business Machines Corporation
    Inventors: Manish A. Bhide, Jean-Claude Mamou, Shyam R. Mudambi
  • Publication number: 20150074669
    Abstract: System, method, and computer program product to perform an operation for task-based modeling for parallel data integration, by determining, for a data flow, a set of processing units, each of the set of processing units defining one or more data processing operations to process the data flow, generating a set of tasks to represent the set of processing units, each task in the set of tasks comprising one or more of the data processing operations of the set of processing units, optimizing the set of tasks based on a set of characteristics of the data flow, and generating a composite execution plan based on the optimized set of tasks to process the data flow in a distributed computing environment.
    Type: Application
    Filed: September 12, 2014
    Publication date: March 12, 2015
    Inventors: Eric A. JACOBSON, Yong LI, Shyam R. MUDAMBI, Xiaoyan PU
  • Publication number: 20150052530
    Abstract: System, method, and computer program product to perform an operation for task-based modeling for parallel data integration, by determining, for a data flow, a set of processing units, each of the set of processing units defining one or more data processing operations to process the data flow, generating a set of tasks to represent the set of processing units, each task in the set of tasks comprising one or more of the data processing operations of the set of processing units, optimizing the set of tasks based on a set of characteristics of the data flow, and generating a composite execution plan based on the optimized set of tasks to process the data flow in a distributed computing environment.
    Type: Application
    Filed: August 14, 2013
    Publication date: February 19, 2015
    Applicant: International Business Machines Corporation
    Inventors: Eric A. JACOBSON, Yong LI, Shyam R. Mudambi, Xiaoyan PU
  • Publication number: 20140281704
    Abstract: System, method, and computer program product to process parallel computing tasks on a distributed computing system, by computing an execution plan for a parallel computing job to be executed on the distributed computing system, the distributed computing system comprising a plurality of compute nodes, generating, based on the execution plan, an ordered set of tasks, the ordered set of tasks comprising: (i) configuration tasks, and (ii) execution tasks for executing the parallel computing job on the distributed computing system, and launching a distributed computing application to assign the tasks of the ordered set of tasks to the plurality of compute nodes to execute the parallel computing job on the distributed computing system.
    Type: Application
    Filed: March 25, 2014
    Publication date: September 18, 2014
    Applicant: International Business Machines Coporation
    Inventors: Eric A. JACOBSON, Yong LI, Shyam R. MUDAMBI, Xiaoyan PU
  • Publication number: 20140282604
    Abstract: Techniques are disclosed for qualified checkpointing of a data flow model having data flow operators and links connecting the data flow operators. A link of the data flow model is selected based on a set of checkpoint criteria. A checkpoint is generated for the selected link. The checkpoint is selected from different checkpoint types. The generated checkpoint is assigned to the selected link. The data flow model, having at least one link with no assigned checkpoint, is executed.
    Type: Application
    Filed: March 15, 2013
    Publication date: September 18, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Publication number: 20140282605
    Abstract: Techniques are disclosed for qualified checkpointing of a data flow model having data flow operators and links connecting the data flow operators. A link of the data flow model is selected based on a set of checkpoint criteria. A checkpoint is generated for the selected link. The checkpoint is selected from different checkpoint types. The generated checkpoint is assigned to the selected link. The data flow model, having at least one link with no assigned checkpoint, is executed.
    Type: Application
    Filed: March 25, 2014
    Publication date: September 18, 2014
    Applicant: International Business Machines Corporation
    Inventors: Eric A. JACOBSON, Yong LI, Shyam R. MUDAMBI, Xiaoyan PU