Patents by Inventor Xiaoyan Pu

Xiaoyan Pu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20180101583
    Abstract: A user-defined function (UDF) is received in a central Computer System, which causes registration of the UDF and distributes the UDF to a cluster of computer system nodes configured for performing, in volatile memory of the nodes, extract-transform-load processing of data cached in the volatile memory of the nodes. First and second job specifications that include the UDF are received by the central Computer System, and the central computer system distributes instructions for the job specifications to the nodes including at least one instruction that invokes the UDF for loading and executing the UDF in the volatile memory of at least one of the nodes during runtime of the jobs. The central Computer System does not cause registration of the UDF again after receiving the first job specification.
    Type: Application
    Filed: October 11, 2016
    Publication date: April 12, 2018
    Inventors: YONG LI, RYAN PHAM, XIAOYAN PU
  • Patent number: 9898337
    Abstract: An approach for deploying workload in a multi-tenancy computing environment is provided. The approach generates, by one or more computer processors, a tenant ID and a plan ID for a tenant based, at least in part, on a tenant registration request. The approach stores, by one or more computer processors, the tenant ID and the plan ID into a shared system record. The approach receives, by one or more computer processors, a request to update a first tenant service plan. The approach determines, by one or more computer processors, one or more resource pools supporting a second tenant service plan based at least in part, on an association between the tenant ID and the plan ID. The approach deploys, by one or more computer processors, one or more resources from the one or more resource pools supporting the second tenant service plan.
    Type: Grant
    Filed: March 27, 2015
    Date of Patent: February 20, 2018
    Assignee: International Business Machines Corporation
    Inventors: Yong Li, Jean-Claude Mamou, David T. Meeks, Xiaoyan Pu
  • Patent number: 9832081
    Abstract: Provided are a computer program product, system, and method for allocating physical nodes for processes in an execution plan. An execution plan is generated indicating a plurality of processes. A resource requirement is generated indicating requested physical nodes and an assignment of the processes to execute on the requested physical nodes. A determination is made from the resource requirement of a resource allocation of physical nodes for the requested physical nodes and the processes. The execution plan is updated to generate an updated execution plan indicating the physical nodes on which the processes will execute according to the received resource allocation.
    Type: Grant
    Filed: May 29, 2015
    Date of Patent: November 28, 2017
    Assignee: International Business Machines Corporation
    Inventors: Krishna K. Bonagiri, Eric A. Jacobson, Yong Li, Xiaoyan Pu
  • Patent number: 9805326
    Abstract: Embodiments presented herein provide task management capabilities for designing a complex data integration workflow in an integrated design environment (IDE). A task management tool of the IDE allows a developer to tag various stages of a data integration workflow in a non-linear manner. When the task management tool receives a tag for a given stage, the task management tool identifies incomplete tasks associated with the stage and generates a task list that includes the incomplete tasks. The developer may return to completing any of the tasks in the workflow in any sequence as desired.
    Type: Grant
    Filed: April 24, 2014
    Date of Patent: October 31, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lawrence A. Greene, Eric A. Jacobson, Yong Li, Xiaoyan Pu
  • Publication number: 20170310564
    Abstract: Provided are a computer program product, system, and method for allocating physical nodes for processes in an execution plan. An execution plan is generated indicating a plurality of processes. A resource requirement is generated indicating requested physical nodes and an assignment of the processes to execute on the requested physical nodes. A determination is made from the resource requirement of a resource allocation of physical nodes for the requested physical nodes and the processes. The execution plan is updated to generate an updated execution plan indicating the physical nodes on which the processes will execute according to the received resource allocation.
    Type: Application
    Filed: July 12, 2017
    Publication date: October 26, 2017
    Inventors: Krishna K. Bonagiri, Eric A. Jacobson, Yong Li, Xiaoyan Pu
  • Patent number: 9787761
    Abstract: Provided are a computer program product, system, and method for allocating physical nodes for processes in an execution plan. An execution plan is generated indicating a plurality of processes. A resource requirement is generated indicating requested physical nodes and an assignment of the processes to execute on the requested physical nodes. A determination is made from the resource requirement of a resource allocation of physical nodes for the requested physical nodes and the processes. The execution plan is updated to generate an updated execution plan indicating the physical nodes on which the processes will execute according to the received resource allocation.
    Type: Grant
    Filed: September 29, 2014
    Date of Patent: October 10, 2017
    Assignee: International Business Machines Corporation
    Inventors: Krishna K. Bonagiri, Eric A. Jacobson, Yong Li, Xiaoyan Pu
  • Patent number: 9762672
    Abstract: Provided are techniques for improving data locality for parallel applications running in a big data distributed file system with a dynamic node group. In response to a consumer job starting to read one or more files in a big data distributed file system having multiple nodes, node group information for the one or more files to be read is retrieved, wherein the node group information identifies nodes from the multiple nodes on which a producer job wrote the one or more files, and the consumer job is assigned to the nodes identified by the node group information to allow for local reading of the one or more files by the consumer job.
    Type: Grant
    Filed: June 15, 2015
    Date of Patent: September 12, 2017
    Assignee: International Business Machines Corporation
    Inventors: Krishna K. Bonagiri, Eric A. Jacobson, Yong Li, Ron E. Liu, Xiaoyan Pu
  • Publication number: 20170140014
    Abstract: A method for extract transform load (ETL) input suggestions for an ETL system in which a current job is being created. A method includes: determining when a new input is made in the current job in the ETL system and dynamically receiving the new input which includes a connection between stages input or a property of a stage input; updating stored information relating to the current job with the new input; accessing rules which apply to the current job; analyzing and applying the rules based on the new input and the current job stored information to generate one or more suggested next inputs in the current job; providing a weighting for the one or more suggested next inputs based on the analysis and application of the rules; and providing a prompt in the current job in the ETL system with the suggested one or more next inputs and their weightings.
    Type: Application
    Filed: August 26, 2016
    Publication date: May 18, 2017
    Inventors: Joseph Bangs, Leonard D. Greenwood, Arron J. Harden, Xiaoyan Pu, Julian J. Vizor
  • Patent number: 9652308
    Abstract: Provided are techniques for sharing a partitioned data set across parallel applications. Under control of a producing application, a partitioned data set is generated; a descriptor that describes the partitioned data set is generated; and the descriptor is registered in a registry. Under control of a consuming application, the registry is accessed to obtain the descriptor of the partitioned data set; and the descriptor is uses to determine how to process the partitioned data set.
    Type: Grant
    Filed: September 5, 2014
    Date of Patent: May 16, 2017
    Assignee: International Business Machines Corporation
    Inventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Xiaoyan Pu
  • Publication number: 20170075966
    Abstract: A system includes at least one processor and processes an ETL job. The system analyzes a specification of the ETL job including one or more functional expressions to load data from one or more source data stores, process the data in memory, and store the processed data to one or more target data stores. One or more data flows are produced from the specification based on the one or more functional expressions. The one or more data flows utilize in-memory distributed data sets generated to accommodate parallel processing for loading and processing the data. The one or more data flows are optimized to assign operations to be performed on the one or more source data stores. The optimized data flows are executed to load the data to the one or more target data stores in accordance with the specification. Present invention embodiments further include methods and computer program products.
    Type: Application
    Filed: August 3, 2016
    Publication date: March 16, 2017
    Inventors: Lawrence A. Greene, Yong Li, Xiaoyan Pu, Yeh-Heng Sheng
  • Publication number: 20170075964
    Abstract: A system includes at least one processor and processes an ETL job. The system analyzes a specification of the ETL job including one or more functional expressions to load data from one or more source data stores, process the data in memory, and store the processed data to one or more target data stores. One or more data flows are produced from the specification based on the one or more functional expressions. The one or more data flows utilize in-memory distributed data sets generated to accommodate parallel processing for loading and processing the data. The one or more data flows are optimized to assign operations to be performed on the one or more source data stores. The optimized data flows are executed to load the data to the one or more target data stores in accordance with the specification. Present invention embodiments further include methods and computer program products.
    Type: Application
    Filed: September 11, 2015
    Publication date: March 16, 2017
    Inventors: Lawrence A. Greene, Yong Li, Xiaoyan Pu, Yeh-Heng Sheng
  • Patent number: 9594637
    Abstract: System, method, and computer program product to process parallel computing tasks on a distributed computing system, by computing an execution plan for a parallel computing job to be executed on the distributed computing system, the distributed computing system comprising a plurality of compute nodes, generating, based on the execution plan, an ordered set of tasks, the ordered set of tasks comprising: (i) configuration tasks, and (ii) execution tasks for executing the parallel computing job on the distributed computing system, and launching a distributed computing application to assign the tasks of the ordered set of tasks to the plurality of compute nodes to execute the parallel computing job on the distributed computing system.
    Type: Grant
    Filed: March 25, 2014
    Date of Patent: March 14, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Patent number: 9542246
    Abstract: Provided are techniques for sharing a partitioned data set across parallel applications. Under control of a producing application, a partitioned data set is generated; a descriptor that describes the partitioned data set is generated; and the descriptor is registered in a registry. Under control of a consuming application, the registry is accessed to obtain the descriptor of the partitioned data set; and the descriptor is uses to determine how to process the partitioned data set.
    Type: Grant
    Filed: May 20, 2015
    Date of Patent: January 10, 2017
    Assignee: International Business Machines Corporation
    Inventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Xiaoyan Pu
  • Publication number: 20160366224
    Abstract: Provided are techniques for improving data locality for parallel applications running in a big data distributed file system with a dynamic node group. In response to a consumer job starting to read one or more files in a big data distributed file system having multiple nodes, node group information for the one or more files to be read is retrieved, wherein the node group information identifies nodes from the multiple nodes on which a producer job wrote the one or more files, and the consumer job is assigned to the nodes identified by the node group information to allow for local reading of the one or more files by the consumer job.
    Type: Application
    Filed: June 15, 2015
    Publication date: December 15, 2016
    Inventors: Krishna K. Bonagiri, Eric A. Jacobson, Yong Li, Ron E. Liu, Xiaoyan Pu
  • Publication number: 20160350201
    Abstract: The present disclosure provides a method and apparatus for re-using existing data flow design jobs in a data integration design environment (IDE). An example method generally includes receiving input placing one or more data flow operators on a design canvas of the IDE, searching a database of existing data flow job designs for existing data flow job designs that include the one or more data flow operators, and displaying a list of the existing data flow job designs that include the one or more data flow operators.
    Type: Application
    Filed: May 27, 2015
    Publication date: December 1, 2016
    Inventors: Lawrence A. GREENE, Eric A. JACOBSON, Yong LI, Xiaoyan PU
  • Patent number: 9507592
    Abstract: A request for analysis of a data integration job is received that includes one or more features and criteria for the analysis. Each feature is extracted from a job model representing the job by invoking a corresponding analytical rule for each feature. The analytical rule includes one or more operations and invoking the analytical rule performs the operations to analyze one or more job components associated with the corresponding feature as represented in the job model and to extract information pertaining to that feature.
    Type: Grant
    Filed: April 28, 2015
    Date of Patent: November 29, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lawrence A. Greene, Eric A. Jacobson, Yong Li, Xiaoyan Pu
  • Patent number: 9501377
    Abstract: The method of managing performance of data integration are described. A performance analyzer may receive data about a data integration job execution. The performance analyzer may determine whether there is a performance issue of the data integration job execution. The performance analyzer analyzes the data about the data integration job execution when there is a performance issue. The performance analyzer generates a job execution design recommendation based on the analysis of the data and a set of predefined recommendation rules. The performance analyzer then displays the data about the data integration job execution and when there is a generated job execution design recommendation, displaying the job execution design recommendation.
    Type: Grant
    Filed: March 18, 2014
    Date of Patent: November 22, 2016
    Assignee: International Business Machines Corporation
    Inventors: Lawrence A. Greene, Eric A. Jacobson, Yong Li, Xiaoyan Pu
  • Publication number: 20160314056
    Abstract: At least one application in a computing environment is executed and one or more performance metrics of the application are measured. The measured performance metrics are analyzed and an operational performance regression is detected. The detected operational performance regression is correlated with one or more recorded changes and the correlated changes are identified as a cause of the operational performance regression. The elements of the computing environment are alerted in accordance with the identified changes to adjust operational performance.
    Type: Application
    Filed: April 23, 2015
    Publication date: October 27, 2016
    Inventors: Lawrence A. Greene, Eric A. Jacobson, Yong Li, Xiaoyan Pu
  • Patent number: 9477511
    Abstract: System, method, and computer program product to perform an operation for task-based modeling for parallel data integration, by determining, for a data flow, a set of processing units, each of the set of processing units defining one or more data processing operations to process the data flow, generating a set of tasks to represent the set of processing units, each task in the set of tasks comprising one or more of the data processing operations of the set of processing units, optimizing the set of tasks based on a set of characteristics of the data flow, and generating a composite execution plan based on the optimized set of tasks to process the data flow in a distributed computing environment.
    Type: Grant
    Filed: August 14, 2013
    Date of Patent: October 25, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Patent number: 9477512
    Abstract: System, method, and computer program product to perform an operation for task-based modeling for parallel data integration, by determining, for a data flow, a set of processing units, each of the set of processing units defining one or more data processing operations to process the data flow, generating a set of tasks to represent the set of processing units, each task in the set of tasks comprising one or more of the data processing operations of the set of processing units, optimizing the set of tasks based on a set of characteristics of the data flow, and generating a composite execution plan based on the optimized set of tasks to process the data flow in a distributed computing environment.
    Type: Grant
    Filed: September 12, 2014
    Date of Patent: October 25, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu