Patents by Inventor Xiaoyan Pu

Xiaoyan Pu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9471652
    Abstract: Methods and systems are provided for extract transform load (ETL) input suggestion for an ETL system in which a current job is being created. A method includes: determining when a new input is made in the current job in the ETL system and dynamically receiving the new input; updating stored information relating to the current job with the new input; accessing rules which apply to the current job; analyzing and applying the rules based on the new input and the current job stored information to generate one or more suggested next inputs in the current job; providing a weighting for the one or more suggested next inputs based on the analysis and application of the rules; and providing a prompt in the current job in the ETL system with the suggested one or more next inputs and their weightings.
    Type: Grant
    Filed: November 18, 2015
    Date of Patent: October 18, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Joseph Bangs, Leonard D. Greenwood, Arron J. Harden, Xiaoyan Pu, Julian J. Vizor
  • Publication number: 20160283273
    Abstract: An approach for deploying workload in a multi-tenancy computing environment is provided. The approach generates, by one or more computer processors, a tenant ID and a plan ID for a tenant based, at least in part, on a tenant registration request. The approach stores, by one or more computer processors, the tenant ID and the plan ID into a shared system record. The approach receives, by one or more computer processors, a request to update a first tenant service plan. The approach determines, by one or more computer processors, one or more resource pools supporting a second tenant service plan based at least in part, on an association between the tenant ID and the plan ID. The approach deploys, by one or more computer processors, one or more resources from the one or more resource pools supporting the second tenant service plan.
    Type: Application
    Filed: March 27, 2015
    Publication date: September 29, 2016
    Inventors: Yong Li, Jean-Claude Mamou, David T. Meeks, Xiaoyan Pu
  • Publication number: 20160283275
    Abstract: An approach for deploying workload in a multi-tenancy computing environment is provided. The approach generates, by one or more computer processors, a tenant ID and a plan ID for a tenant based, at least in part, on a tenant registration request. The approach stores, by one or more computer processors, the tenant ID and the plan ID into a shared system record. The approach receives, by one or more computer processors, a request to update a first tenant service plan. The approach determines, by one or more computer processors, one or more resource pools supporting a second tenant service plan based at least in part, on an association between the tenant ID and the plan ID. The approach deploys, by one or more computer processors, one or more resources from the one or more resource pools supporting the second tenant service plan.
    Type: Application
    Filed: February 29, 2016
    Publication date: September 29, 2016
    Inventors: Yong Li, Jean-Claude Mamou, David T. Meeks, Xiaoyan Pu
  • Patent number: 9424160
    Abstract: Data flow disruptions over a series of data processing operators can be detected by a computer system that generates a profile for data flow at an operator. The profile can include data input, processing, and output wait times. Using the profile, the system can detect potential flow disruptions. If the potential disruption satisfies a rule, it is considered a data flow disruption and a recommendation associated with the satisfied rule is identified. The recommendation and the operator identity is displayed.
    Type: Grant
    Filed: March 27, 2015
    Date of Patent: August 23, 2016
    Assignee: International Business Machines Corporation
    Inventors: Brian K. Caufield, Lawrence A. Greene, Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu, Dong J. Wei
  • Patent number: 9401835
    Abstract: Techniques are disclosed for data integration on retargetable engines in a networked environment. The networked environment includes data processing engines of different types and having different sets of characteristics. A request is received execute a data flow model in the networked environment. The data flow model includes data flow objects. A first data processing engine is programmatically selected based on a predefined set of criteria and the sets of characteristics of the data processing engines. The data flow model is executed using the selected data processing engine and responsive to the request.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: July 26, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Patent number: 9372690
    Abstract: A request for analysis of a data integration job is received that includes one or more features and criteria for the analysis. Each feature is extracted from a job model representing the job by invoking a corresponding analytical rule for each feature. The analytical rule includes one or more operations and invoking the analytical rule performs the operations to analyze one or more job components associated with the corresponding feature as represented in the job model and to extract information pertaining to that feature.
    Type: Grant
    Filed: September 3, 2014
    Date of Patent: June 21, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Lawrence A. Greene, Eric A. Jacobson, Yong Li, Xiaoyan Pu
  • Publication number: 20160127382
    Abstract: A method includes a workload management (WLM) server that receives a first CHECK WORKLOAD command for a workload in a queue of the WLM server. It may be determined whether the workload is ready to run on a WLM client. If the workload is not ready to run, a wait time for the workload with the WLM server is dynamically estimated. The wait time is sent to the WLM client. If the workload is ready to run, then a response is sent to the WLM client that workload is ready to run.
    Type: Application
    Filed: January 15, 2016
    Publication date: May 5, 2016
    Inventors: Yong Li, Hanson Lieu, Ron Liu, Xiaoyan Pu
  • Patent number: 9323619
    Abstract: System, method, and computer program product to process parallel computing tasks on a distributed computing system, by computing an execution plan for a parallel computing job to be executed on the distributed computing system, the distributed computing system comprising a plurality of compute nodes, generating, based on the execution plan, an ordered set of tasks, the ordered set of tasks comprising: (i) configuration tasks, and (ii) execution tasks for executing the parallel computing job on the distributed computing system, and launching a distributed computing application to assign the tasks of the ordered set of tasks to the plurality of compute nodes to execute the parallel computing job on the distributed computing system.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: April 26, 2016
    Assignee: International Business Machines Corporation
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Patent number: 9304814
    Abstract: A method includes a workload management (WLM) server that receives a first CHECK WORKLOAD command for a workload in a queue of the WLM server. It may be determined whether the workload is ready to run on a WLM client. If the workload is not ready to run, a wait time for the workload with the WLM server is dynamically estimated. The wait time is sent to the WLM client. If the workload is ready to run, then a response is sent to the WLM client that workload is ready to run.
    Type: Grant
    Filed: April 30, 2013
    Date of Patent: April 5, 2016
    Assignee: International Business Machines Corporation
    Inventors: Yong Li, Hanson Lieu, Ron Liu, Xiaoyan Pu
  • Publication number: 20160094417
    Abstract: Provided are a computer program product, system, and method for allocating physical nodes for processes in an execution plan. An execution plan is generated indicating a plurality of processes. A resource requirement is generated indicating requested physical nodes and an assignment of the processes to execute on the requested physical nodes. A determination is made from the resource requirement of a resource allocation of physical nodes for the requested physical nodes and the processes. The execution plan is updated to generate an updated execution plan indicating the physical nodes on which the processes will execute according to the received resource allocation.
    Type: Application
    Filed: May 29, 2015
    Publication date: March 31, 2016
    Inventors: Krishna K. Bonagiri, Eric A. Jacobson, Yong Li, Xiaoyan Pu
  • Publication number: 20160094415
    Abstract: Provided are a computer program product, system, and method for allocating physical nodes for processes in an execution plan. An execution plan is generated indicating a plurality of processes. A resource requirement is generated indicating requested physical nodes and an assignment of the processes to execute on the requested physical nodes. A determination is made from the resource requirement of a resource allocation of physical nodes for the requested physical nodes and the processes. The execution plan is updated to generate an updated execution plan indicating the physical nodes on which the processes will execute according to the received resource allocation.
    Type: Application
    Filed: September 29, 2014
    Publication date: March 31, 2016
    Inventors: Krishna K. Bonagiri, Eric A. Jacobson, Yong Li, Xiaoyan Pu
  • Patent number: 9294122
    Abstract: According to one embodiment of the present invention, a system selectively compresses data fields in a parallel data flow. The system identifies within an execution plan for the parallel data flow a first instance of a data field within a stage of the parallel data flow. The system traces the identified data field through stages of the parallel data flow and determines a score value for the identified data field based on operations performed on the identified data field during traversal of the stages. The system compresses the identified data field based on the score value indicating a performance gain with respect to the compressed data field. Embodiments of the present invention further include a method and computer program product for selectively compressing data fields in a parallel data flow in substantially the same manners described above.
    Type: Grant
    Filed: March 11, 2015
    Date of Patent: March 22, 2016
    Assignee: International Business Machines Corporation
    Inventors: Lawrence A. Greene, Eric A. Jacobson, Yong Li, Xiaoyan Pu
  • Publication number: 20160070607
    Abstract: Provided are techniques for sharing a partitioned data set across parallel applications. Under control of a producing application, a partitioned data set is generated; a descriptor that describes the partitioned data set is generated; and the descriptor is registered in a registry. Under control of a consuming application, the registry is accessed to obtain the descriptor of the partitioned data set; and the descriptor is uses to determine how to process the partitioned data set.
    Type: Application
    Filed: September 5, 2014
    Publication date: March 10, 2016
    Inventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Xiaoyan Pu
  • Publication number: 20160070608
    Abstract: Provided are techniques for sharing a partitioned data set across parallel applications. Under control of a producing application, a partitioned data set is generated; a descriptor that describes the partitioned data set is generated; and the descriptor is registered in a registry. Under control of a consuming application, the registry is accessed to obtain the descriptor of the partitioned data set; and the descriptor is uses to determine how to process the partitioned data set.
    Type: Application
    Filed: May 20, 2015
    Publication date: March 10, 2016
    Inventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Xiaoyan Pu
  • Publication number: 20160062790
    Abstract: A request for analysis of a data integration job is received that includes one or more features and criteria for the analysis. Each feature is extracted from a job model representing the job by invoking a corresponding analytical rule for each feature. The analytical rule includes one or more operations and invoking the analytical rule performs the operations to analyze one or more job components associated with the corresponding feature as represented in the job model and to extract information pertaining to that feature.
    Type: Application
    Filed: April 28, 2015
    Publication date: March 3, 2016
    Inventors: Lawrence A. Greene, Eric A. Jacobson, Yong Li, Xiaoyan Pu
  • Publication number: 20160062767
    Abstract: A request for analysis of a data integration job is received that includes one or more features and criteria for the analysis. Each feature is extracted from a job model representing the job by invoking a corresponding analytical rule for each feature. The analytical rule includes one or more operations and invoking the analytical rule performs the operations to analyze one or more job components associated with the corresponding feature as represented in the job model and to extract information pertaining to that feature.
    Type: Application
    Filed: September 3, 2014
    Publication date: March 3, 2016
    Inventors: Lawrence A. Greene, Eric A. Jacobson, Yong Li, Xiaoyan Pu
  • Patent number: 9262210
    Abstract: A method, computer program product and system for workload management for an Extract, Transform, and Load (ETL) system. A priority of each workload in a set of workloads is determined using a priority rule. In response to determining that the priority of a workload to be checked has a highest priority, it is indicated that the workload has the highest priority. It is determined whether at least one logical resource representing an ETL metric is available for executing the workload. In response to determining that the workload has the highest priority and that the at least one logical resource is available, it is determined that the workload is runnable.
    Type: Grant
    Filed: June 29, 2012
    Date of Patent: February 16, 2016
    Assignee: International Business Machines Corporation
    Inventors: Brian K. Caufield, Yong Li, Xiaoyan Pu
  • Patent number: 9262205
    Abstract: Techniques are disclosed for qualified checkpointing of a data flow model having data flow operators and links connecting the data flow operators. A link of the data flow model is selected based on a set of checkpoint criteria. A checkpoint is generated for the selected link. The checkpoint is selected from different checkpoint types. The generated checkpoint is assigned to the selected link. The data flow model, having at least one link with no assigned checkpoint, is executed.
    Type: Grant
    Filed: March 25, 2014
    Date of Patent: February 16, 2016
    Assignee: International Business Machines Corporation
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Patent number: 9256460
    Abstract: Techniques are disclosed for qualified checkpointing of a data flow model having data flow operators and links connecting the data flow operators. A link of the data flow model is selected based on a set of checkpoint criteria. A checkpoint is generated for the selected link. The checkpoint is selected from different checkpoint types. The generated checkpoint is assigned to the selected link. The data flow model, having at least one link with no assigned checkpoint, is executed.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: February 9, 2016
    Assignee: International Business Machines Corporation
    Inventors: Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu
  • Publication number: 20150311915
    Abstract: According to one embodiment of the present invention, a system selectively compresses data fields in a parallel data flow. The system identifies within an execution plan for the parallel data flow a first instance of a data field within a stage of the parallel data flow. The system traces the identified data field through stages of the parallel data flow and determines a score value for the identified data field based on operations performed on the identified data field during traversal of the stages. The system compresses the identified data field based on the score value indicating a performance gain with respect to the compressed data field. Embodiments of the present invention further include a method and computer program product for selectively compressing data fields in a parallel data flow in substantially the same manners described above.
    Type: Application
    Filed: March 11, 2015
    Publication date: October 29, 2015
    Inventors: Lawrence A. Greene, Eric A. Jacobson, Yong Li, Xiaoyan Pu