Patents by Inventor Brian K. Caufield
Brian K. Caufield has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9996389Abstract: Embodiments presented herein provide techniques for optimizing parallel data flows of a batch processing job using a profile of the processing job. An application retrieves a job profile for a processing job. The processing job has a plurality of processing stages specified in an execution profile. The job profile includes statistical data for at least one of the processing stages obtained during prior executions of the job. The application modifies properties of the execution profile based on the job profile to optimize the execution of the job. The application executes the processing job with the modified execution profile.Type: GrantFiled: March 11, 2014Date of Patent: June 12, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Brian K. Caufield, Lawrence A. Greene, Eric A. Jacobson, Yong Li, Xiaoyan Pu
-
Patent number: 9990412Abstract: A data driven parallel sorting method includes distributing input data records to n partitions one by one in a circular manner. Each partition corresponds to a parallel sorting process with an allocated memory chunk sized to store m data records. The method also includes sorting, in parallel, current data records in respective memory chunks in respective partitions. The method also includes in response to distribution of data records of ?m/n? rounds, circularly controlling one of the n partitions, and writing data records that have been sorted in the memory chunk of the partition into a mass storage as an ordered data chunk, and emptying the memory chunk. The method also includes in response to all data records being distributed, writing data chunks that have been sorted in respective memory chunks into the mass storage, and performing a merge sort on all ordered data chunks in the mass storage.Type: GrantFiled: April 28, 2014Date of Patent: June 5, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Brian K. Caufield, Ron E. Liu, Dong J. Wei, Xin Ying Yang
-
Patent number: 9983906Abstract: Embodiments presented herein provide techniques for optimizing parallel data flows of a batch processing job using a profile of the processing job. An application retrieves a job profile for a processing job. The processing job has a plurality of processing stages specified in an execution profile. The job profile includes statistical data for at least one of the processing stages obtained during prior executions of the job. The application modifies properties of the execution profile based on the job profile to optimize the execution of the job. The application executes the processing job with the modified execution profile.Type: GrantFiled: February 13, 2015Date of Patent: May 29, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Brian K. Caufield, Lawrence A. Greene, Eric A. Jacobson, Yong Li, Xiaoyan Pu
-
Patent number: 9817700Abstract: A method, computer program product, and system for dynamically distributing data for parallel processing in a computing system, comprising allocating a data buffer to each of a plurality of data partitions, where each data buffer stores data to be processed by its corresponding data partition, distributing data in multiple rounds to the data buffers for processing by the data partitions, where in each round the data is distributed based on a determined data processing capacity for each data partition, and where a greater amount of data is distributed to the data partitions with higher determined processing capacities, and periodically monitoring usage of each data buffer and re-determining the determined data processing capacity of each data partition based on its corresponding data buffer usage.Type: GrantFiled: April 26, 2011Date of Patent: November 14, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Brian K. Caufield, Fan Ding, Mi Wan Shum, Dong Jie Wei, Samuel H K Wong
-
Patent number: 9811384Abstract: A method, computer program product, and system for dynamically distributing data for parallel processing in a computing system, comprising allocating a data buffer to each of a plurality of data partitions, where each data buffer stores data to be processed by its corresponding data partition, distributing data in multiple rounds to the data buffers for processing by the data partitions, where in each round the data is distributed based on a determined data processing capacity for each data partition, and where a greater amount of data is distributed to the data partitions with higher determined processing capacities, and periodically monitoring usage of each data buffer and re-determining the determined data processing capacity of each data partition based on its corresponding data buffer usage.Type: GrantFiled: June 27, 2012Date of Patent: November 7, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Brian K. Caufield, Fan Ding, Mi Wan Shum, Dong Jie Wei, Samuel H K Wong
-
Patent number: 9733984Abstract: Provided are techniques for multiple stage workload management. A staging queue and a run queue are provided. A workload is received. In response to determining that application resources are not available and that the workload has not been previously semi-started, the workload is added to the staging queue. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that run resources are available, the workload is started. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that the run resources are not available, adding the workload to the run queue.Type: GrantFiled: March 2, 2016Date of Patent: August 15, 2017Assignee: International Business Machines CorporationInventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Mi W. Shum, Chun H. Sun, DongJie Wei
-
Patent number: 9652308Abstract: Provided are techniques for sharing a partitioned data set across parallel applications. Under control of a producing application, a partitioned data set is generated; a descriptor that describes the partitioned data set is generated; and the descriptor is registered in a registry. Under control of a consuming application, the registry is accessed to obtain the descriptor of the partitioned data set; and the descriptor is uses to determine how to process the partitioned data set.Type: GrantFiled: September 5, 2014Date of Patent: May 16, 2017Assignee: International Business Machines CorporationInventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Xiaoyan Pu
-
Patent number: 9575916Abstract: A system identifies a performance bottleneck status in a parallel data processing environment by examining data flow associated with the parallel data processing environment to identify at least one operator, where an operator type is associated with at least one operator, at least one buffer, and a relationship that the buffer has with the operator, where the relationship is associated with the operator type. The system monitors the buffer to determine a buffer status associated with the buffer. The system applies a set of rules to identify an operator bottleneck status associated with the operator. The set of rules is applied to the operator, based on the operator type, the buffer status, and relationship that the buffer has with the operator. The system then determines a performance bottleneck status associated with the parallel data processing environment, based on the operator bottleneck status.Type: GrantFiled: January 6, 2014Date of Patent: February 21, 2017Assignee: International Business Machines CorporationInventors: Brian K. Caufield, Ron E. Liu, DongJie Wei, Xin Ying Yang
-
Patent number: 9542246Abstract: Provided are techniques for sharing a partitioned data set across parallel applications. Under control of a producing application, a partitioned data set is generated; a descriptor that describes the partitioned data set is generated; and the descriptor is registered in a registry. Under control of a consuming application, the registry is accessed to obtain the descriptor of the partitioned data set; and the descriptor is uses to determine how to process the partitioned data set.Type: GrantFiled: May 20, 2015Date of Patent: January 10, 2017Assignee: International Business Machines CorporationInventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Xiaoyan Pu
-
Patent number: 9424160Abstract: Data flow disruptions over a series of data processing operators can be detected by a computer system that generates a profile for data flow at an operator. The profile can include data input, processing, and output wait times. Using the profile, the system can detect potential flow disruptions. If the potential disruption satisfies a rule, it is considered a data flow disruption and a recommendation associated with the satisfied rule is identified. The recommendation and the operator identity is displayed.Type: GrantFiled: March 27, 2015Date of Patent: August 23, 2016Assignee: International Business Machines CorporationInventors: Brian K. Caufield, Lawrence A. Greene, Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu, Dong J. Wei
-
Publication number: 20160179578Abstract: Provided are techniques for multiple stage workload management. A staging queue and a run queue are provided. A workload is received. In response to determining that application resources are not available and that the workload has not been previously semi-started, the workload is added to the staging queue. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that run resources are available, the workload is started. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that the run resources are not available, adding the workload to the run queue.Type: ApplicationFiled: March 2, 2016Publication date: June 23, 2016Inventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Mi W. Shum, Chun H. Sun, DongJie Wei
-
Patent number: 9304816Abstract: Provided are techniques for multiple stage workload management. A staging queue and a run queue are provided. A workload is received. In response to determining that application resources are not available and that the workload has not been previously semi-started, the workload is added to the staging queue. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that run resources are available, the workload is started. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that the run resources are not available, adding the workload to the run queue.Type: GrantFiled: August 5, 2013Date of Patent: April 5, 2016Assignee: International Business Machines CorporationInventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Mi W. Shum, Chun H. Sun, DongJie Wei
-
Patent number: 9298596Abstract: According to one embodiment of the present invention, a system tests jobs in a computing environment. The system creates a test case for one or more existing executable jobs without modifying the job design or recompiling the executable itself, wherein the test case includes one or more capture points in a job flow of the executable jobs and corresponding rules for capturing data, identification of data for testing the one or more executable jobs, and rules for comparing the captured data to expected results. The system captures the data at the one or more capture points in the job flow in accordance with the test case and generates a baseline of expected results.Type: GrantFiled: July 9, 2013Date of Patent: March 29, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Brian K. Caufield, Ajay Sood, Julian J. Vizor
-
Publication number: 20160070607Abstract: Provided are techniques for sharing a partitioned data set across parallel applications. Under control of a producing application, a partitioned data set is generated; a descriptor that describes the partitioned data set is generated; and the descriptor is registered in a registry. Under control of a consuming application, the registry is accessed to obtain the descriptor of the partitioned data set; and the descriptor is uses to determine how to process the partitioned data set.Type: ApplicationFiled: September 5, 2014Publication date: March 10, 2016Inventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Xiaoyan Pu
-
Publication number: 20160070608Abstract: Provided are techniques for sharing a partitioned data set across parallel applications. Under control of a producing application, a partitioned data set is generated; a descriptor that describes the partitioned data set is generated; and the descriptor is registered in a registry. Under control of a consuming application, the registry is accessed to obtain the descriptor of the partitioned data set; and the descriptor is uses to determine how to process the partitioned data set.Type: ApplicationFiled: May 20, 2015Publication date: March 10, 2016Inventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Xiaoyan Pu
-
Patent number: 9262210Abstract: A method, computer program product and system for workload management for an Extract, Transform, and Load (ETL) system. A priority of each workload in a set of workloads is determined using a priority rule. In response to determining that the priority of a workload to be checked has a highest priority, it is indicated that the workload has the highest priority. It is determined whether at least one logical resource representing an ETL metric is available for executing the workload. In response to determining that the workload has the highest priority and that the at least one logical resource is available, it is determined that the workload is runnable.Type: GrantFiled: June 29, 2012Date of Patent: February 16, 2016Assignee: International Business Machines CorporationInventors: Brian K. Caufield, Yong Li, Xiaoyan Pu
-
Publication number: 20150269006Abstract: Data flow disruptions over a series of data processing operators can be detected by a computer system that generates a profile for data flow at an operator. The profile can include data input, processing, and output wait times. Using the profile, the system can detect potential flow disruptions. If the potential disruption satisfies a rule, it is considered a data flow disruption and a recommendation associated with the satisfied rule is identified. The recommendation and the operator identity is displayed.Type: ApplicationFiled: March 27, 2015Publication date: September 24, 2015Inventors: Brian K. Caufield, Lawrence A. Greene, Eric A. Jacobson, Yong Li, Shyam R. Mudambi, Xiaoyan Pu, Dong J. Wei
-
Publication number: 20150261568Abstract: Embodiments presented herein provide techniques for optimizing parallel data flows of a batch processing job using a profile of the processing job. An application retrieves a job profile for a processing job. The processing job has a plurality of processing stages specified in an execution profile. The job profile includes statistical data for at least one of the processing stages obtained during prior executions of the job. The application modifies properties of the execution profile based on the job profile to optimize the execution of the job. The application executes the processing job with the modified execution profile.Type: ApplicationFiled: March 11, 2014Publication date: September 17, 2015Applicant: International Business Machines CorporationInventors: Brian K. CAUFIELD, Lawrence A. GREENE, Eric A. JACOBSON, Yong LI, Xiaoyan PU
-
Publication number: 20150261572Abstract: Embodiments presented herein provide techniques for optimizing parallel data flows of a batch processing job using a profile of the processing job. An application retrieves a job profile for a processing job. The processing job has a plurality of processing stages specified in an execution profile. The job profile includes statistical data for at least one of the processing stages obtained during prior executions of the job. The application modifies properties of the execution profile based on the job profile to optimize the execution of the job. The application executes the processing job with the modified execution profile.Type: ApplicationFiled: February 13, 2015Publication date: September 17, 2015Inventors: Brian K. CAUFIELD, Lawrence A. GREENE, Eric A. JACOBSON, Yong LI, Xiaoyan PU
-
Patent number: 9110575Abstract: Methods and systems for graphically emphasizing a selected path through a diagram, where the diagram includes a number of nodes and a number of lines, the methods and systems including: applying a node highlight effect to a node of the number of nodes in the selected path, where applying the node highlight effect includes applying a node shadow to the node, applying a line highlight effect to a line of the number of lines in the selected path, where applying the line highlight effect includes applying a line shadow to the line, applying a node fade effect to a node of the number of nodes not in the selected path, and applying a line fade effect to a line of the number of lines not in the selected path.Type: GrantFiled: October 14, 2008Date of Patent: August 18, 2015Assignee: International Business Machines CorporationInventors: Aarti D Borkar, Arron J Harden, Brian K Caufield