Patents by Inventor Ron E. Liu
Ron E. Liu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9990412Abstract: A data driven parallel sorting method includes distributing input data records to n partitions one by one in a circular manner. Each partition corresponds to a parallel sorting process with an allocated memory chunk sized to store m data records. The method also includes sorting, in parallel, current data records in respective memory chunks in respective partitions. The method also includes in response to distribution of data records of ?m/n? rounds, circularly controlling one of the n partitions, and writing data records that have been sorted in the memory chunk of the partition into a mass storage as an ordered data chunk, and emptying the memory chunk. The method also includes in response to all data records being distributed, writing data chunks that have been sorted in respective memory chunks into the mass storage, and performing a merge sort on all ordered data chunks in the mass storage.Type: GrantFiled: April 28, 2014Date of Patent: June 5, 2018Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Brian K. Caufield, Ron E. Liu, Dong J. Wei, Xin Ying Yang
-
Patent number: 9762672Abstract: Provided are techniques for improving data locality for parallel applications running in a big data distributed file system with a dynamic node group. In response to a consumer job starting to read one or more files in a big data distributed file system having multiple nodes, node group information for the one or more files to be read is retrieved, wherein the node group information identifies nodes from the multiple nodes on which a producer job wrote the one or more files, and the consumer job is assigned to the nodes identified by the node group information to allow for local reading of the one or more files by the consumer job.Type: GrantFiled: June 15, 2015Date of Patent: September 12, 2017Assignee: International Business Machines CorporationInventors: Krishna K. Bonagiri, Eric A. Jacobson, Yong Li, Ron E. Liu, Xiaoyan Pu
-
Patent number: 9733984Abstract: Provided are techniques for multiple stage workload management. A staging queue and a run queue are provided. A workload is received. In response to determining that application resources are not available and that the workload has not been previously semi-started, the workload is added to the staging queue. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that run resources are available, the workload is started. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that the run resources are not available, adding the workload to the run queue.Type: GrantFiled: March 2, 2016Date of Patent: August 15, 2017Assignee: International Business Machines CorporationInventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Mi W. Shum, Chun H. Sun, DongJie Wei
-
Patent number: 9652308Abstract: Provided are techniques for sharing a partitioned data set across parallel applications. Under control of a producing application, a partitioned data set is generated; a descriptor that describes the partitioned data set is generated; and the descriptor is registered in a registry. Under control of a consuming application, the registry is accessed to obtain the descriptor of the partitioned data set; and the descriptor is uses to determine how to process the partitioned data set.Type: GrantFiled: September 5, 2014Date of Patent: May 16, 2017Assignee: International Business Machines CorporationInventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Xiaoyan Pu
-
Patent number: 9575916Abstract: A system identifies a performance bottleneck status in a parallel data processing environment by examining data flow associated with the parallel data processing environment to identify at least one operator, where an operator type is associated with at least one operator, at least one buffer, and a relationship that the buffer has with the operator, where the relationship is associated with the operator type. The system monitors the buffer to determine a buffer status associated with the buffer. The system applies a set of rules to identify an operator bottleneck status associated with the operator. The set of rules is applied to the operator, based on the operator type, the buffer status, and relationship that the buffer has with the operator. The system then determines a performance bottleneck status associated with the parallel data processing environment, based on the operator bottleneck status.Type: GrantFiled: January 6, 2014Date of Patent: February 21, 2017Assignee: International Business Machines CorporationInventors: Brian K. Caufield, Ron E. Liu, DongJie Wei, Xin Ying Yang
-
Patent number: 9542246Abstract: Provided are techniques for sharing a partitioned data set across parallel applications. Under control of a producing application, a partitioned data set is generated; a descriptor that describes the partitioned data set is generated; and the descriptor is registered in a registry. Under control of a consuming application, the registry is accessed to obtain the descriptor of the partitioned data set; and the descriptor is uses to determine how to process the partitioned data set.Type: GrantFiled: May 20, 2015Date of Patent: January 10, 2017Assignee: International Business Machines CorporationInventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Xiaoyan Pu
-
Publication number: 20160366224Abstract: Provided are techniques for improving data locality for parallel applications running in a big data distributed file system with a dynamic node group. In response to a consumer job starting to read one or more files in a big data distributed file system having multiple nodes, node group information for the one or more files to be read is retrieved, wherein the node group information identifies nodes from the multiple nodes on which a producer job wrote the one or more files, and the consumer job is assigned to the nodes identified by the node group information to allow for local reading of the one or more files by the consumer job.Type: ApplicationFiled: June 15, 2015Publication date: December 15, 2016Inventors: Krishna K. Bonagiri, Eric A. Jacobson, Yong Li, Ron E. Liu, Xiaoyan Pu
-
Publication number: 20160179578Abstract: Provided are techniques for multiple stage workload management. A staging queue and a run queue are provided. A workload is received. In response to determining that application resources are not available and that the workload has not been previously semi-started, the workload is added to the staging queue. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that run resources are available, the workload is started. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that the run resources are not available, adding the workload to the run queue.Type: ApplicationFiled: March 2, 2016Publication date: June 23, 2016Inventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Mi W. Shum, Chun H. Sun, DongJie Wei
-
Patent number: 9304816Abstract: Provided are techniques for multiple stage workload management. A staging queue and a run queue are provided. A workload is received. In response to determining that application resources are not available and that the workload has not been previously semi-started, the workload is added to the staging queue. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that run resources are available, the workload is started. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that the run resources are not available, adding the workload to the run queue.Type: GrantFiled: August 5, 2013Date of Patent: April 5, 2016Assignee: International Business Machines CorporationInventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Mi W. Shum, Chun H. Sun, DongJie Wei
-
Publication number: 20160070608Abstract: Provided are techniques for sharing a partitioned data set across parallel applications. Under control of a producing application, a partitioned data set is generated; a descriptor that describes the partitioned data set is generated; and the descriptor is registered in a registry. Under control of a consuming application, the registry is accessed to obtain the descriptor of the partitioned data set; and the descriptor is uses to determine how to process the partitioned data set.Type: ApplicationFiled: May 20, 2015Publication date: March 10, 2016Inventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Xiaoyan Pu
-
Publication number: 20160070607Abstract: Provided are techniques for sharing a partitioned data set across parallel applications. Under control of a producing application, a partitioned data set is generated; a descriptor that describes the partitioned data set is generated; and the descriptor is registered in a registry. Under control of a consuming application, the registry is accessed to obtain the descriptor of the partitioned data set; and the descriptor is uses to determine how to process the partitioned data set.Type: ApplicationFiled: September 5, 2014Publication date: March 10, 2016Inventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Xiaoyan Pu
-
Publication number: 20150193368Abstract: A system identifies a performance bottleneck status in a parallel data processing environment by examining data flow associated with the parallel data processing environment to identify at least one operator, where an operator type is associated with at least one operator, at least one buffer, and a relationship that the buffer has with the operator, where the relationship is associated with the operator type. The system monitors the buffer to determine a buffer status associated with the buffer. The system applies a set of rules to identify an operator bottleneck status associated with the operator. The set of rules is applied to the operator, based on the operator type, the buffer status, and relationship that the buffer has with the operator. The system then determines a performance bottleneck status associated with the parallel data processing environment, based on the operator bottleneck status.Type: ApplicationFiled: January 6, 2014Publication date: July 9, 2015Applicant: International Business Machines CorporationInventors: Brian K. CAUFIELD, Ron E. LIU, DongJie WEI, Xin Y. YANG
-
Publication number: 20150040133Abstract: Provided are techniques for multiple stage workload management. A staging queue and a run queue are provided. A workload is received. In response to determining that application resources are not available and that the workload has not been previously semi-started, the workload is added to the staging queue. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that run resources are available, the workload is started. In response to determining that the application resources are not available and that the workload has been semi-started, and, in response to determining that the run resources are not available, adding the workload to the run queue.Type: ApplicationFiled: August 5, 2013Publication date: February 5, 2015Applicant: International Business Machines CorporationInventors: Brian K. Caufield, Ron E. Liu, Sriram K. Padmanabhan, Mi W. Shum, Chun H. Sun, DongJie Wei
-
Publication number: 20140324890Abstract: A data driven parallel sorting method includes distributing input data records to n partitions one by one in a circular manner. Each partition corresponds to a parallel sorting process with an allocated memory chunk sized to store m data records. The method also includes sorting, in parallel, current data records in respective memory chunks in respective partitions. The method also includes in response to distribution of data records of ?m/n? rounds, circularly controlling one of the n partitions, and writing data records that have been sorted in the memory chunk of the partition into a mass storage as an ordered data chunk, and emptying the memory chunk. The method also includes in response to all data records being distributed, writing data chunks that have been sorted in respective memory chunks into the mass storage, and performing a merge sort on all ordered data chunks in the mass storage.Type: ApplicationFiled: April 28, 2014Publication date: October 30, 2014Applicant: International Business Machines CorporationInventors: Brian K. Caufield, Ron E. Liu, Dong J. Wei, Xin Y. Yang