Patents by Inventor Hui-Fang Wen
Hui-Fang Wen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8904398Abstract: Mapping tasks to physical processors in parallel computing system may include partitioning tasks in the parallel computing system into groups of tasks, the tasks being grouped according to their communication characteristics (e.g., pattern and frequency); mapping, by a processor, the groups of tasks to groups of physical processors, respectively; and fine tuning, by the processor, the mapping within each of the groups.Type: GrantFiled: March 6, 2012Date of Patent: December 2, 2014Assignee: International Business Machines CorporationInventors: I-Hsin Chung, David J. Klepacki, Che-Rung Lee, Hui-Fang Wen
-
Patent number: 8898648Abstract: A profiling tool identifies a code region with a false sharing potential. A static analysis tool classifies variables and arrays in the identified code region. A mapping detection library correlates memory access instructions in the identified code region with variables and arrays in the identified code region while a processor is running the identified code region. The mapping detection library identifies one or more instructions at risk, in the identified code region, which are subject to an analysis by a false sharing detection library. A false sharing detection library performs a run-time analysis of the one or more instructions at risk while the processor is re-running the identified code region. The false sharing detection library determines, based on the performed run-time analysis, whether two different portions of the cache memory line are accessed by the generated binary code.Type: GrantFiled: November 30, 2012Date of Patent: November 25, 2014Assignee: International Business Machines CorporationInventors: I-Hsin Chung, Guojing Cong, Hiroki Murata, Yasushi Negishi, Hui-Fang Wen
-
Patent number: 8819346Abstract: A computer implemented method analyzes shared memory accesses during execution of an application program. The method includes instrumenting events of shared memory accesses in the application program, where the application program is to be executed on a target configuration having p nodes; executing the application program using p1 processing nodes, where p1 is less than p and satisfies a constraint. For accesses made by the executing application program, the method determines a target thread and maps determined target threads to either a remote node or a local node corresponding to a remote memory access and to a local memory access, respectively. Also disclosed is a computer-readable storage medium that stores a program of executable instructions that implements the method, and a data processing system. The invention can be implemented using a language such as Unified Parallel C (UPC) directed to a partitioned global address space (PGAS) paradigm.Type: GrantFiled: March 9, 2012Date of Patent: August 26, 2014Assignee: International Business Machines CorporationInventors: Guojing Cong, Ettore Tiotto, Hui-Fang Wen
-
Publication number: 20130238862Abstract: A computer implemented method analyzes shared memory accesses during execution of an application program. The method includes instrumenting events of shared memory accesses in the application program, where the application program is to be executed on a target configuration having p nodes; executing the application program using p1 processing nodes, where p1 is less than p and satisfies a constraint. For accesses made by the executing application program, the method determines a target thread and maps determined target threads to either a remote node or a local node corresponding to a remote memory access and to a local memory access, respectively. Also disclosed is a computer-readable storage medium that stores a program of executable instructions that implements the method, and a data processing system. The invention can be implemented using a language such as Unified Parallel C (UPC) directed to a partitioned global address space (PGAS) paradigm.Type: ApplicationFiled: March 9, 2012Publication date: September 12, 2013Applicant: International Business Machines CorporationInventors: Guojing Cong, Ettore Tiotto, Hui-Fang Wen
-
Patent number: 8527959Abstract: A method for application performance data collection includes steps or acts of: customizing a performance tool for collecting application performance data of an application; modifying the application by inserting the performance tool while the application does not need to be rebuilt from the source; executing the application; and collecting the application execution performance data such that only interesting data is collected. Customizing the performance tool proceeds by implementing at least one configurable tracing function that can be programmed by the user; compiling the function(s) into an object file; and inserting the object file into the performance tool using binary instrumentation.Type: GrantFiled: December 7, 2007Date of Patent: September 3, 2013Assignee: International Business Machines CorporationInventors: I-Hsin Chung, Kattamuri Ekanadham, David Joseph Klepacki, Simone Sbaraglia, Robert Edward Walkup, Hui-Fang Wen, Hao Yu
-
Patent number: 8490061Abstract: During runtime of a binary program file, streams of instructions are executed and memory references, generated by instrumentation applied to given ones of the instructions that refer to memory locations, are collected. A transformation is performed, based on the executed streams of instructions and the collected memory references, to obtain a table. The table lists memory events of interest for active data structures for each function in the program file. The transformation is performed to translate memory addresses for given ones of the instructions and given ones of the data structures into locations and variable names in a source file corresponding to the binary file. At least the memory events of interest are displayed, and the display is organized so as to correlate the memory events of interest with corresponding ones of the data structures.Type: GrantFiled: May 7, 2009Date of Patent: July 16, 2013Assignee: International Business Machines CorporationInventors: I-Hsin Chung, Guojing Cong, Kattamuri Ekanadham, David Klepacki, Simone Sbaraglia, Hui-Fang Wen
-
Publication number: 20130014115Abstract: Mapping tasks to physical processors in parallel computing system may include partitioning tasks in the parallel computing system into groups of tasks, the tasks being grouped according to their communication characteristics (e.g., pattern and frequency); mapping, by a processor, the groups of tasks to groups of physical processors, respectively; and fine tuning, by the processor, the mapping within each of the groups.Type: ApplicationFiled: September 14, 2012Publication date: January 10, 2013Applicant: International Business Machines CorporationInventors: I-Hsin Chung, David J. Klepacki, Che-Rung Lee, Hui-Fang Wen
-
Patent number: 8327325Abstract: A target application is automatically tuned. A list of solutions for identified performance bottlenecks in a target application is retrieved from a storage device. A plurality of modules is executed to compute specific parameters for solutions contained in the list of solutions. A list of modification commands associated with specific parameters computed by the plurality of modules is generated. The list of modification commands associated with the specific parameters is appended to a command sequence list. The list of modification commands is implemented in the target application. Specific source code regions corresponding to the identified performance bottlenecks in the target application are automatically tuned using the implemented list of modification commands. Then, the tuned target application is stored in the storage device.Type: GrantFiled: January 14, 2009Date of Patent: December 4, 2012Assignee: International Business Machines CorporationInventors: I-Hsin Chung, Guojing Cong, David J. Klepacki, Simone Sbaraglia, Seetharami R. Seelam, Hui-Fang Wen
-
Publication number: 20120254879Abstract: Mapping tasks to physical processors in parallel computing system may include partitioning tasks in the parallel computing system into groups of tasks, the tasks being grouped according to their communication characteristics (e.g., pattern and frequency); mapping, by a processor, the groups of tasks to groups of physical processors, respectively; and fine tuning, by the processor, the mapping within each of the groups.Type: ApplicationFiled: March 6, 2012Publication date: October 4, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: I-Hsin Chung, David J. Klepacki, Che-Rung Lee, Hui-Fang Wen
-
Patent number: 8225291Abstract: Detecting performance bottlenecks in a target application is provided. In response to receiving hotspot selections from a user interface, bottleneck rules are extracted from a database. A hotspot is a region of source code that exceeds a time threshold to execute in the target application. Metrics needed to evaluate the bottleneck rules extracted from the database are identified. The identified metrics are computed. It is determined whether each bottleneck rule extracted from the database is evaluated to true using the computed metrics for hotspots in the target application. In response to determining that a bottleneck rule is evaluated to true using an appropriate computed metric corresponding to the bottleneck rule, a bottleneck description is created for the bottleneck rule. Then, the bottleneck description is sent to the user interface.Type: GrantFiled: January 4, 2008Date of Patent: July 17, 2012Assignee: International Business Machines CorporationInventors: I-Hsin Chung, Guojing Cong, David Joseph Klepacki, Simone Sbaraglia, Seetharami R. Seelam, Hui-Fang Wen
-
Publication number: 20100287536Abstract: During runtime of a binary program file, streams of instructions are executed and memory references, generated by instrumentation applied to given ones of the instructions that refer to memory locations, are collected. A transformation is performed, based on the executed streams of instructions and the collected memory references, to obtain a table. The table lists memory events of interest for active data structures for each function in the program file. The transformation is performed to translate memory addresses for given ones of the instructions and given ones of the data structures into locations and variable names in a source file corresponding to the binary file. At least the memory events of interest are displayed, and the display is organized so as to correlate the memory events of interest with corresponding ones of the data structures.Type: ApplicationFiled: May 7, 2009Publication date: November 11, 2010Applicant: International Business Machiness CorporationInventors: I-Hsin Chung, Guojing Cong, Kattamuri Ekanadham, David Klepacki, Simone Sbaraglia, Hui-Fang Wen
-
Publication number: 20100180255Abstract: A target application is automatically tuned. A list of solutions for identified performance bottlenecks in a target application is retrieved from a storage device. A plurality of modules is executed to compute specific parameters for solutions contained in the list of solutions. A list of modification commands associated with specific parameters computed by the plurality of modules is generated. The list of modification commands associated with the specific parameters is appended to a command sequence list. The list of modification commands is implemented in the target application. Specific source code regions corresponding to the identified performance bottlenecks in the target application are automatically tuned using the implemented list of modification commands. Then, the tuned target application is stored in the storage device.Type: ApplicationFiled: January 14, 2009Publication date: July 15, 2010Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: I-Hsin Chung, Guojing Cong, David J. Klepacki, Simone Sbaraglia, Seetharami R. Seelam, Hui-Fang Wen
-
Publication number: 20090177642Abstract: A system for detecting performance bottlenecks in a target application. In response to receiving hotspot selections from a user interface, bottleneck rules are extracted from a database. A hotspot is a region of source code that exceeds a time threshold to execute in the target application. Metrics needed to evaluate the bottleneck rules extracted from the database are identified. The identified metrics are computed. It is determined whether each bottleneck rule extracted from the database is evaluated to true using the computed metrics for hotspots in the target application. In response to determining that a bottleneck rule is evaluated to true using an appropriate computed metric corresponding to the bottleneck rule, a bottleneck description is created for the bottleneck rule. Then, the bottleneck description is sent to the user interface.Type: ApplicationFiled: January 4, 2008Publication date: July 9, 2009Inventors: I-Hsin Chung, Guojing Cong, David Joseph Klepacki, Simone Sbaraglia, Seetharami R. Seelam, Hui-Fang Wen
-
Publication number: 20090150874Abstract: A method for application performance data collection includes steps or acts of: customizing a performance tool for collecting application performance data of an application; modifying the application by inserting the performance tool while the application does not need to be rebuilt from the source; executing the application; and collecting the application execution performance data such that only interesting data is collected. Customizing the performance tool proceeds by implementing at least one configurable tracing function that can be programmed by the user; compiling the function(s) into an object file; and inserting the object file into the performance tool using binary instrumentation.Type: ApplicationFiled: December 7, 2007Publication date: June 11, 2009Applicant: International Business Machines CorporationInventors: I-Hsin Chung, Kattamuri Ekanadham, David Joseph Klepacki, Simone Sbaraglia, Robert Edward Walkup, Hui-Fang Wen, Hao Yu