Patents by Inventor Hui-Fang Wen

Hui-Fang Wen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8904398
    Abstract: Mapping tasks to physical processors in parallel computing system may include partitioning tasks in the parallel computing system into groups of tasks, the tasks being grouped according to their communication characteristics (e.g., pattern and frequency); mapping, by a processor, the groups of tasks to groups of physical processors, respectively; and fine tuning, by the processor, the mapping within each of the groups.
    Type: Grant
    Filed: March 6, 2012
    Date of Patent: December 2, 2014
    Assignee: International Business Machines Corporation
    Inventors: I-Hsin Chung, David J. Klepacki, Che-Rung Lee, Hui-Fang Wen
  • Patent number: 8898648
    Abstract: A profiling tool identifies a code region with a false sharing potential. A static analysis tool classifies variables and arrays in the identified code region. A mapping detection library correlates memory access instructions in the identified code region with variables and arrays in the identified code region while a processor is running the identified code region. The mapping detection library identifies one or more instructions at risk, in the identified code region, which are subject to an analysis by a false sharing detection library. A false sharing detection library performs a run-time analysis of the one or more instructions at risk while the processor is re-running the identified code region. The false sharing detection library determines, based on the performed run-time analysis, whether two different portions of the cache memory line are accessed by the generated binary code.
    Type: Grant
    Filed: November 30, 2012
    Date of Patent: November 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: I-Hsin Chung, Guojing Cong, Hiroki Murata, Yasushi Negishi, Hui-Fang Wen
  • Patent number: 8819346
    Abstract: A computer implemented method analyzes shared memory accesses during execution of an application program. The method includes instrumenting events of shared memory accesses in the application program, where the application program is to be executed on a target configuration having p nodes; executing the application program using p1 processing nodes, where p1 is less than p and satisfies a constraint. For accesses made by the executing application program, the method determines a target thread and maps determined target threads to either a remote node or a local node corresponding to a remote memory access and to a local memory access, respectively. Also disclosed is a computer-readable storage medium that stores a program of executable instructions that implements the method, and a data processing system. The invention can be implemented using a language such as Unified Parallel C (UPC) directed to a partitioned global address space (PGAS) paradigm.
    Type: Grant
    Filed: March 9, 2012
    Date of Patent: August 26, 2014
    Assignee: International Business Machines Corporation
    Inventors: Guojing Cong, Ettore Tiotto, Hui-Fang Wen
  • Publication number: 20130238862
    Abstract: A computer implemented method analyzes shared memory accesses during execution of an application program. The method includes instrumenting events of shared memory accesses in the application program, where the application program is to be executed on a target configuration having p nodes; executing the application program using p1 processing nodes, where p1 is less than p and satisfies a constraint. For accesses made by the executing application program, the method determines a target thread and maps determined target threads to either a remote node or a local node corresponding to a remote memory access and to a local memory access, respectively. Also disclosed is a computer-readable storage medium that stores a program of executable instructions that implements the method, and a data processing system. The invention can be implemented using a language such as Unified Parallel C (UPC) directed to a partitioned global address space (PGAS) paradigm.
    Type: Application
    Filed: March 9, 2012
    Publication date: September 12, 2013
    Applicant: International Business Machines Corporation
    Inventors: Guojing Cong, Ettore Tiotto, Hui-Fang Wen
  • Patent number: 8527959
    Abstract: A method for application performance data collection includes steps or acts of: customizing a performance tool for collecting application performance data of an application; modifying the application by inserting the performance tool while the application does not need to be rebuilt from the source; executing the application; and collecting the application execution performance data such that only interesting data is collected. Customizing the performance tool proceeds by implementing at least one configurable tracing function that can be programmed by the user; compiling the function(s) into an object file; and inserting the object file into the performance tool using binary instrumentation.
    Type: Grant
    Filed: December 7, 2007
    Date of Patent: September 3, 2013
    Assignee: International Business Machines Corporation
    Inventors: I-Hsin Chung, Kattamuri Ekanadham, David Joseph Klepacki, Simone Sbaraglia, Robert Edward Walkup, Hui-Fang Wen, Hao Yu
  • Patent number: 8490061
    Abstract: During runtime of a binary program file, streams of instructions are executed and memory references, generated by instrumentation applied to given ones of the instructions that refer to memory locations, are collected. A transformation is performed, based on the executed streams of instructions and the collected memory references, to obtain a table. The table lists memory events of interest for active data structures for each function in the program file. The transformation is performed to translate memory addresses for given ones of the instructions and given ones of the data structures into locations and variable names in a source file corresponding to the binary file. At least the memory events of interest are displayed, and the display is organized so as to correlate the memory events of interest with corresponding ones of the data structures.
    Type: Grant
    Filed: May 7, 2009
    Date of Patent: July 16, 2013
    Assignee: International Business Machines Corporation
    Inventors: I-Hsin Chung, Guojing Cong, Kattamuri Ekanadham, David Klepacki, Simone Sbaraglia, Hui-Fang Wen
  • Publication number: 20130014115
    Abstract: Mapping tasks to physical processors in parallel computing system may include partitioning tasks in the parallel computing system into groups of tasks, the tasks being grouped according to their communication characteristics (e.g., pattern and frequency); mapping, by a processor, the groups of tasks to groups of physical processors, respectively; and fine tuning, by the processor, the mapping within each of the groups.
    Type: Application
    Filed: September 14, 2012
    Publication date: January 10, 2013
    Applicant: International Business Machines Corporation
    Inventors: I-Hsin Chung, David J. Klepacki, Che-Rung Lee, Hui-Fang Wen
  • Patent number: 8327325
    Abstract: A target application is automatically tuned. A list of solutions for identified performance bottlenecks in a target application is retrieved from a storage device. A plurality of modules is executed to compute specific parameters for solutions contained in the list of solutions. A list of modification commands associated with specific parameters computed by the plurality of modules is generated. The list of modification commands associated with the specific parameters is appended to a command sequence list. The list of modification commands is implemented in the target application. Specific source code regions corresponding to the identified performance bottlenecks in the target application are automatically tuned using the implemented list of modification commands. Then, the tuned target application is stored in the storage device.
    Type: Grant
    Filed: January 14, 2009
    Date of Patent: December 4, 2012
    Assignee: International Business Machines Corporation
    Inventors: I-Hsin Chung, Guojing Cong, David J. Klepacki, Simone Sbaraglia, Seetharami R. Seelam, Hui-Fang Wen
  • Publication number: 20120254879
    Abstract: Mapping tasks to physical processors in parallel computing system may include partitioning tasks in the parallel computing system into groups of tasks, the tasks being grouped according to their communication characteristics (e.g., pattern and frequency); mapping, by a processor, the groups of tasks to groups of physical processors, respectively; and fine tuning, by the processor, the mapping within each of the groups.
    Type: Application
    Filed: March 6, 2012
    Publication date: October 4, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: I-Hsin Chung, David J. Klepacki, Che-Rung Lee, Hui-Fang Wen
  • Patent number: 8225291
    Abstract: Detecting performance bottlenecks in a target application is provided. In response to receiving hotspot selections from a user interface, bottleneck rules are extracted from a database. A hotspot is a region of source code that exceeds a time threshold to execute in the target application. Metrics needed to evaluate the bottleneck rules extracted from the database are identified. The identified metrics are computed. It is determined whether each bottleneck rule extracted from the database is evaluated to true using the computed metrics for hotspots in the target application. In response to determining that a bottleneck rule is evaluated to true using an appropriate computed metric corresponding to the bottleneck rule, a bottleneck description is created for the bottleneck rule. Then, the bottleneck description is sent to the user interface.
    Type: Grant
    Filed: January 4, 2008
    Date of Patent: July 17, 2012
    Assignee: International Business Machines Corporation
    Inventors: I-Hsin Chung, Guojing Cong, David Joseph Klepacki, Simone Sbaraglia, Seetharami R. Seelam, Hui-Fang Wen
  • Publication number: 20100287536
    Abstract: During runtime of a binary program file, streams of instructions are executed and memory references, generated by instrumentation applied to given ones of the instructions that refer to memory locations, are collected. A transformation is performed, based on the executed streams of instructions and the collected memory references, to obtain a table. The table lists memory events of interest for active data structures for each function in the program file. The transformation is performed to translate memory addresses for given ones of the instructions and given ones of the data structures into locations and variable names in a source file corresponding to the binary file. At least the memory events of interest are displayed, and the display is organized so as to correlate the memory events of interest with corresponding ones of the data structures.
    Type: Application
    Filed: May 7, 2009
    Publication date: November 11, 2010
    Applicant: International Business Machiness Corporation
    Inventors: I-Hsin Chung, Guojing Cong, Kattamuri Ekanadham, David Klepacki, Simone Sbaraglia, Hui-Fang Wen
  • Publication number: 20100180255
    Abstract: A target application is automatically tuned. A list of solutions for identified performance bottlenecks in a target application is retrieved from a storage device. A plurality of modules is executed to compute specific parameters for solutions contained in the list of solutions. A list of modification commands associated with specific parameters computed by the plurality of modules is generated. The list of modification commands associated with the specific parameters is appended to a command sequence list. The list of modification commands is implemented in the target application. Specific source code regions corresponding to the identified performance bottlenecks in the target application are automatically tuned using the implemented list of modification commands. Then, the tuned target application is stored in the storage device.
    Type: Application
    Filed: January 14, 2009
    Publication date: July 15, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: I-Hsin Chung, Guojing Cong, David J. Klepacki, Simone Sbaraglia, Seetharami R. Seelam, Hui-Fang Wen
  • Publication number: 20090177642
    Abstract: A system for detecting performance bottlenecks in a target application. In response to receiving hotspot selections from a user interface, bottleneck rules are extracted from a database. A hotspot is a region of source code that exceeds a time threshold to execute in the target application. Metrics needed to evaluate the bottleneck rules extracted from the database are identified. The identified metrics are computed. It is determined whether each bottleneck rule extracted from the database is evaluated to true using the computed metrics for hotspots in the target application. In response to determining that a bottleneck rule is evaluated to true using an appropriate computed metric corresponding to the bottleneck rule, a bottleneck description is created for the bottleneck rule. Then, the bottleneck description is sent to the user interface.
    Type: Application
    Filed: January 4, 2008
    Publication date: July 9, 2009
    Inventors: I-Hsin Chung, Guojing Cong, David Joseph Klepacki, Simone Sbaraglia, Seetharami R. Seelam, Hui-Fang Wen
  • Publication number: 20090150874
    Abstract: A method for application performance data collection includes steps or acts of: customizing a performance tool for collecting application performance data of an application; modifying the application by inserting the performance tool while the application does not need to be rebuilt from the source; executing the application; and collecting the application execution performance data such that only interesting data is collected. Customizing the performance tool proceeds by implementing at least one configurable tracing function that can be programmed by the user; compiling the function(s) into an object file; and inserting the object file into the performance tool using binary instrumentation.
    Type: Application
    Filed: December 7, 2007
    Publication date: June 11, 2009
    Applicant: International Business Machines Corporation
    Inventors: I-Hsin Chung, Kattamuri Ekanadham, David Joseph Klepacki, Simone Sbaraglia, Robert Edward Walkup, Hui-Fang Wen, Hao Yu