Patents by Inventor David J. Klepacki

David J. Klepacki has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8904398
    Abstract: Mapping tasks to physical processors in parallel computing system may include partitioning tasks in the parallel computing system into groups of tasks, the tasks being grouped according to their communication characteristics (e.g., pattern and frequency); mapping, by a processor, the groups of tasks to groups of physical processors, respectively; and fine tuning, by the processor, the mapping within each of the groups.
    Type: Grant
    Filed: March 6, 2012
    Date of Patent: December 2, 2014
    Assignee: International Business Machines Corporation
    Inventors: I-Hsin Chung, David J. Klepacki, Che-Rung Lee, Hui-Fang Wen
  • Patent number: 8869155
    Abstract: A method for increasing performance of an operation on a distributed memory machine is provided. Asynchronous parallel steps in the operation are transformed into synchronous parallel steps. The synchronous parallel steps of the operation are rearranged to generate an altered operation that schedules memory accesses for increasing locality of reference. The altered operation that schedules memory accesses for increasing locality of reference is mapped onto the distributed memory machine. Then, the altered operation is executed on the distributed memory machine to simulate local memory accesses with virtual threads to check cache performance within each node of the distributed memory machine.
    Type: Grant
    Filed: November 12, 2010
    Date of Patent: October 21, 2014
    Assignee: International Business Machines Corporation
    Inventors: George Almasi, Guojing Cong, David J. Klepacki, Vijay A. Saraswat
  • Publication number: 20130014115
    Abstract: Mapping tasks to physical processors in parallel computing system may include partitioning tasks in the parallel computing system into groups of tasks, the tasks being grouped according to their communication characteristics (e.g., pattern and frequency); mapping, by a processor, the groups of tasks to groups of physical processors, respectively; and fine tuning, by the processor, the mapping within each of the groups.
    Type: Application
    Filed: September 14, 2012
    Publication date: January 10, 2013
    Applicant: International Business Machines Corporation
    Inventors: I-Hsin Chung, David J. Klepacki, Che-Rung Lee, Hui-Fang Wen
  • Patent number: 8327325
    Abstract: A target application is automatically tuned. A list of solutions for identified performance bottlenecks in a target application is retrieved from a storage device. A plurality of modules is executed to compute specific parameters for solutions contained in the list of solutions. A list of modification commands associated with specific parameters computed by the plurality of modules is generated. The list of modification commands associated with the specific parameters is appended to a command sequence list. The list of modification commands is implemented in the target application. Specific source code regions corresponding to the identified performance bottlenecks in the target application are automatically tuned using the implemented list of modification commands. Then, the tuned target application is stored in the storage device.
    Type: Grant
    Filed: January 14, 2009
    Date of Patent: December 4, 2012
    Assignee: International Business Machines Corporation
    Inventors: I-Hsin Chung, Guojing Cong, David J. Klepacki, Simone Sbaraglia, Seetharami R. Seelam, Hui-Fang Wen
  • Publication number: 20120254879
    Abstract: Mapping tasks to physical processors in parallel computing system may include partitioning tasks in the parallel computing system into groups of tasks, the tasks being grouped according to their communication characteristics (e.g., pattern and frequency); mapping, by a processor, the groups of tasks to groups of physical processors, respectively; and fine tuning, by the processor, the mapping within each of the groups.
    Type: Application
    Filed: March 6, 2012
    Publication date: October 4, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: I-Hsin Chung, David J. Klepacki, Che-Rung Lee, Hui-Fang Wen
  • Publication number: 20120124585
    Abstract: A method for increasing performance of an operation on a distributed memory machine is provided. Asynchronous parallel steps in the operation are transformed into synchronous parallel steps. The synchronous parallel steps of the operation are rearranged to generate an altered operation that schedules memory accesses for increasing locality of reference. The altered operation that schedules memory accesses for increasing locality of reference is mapped onto the distributed memory machine. Then, the altered operation is executed on the distributed memory machine to simulate local memory accesses with virtual threads to check cache performance within each node of the distributed memory machine.
    Type: Application
    Filed: November 12, 2010
    Publication date: May 17, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: George Almasi, Guojing Cong, David J. Klepacki, Vijay A. Saraswat
  • Publication number: 20100180255
    Abstract: A target application is automatically tuned. A list of solutions for identified performance bottlenecks in a target application is retrieved from a storage device. A plurality of modules is executed to compute specific parameters for solutions contained in the list of solutions. A list of modification commands associated with specific parameters computed by the plurality of modules is generated. The list of modification commands associated with the specific parameters is appended to a command sequence list. The list of modification commands is implemented in the target application. Specific source code regions corresponding to the identified performance bottlenecks in the target application are automatically tuned using the implemented list of modification commands. Then, the tuned target application is stored in the storage device.
    Type: Application
    Filed: January 14, 2009
    Publication date: July 15, 2010
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: I-Hsin Chung, Guojing Cong, David J. Klepacki, Simone Sbaraglia, Seetharami R. Seelam, Hui-Fang Wen