Patents by Inventor Jun Shirako

Jun Shirako has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11630654
    Abstract: Aspects include modeling data cache utilization for each loop in a loop nest; estimating total data cache lines fetched in one iteration of the loop; and determining the possibility of data cache reuse across loop iterations using data cache lines fetched and associativity constraints. Aspects also include estimating, for memory reference pairs, reuse by one reference of data cache line fetched by another; estimating total number of cache misses for all iterations of the loop; and estimating total number of cache misses of a reference for iterations of a next outer loop as equal to total cache misses for an entire inner loop. Aspects further include estimating memory cost of a loop unroll and jam transformation, without performing the transformation; and extending a data cache model to estimate best unroll-and-jam factors for the loop nest, capable of minimizing total cache misses incurred by the memory references in the loop body.
    Type: Grant
    Filed: August 19, 2021
    Date of Patent: April 18, 2023
    Assignee: International Business Machines Corporation
    Inventors: Wai Hung Tsang, Prithayan Barua, Ettore Tiotto, Bardia Mahjour, Jun Shirako
  • Publication number: 20230067853
    Abstract: Aspects include modeling data cache utilization for each loop in a loop nest; estimating total data cache lines fetched in one iteration of the loop; and determining the possibility of data cache reuse across loop iterations using data cache lines fetched and associativity constraints. Aspects also include estimating, for memory reference pairs, reuse by one reference of data cache line fetched by another; estimating total number of cache misses for all iterations of the loop; and estimating total number of cache misses of a reference for iterations of a next outer loop as equal to total cache misses for an entire inner loop. Aspects further include estimating memory cost of a loop unroll and jam transformation, without performing the transformation; and extending a data cache model to estimate best unroll-and-jam factors for the loop nest, capable of minimizing total cache misses incurred by the memory references in the loop body.
    Type: Application
    Filed: August 19, 2021
    Publication date: March 2, 2023
    Inventors: Wai Hung Tsang, Prithayan Barua, Ettore Tiotto, Bardia Mahjour, Jun Shirako
  • Patent number: 8812880
    Abstract: Provided is a multiprocessor system and a compiler used in the system for automatically extracting tasks having parallelism from an input program to be processed, performing scheduling to efficiently operate processor units by arranging the tasks according to characteristics of the processor units, and generating codes for optimizing a system frequency and a power supply voltage by estimating a processing amount of the processor units.
    Type: Grant
    Filed: January 11, 2010
    Date of Patent: August 19, 2014
    Assignee: Waseda University
    Inventors: Hironori Kasahara, Keiji Kimura, Jun Shirako, Masaki Ito, Hiroaki Shikano
  • Patent number: 8250548
    Abstract: A heterogeneous multiprocessor system including a plurality of processor elements having mutually different instruction sets and structures avoids a specific processor element from being short of resources to improve throughput. An executable task is extracted based on a preset depending relationship between a plurality of tasks, and the plurality of first processors are allocated to a general-purpose processor group based on a depending relationship among the extracted tasks. A second processor is allocated to an accelerator group, a task to be allocated is determined from the extracted tasks based on a priority value for each of tasks, and an execution cost of executing the determined task by the first processor is compared with an execution cost of executing the task by the second processor. The task is allocated to one of the general-purpose processor group and the accelerator group that is judged to be lower as a result of the cost comparison.
    Type: Grant
    Filed: January 23, 2007
    Date of Patent: August 21, 2012
    Assignee: Waseda University
    Inventors: Hironori Kasahara, Keiji Kimura, Jun Shirako, Yasutaka Wada, Masaki Ito, Hiroaki Shikano
  • Patent number: 7895453
    Abstract: Provided is a multiprocessor system and a compiler used in the system for automatically extracting tasks having parallelism from an input program to be processed, performing scheduling to efficiently operate processor units by arranging the tasks according to characteristics of the processor units, and generating codes for optimizing a system frequency and a power supply voltage by estimating a processing amount of the processor units.
    Type: Grant
    Filed: April 12, 2006
    Date of Patent: February 22, 2011
    Assignee: Waseda University
    Inventors: Hironori Kasahara, Keiji Kimura, Jun Shirako, Masaki Ito, Hiroaki Shikano
  • Publication number: 20100146310
    Abstract: Provided is a multiprocessor system and a compiler used in the system for automatically extracting tasks having parallelism from an input program to be processed, performing scheduling to efficiently operate processor units by arranging the tasks according to characteristics of the processor units, and generating codes for optimizing a system frequency and a power supply voltage by estimating a processing amount of the processor units.
    Type: Application
    Filed: January 11, 2010
    Publication date: June 10, 2010
    Inventors: Hironori Kasahara, Keiji Kimura, Jun Shirako, Masaki Ito, Hiroaki Shikano
  • Publication number: 20070283358
    Abstract: A heterogeneous multiprocessor system including a plurality of processor elements having mutually different instruction sets and structures avoids a specific processor element from being short of resources to improve throughput. An executable task is extracted based on a preset depending relationship between a plurality of tasks, and the plurality of first processors are allocated to a general-purpose processor group based on a depending relationship among the extracted tasks. A second processor is allocated to an accelerator group, a task to be allocated is determined from the extracted tasks based on a priority value for each of tasks, and an execution cost of executing the determined task by the first processor is compared with an execution cost of executing the task by the second processor. The task is allocated to one of the general-purpose processor group and the accelerator group that is judged to be lower as a result of the cost comparison.
    Type: Application
    Filed: January 23, 2007
    Publication date: December 6, 2007
    Inventors: Hironori Kasahara, Keiji Kimura, Jun Shirako, Yasutaka Wada, Masaki Ito, Hiroaki Shikano
  • Publication number: 20070255929
    Abstract: Provided is a multiprocessor system and a compiler used in the system for automatically extracting tasks having parallelism from an input program to be processed, performing scheduling to efficiently operate processor units by arranging the tasks according to characteristics of the processor units, and generating codes for optimizing a system frequency and a power supply voltage by estimating a processing amount of the processor units.
    Type: Application
    Filed: April 12, 2006
    Publication date: November 1, 2007
    Inventors: Hironori Kasahara, Keiji Kimura, Jun Shirako, Masaki Ito, Hiroaki Shikano