Patents by Inventor Kazuaki Ishizaki

Kazuaki Ishizaki has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240094999
    Abstract: A computer-implemented method, system and computer program product for improving the performance of a program that manipulates two vectors of data. It is determined whether the program contains one of the following patterns: a first pattern corresponding to v0.rearrange(s, v1); a second pattern corresponding to v0.blend(v1, m); and a third pattern corresponding to v0.rearrange(s).blend(v1.rearrange(s), m). Upon identifying code written as the first pattern in the program, the first pattern is rewritten and replaced with the second or third pattern if the execution time of the program with the second or third pattern is less than the execution time of the program with the first program. In a similar manner, upon identifying code written as the second or third pattern in the program, the second or third pattern is rewritten and replaced with the first pattern if the execution time of the program can be improved.
    Type: Application
    Filed: September 20, 2022
    Publication date: March 21, 2024
    Inventor: Kazuaki Ishizaki
  • Patent number: 11398004
    Abstract: A method is provided for buffer allocation on a graphics processing unit. The method includes analyzing, by the graphics processing unit, a program to be executed on the graphics processing unit to determine, for an object in the program, a set of elements in the object that are designated to be accessed during an execution of the program. The method further includes allocating, by the graphics processing unit, a placement of the object in a device buffer on the graphics processing unit based on the set of elements to minimize a number of memory accesses during the execution of the program.
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: July 26, 2022
    Assignee: International Business Machines Corporation
    Inventor: Kazuaki Ishizaki
  • Patent number: 11210193
    Abstract: A method for improving performance of a system including a first processor and a second processor includes obtaining a code region specified to be executed on the second processor, the code region including a plurality of instructions, calculating a performance improvement of executing at least one of the plurality of instructions included in the code region on the second processor over executing the at least one instruction on the first processor, removing the at least one instruction from the code region in response to a condition including that the performance improvement does not exceed a first threshold, and repeating the calculating and the removing to produce a modified code region specified to be executed on the second processor.
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: December 28, 2021
    Assignee: International Business Machines Corporation
    Inventor: Kazuaki Ishizaki
  • Patent number: 11188454
    Abstract: Methods and systems for training a neural network include determining a graph representation of a set of neural network training operations based on definition-use chains. A memory allocation queue is determined based on a slack value for each neural network training operation in the graph representation. Memory for each neural network training operation in the memory allocation queue is allocated. Execution of neural network training operations with non-zero slack is delayed to minimize an amount of memory allocated at any one time. Neural network training is executed using the allocated memory for each neural network training operation.
    Type: Grant
    Filed: March 25, 2019
    Date of Patent: November 30, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Kazuaki Ishizaki
  • Patent number: 11093224
    Abstract: A method performed during execution of a compilation process for a program having nested loops is provided. The method replaces multiple conditional branch instructions for a processor which uses a conditional branch instruction limited to only comparing a value of a general register with a value of a special register that holds a loop counter value. The method generates, in replacement of the multiple conditional branch instructions, the conditional branch instruction limited to only comparing the value of the general register with the value of the special register that holds the loop counter value for the inner-most loop. The method adds (i) a register initialization outside the nested loops and (ii) a register value adjustment to the inner-most loop. The method defines the value for the general register for the register initialization and conditions for the generated conditional branch instruction, responsive to requirements of the multiple conditional branch instructions.
    Type: Grant
    Filed: April 24, 2019
    Date of Patent: August 17, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eri Ogawa, Kazuaki Ishizaki, Hiroshi Inoue
  • Patent number: 11042530
    Abstract: A computer-implemented method for improving performance of data processing with nullable schema information by using a data processing framework is presented. The method includes reading, by the processor, data from one or more blocks forming a column, where the data is stored in a database including the one or more blocks and determining, by the processor, whether any row in each block of the one or more blocks includes null data. The computer-implemented method further includes executing, by the data processing framework, optimized code if the block does not include null data and executing, by the data processing framework, non-optimized code if the block includes null data.
    Type: Grant
    Filed: January 17, 2018
    Date of Patent: June 22, 2021
    Assignee: International Business Machines Corporation
    Inventors: Kazuaki Ishizaki, Takanori Ueda
  • Patent number: 11029924
    Abstract: A method includes identifying a code portion that accesses a primitive value in a user-defined function included in a user program, converting the code portion and an argument in a manner to directly reference an internal data representation of the user program, and generating a code for calling the user-defined function converted by the conversion.
    Type: Grant
    Filed: November 18, 2019
    Date of Patent: June 8, 2021
    Assignee: International Business Machines Corporation
    Inventors: Hiroshi Inoue, Kazuaki Ishizaki, Jan M. Wroblewski, Moriyoshi Ohara
  • Patent number: 10929161
    Abstract: A method, computer program product, and system includes a processor(s) obtaining, during runtime, from a compiler, two versions of a data parallel loop for an operation. The host computing system comprises includes a CPU and a GPU is accessible to the host. The processor(s) online profiles the two versions by asynchronously executing the first version, in a profile mode, with the GPU and executing the second version, in the profile mode, with the CPU. The processor(s) generates execution times for the first version and the second version. The processor(s) stores the executions times and performance data in a storage, where the performance data comprises a size of the data parallel loop for the operation. The processor(s) update a regression model(s) to predict performance numbers for a process of an unknown loop size. The processor(s) execute the operation with the CPU or the GPU based on the performance data.
    Type: Grant
    Filed: August 27, 2019
    Date of Patent: February 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Gita Koblents, Alon Shalev Housfater, Kazuaki Ishizaki, Akihiro Hayashi
  • Publication number: 20200341765
    Abstract: A method performed during execution of a compilation process for a program having nested loops is provided. The method replaces multiple conditional branch instructions for a processor which uses a conditional branch instruction limited to only comparing a value of a general register with a value of a special register that holds a loop counter value. The method generates, in replacement of the multiple conditional branch instructions, the conditional branch instruction limited to only comparing the value of the general register with the value of the special register that holds the loop counter value for the inner-most loop. The method adds (i) a register initialization outside the nested loops and (ii) a register value adjustment to the inner-most loop. The method defines the value for the general register for the register initialization and conditions for the generated conditional branch instruction, responsive to requirements of the multiple conditional branch instructions.
    Type: Application
    Filed: April 24, 2019
    Publication date: October 29, 2020
    Inventors: Eri Ogawa, Kazuaki Ishizaki, Hiroshi Inoue
  • Publication number: 20200310955
    Abstract: Methods and systems for training a neural network include determining a graph representation of a set of neural network training operations based on definition-use chains. A memory allocation queue is determined based on a slack value for each neural network training operation in the graph representation. Memory for each neural network training operation in the memory allocation queue is allocated. Execution of neural network training operations with non-zero slack is delayed to minimize an amount of memory allocated at any one time. Neural network training is executed using the allocated memory for each neural network training operation.
    Type: Application
    Filed: March 25, 2019
    Publication date: October 1, 2020
    Inventor: Kazuaki Ishizaki
  • Patent number: 10776090
    Abstract: A computer-implemented method and a computer program product are provided for converting a first object having a first data format to a second object having a second data format that is different from the first format in that the second data format requires an object header. The method includes adding the object header to the first object. The method further includes returning, as a pointer, an address of the added object header to a user defined function that uses the second object. The first object lacks pointers to other objects, and does not escape.
    Type: Grant
    Filed: January 16, 2018
    Date of Patent: September 15, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Kazuaki Ishizaki
  • Patent number: 10754754
    Abstract: A method for improving performance of a system including a first processor and a second processor includes obtaining a code region specified to be executed on the second processor, the code region including a plurality of instructions, calculating a performance improvement of executing at least one of the plurality of instructions included in the code region on the second processor over executing the at least one instruction on the first processor, removing the at least one instruction from the code region in response to a condition including that the performance improvement does not exceed a first threshold, and repeating the calculating and the removing to produce a modified code region specified to be executed on the second processor.
    Type: Grant
    Filed: November 3, 2017
    Date of Patent: August 25, 2020
    Assignee: International Business Machines Corporation
    Inventor: Kazuaki Ishizaki
  • Patent number: 10719494
    Abstract: Methods and a system are provided for accelerating an operation in a B+-tree. A method including forming triplets, by a triplet manager. Each of the triplets includes a pointer to a leaf node, a lower bound of a key on the leaf node, and an upper bound of the key on the leaf node. The method further includes performing, by the triplet manager, a lookup operation on the triplets responsive to the operation to avoid traversals of intermediate nodes for the operation. The method also includes executing, by a processor, the operation in the B+-tree while avoiding the traversals of the intermediate nodes for the operation responsive to a result of the lookup operation. The operation is any one of an insertion operation, a deletion operation, and a search operation.
    Type: Grant
    Filed: August 6, 2015
    Date of Patent: July 21, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Kazuaki Ishizaki
  • Publication number: 20200089477
    Abstract: A method includes identifying a code portion that accesses a primitive value in a user-defined function included in a user program, converting the code portion and an argument in a manner to directly reference an internal data representation of the user program, and generating a code for calling the user-defined function converted by the conversion.
    Type: Application
    Filed: November 18, 2019
    Publication date: March 19, 2020
    Inventors: Hiroshi Inoue, Kazuaki Ishizaki, Jan M. Wroblewski, Moriyoshi Ohara
  • Patent number: 10585647
    Abstract: A method includes identifying a code portion that accesses a primitive value in a user-defined function included in a user program, converting the code portion and an argument in a manner to directly reference an internal data representation of the user program, and generating a code for calling the user-defined function converted by the conversion.
    Type: Grant
    Filed: May 2, 2017
    Date of Patent: March 10, 2020
    Assignee: International Business Machines Corporation
    Inventors: Hiroshi Inoue, Kazuaki Ishizaki, Jan M. Wroblewski, Moriyoshi Ohara
  • Publication number: 20200065209
    Abstract: A method for improving performance of a system including a first processor and a second processor includes obtaining a code region specified to be executed on the second processor, the code region including a plurality of instructions, calculating a performance improvement of executing at least one of the plurality of instructions included in the code region on the second processor over executing the at least one instruction on the first processor, removing the at least one instruction from the code region in response to a condition including that the performance improvement does not exceed a first threshold, and repeating the calculating and the removing to produce a modified code region specified to be executed on the second processor.
    Type: Application
    Filed: October 28, 2019
    Publication date: February 27, 2020
    Inventor: Kazuaki Ishizaki
  • Publication number: 20200058095
    Abstract: A method is provided for buffer allocation on a graphics processing unit. The method includes analyzing, by the graphics processing unit, a program to be executed on the graphics processing unit to determine, for an object in the program, a set of elements in the object that are designated to be accessed during an execution of the program. The method further includes allocating, by the graphics processing unit, a placement of the object in a device buffer on the graphics processing unit based on the set of elements to minimize a number of memory accesses during the execution of the program.
    Type: Application
    Filed: October 28, 2019
    Publication date: February 20, 2020
    Inventor: Kazuaki Ishizaki
  • Patent number: 10540194
    Abstract: A method, computer program product, and system includes a processor(s) obtaining, during runtime, from a compiler, two versions of a data parallel loop for an operation. The host computing system comprises includes a CPU and a GPU is accessible to the host. The processor(s) online profiles the two versions by asynchronously executing the first version, in a profile mode, with the GPU and executing the second version, in the profile mode, with the CPU. The processor(s) generates execution times for the first version and the second version. The processor(s) stores the executions times and performance data in a storage, where the performance data comprises a size of the data parallel loop for the operation. The processor(s) update a regression model(s) to predict performance numbers for a process of an unknown loop size. The processor(s) execute the operation with the CPU or the GPU based on the performance data.
    Type: Grant
    Filed: December 21, 2017
    Date of Patent: January 21, 2020
    Assignee: International Business Machines Corporation
    Inventors: Gita Koblents, Alon Shalev Housfater, Kazuaki Ishizaki, Akihiro Hayashi
  • Patent number: 10515430
    Abstract: A method is provided for buffer allocation on a graphics processing unit. The method includes analyzing, by the graphics processing unit, a program to be executed on the graphics processing unit to determine, for an object in the program, a set of elements in the object that are designated to be accessed during an execution of the program. The method further includes allocating, by the graphics processing unit, a placement of the object in a device buffer on the graphics processing unit based on the set of elements to minimize a number of memory accesses during the execution of the program.
    Type: Grant
    Filed: November 3, 2015
    Date of Patent: December 24, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Kazuaki Ishizaki
  • Publication number: 20190384623
    Abstract: A method, computer program product, and system includes a processor(s) obtaining, during runtime, from a compiler, two versions of a data parallel loop for an operation. The host computing system comprises includes a CPU and a GPU is accessible to the host. The processor(s) online profiles the two versions by asynchronously executing the first version, in a profile mode, with the GPU and executing the second version, in the profile mode, with the CPU. The processor(s) generates execution times for the first version and the second version. The processor(s) stores the executions times and performance data in a storage, where the performance data comprises a size of the data parallel loop for the operation. The processor(s) update a regression model(s) to predict performance numbers for a process of an unknown loop size. The processor(s) execute the operation with the CPU or the GPU based on the performance data.
    Type: Application
    Filed: August 27, 2019
    Publication date: December 19, 2019
    Inventors: Gita Koblents, Alon Shalev Housfater, Kazuaki Ishizaki, Akihiro Hayashi