Patents by Inventor Kazuaki Ishizaki
Kazuaki Ishizaki has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240094999Abstract: A computer-implemented method, system and computer program product for improving the performance of a program that manipulates two vectors of data. It is determined whether the program contains one of the following patterns: a first pattern corresponding to v0.rearrange(s, v1); a second pattern corresponding to v0.blend(v1, m); and a third pattern corresponding to v0.rearrange(s).blend(v1.rearrange(s), m). Upon identifying code written as the first pattern in the program, the first pattern is rewritten and replaced with the second or third pattern if the execution time of the program with the second or third pattern is less than the execution time of the program with the first program. In a similar manner, upon identifying code written as the second or third pattern in the program, the second or third pattern is rewritten and replaced with the first pattern if the execution time of the program can be improved.Type: ApplicationFiled: September 20, 2022Publication date: March 21, 2024Inventor: Kazuaki Ishizaki
-
Patent number: 11398004Abstract: A method is provided for buffer allocation on a graphics processing unit. The method includes analyzing, by the graphics processing unit, a program to be executed on the graphics processing unit to determine, for an object in the program, a set of elements in the object that are designated to be accessed during an execution of the program. The method further includes allocating, by the graphics processing unit, a placement of the object in a device buffer on the graphics processing unit based on the set of elements to minimize a number of memory accesses during the execution of the program.Type: GrantFiled: October 28, 2019Date of Patent: July 26, 2022Assignee: International Business Machines CorporationInventor: Kazuaki Ishizaki
-
Patent number: 11210193Abstract: A method for improving performance of a system including a first processor and a second processor includes obtaining a code region specified to be executed on the second processor, the code region including a plurality of instructions, calculating a performance improvement of executing at least one of the plurality of instructions included in the code region on the second processor over executing the at least one instruction on the first processor, removing the at least one instruction from the code region in response to a condition including that the performance improvement does not exceed a first threshold, and repeating the calculating and the removing to produce a modified code region specified to be executed on the second processor.Type: GrantFiled: October 28, 2019Date of Patent: December 28, 2021Assignee: International Business Machines CorporationInventor: Kazuaki Ishizaki
-
Patent number: 11188454Abstract: Methods and systems for training a neural network include determining a graph representation of a set of neural network training operations based on definition-use chains. A memory allocation queue is determined based on a slack value for each neural network training operation in the graph representation. Memory for each neural network training operation in the memory allocation queue is allocated. Execution of neural network training operations with non-zero slack is delayed to minimize an amount of memory allocated at any one time. Neural network training is executed using the allocated memory for each neural network training operation.Type: GrantFiled: March 25, 2019Date of Patent: November 30, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Kazuaki Ishizaki
-
Patent number: 11093224Abstract: A method performed during execution of a compilation process for a program having nested loops is provided. The method replaces multiple conditional branch instructions for a processor which uses a conditional branch instruction limited to only comparing a value of a general register with a value of a special register that holds a loop counter value. The method generates, in replacement of the multiple conditional branch instructions, the conditional branch instruction limited to only comparing the value of the general register with the value of the special register that holds the loop counter value for the inner-most loop. The method adds (i) a register initialization outside the nested loops and (ii) a register value adjustment to the inner-most loop. The method defines the value for the general register for the register initialization and conditions for the generated conditional branch instruction, responsive to requirements of the multiple conditional branch instructions.Type: GrantFiled: April 24, 2019Date of Patent: August 17, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Eri Ogawa, Kazuaki Ishizaki, Hiroshi Inoue
-
Patent number: 11042530Abstract: A computer-implemented method for improving performance of data processing with nullable schema information by using a data processing framework is presented. The method includes reading, by the processor, data from one or more blocks forming a column, where the data is stored in a database including the one or more blocks and determining, by the processor, whether any row in each block of the one or more blocks includes null data. The computer-implemented method further includes executing, by the data processing framework, optimized code if the block does not include null data and executing, by the data processing framework, non-optimized code if the block includes null data.Type: GrantFiled: January 17, 2018Date of Patent: June 22, 2021Assignee: International Business Machines CorporationInventors: Kazuaki Ishizaki, Takanori Ueda
-
Program optimization by converting code portions to directly reference internal data representations
Patent number: 11029924Abstract: A method includes identifying a code portion that accesses a primitive value in a user-defined function included in a user program, converting the code portion and an argument in a manner to directly reference an internal data representation of the user program, and generating a code for calling the user-defined function converted by the conversion.Type: GrantFiled: November 18, 2019Date of Patent: June 8, 2021Assignee: International Business Machines CorporationInventors: Hiroshi Inoue, Kazuaki Ishizaki, Jan M. Wroblewski, Moriyoshi Ohara -
Patent number: 10929161Abstract: A method, computer program product, and system includes a processor(s) obtaining, during runtime, from a compiler, two versions of a data parallel loop for an operation. The host computing system comprises includes a CPU and a GPU is accessible to the host. The processor(s) online profiles the two versions by asynchronously executing the first version, in a profile mode, with the GPU and executing the second version, in the profile mode, with the CPU. The processor(s) generates execution times for the first version and the second version. The processor(s) stores the executions times and performance data in a storage, where the performance data comprises a size of the data parallel loop for the operation. The processor(s) update a regression model(s) to predict performance numbers for a process of an unknown loop size. The processor(s) execute the operation with the CPU or the GPU based on the performance data.Type: GrantFiled: August 27, 2019Date of Patent: February 23, 2021Assignee: International Business Machines CorporationInventors: Gita Koblents, Alon Shalev Housfater, Kazuaki Ishizaki, Akihiro Hayashi
-
Publication number: 20200341765Abstract: A method performed during execution of a compilation process for a program having nested loops is provided. The method replaces multiple conditional branch instructions for a processor which uses a conditional branch instruction limited to only comparing a value of a general register with a value of a special register that holds a loop counter value. The method generates, in replacement of the multiple conditional branch instructions, the conditional branch instruction limited to only comparing the value of the general register with the value of the special register that holds the loop counter value for the inner-most loop. The method adds (i) a register initialization outside the nested loops and (ii) a register value adjustment to the inner-most loop. The method defines the value for the general register for the register initialization and conditions for the generated conditional branch instruction, responsive to requirements of the multiple conditional branch instructions.Type: ApplicationFiled: April 24, 2019Publication date: October 29, 2020Inventors: Eri Ogawa, Kazuaki Ishizaki, Hiroshi Inoue
-
Publication number: 20200310955Abstract: Methods and systems for training a neural network include determining a graph representation of a set of neural network training operations based on definition-use chains. A memory allocation queue is determined based on a slack value for each neural network training operation in the graph representation. Memory for each neural network training operation in the memory allocation queue is allocated. Execution of neural network training operations with non-zero slack is delayed to minimize an amount of memory allocated at any one time. Neural network training is executed using the allocated memory for each neural network training operation.Type: ApplicationFiled: March 25, 2019Publication date: October 1, 2020Inventor: Kazuaki Ishizaki
-
Patent number: 10776090Abstract: A computer-implemented method and a computer program product are provided for converting a first object having a first data format to a second object having a second data format that is different from the first format in that the second data format requires an object header. The method includes adding the object header to the first object. The method further includes returning, as a pointer, an address of the added object header to a user defined function that uses the second object. The first object lacks pointers to other objects, and does not escape.Type: GrantFiled: January 16, 2018Date of Patent: September 15, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Kazuaki Ishizaki
-
Patent number: 10754754Abstract: A method for improving performance of a system including a first processor and a second processor includes obtaining a code region specified to be executed on the second processor, the code region including a plurality of instructions, calculating a performance improvement of executing at least one of the plurality of instructions included in the code region on the second processor over executing the at least one instruction on the first processor, removing the at least one instruction from the code region in response to a condition including that the performance improvement does not exceed a first threshold, and repeating the calculating and the removing to produce a modified code region specified to be executed on the second processor.Type: GrantFiled: November 3, 2017Date of Patent: August 25, 2020Assignee: International Business Machines CorporationInventor: Kazuaki Ishizaki
-
Patent number: 10719494Abstract: Methods and a system are provided for accelerating an operation in a B+-tree. A method including forming triplets, by a triplet manager. Each of the triplets includes a pointer to a leaf node, a lower bound of a key on the leaf node, and an upper bound of the key on the leaf node. The method further includes performing, by the triplet manager, a lookup operation on the triplets responsive to the operation to avoid traversals of intermediate nodes for the operation. The method also includes executing, by a processor, the operation in the B+-tree while avoiding the traversals of the intermediate nodes for the operation responsive to a result of the lookup operation. The operation is any one of an insertion operation, a deletion operation, and a search operation.Type: GrantFiled: August 6, 2015Date of Patent: July 21, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Kazuaki Ishizaki
-
PROGRAM OPTIMIZATION BY CONVERTING CODE PORTIONS TO DIRECTLY REFERENCE INTERNAL DATA REPRESENTATIONS
Publication number: 20200089477Abstract: A method includes identifying a code portion that accesses a primitive value in a user-defined function included in a user program, converting the code portion and an argument in a manner to directly reference an internal data representation of the user program, and generating a code for calling the user-defined function converted by the conversion.Type: ApplicationFiled: November 18, 2019Publication date: March 19, 2020Inventors: Hiroshi Inoue, Kazuaki Ishizaki, Jan M. Wroblewski, Moriyoshi Ohara -
Program optimization by converting code portions to directly reference internal data representations
Patent number: 10585647Abstract: A method includes identifying a code portion that accesses a primitive value in a user-defined function included in a user program, converting the code portion and an argument in a manner to directly reference an internal data representation of the user program, and generating a code for calling the user-defined function converted by the conversion.Type: GrantFiled: May 2, 2017Date of Patent: March 10, 2020Assignee: International Business Machines CorporationInventors: Hiroshi Inoue, Kazuaki Ishizaki, Jan M. Wroblewski, Moriyoshi Ohara -
Publication number: 20200065209Abstract: A method for improving performance of a system including a first processor and a second processor includes obtaining a code region specified to be executed on the second processor, the code region including a plurality of instructions, calculating a performance improvement of executing at least one of the plurality of instructions included in the code region on the second processor over executing the at least one instruction on the first processor, removing the at least one instruction from the code region in response to a condition including that the performance improvement does not exceed a first threshold, and repeating the calculating and the removing to produce a modified code region specified to be executed on the second processor.Type: ApplicationFiled: October 28, 2019Publication date: February 27, 2020Inventor: Kazuaki Ishizaki
-
Publication number: 20200058095Abstract: A method is provided for buffer allocation on a graphics processing unit. The method includes analyzing, by the graphics processing unit, a program to be executed on the graphics processing unit to determine, for an object in the program, a set of elements in the object that are designated to be accessed during an execution of the program. The method further includes allocating, by the graphics processing unit, a placement of the object in a device buffer on the graphics processing unit based on the set of elements to minimize a number of memory accesses during the execution of the program.Type: ApplicationFiled: October 28, 2019Publication date: February 20, 2020Inventor: Kazuaki Ishizaki
-
Patent number: 10540194Abstract: A method, computer program product, and system includes a processor(s) obtaining, during runtime, from a compiler, two versions of a data parallel loop for an operation. The host computing system comprises includes a CPU and a GPU is accessible to the host. The processor(s) online profiles the two versions by asynchronously executing the first version, in a profile mode, with the GPU and executing the second version, in the profile mode, with the CPU. The processor(s) generates execution times for the first version and the second version. The processor(s) stores the executions times and performance data in a storage, where the performance data comprises a size of the data parallel loop for the operation. The processor(s) update a regression model(s) to predict performance numbers for a process of an unknown loop size. The processor(s) execute the operation with the CPU or the GPU based on the performance data.Type: GrantFiled: December 21, 2017Date of Patent: January 21, 2020Assignee: International Business Machines CorporationInventors: Gita Koblents, Alon Shalev Housfater, Kazuaki Ishizaki, Akihiro Hayashi
-
Patent number: 10515430Abstract: A method is provided for buffer allocation on a graphics processing unit. The method includes analyzing, by the graphics processing unit, a program to be executed on the graphics processing unit to determine, for an object in the program, a set of elements in the object that are designated to be accessed during an execution of the program. The method further includes allocating, by the graphics processing unit, a placement of the object in a device buffer on the graphics processing unit based on the set of elements to minimize a number of memory accesses during the execution of the program.Type: GrantFiled: November 3, 2015Date of Patent: December 24, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Kazuaki Ishizaki
-
Publication number: 20190384623Abstract: A method, computer program product, and system includes a processor(s) obtaining, during runtime, from a compiler, two versions of a data parallel loop for an operation. The host computing system comprises includes a CPU and a GPU is accessible to the host. The processor(s) online profiles the two versions by asynchronously executing the first version, in a profile mode, with the GPU and executing the second version, in the profile mode, with the CPU. The processor(s) generates execution times for the first version and the second version. The processor(s) stores the executions times and performance data in a storage, where the performance data comprises a size of the data parallel loop for the operation. The processor(s) update a regression model(s) to predict performance numbers for a process of an unknown loop size. The processor(s) execute the operation with the CPU or the GPU based on the performance data.Type: ApplicationFiled: August 27, 2019Publication date: December 19, 2019Inventors: Gita Koblents, Alon Shalev Housfater, Kazuaki Ishizaki, Akihiro Hayashi