Patents by Inventor Luiz DeRose
Luiz DeRose has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10761820Abstract: A parallelization assistant tool system to assist in parallelization of a computer program is disclosed. The system directs the execution of instrumented code of the computer program to collect performance statistics information relating to execution of loops within the computer program. The system provides a user interface for presenting to a programmer the performance statistics information collected for a loop within the computer program so that the programmer can prioritize efforts to parallelize the computer program. The system generates inlined source code of a loop by aggressively inlining functions substantially without regard to compilation performance, execution performance, or both. The system analyzes the inlined source code to determine the data-sharing attributes of the variables of the loop. The system may generate compiler directives to specify the data-sharing attributes of the variables.Type: GrantFiled: December 22, 2015Date of Patent: September 1, 2020Assignee: Cray, Inc.Inventors: Heidi Poxon, John Levesque, Luiz DeRose, Brian H. Johnson
-
Patent number: 10698813Abstract: A system is provided for allocating memory for data of a program for execution by a computer system with a multi-tier memory that includes LBM and HBM. The system accesses a data structure map that maps data structures of the program to the memory addresses within an address space of the program to which the data structures are initially allocated. The system executes the program to collect statistics relating to memory requests and memory bandwidth utilization of the program. The system determines an extent to which each data structure is used by a high memory utilization portion of the program based on the data structure map and the collected statistics. The system generates a memory allocation plan that favors allocating data structures in HBM based on the extent to which the data structures are used by a high memory utilization portion of the program.Type: GrantFiled: July 12, 2018Date of Patent: June 30, 2020Assignee: Hewlett Packard Enterprise Development LPInventors: Heidi Lynn Poxon, William Homer, David W. Oehmke, Luiz DeRose, Clayton D. Andreasen, Sanyam Mehta
-
Publication number: 20190163637Abstract: A method for prefetching data into a cache is provided. The method allocates an outstanding request buffer (“ORB”). The method stores in an address field of the ORB an address and a number of blocks. The method issues prefetch requests for a degree number of blocks starting at the address. When a prefetch response is received for all the prefetch requests, the method adjusts the address of the next block to prefetch and adjusts the number of blocks remaining to be retrieved and then issues prefetch requests for a degree number of blocks starting at the adjusted address. The prefetching pauses when a maximum distance between the reads of the prefetched blocks and the last prefetched block is reached. When a read request for a prefetched block is received, the method resumes prefetching when a resume criterion is satisfied.Type: ApplicationFiled: March 6, 2018Publication date: May 30, 2019Inventors: Sanyam Mehta, James Robert Kohn, Daniel Jonathan Ernst, Heidi Lynn Poxon, Luiz DeRose
-
Patent number: 10303610Abstract: A method for prefetching data into a cache is provided. The method allocates an outstanding request buffer (“ORB”). The method stores in an address field of the ORB an address and a number of blocks. The method issues prefetch requests for a degree number of blocks starting at the address. When a prefetch response is received for all the prefetch requests, the method adjusts the address of the next block to prefetch and adjusts the number of blocks remaining to be retrieved and then issues prefetch requests for a degree number of blocks starting at the adjusted address. The prefetching pauses when a maximum distance between the reads of the prefetched blocks and the last prefetched block is reached. When a read request for a prefetched block is received, the method resumes prefetching when a resume criterion is satisfied.Type: GrantFiled: March 6, 2018Date of Patent: May 28, 2019Assignee: Cray, Inc.Inventors: Sanyam Mehta, James Robert Kohn, Daniel Jonathan Ernst, Heidi Lynn Poxon, Luiz DeRose
-
Publication number: 20190042435Abstract: A method for prefetching data into a cache is provided. The method allocates an outstanding request buffer (“ORB”). The method stores in an address field of the ORB an address and a number of blocks. The method issues prefetch requests for a degree number of blocks starting at the address. When a prefetch response is received for all the prefetch requests, the method adjusts the address of the next block to prefetch and adjusts the number of blocks remaining to be retrieved and then issues prefetch requests for a degree number of blocks starting at the adjusted address. The prefetching pauses when a maximum distance between the reads of the prefetched blocks and the last prefetched block is reached. When a read request for a prefetched block is received, the method resumes prefetching when a resume criterion is satisfied.Type: ApplicationFiled: March 6, 2018Publication date: February 7, 2019Inventors: Sanyam Mehta, James Robert Kohn, Daniel Jonathan Ernst, Heidi Lynn Poxon, Luiz DeRose
-
Patent number: 10185659Abstract: A system is provided for allocating memory for data of a program for execution by a computer system with a multi-tier memory that includes LBM and HBM. The system accesses a data structure map that maps data structures of the program to the memory addresses within an address space of the program to which the data structures are initially allocated. The system executes the program to collect statistics relating to memory requests and memory bandwidth utilization of the program. The system determines an extent to which each data structure is used by a high memory utilization portion of the program based on the data structure map and the collected statistics. The system generates a memory allocation plan that favors allocating data structures in HBM based on the extent to which the data structures are used by a high memory utilization portion of the program.Type: GrantFiled: December 9, 2016Date of Patent: January 22, 2019Assignee: Cray, Inc.Inventors: Heidi Lynn Poxon, William Homer, David W. Oehmke, Luiz DeRose, Clayton D. Andreasen, Sanyam Mehta
-
Publication number: 20180322064Abstract: A system is provided for allocating memory for data of a program for execution by a computer system with a multi-tier memory that includes LBM and HBM. The system accesses a data structure map that maps data structures of the program to the memory addresses within an address space of the program to which the data structures are initially allocated. The system executes the program to collect statistics relating to memory requests and memory bandwidth utilization of the program. The system determines an extent to which each data structure is used by a high memory utilization portion of the program based on the data structure map and the collected statistics. The system generates a memory allocation plan that favors allocating data structures in HBM based on the extent to which the data structures are used by a high memory utilization portion of the program.Type: ApplicationFiled: July 12, 2018Publication date: November 8, 2018Inventors: Heidi Lynn Poxon, William Homer, David W. Oehmke, Luiz DeRose, Clayton D. Andreasen, Sanyam Mehta
-
Publication number: 20180165209Abstract: A system is provided for allocating memory for data of a program for execution by a computer system with a multi-tier memory that includes LBM and HBM. The system accesses a data structure map that maps data structures of the program to the memory addresses within an address space of the program to which the data structures are initially allocated. The system executes the program to collect statistics relating to memory requests and memory bandwidth utilization of the program. The system determines an extent to which each data structure is used by a high memory utilization portion of the program based on the data structure map and the collected statistics. The system generates a memory allocation plan that favors allocating data structures in HBM based on the extent to which the data structures are used by a high memory utilization portion of the program.Type: ApplicationFiled: December 9, 2016Publication date: June 14, 2018Inventors: Heidi Lynn Poxon, William Homer, David W. Oehmke, Luiz DeRose, Clayton D. Andreasen, Sanyam Mehta
-
Patent number: 9946654Abstract: A method for prefetching data into a cache is provided. The method allocates an outstanding request buffer (“ORB”). The method stores in an address field of the ORB an address and a number of blocks. The method issues prefetch requests for a degree number of blocks starting at the address. When a prefetch response is received for all the prefetch requests, the method adjusts the address of the next block to prefetch and adjusts the number of blocks remaining to be retrieved and then issues prefetch requests for a degree number of blocks starting at the adjusted address. The prefetching pauses when a maximum distance between the reads of the prefetched blocks and the last prefetched block is reached. When a read request for a prefetched block is received, the method resumes prefetching when a resume criterion is satisfied.Type: GrantFiled: October 26, 2016Date of Patent: April 17, 2018Assignee: Cray Inc.Inventors: Sanyam Mehta, James Robert Kohn, Daniel Jonathan Ernst, Heidi Lynn Poxon, Luiz DeRose
-
Publication number: 20180074963Abstract: A method for prefetching data into a cache is provided. The method allocates an outstanding request buffer (“ORB”). The method stores in an address field of the ORB an address and a number of blocks. The method issues prefetch requests for a degree number of blocks starting at the address. When a prefetch response is received for all the prefetch requests, the method adjusts the address of the next block to prefetch and adjusts the number of blocks remaining to be retrieved and then issues prefetch requests for a degree number of blocks starting at the adjusted address. The prefetching pauses when a maximum distance between the reads of the prefetched blocks and the last prefetched block is reached. When a read request for a prefetched block is received, the method resumes prefetching when a resume criterion is satisfied.Type: ApplicationFiled: October 26, 2016Publication date: March 15, 2018Inventors: Sanyam Mehta, James Robert Kohn, Daniel Jonathan Ernst, Heidi Lynn Poxon, Luiz DeRose
-
Publication number: 20170206068Abstract: An optimization system to apply directives to a computer program without having to perform repeated front-end compilations of source code of the computer program is provided. In some embodiments, the optimization system performs a first compilation of the source code of the program to generate first front-end code and first back-end code of the computer program. The compilation includes a first front-end compilation and a first back-end compilation. The optimization system identifies a compiler directive to apply to a location within the first front-end code. The optimization system then performs a second back-end compilation of the first front-end code factoring in the compiler directive to generate second back-end code affected by the compiler directive.Type: ApplicationFiled: May 9, 2016Publication date: July 20, 2017Inventors: Brian H. Johnson, Heidi Poxon, Luiz DeRose, Gary W. Elsesser, Clayton D. Andreasen, John Levesque
-
Publication number: 20160110174Abstract: A parallelization assistant tool system to assist in parallelization of a computer program is disclosed. The system directs the execution of instrumented code of the computer program to collect performance statistics information relating to execution of loops within the computer program. The system provides a user interface for presenting to a programmer the performance statistics information collected for a loop within the computer program so that the programmer can prioritize efforts to parallelize the computer program. The system generates inlined source code of a loop by aggressively inlining functions substantially without regard to compilation performance, execution performance, or both. The system analyzes the inlined source code to determine the data-sharing attributes of the variables of the loop. The system may generate compiler directives to specify the data-sharing attributes of the variables.Type: ApplicationFiled: December 22, 2015Publication date: April 21, 2016Inventors: Heidi Poxon, John Levesque, Luiz DeRose, Brian H. Johnson
-
Patent number: 9250877Abstract: A parallelization assistant tool system to assist in parallelization of a computer program is disclosed. The system directs the execution of instrumented code of the computer program to collect performance statistics information relating to execution of loops within the computer program. The system provides a user interface for presenting to a programmer the performance statistics information collected for a loop within the computer program so that the programmer can prioritize efforts to parallelize the computer program. The system generates inlined source code of a loop by aggressively inlining functions substantially without regard to compilation performance, execution performance, or both. The system analyzes the inlined source code to determine the data-sharing attributes of the variables of the loop. The system may generate compiler directives to specify the data-sharing attributes of the variables.Type: GrantFiled: September 20, 2013Date of Patent: February 2, 2016Assignee: Cray Inc.Inventors: Heidi Poxon, John Levesque, Luiz DeRose, Brian H. Johnson
-
Publication number: 20150089468Abstract: A parallelization assistant tool system to assist in parallelization of a computer program is disclosed. The system directs the execution of instrumented code of the computer program to collect performance statistics information relating to execution of loops within the computer program. The system provides a user interface for presenting to a programmer the performance statistics information collected for a loop within the computer program so that the programmer can prioritize efforts to parallelize the computer program. The system generates inlined source code of a loop by aggressively inlining functions substantially without regard to compilation performance, execution performance, or both. The system analyzes the inlined source code to determine the data-sharing attributes of the variables of the loop. The system may generate compiler directives to specify the data-sharing attributes of the variables.Type: ApplicationFiled: September 20, 2013Publication date: March 26, 2015Applicant: Cray Inc.Inventors: Heidi Poxon, John Levesque, Luiz DeRose, Brian H. Johnson
-
Publication number: 20120311537Abstract: Systems and methods provide a display indicating performance characteristics of a computer application. The display may include a call graph having nodes that represent subunits of the application. A first set of statistics for the subunit may be represented in the size or dimensions of the node. A second set of statistics may be displayed in the interior of the node. A third set of statistics may be displayed in response to selecting the node.Type: ApplicationFiled: August 15, 2012Publication date: December 6, 2012Applicant: Cray Inc.Inventors: Luiz DeRose, Dean T. Johnson
-
Patent number: 8286135Abstract: Systems and methods provide a display indicating performance characteristics of a computer application. The display may include a call graph having nodes that represent subunits of the application. A first set of statistics for the subunit may be represented in the size or dimensions of the node. A second set of statistics may be displayed in the interior of the node. A third set of statistics may be displayed in response to selecting the node.Type: GrantFiled: October 17, 2007Date of Patent: October 9, 2012Assignee: Cray Inc.Inventors: Luiz DeRose, Dean T. Johnson
-
Publication number: 20080092121Abstract: Systems and methods provide a display indicating performance characteristics of a computer application. The display may include a call graph having nodes that represent subunits of the application. A first set of statistics for the subunit may be represented in the size or dimensions of the node. A second set of statistics may be displayed in the interior of the node. A third set of statistics may be displayed in response to selecting the node.Type: ApplicationFiled: October 17, 2007Publication date: April 17, 2008Inventors: Luiz DeRose, Dean Johnson
-
Patent number: 7308681Abstract: A method and apparatus for creating a compressed trace for a program, wherein events are compressed separately to provide improved compression and tracing. A sequence of events for a program is selected, and a sequence of values is then determined for each of the selected events occurring during an execution of the program. Each sequence of values is then compressed to generate a compressed sequence of values for each event. These values are then ordered in accordance with information stored in selected events (such as for example, branch events), where the ordered values correspond to the trace.Type: GrantFiled: October 28, 2003Date of Patent: December 11, 2007Assignee: International Business Machines CorporationInventors: Kattamuri Ekanadham, Pratap Pattnaik, Simone Sbaraglia, Luiz A. DeRose
-
Publication number: 20050091643Abstract: A method and apparatus for creating a compressed trace for a program, wherein events are compressed separately to provide improved compression and tracing. A sequence of events for a program is selected, and a sequence of values is then determined for each of the selected events occurring during an execution of the program. Each sequence of values is then compressed to generate a compressed sequence of values for each event. These values are then ordered in accordance with information stored in selected events (such as for example, branch events), where the ordered values correspond to the trace.Type: ApplicationFiled: October 28, 2003Publication date: April 28, 2005Applicant: International Business Machines CorporationInventors: Kattamuri Ekanadham, Pratap Pattnaik, Simone Sbaraglia, Luiz DeRose