Patents by Inventor Benedict R. Gaster
Benedict R. Gaster has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11231962Abstract: With the success of programming models such as OpenCL and CUDA, heterogeneous computing platforms are becoming mainstream. However, these heterogeneous systems are low-level, not composable, and their behavior is often implementation defined even for standardized programming models. In contrast, the method and system embodiments for the heterogeneous parallel primitives (HPP) programming model disclosed herein provide a flexible and composable programming platform that guarantees behavior even in the case of developing high-performance code.Type: GrantFiled: October 30, 2017Date of Patent: January 25, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Benedict R. Gaster, Lee W. Howes
-
Patent number: 10467013Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.Type: GrantFiled: November 29, 2018Date of Patent: November 5, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
-
Patent number: 10360652Abstract: A processor comprising hardware logic configured to execute of a first wavefront in a hardware resource and stop execution of the first wavefront before the first wavefront completes. The processor schedules a second wavefront for execution in the hardware resource.Type: GrantFiled: June 13, 2014Date of Patent: July 23, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Marc S. Orr, Bradford M. Beckmann, Benedict R. Gaster, Steven K. Reinhardt, David A. Wood
-
Publication number: 20190146799Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.Type: ApplicationFiled: November 29, 2018Publication date: May 16, 2019Applicant: ADVANCED MICRO DEVICES, INC.Inventors: Lee W. HOWES, Benedict R. GASTER, Michael C. HOUSTON
-
Patent number: 10235220Abstract: A system, method, and computer program product are provided for improving resource utilization of multithreaded applications. Rather than requiring threads to block while waiting for data from a channel or requiring context switching to minimize blocking, the techniques disclosed herein provide an event-driven approach to launch kernels only when needed to perform operations on channel data, and then terminate in order to free resources. These operations are handled efficiently in hardware, but are flexible enough to be implemented in all manner of programming models.Type: GrantFiled: September 7, 2012Date of Patent: March 19, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Lee W. Howes, Benedict R. Gaster, Michael Clair Houston, Michael Mantor
-
Patent number: 10146549Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.Type: GrantFiled: November 6, 2017Date of Patent: December 4, 2018Assignee: Advanced Micro Devices, Inc.Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
-
Publication number: 20180129504Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.Type: ApplicationFiled: November 6, 2017Publication date: May 10, 2018Applicant: Advanced Micro Devices, Inc.Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
-
Publication number: 20180060124Abstract: With the success of programming models such as OpenCL and CUDA, heterogeneous computing platforms are becoming mainstream. However, these heterogeneous systems are low-level, not composable, and their behavior is often implementation defined even for standardized programming models. In contrast, the method and system embodiments for the heterogeneous parallel primitives (HPP) programming model disclosed herein provide a flexible and composable programming platform that guarantees behavior even in the case of developing high-performance code.Type: ApplicationFiled: October 30, 2017Publication date: March 1, 2018Applicant: Advanced Micro Devices, Inc.Inventors: Benedict R. Gaster, Lee W. Howes
-
Patent number: 9811343Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.Type: GrantFiled: May 26, 2017Date of Patent: November 7, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
-
Publication number: 20170262289Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.Type: ApplicationFiled: May 26, 2017Publication date: September 14, 2017Applicant: Advanced Micro Devices, Inc.Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
-
Patent number: 9697003Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.Type: GrantFiled: June 7, 2013Date of Patent: July 4, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
-
Patent number: 9424009Abstract: Some embodiments include a processing subsystem that compiles program code to generate compiled program code. In these embodiments, while compiling the program code, the processing subsystem first identifies a pointer in the program code that points to an unspecified address space. The processing subsystem then analyzes at least a portion of the program code to determine one or more address spaces to which the pointer may point. Next, the processor updates metadata for the pointer to indicate the one or more address spaces to which the pointer may point, the metadata enabling a determination of an address space to which the pointer points during subsequent execution of the compiled program code.Type: GrantFiled: May 28, 2015Date of Patent: August 23, 2016Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Benedict R. Gaster
-
Patent number: 9424099Abstract: Disclosed methods, systems, and computer program products embodiments include synchronizing a group of workitems on a processor by storing a respective program counter associated with each of the workitems, selecting at least one first workitem from the group for execution, and executing the selected at least one first workitem on the processor. The selecting is based upon the respective stored program counter associated with the at least one first workitem.Type: GrantFiled: November 8, 2012Date of Patent: August 23, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Michael C. Houston, Benedict R. Gaster, Lee W. Howes, Michael Mantor, Dominik Behr
-
Patent number: 9361118Abstract: A method, computer program product, and system is described that determines the correctness of using memory operations in a computing device with heterogeneous computer components. Embodiments include an optimizer based on the characteristics of a Sequential Consistency for Heterogeneous-Race-Free (SC for HRF) model that analyzes a program and determines the correctness of the ordering of events in the program. HRF models include combinations of the properties: scope order, scope inclusion, and scope transitivity. The optimizer can determine when a program is heterogeneous-race-free in accordance with an SC for HRF memory consistency model. For example, the optimizer can analyze a portion of program code, respect the properties of the SC for HRF model, and determine whether a value produced by a store memory event will be a candidate for a value observed by a load memory event. In addition, the optimizer can determine whether reordering of events is possible.Type: GrantFiled: May 12, 2014Date of Patent: June 7, 2016Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Derek R. Hower, Mark D. Hill, David Wood, Steven K. Reinhardt, Benedict R. Gaster, Blake A. Hechtman, Bradford M. Beckmann
-
Allocating memory and using the allocated memory in a workgroup in a dispatched data parallel kernel
Patent number: 9244828Abstract: In a computing system, memory may be managed by using a distributed array, which is a global set of local memory regions. A segment in the distributed array is allocated and is bound to a physical memory region. The segment is used by a workgroup in a dispatched data parallel kernel, wherein a workgroup includes one or more work items. When the distributed array is declared, parameters of the distributed array may be defined. The parameters may include an indication whether the distributed array is persistent (data written to the distributed array during one parallel dispatch is accessible by work items in a subsequent dispatch) or an indication whether the distributed array is shared (nested kernels may access the distributed array). The segment may be deallocated after it has been used.Type: GrantFiled: February 15, 2012Date of Patent: January 26, 2016Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Benedict R. Gaster, Lee W. Howes -
Publication number: 20150363903Abstract: A processor comprising hardware logic configured to execute of a first wavefront in a hardware resource and stop execution of the first wavefront before the first wavefront completes. The processor schedules a second wavefront for execution in the hardware resource.Type: ApplicationFiled: June 13, 2014Publication date: December 17, 2015Inventors: Marc S. Orr, Bradford M. Beckmann, Benedict R. Gaster, Steven K. Reinhardt, David A. Wood
-
Publication number: 20150261511Abstract: Some embodiments include a processing subsystem that compiles program code to generate compiled program code. In these embodiments, while compiling the program code, the processing subsystem first identifies a pointer in the program code that points to an unspecified address space. The processing subsystem then analyzes at least a portion of the program code to determine one or more address spaces to which the pointer may point. Next, the processor updates metadata for the pointer to indicate the one or more address spaces to which the pointer may point, the metadata enabling a determination of an address space to which the pointer points during subsequent execution of the compiled program code.Type: ApplicationFiled: May 28, 2015Publication date: September 17, 2015Inventor: Benedict R. Gaster
-
Patent number: 9058192Abstract: Some embodiments include a processing subsystem that compiles program code to generate compiled program code. In these embodiments, while compiling the program code, the processing subsystem first identifies a pointer in the program code that points to an unspecified address space. The processing subsystem then analyzes at least a portion of the program code to determine one or more address spaces to which the pointer may point. Next, the processor updates metadata for the pointer to indicate the one or more address spaces to which the pointer may point, the metadata enabling a determination of an address space to which the pointer points during subsequent execution of the compiled program code.Type: GrantFiled: October 1, 2012Date of Patent: June 16, 2015Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Benedict R. Gaster
-
Patent number: 8966461Abstract: A medium, method, and apparatus are disclosed for eliding superfluous function invocations in a vector-processing environment. A compiler receives program code comprising a width-contingent invocation of a function. The compiler creates a width-specific executable version of the program code by determining a vector width of a target computer system and omitting the function from the width-specific executable if the vector width meets one or more criteria. For example, the compiler may omit the function call if the vector width is greater than a minimum size.Type: GrantFiled: September 29, 2011Date of Patent: February 24, 2015Assignee: Advanced Micro Devices, Inc.Inventors: Benedict R. Gaster, Lee W. Howes, Mark D. Hummel
-
Publication number: 20140365752Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.Type: ApplicationFiled: June 7, 2013Publication date: December 11, 2014Inventors: Lee W. HOWES, Benedict R. GASTER, Michael C. HOUSTON