Patents by Inventor Lee W. Howes

Lee W. Howes has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10467013
    Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.
    Type: Grant
    Filed: November 29, 2018
    Date of Patent: November 5, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
  • Publication number: 20190146799
    Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.
    Type: Application
    Filed: November 29, 2018
    Publication date: May 16, 2019
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Lee W. HOWES, Benedict R. GASTER, Michael C. HOUSTON
  • Patent number: 10235220
    Abstract: A system, method, and computer program product are provided for improving resource utilization of multithreaded applications. Rather than requiring threads to block while waiting for data from a channel or requiring context switching to minimize blocking, the techniques disclosed herein provide an event-driven approach to launch kernels only when needed to perform operations on channel data, and then terminate in order to free resources. These operations are handled efficiently in hardware, but are flexible enough to be implemented in all manner of programming models.
    Type: Grant
    Filed: September 7, 2012
    Date of Patent: March 19, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Lee W. Howes, Benedict R. Gaster, Michael Clair Houston, Michael Mantor
  • Patent number: 10146549
    Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.
    Type: Grant
    Filed: November 6, 2017
    Date of Patent: December 4, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
  • Publication number: 20180129504
    Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.
    Type: Application
    Filed: November 6, 2017
    Publication date: May 10, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
  • Publication number: 20180060124
    Abstract: With the success of programming models such as OpenCL and CUDA, heterogeneous computing platforms are becoming mainstream. However, these heterogeneous systems are low-level, not composable, and their behavior is often implementation defined even for standardized programming models. In contrast, the method and system embodiments for the heterogeneous parallel primitives (HPP) programming model disclosed herein provide a flexible and composable programming platform that guarantees behavior even in the case of developing high-performance code.
    Type: Application
    Filed: October 30, 2017
    Publication date: March 1, 2018
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Benedict R. Gaster, Lee W. Howes
  • Patent number: 9811343
    Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.
    Type: Grant
    Filed: May 26, 2017
    Date of Patent: November 7, 2017
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
  • Publication number: 20170262289
    Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.
    Type: Application
    Filed: May 26, 2017
    Publication date: September 14, 2017
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
  • Patent number: 9697003
    Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.
    Type: Grant
    Filed: June 7, 2013
    Date of Patent: July 4, 2017
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
  • Patent number: 9594595
    Abstract: A system and methods embodying some aspects of the present embodiments for efficient load balancing using predication flags are provided. The load balancing system includes a first processing unit, a second processing unit, and a shared queue. The first processing unit is in communication with a first queue. The second processing unit is in communication with a second queue. The first and second queues are each configured to hold a packet. The shared queue is configured to maintain a work assignment, wherein the work assignment is to be processed by either the first or second processing unit.
    Type: Grant
    Filed: May 17, 2013
    Date of Patent: March 14, 2017
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Vinod Tipparaju, Lee W. Howes, Thomas Scogland
  • Patent number: 9424099
    Abstract: Disclosed methods, systems, and computer program products embodiments include synchronizing a group of workitems on a processor by storing a respective program counter associated with each of the workitems, selecting at least one first workitem from the group for execution, and executing the selected at least one first workitem on the processor. The selecting is based upon the respective stored program counter associated with the at least one first workitem.
    Type: Grant
    Filed: November 8, 2012
    Date of Patent: August 23, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael C. Houston, Benedict R. Gaster, Lee W. Howes, Michael Mantor, Dominik Behr
  • Patent number: 9262139
    Abstract: A method, a system, and a non-transitory computer readable medium for parallelizing computer program code including a loop are presented. An intermediate language version of the computer program code is generated based on a parallel type of the loop, wherein the intermediate language version includes information about parallelism in the computer program code. The intermediate language version is optimized at runtime based on the device characteristics where the computer program code is to be executed. The parallel type may include a thread parallel type, wherein the loop is dispatched to multiple threads for execution, or a general parallel type, wherein the loop is dispatched to a single thread and may be vectorized for execution. The intermediate language version may be saved separate from the computer program code.
    Type: Grant
    Filed: January 7, 2013
    Date of Patent: February 16, 2016
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Lee W. Howes, Dongping Zhang
  • Publication number: 20160034304
    Abstract: A system and methods embodying some aspects of the present embodiments for maintaining compact in-order queues are provided. The queue management method includes requesting a work pointer from a primary queue, wherein the work pointer points to a work assignment comprising an indirect queue and a dependency list; responsive to the dependency list not being cleared, invalidating the work pointer in the primary queue and adding a new pointer to the end of the primary queue, the new pointer configured to point to the work assignment; and responsive to the dependency list being clear, removing the work pointer from the primary queue and performing work in the indirect queue.
    Type: Application
    Filed: July 29, 2014
    Publication date: February 4, 2016
    Inventors: Vinod TIPPARAJU, Lee W. HOWES, Thomas R.W. SCOGLAND
  • Patent number: 9244828
    Abstract: In a computing system, memory may be managed by using a distributed array, which is a global set of local memory regions. A segment in the distributed array is allocated and is bound to a physical memory region. The segment is used by a workgroup in a dispatched data parallel kernel, wherein a workgroup includes one or more work items. When the distributed array is declared, parameters of the distributed array may be defined. The parameters may include an indication whether the distributed array is persistent (data written to the distributed array during one parallel dispatch is accessible by work items in a subsequent dispatch) or an indication whether the distributed array is shared (nested kernels may access the distributed array). The segment may be deallocated after it has been used.
    Type: Grant
    Filed: February 15, 2012
    Date of Patent: January 26, 2016
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Benedict R. Gaster, Lee W. Howes
  • Patent number: 8966461
    Abstract: A medium, method, and apparatus are disclosed for eliding superfluous function invocations in a vector-processing environment. A compiler receives program code comprising a width-contingent invocation of a function. The compiler creates a width-specific executable version of the program code by determining a vector width of a target computer system and omitting the function from the width-specific executable if the vector width meets one or more criteria. For example, the compiler may omit the function call if the vector width is greater than a minimum size.
    Type: Grant
    Filed: September 29, 2011
    Date of Patent: February 24, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Benedict R. Gaster, Lee W. Howes, Mark D. Hummel
  • Publication number: 20140365752
    Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.
    Type: Application
    Filed: June 7, 2013
    Publication date: December 11, 2014
    Inventors: Lee W. HOWES, Benedict R. GASTER, Michael C. HOUSTON
  • Publication number: 20140344830
    Abstract: A system and methods embodying some aspects of the present embodiments for efficient load balancing using predication flags are provided. The load balancing system includes a first processing unit, a second processing unit, and a shared queue. The first processing unit is in communication with a first queue. The second processing unit is in communication with a second queue. The first and second queues are each configured to hold a packet. The shared queue is configured to maintain a work assignment, wherein the work assignment is to be processed by either the first or second processing unit.
    Type: Application
    Filed: May 17, 2013
    Publication date: November 20, 2014
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Vinod TIPPARAJU, Lee W. Howes, Thomas Scogland
  • Publication number: 20140196016
    Abstract: A method, a system, and a non-transitory computer readable medium for parallelizing computer program code including a loop are presented. An intermediate language version of the computer program code is generated based on a parallel type of the loop, wherein the intermediate language version includes information about parallelism in the computer program code. The intermediate language version is optimized at runtime based on the device characteristics where the computer program code is to be executed. The parallel type may include a thread parallel type, wherein the loop is dispatched to multiple threads for execution, or a general parallel type, wherein the loop is dispatched to a single thread and may be vectorized for execution. The intermediate language version may be saved separate from the computer program code.
    Type: Application
    Filed: January 7, 2013
    Publication date: July 10, 2014
    Applicant: ADVANCED MICRO DEVICES, INC.
    Inventors: Lee W. Howes, Dongping Zhang
  • Publication number: 20140157287
    Abstract: Methods, systems, and computer readable storage media embodiments allow for low overhead context switching of threads. In embodiments, applications, such as, but not limited to, iterative data-parallel applications, substantially reduce the overhead of context switching by adding a user or higher-level program configurability of a state to be saved upon preempting of a executing thread. These methods, systems, and computer readable storage media include aspects of running a group of threads on a processor, saving state information by respective threads in the group in response to a signal from a scheduler, and pre-empting running of the group after the saving of the state information.
    Type: Application
    Filed: November 30, 2012
    Publication date: June 5, 2014
    Applicant: Advanced Micro Devices, Inc
    Inventors: Lee W. HOWES, Benedict R. GASTER, Michael MANTOR
  • Publication number: 20130332937
    Abstract: With the success of programming models such as OpenCL and CUDA, heterogeneous computing platforms are becoming mainstream. However, these heterogeneous systems are low-level, not composable, and their behavior is often implementation defined even for standardized programming models. In contrast, the method and system embodiments for the heterogeneous parallel primitives (HPP) programming model disclosed herein provide a flexible and composable programming platform that guarantees behavior even in the case of developing high-performance code.
    Type: Application
    Filed: May 29, 2013
    Publication date: December 12, 2013
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Benedict R. GASTER, Lee W. Howes