Patents by Inventor Blake Hechtman

Blake Hechtman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9477599
    Abstract: A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read/write combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events since a store event may not need to reach main memory to complete.
    Type: Grant
    Filed: August 7, 2013
    Date of Patent: October 25, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Blake A. Hechtman, Bradford M. Beckmann
  • Patent number: 9436395
    Abstract: Central processing units (CPUs) in computing systems manage graphics processing units (GPUs), network processors, security co-processors, and other data heavy devices as buffered peripherals using device drivers. Unfortunately, as a result of large and latency-sensitive data transfers between CPUs and these external devices, and memory partitioned into kernel-access and user-access spaces, these schemes to manage peripherals may introduce latency and memory use inefficiencies. Proposed are schemes to reduce latency and redundant memory copies using virtual to physical page remapping while maintaining user/kernel level access abstractions.
    Type: Grant
    Filed: March 14, 2014
    Date of Patent: September 6, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Blake A. Hechtman, Shuai Che
  • Patent number: 9411652
    Abstract: Sharing tasks among compute units in a processor can increase the efficiency of the processor. When a compute unit does not have a task in its task memory to perform, donating tasks from other compute units can prevent the compute unit from being idle while there is task in other parts of the processor. It is desirable to share tasks among compute units that are within defined scopes of the processor. Compute units may share tasks by allowing other compute units to access their private memory, or by donating tasks to a shared memory.
    Type: Grant
    Filed: August 22, 2014
    Date of Patent: August 9, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Blake A. Hechtman, Derek R. Hower
  • Patent number: 9396112
    Abstract: A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read-only cache and write-only combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events and reduces overhead in maintaining write-only combining buffers.
    Type: Grant
    Filed: August 26, 2013
    Date of Patent: July 19, 2016
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Blake A. Hechtman, Bradford M. Beckmann
  • Patent number: 9361118
    Abstract: A method, computer program product, and system is described that determines the correctness of using memory operations in a computing device with heterogeneous computer components. Embodiments include an optimizer based on the characteristics of a Sequential Consistency for Heterogeneous-Race-Free (SC for HRF) model that analyzes a program and determines the correctness of the ordering of events in the program. HRF models include combinations of the properties: scope order, scope inclusion, and scope transitivity. The optimizer can determine when a program is heterogeneous-race-free in accordance with an SC for HRF memory consistency model. For example, the optimizer can analyze a portion of program code, respect the properties of the SC for HRF model, and determine whether a value produced by a store memory event will be a candidate for a value observed by a load memory event. In addition, the optimizer can determine whether reordering of events is possible.
    Type: Grant
    Filed: May 12, 2014
    Date of Patent: June 7, 2016
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Derek R. Hower, Mark D. Hill, David Wood, Steven K. Reinhardt, Benedict R. Gaster, Blake A. Hechtman, Bradford M. Beckmann
  • Publication number: 20160055033
    Abstract: Sharing tasks among compute units in a processor can increase the efficiency of the processor. When a compute unit does not have a task in its task memory to perform, donating tasks from other compute units can prevent the compute unit from being idle while there is task in other parts of the processor. It is desirable to share tasks among compute units that are within defined scopes of the processor. Compute units may share tasks by allowing other compute units to access their private memory, or by donating tasks to a shared memory.
    Type: Application
    Filed: August 22, 2014
    Publication date: February 25, 2016
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Blake A. HECHTMAN, Derek R. Hower
  • Publication number: 20150261457
    Abstract: Central processing units (CPUs) in computing systems manage graphics processing units (GPUs), network processors, security co-processors, and other data heavy devices as buffered peripherals using device drivers. Unfortunately, as a result of large and latency-sensitive data transfers between CPUs and these external devices, and memory partitioned into kernel-access and user-access spaces, these schemes to manage peripherals may introduce latency and memory use inefficiencies. Proposed are schemes to reduce latency and redundant memory copies using virtual to physical page remapping while maintaining user/kernel level access abstractions.
    Type: Application
    Filed: March 14, 2014
    Publication date: September 17, 2015
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Blake A. Hechtman, Shuai Che
  • Publication number: 20150106587
    Abstract: A processor remaps stored data and the corresponding memory addresses of the data for different processing units of a heterogeneous processor. The processor includes a data remap engine that changes the format of the data (that is, how the data is physically arranged in segments of memory) in response to a transfer of the data from system memory to a local memory hierarchy of an accelerated processing module (APM) of the processor. The APM's local memory hierarchy includes an address remap engine that remaps the memory addresses of the data at the local memory hierarchy so that the data can be accessed by routines at the APM that are unaware of the data remapping. By remapping the data, and the corresponding memory addresses, the APM can perform operations on the data more efficiently.
    Type: Application
    Filed: October 16, 2013
    Publication date: April 16, 2015
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shuai Che, Bradford Beckmann, Blake Hechtman
  • Publication number: 20150058567
    Abstract: A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read-only cache and write-only combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events and reduces overhead in maintaining write-only combining buffers.
    Type: Application
    Filed: August 26, 2013
    Publication date: February 26, 2015
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Blake A. Hechtman, Bradford M. Beckmann
  • Publication number: 20150046652
    Abstract: A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read/write combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events since a store event may not need to reach main memory to complete.
    Type: Application
    Filed: August 7, 2013
    Publication date: February 12, 2015
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Blake A. HECHTMAN, Bradford M. Beckmann
  • Publication number: 20140337587
    Abstract: A method, computer program product, and system is described that determines the correctness of using memory operations in a computing device with heterogeneous computer components. Embodiments include an optimizer based on the characteristics of a Sequential Consistency for Heterogeneous-Race-Free (SC for HRF) model that analyzes a program and determines the correctness of the ordering of events in the program. HRF models include combinations of the properties: scope order, scope inclusion, and scope transitivity. The optimizer can determine when a program is heterogeneous-race-free in accordance with an SC for HRF memory consistency model . For example, the optimizer can analyze a portion of program code, respect the properties of the SC for HRF model, and determine whether a value produced by a store memory event will be a candidate for a value observed by a load memory event. In addition, the optimizer can determine whether reordering of events is possible.
    Type: Application
    Filed: May 12, 2014
    Publication date: November 13, 2014
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Derek R. HOWER, Mark D. Hill, David Wood, Steven K. Reinhardt, Benedict R. Gaster, Blake A. Hechtman, Bradford M. Beckmann