Patents by Inventor Blake A. Hechtman
Blake A. Hechtman has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 9477599Abstract: A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read/write combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events since a store event may not need to reach main memory to complete.Type: GrantFiled: August 7, 2013Date of Patent: October 25, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Blake A. Hechtman, Bradford M. Beckmann
-
Patent number: 9436395Abstract: Central processing units (CPUs) in computing systems manage graphics processing units (GPUs), network processors, security co-processors, and other data heavy devices as buffered peripherals using device drivers. Unfortunately, as a result of large and latency-sensitive data transfers between CPUs and these external devices, and memory partitioned into kernel-access and user-access spaces, these schemes to manage peripherals may introduce latency and memory use inefficiencies. Proposed are schemes to reduce latency and redundant memory copies using virtual to physical page remapping while maintaining user/kernel level access abstractions.Type: GrantFiled: March 14, 2014Date of Patent: September 6, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Blake A. Hechtman, Shuai Che
-
Patent number: 9411652Abstract: Sharing tasks among compute units in a processor can increase the efficiency of the processor. When a compute unit does not have a task in its task memory to perform, donating tasks from other compute units can prevent the compute unit from being idle while there is task in other parts of the processor. It is desirable to share tasks among compute units that are within defined scopes of the processor. Compute units may share tasks by allowing other compute units to access their private memory, or by donating tasks to a shared memory.Type: GrantFiled: August 22, 2014Date of Patent: August 9, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Blake A. Hechtman, Derek R. Hower
-
Patent number: 9396112Abstract: A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read-only cache and write-only combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events and reduces overhead in maintaining write-only combining buffers.Type: GrantFiled: August 26, 2013Date of Patent: July 19, 2016Assignee: Advanced Micro Devices, Inc.Inventors: Blake A. Hechtman, Bradford M. Beckmann
-
Patent number: 9361118Abstract: A method, computer program product, and system is described that determines the correctness of using memory operations in a computing device with heterogeneous computer components. Embodiments include an optimizer based on the characteristics of a Sequential Consistency for Heterogeneous-Race-Free (SC for HRF) model that analyzes a program and determines the correctness of the ordering of events in the program. HRF models include combinations of the properties: scope order, scope inclusion, and scope transitivity. The optimizer can determine when a program is heterogeneous-race-free in accordance with an SC for HRF memory consistency model. For example, the optimizer can analyze a portion of program code, respect the properties of the SC for HRF model, and determine whether a value produced by a store memory event will be a candidate for a value observed by a load memory event. In addition, the optimizer can determine whether reordering of events is possible.Type: GrantFiled: May 12, 2014Date of Patent: June 7, 2016Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Derek R. Hower, Mark D. Hill, David Wood, Steven K. Reinhardt, Benedict R. Gaster, Blake A. Hechtman, Bradford M. Beckmann
-
Publication number: 20160055033Abstract: Sharing tasks among compute units in a processor can increase the efficiency of the processor. When a compute unit does not have a task in its task memory to perform, donating tasks from other compute units can prevent the compute unit from being idle while there is task in other parts of the processor. It is desirable to share tasks among compute units that are within defined scopes of the processor. Compute units may share tasks by allowing other compute units to access their private memory, or by donating tasks to a shared memory.Type: ApplicationFiled: August 22, 2014Publication date: February 25, 2016Applicant: Advanced Micro Devices, Inc.Inventors: Blake A. HECHTMAN, Derek R. Hower
-
Publication number: 20150261457Abstract: Central processing units (CPUs) in computing systems manage graphics processing units (GPUs), network processors, security co-processors, and other data heavy devices as buffered peripherals using device drivers. Unfortunately, as a result of large and latency-sensitive data transfers between CPUs and these external devices, and memory partitioned into kernel-access and user-access spaces, these schemes to manage peripherals may introduce latency and memory use inefficiencies. Proposed are schemes to reduce latency and redundant memory copies using virtual to physical page remapping while maintaining user/kernel level access abstractions.Type: ApplicationFiled: March 14, 2014Publication date: September 17, 2015Applicant: Advanced Micro Devices, Inc.Inventors: Blake A. Hechtman, Shuai Che
-
Publication number: 20150106587Abstract: A processor remaps stored data and the corresponding memory addresses of the data for different processing units of a heterogeneous processor. The processor includes a data remap engine that changes the format of the data (that is, how the data is physically arranged in segments of memory) in response to a transfer of the data from system memory to a local memory hierarchy of an accelerated processing module (APM) of the processor. The APM's local memory hierarchy includes an address remap engine that remaps the memory addresses of the data at the local memory hierarchy so that the data can be accessed by routines at the APM that are unaware of the data remapping. By remapping the data, and the corresponding memory addresses, the APM can perform operations on the data more efficiently.Type: ApplicationFiled: October 16, 2013Publication date: April 16, 2015Applicant: Advanced Micro Devices, Inc.Inventors: Shuai Che, Bradford Beckmann, Blake Hechtman
-
Publication number: 20150058567Abstract: A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read-only cache and write-only combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events and reduces overhead in maintaining write-only combining buffers.Type: ApplicationFiled: August 26, 2013Publication date: February 26, 2015Applicant: Advanced Micro Devices, Inc.Inventors: Blake A. Hechtman, Bradford M. Beckmann
-
Publication number: 20150046652Abstract: A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read/write combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events since a store event may not need to reach main memory to complete.Type: ApplicationFiled: August 7, 2013Publication date: February 12, 2015Applicant: Advanced Micro Devices, Inc.Inventors: Blake A. HECHTMAN, Bradford M. Beckmann
-
Publication number: 20140337587Abstract: A method, computer program product, and system is described that determines the correctness of using memory operations in a computing device with heterogeneous computer components. Embodiments include an optimizer based on the characteristics of a Sequential Consistency for Heterogeneous-Race-Free (SC for HRF) model that analyzes a program and determines the correctness of the ordering of events in the program. HRF models include combinations of the properties: scope order, scope inclusion, and scope transitivity. The optimizer can determine when a program is heterogeneous-race-free in accordance with an SC for HRF memory consistency model . For example, the optimizer can analyze a portion of program code, respect the properties of the SC for HRF model, and determine whether a value produced by a store memory event will be a candidate for a value observed by a load memory event. In addition, the optimizer can determine whether reordering of events is possible.Type: ApplicationFiled: May 12, 2014Publication date: November 13, 2014Applicant: Advanced Micro Devices, Inc.Inventors: Derek R. HOWER, Mark D. Hill, David Wood, Steven K. Reinhardt, Benedict R. Gaster, Blake A. Hechtman, Bradford M. Beckmann