Patents by Inventor Bradford M. Beckmann

Bradford M. Beckmann has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Flexible framework to support memory synchronization operations

Patent number: 10198261

Abstract: A method of performing memory synchronization operations is provided that includes receiving, at a programmable cache controller in communication with one or more caches, an instruction in a first language to perform a memory synchronization operation of synchronizing a plurality of instruction sequences executing on a processor, mapping the received instruction in the first language to one or more selected cache operations in a second language executable by the cache controller and executing the one or more cache operations to perform the memory synchronization operation. The method further comprises receiving a second mapping that provides mapping instructions to map the received instruction to one or more other cache operations, mapping the received instruction to one or more other cache operations and executing the one or more other cache operations to perform the memory synchronization operation.

Type: Grant

Filed: April 11, 2016

Date of Patent: February 5, 2019

Assignee: Advanced Micro Devices, Inc.

Inventors: Shuai Che, Marc S. Orr, Bradford M. Beckmann
MONITOR SUPPORT ON ACCELERATED PROCESSING DEVICE

Publication number: 20190034151

Abstract: A technique for implementing synchronization monitors on an accelerated processing device (“APD”) is provided. Work on an APD includes workgroups that include one or more wavefronts. All wavefronts of a workgroup execute on a single compute unit. A monitor is a synchronization construct that allows workgroups to stall until a particular condition is met. Responsive to all wavefronts of a workgroup executing a wait instruction, the monitor coordinator records the workgroup in an “entry queue.” The workgroup begins saving its state to a general APD memory and, when such saving is complete, the monitor coordinator moves the workgroup to a “condition queue.” When the condition specified by the wait instruction is met, the monitor coordinator moves the workgroup to a “ready queue,” and, when sufficient resources are available on a compute unit, the APD schedules the ready workgroup for execution on a compute unit.

Type: Application

Filed: July 27, 2017

Publication date: January 31, 2019

Applicant: Advanced Micro Devices, Inc.

Inventors: Alexandru Dutu, Bradford M. Beckmann
PROCESSOR WITH HOST AND SLAVE OPERATING MODES STACKED WITH MEMORY

Publication number: 20190013051

Abstract: A system, method, and computer program product are provided for a memory device system. One or more memory dies and at least one logic die are disposed in a package and communicatively coupled. The logic die comprises a processing device configurable to manage virtual memory and operate in an operating mode. The operating mode is selected from a set of operating modes comprising a slave operating mode and a host operating mode.

Type: Application

Filed: September 12, 2018

Publication date: January 10, 2019

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Nuwan S. Jayasena, Gabriel H. Loh, Bradford M. Beckmann, James M. O'Connor, Lisa R. Hsu
Processor with host and slave operating modes stacked with memory

Patent number: 10079044

Abstract: A system, method, and computer program product are provided for a memory device system. One or more memory dies and at least one logic die are disposed in a package and communicatively coupled. The logic die comprises a processing device configurable to manage virtual memory and operate in an operating mode. The operating mode is selected from a set of operating modes comprising a slave operating mode and a host operating mode.

Type: Grant

Filed: December 20, 2012

Date of Patent: September 18, 2018

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Nuwan S. Jayasena, Gabriel H. Loh, Bradford M. Beckmann, James M. O'Connor, Lisa R. Hsu
Managing cache coherence using information in a page table

Patent number: 10019377

Abstract: The described embodiments include a computing device with two or more types of processors and a memory that is shared between the two or more types of processors. The computing device performs operations for handling cache coherency between the two or more types of processors. During operation, the computing device sets a cache coherency indicator in metadata in a page table entry in a page table, the page table entry information about a page of data that is stored in the memory. The computing device then uses the cache coherency indicator to determine operations to be performed when accessing data in the page of data in the memory. For example, the computing device can use the coherency indicator to determine whether a coherency operation is to be performed when a processor of a given type accesses data in the page of data in the memory.

Type: Grant

Filed: May 23, 2016

Date of Patent: July 10, 2018

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Arkaprava Basu, Bradford M. Beckmann, Shuai Che, Sooraj Puthoor
Dynamic wavefront creation for processing units using a hybrid compactor

Patent number: 9898287

Abstract: A method, a non-transitory computer readable medium, and a processor for repacking dynamic wavefronts during program code execution on a processing unit, each dynamic wavefront including multiple threads are presented. If a branch instruction is detected, a determination is made whether all wavefronts following a same control path in the program code have reached a compaction point, which is the branch instruction. If no branch instruction is detected in executing the program code, a determination is made whether all wavefronts following the same control path have reached a reconvergence point, which is a beginning of a program code segment to be executed by both a taken branch and a not taken branch from a previous branch instruction. The dynamic wavefronts are repacked with all threads that follow the same control path, if all wavefronts following the same control path have reached the branch instruction or the reconvergence point.

Type: Grant

Filed: April 9, 2015

Date of Patent: February 20, 2018

Assignee: Advanced Micro Devices, Inc.

Inventors: Sooraj Puthoor, Bradford M. Beckmann, Dmitri Yudanov
Managing Cache Coherence Using Information in a Page Table

Publication number: 20170337136

Abstract: The described embodiments include a computing device with two or more types of processors and a memory that is shared between the two or more types of processors. The computing device performs operations for handling cache coherency between the two or more types of processors. During operation, the computing device sets a cache coherency indicator in metadata in a page table entry in a page table, the page table entry information about a page of data that is stored in the memory. The computing device then uses the cache coherency indicator to determine operations to be performed when accessing data in the page of data in the memory. For example, the computing device can use the coherency indicator to determine whether a coherency operation is to be performed when a processor of a given type accesses data in the page of data in the memory.

Type: Application

Filed: May 23, 2016

Publication date: November 23, 2017

Inventors: Arkaprava Basu, Bradford M. Beckmann, Shuai Che, Sooraj Puthoor
Remote scoped synchronization for work stealing and sharing

Patent number: 9804883

Abstract: Described herein is an apparatus and method for remote scoped synchronization, which is a new semantic that allows a work-item to order memory accesses with a scope instance outside of its scope hierarchy. More precisely, remote synchronization expands visibility at a particular scope to all scope-instances encompassed by that scope. Remote scoped synchronization operation allows smaller scopes to be used more frequently and defers added cost to only when larger scoped synchronization is required. This enables programmers to optimize the scope that memory operations are performed at for important communication patterns like work stealing. Executing memory operations at the optimum scope reduces both execution time and energy. In particular, remote synchronization allows a work-item to communicate with a scope that it otherwise would not be able to access. Specifically, work-items can pull valid data from and push updates to scopes that do not (hierarchically) contain them.

Type: Grant

Filed: November 14, 2014

Date of Patent: October 31, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Marc S. Orr, Bradford M. Beckmann, Ayse Yilmazer, Shuai Che, David A. Wood, Mark D. Hill
FLEXIBLE FRAMEWORK TO SUPPORT MEMORY SYNCHRONIZATION OPERATIONS

Publication number: 20170293487

Abstract: A method of performing memory synchronization operations is provided that includes receiving, at a programmable cache controller in communication with one or more caches, an instruction in a first language to perform a memory synchronization operation of synchronizing a plurality of instruction sequences executing on a processor, mapping the received instruction in the first language to one or more selected cache operations in a second language executable by the cache controller and executing the one or more cache operations to perform the memory synchronization operation. The method further comprises receiving a second mapping that provides mapping instructions to map the received instruction to one or more other cache operations, mapping the received instruction to one or more other cache operations and executing the one or more other cache operations to perform the memory synchronization operation.

Type: Application

Filed: April 11, 2016

Publication date: October 12, 2017

Applicant: Advanced Micro Devices, Inc.

Inventors: Shuai Che, Marc S. Orr, Bradford M. Beckmann
Selecting a resource from a set of resources for performing an operation

Patent number: 9766936

Abstract: The described embodiments comprise a selection mechanism that selects a resource from a set of resources in a computing device for performing an operation. In some embodiments, the selection mechanism performs a lookup in a table selected from a set of tables to identify a resource from the set of resources. When the resource is not available for performing the operation and until another resource is selected for performing the operation, the selection mechanism identifies a next resource in the table and selects the next resource for performing the operation when the next resource is available for performing the operation.

Type: Grant

Filed: November 6, 2015

Date of Patent: September 19, 2017

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Bradford M. Beckmann, Mithuna S. Thottethodi, James M. O'Connor, Mauricio Breternitz, Lisa R. Hsu, Gabriel H. Loh, Yasuko Eckert
Stacked memory device with metadata management

Patent number: 9697147

Abstract: A processing system comprises one or more processor devices and other system components coupled to a stacked memory device having a set of stacked memory layers and a set of one or more logic layers. The set of logic layers implements a metadata manager that offloads metadata management from the other system components. The set of logic layers also includes a memory interface coupled to memory cell circuitry implemented in the set of stacked memory layers and coupleable to the devices external to the stacked memory device. The memory interface operates to perform memory accesses for the external devices and for the metadata manager. By virtue of the metadata manager's tight integration with the stacked memory layers, the metadata manager may perform certain memory-intensive metadata management operations more efficiently than could be performed by the external devices.

Type: Grant

Filed: August 6, 2012

Date of Patent: July 4, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Gabriel H. Loh, James M. O'Connor, Bradford M. Beckmann, Michael Ignatowski
Moving data between caches in a heterogeneous processor system

Patent number: 9652390

Abstract: Apparatus, computer readable medium, integrated circuit, and method of moving a plurality of data items to a first cache or a second cache are presented. The method includes receiving an indication that the first cache requested the plurality of data items. The method includes storing information indicating that the first cache requested the plurality of data items. The information may include an address for each of the plurality of data items. The method includes determining based at least on the stored information to move the plurality of data items to the second cache. The method includes moving the plurality of data items to the second cache. The method may include determining a time interval between receiving the indication that the first cache requested the plurality of data items and moving the plurality of data items to the second cache. A scratch pad memory is disclosed.

Type: Grant

Filed: August 5, 2014

Date of Patent: May 16, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: JunLi Gu, Bradford M. Beckmann, Yuan Xie
SYSTEM PERFORMANCE MANAGEMENT USING PRIORITIZED COMPUTE UNITS

Publication number: 20170004080

Abstract: Methods, devices, and systems for managing performance of a processor having multiple compute units. An effective number of the multiple compute units may be determined to designate as having priority. On a condition that the effective number is nonzero, the effective number of the multiple compute units may each be designated as a priority compute unit. Priority compute units may have access to a shared cache whereas non-priority compute units may not. Workgroups may be preferentially dispatched to priority compute units. Memory access requests from priority compute units may be served ahead of requests from non-priority compute units.

Type: Application

Filed: June 30, 2015

Publication date: January 5, 2017

Applicant: ADVANCED MICRO DEVICES, INC.

Inventors: Zhe Wang, Sooraj Puthoor, Bradford M. Beckmann
CONDITIONAL ATOMIC OPERATIONS AT A PROCESSOR

Publication number: 20160357551

Abstract: A conditional fetch-and-phi operation tests a memory location to determine if the memory locations stores a specified value and, if so, modifies the value at the memory location. The conditional fetch-and-phi operation can be implemented so that it can be concurrently executed by a plurality of concurrently executing threads, such as the threads of wavefront at a GPU. To execute the conditional fetch-and-phi operation, one of the concurrently executing threads is selected to execute a compare-and-swap (CAS) operation at the memory location, while the other threads await the results. The CAS operation tests the value at the memory location and, if the CAS operation is successful, the value is passed to each of the concurrently executing threads.

Type: Application

Filed: June 2, 2015

Publication date: December 8, 2016

Inventors: David A. Wood, Steven K. Reinhardt, Bradford M. Beckmann, Marc S. Orr
MESSAGE AGGREGATION, COMBINING AND COMPRESSION FOR EFFICIENT DATA COMMUNICATIONS IN GPU-BASED CLUSTERS

Publication number: 20160352598

Abstract: A system and method for efficient management of network traffic management of highly data parallel computing. A processing node includes one or more processors capable of generating network messages. A network interface is used to receive and send network messages across a network. The processing node reduces at least one of a number or a storage size of the original network messages into one or more new network messages. The new network messages are sent to the network interface to send across the network.

Type: Application

Filed: May 26, 2016

Publication date: December 1, 2016

Inventors: Steven K. Reinhardt, Marc S. Orr, Bradford M. Beckmann, Shuai Che, David A. Wood
Write combining cache microarchitecture for synchronization events

Patent number: 9477599

Abstract: A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read/write combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events since a store event may not need to reach main memory to complete.

Type: Grant

Filed: August 7, 2013

Date of Patent: October 25, 2016

Assignee: Advanced Micro Devices, Inc.

Inventors: Blake A. Hechtman, Bradford M. Beckmann
DYNAMIC WAVEFRONT CREATION FOR PROCESSING UNITS USING A HYBRID COMPACTOR

Publication number: 20160239302

Abstract: A method, a non-transitory computer readable medium, and a processor for repacking dynamic wavefronts during program code execution on a processing unit, each dynamic wavefront including multiple threads are presented. If a branch instruction is detected, a determination is made whether all wavefronts following a same control path in the program code have reached a compaction point, which is the branch instruction. If no branch instruction is detected in executing the program code, a determination is made whether all wavefronts following the same control path have reached a reconvergence point, which is a beginning of a program code segment to be executed by both a taken branch and a not taken branch from a previous branch instruction. The dynamic wavefronts are repacked with all threads that follow the same control path, if all wavefronts following the same control path have reached the branch instruction or the reconvergence point.

Type: Application

Filed: April 9, 2015

Publication date: August 18, 2016

Applicant: Advanced Micro Devices, Inc.

Inventors: Sooraj Puthoor, Bradford M. Beckmann, Dmitri Yudanov
Conditional notification mechanism

Patent number: 9411663

Abstract: The described embodiments comprise a first hardware context. The first hardware context receives, from a second hardware context, an indication of a memory location and a condition to be met by the memory location. The first hardware context then sends a signal to the second hardware context when the memory location meets the condition.

Type: Grant

Filed: March 1, 2013

Date of Patent: August 9, 2016

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Steven K. Reinhardt, Marc S. Orr, Bradford M. Beckmann
Hierarchical write-combining cache coherence

Patent number: 9396112

Abstract: A method, computer program product, and system is described that enforces a release consistency with special accesses sequentially consistent (RCsc) memory model and executes release synchronization instructions such as a StRel event without tracking an outstanding store event through a memory hierarchy, while efficiently using bandwidth resources. What is also described is the decoupling of a store event from an ordering of the store event with respect to a RCsc memory model. The description also includes a set of hierarchical read-only cache and write-only combining buffers that coalesce stores from different parts of the system. In addition, a pool component maintains partial order of received store events and release synchronization events to avoid content addressable memory (CAM) structures, full cache flushes, as well as direct write-throughs to memory. The approach improves the performance of both global and local synchronization events and reduces overhead in maintaining write-only combining buffers.

Type: Grant

Filed: August 26, 2013

Date of Patent: July 19, 2016

Assignee: Advanced Micro Devices, Inc.

Inventors: Blake A. Hechtman, Bradford M. Beckmann
Method for memory consistency among heterogeneous computer components

Patent number: 9361118

Abstract: A method, computer program product, and system is described that determines the correctness of using memory operations in a computing device with heterogeneous computer components. Embodiments include an optimizer based on the characteristics of a Sequential Consistency for Heterogeneous-Race-Free (SC for HRF) model that analyzes a program and determines the correctness of the ordering of events in the program. HRF models include combinations of the properties: scope order, scope inclusion, and scope transitivity. The optimizer can determine when a program is heterogeneous-race-free in accordance with an SC for HRF memory consistency model. For example, the optimizer can analyze a portion of program code, respect the properties of the SC for HRF model, and determine whether a value produced by a store memory event will be a candidate for a value observed by a load memory event. In addition, the optimizer can determine whether reordering of events is possible.

Type: Grant

Filed: May 12, 2014

Date of Patent: June 7, 2016

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Derek R. Hower, Mark D. Hill, David Wood, Steven K. Reinhardt, Benedict R. Gaster, Blake A. Hechtman, Bradford M. Beckmann

prev 1 2 3 4 5 next