Patents Assigned to Advanced Micro Device, Inc.
-
Patent number: 11513973Abstract: A processor in a system is responsive to a coherent memory request buffer having a plurality of entries to store coherent memory requests from a client module and a non-coherent memory request buffer having a plurality of entries to store non-coherent memory requests from the client module. The client module buffers coherent and non-coherent memory requests and releases the memory requests based on one or more conditions of the processor or one of its caches. The memory requests are released to a central data fabric and into the system based on a first watermark associated with the coherent memory buffer and a second watermark associated with the non-coherent memory buffer.Type: GrantFiled: December 20, 2019Date of Patent: November 29, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Sonu Arora, Benjamin Tsien, Alexander J. Branover
-
Patent number: 11513689Abstract: The present application describes embodiments of an interface for coupling flash memory and dynamic random access memory (DRAM) in a processing system. Some embodiments include a dedicated interface between a flash memory and DRAM. The dedicated interface is to provide access to the flash memory in response to instructions received over a DRAM interface between the DRAM and a processing device. Some embodiments of a method include accessing a flash memory via a dedicated interface between the flash memory and a dynamic random access memory (DRAM) in response to an instruction received over a DRAM interface between the DRAM and a processing device.Type: GrantFiled: May 24, 2021Date of Patent: November 29, 2022Assignee: Advanced Micro Devices, Inc.Inventor: James Bauman
-
Patent number: 11514194Abstract: Devices, methods, and systems for secure communications on a computing device. A host operating system (OS) runs on a host processor in communication with a host memory. A secure OS runs on a coprocessor in communication with a secure memory. The coprocessor receives information from an external device over a secure peer-to-peer (P2P) connection. The secure P2P connection is managed by the secure OS and is not accessible by the host OS.Type: GrantFiled: December 19, 2019Date of Patent: November 29, 2022Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Guhan Krishnan, Carl K. Wakeland, Saikishore Reddipalli, Philip Ng
-
Patent number: 11507641Abstract: Techniques for performing in-memory matrix multiplication, taking into account temperature variations in the memory, are disclosed. In one example, the matrix multiplication memory uses ohmic multiplication and current summing to perform the dot products involved in matrix multiplication. One downside to this analog form of multiplication is that temperature affects the accuracy of the results. Thus techniques are provided herein to compensate for the effects of temperature increases on the accuracy of in-memory matrix multiplications. According to the techniques, portions of input matrices are classified as effective or ineffective. Effective portions are mapped to low temperature regions of the in-memory matrix multiplier and ineffective portions are mapped to high temperature regions of the in-memory matrix multiplier. The matrix multiplication is then performed.Type: GrantFiled: May 31, 2019Date of Patent: November 22, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Majed Valad Beigi, Amin Farmahini-Farahani, Sudhanva Gurumurthi
-
Patent number: 11507522Abstract: Systems, apparatuses, and methods for implementing memory request priority assignment techniques for parallel processors are disclosed. A system includes at least a parallel processor coupled to a memory subsystem, where the parallel processor includes at least a plurality of compute units for executing wavefronts in lock-step. The parallel processor assigns priorities to memory requests of wavefronts on a per-work-item basis by indexing into a first priority vector, with the index generated based on lane-specific information. If a given event is detected, a second priority vector is generated by applying a given priority promotion vector to the first priority vector. Then, for subsequent wavefronts, memory requests are assigned priorities by indexing into the second priority vector with lane-specific information. The use of priority vectors to assign priorities to memory requests helps to reduce the memory divergence problem experienced by different work-items of a wavefront.Type: GrantFiled: December 6, 2019Date of Patent: November 22, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Sooraj Puthoor, Kishore Punniyamurthy, Onur Kayiran, Xianwei Zhang, Yasuko Eckert, Johnathan Alsop, Bradford Michael Beckmann
-
Patent number: 11507380Abstract: A processing system includes a processor with a branch predictor including one or more branch target buffer tables. The processor also includes a branch prediction pipeline including a throttle unit and an uncertainty accumulator. The processor assigns an uncertainty value for each of a plurality of branch predictions generated by the branch predictor and adds the uncertainty value for each of the plurality of branch predictions to an accumulated uncertainty counter associated with the uncertainty accumulator. The throttle unit of the branch prediction pipeline throttles operations of the branch prediction pipeline based on the accumulated uncertainty counter.Type: GrantFiled: August 29, 2018Date of Patent: November 22, 2022Assignee: Advanced Micro Devices, Inc.Inventor: Thomas Clouqueur
-
Patent number: 11508124Abstract: A processing system includes hull shader circuitry that launches thread groups including one or more primitives. The hull shader circuitry also generates tessellation factors that indicate subdivisions of the primitives. The processing system also includes throttling circuitry that estimates a primitive launch time interval for the domain shader based on the tessellation factors and selectively throttles launching of the thread groups from the hull shader circuitry based on the primitive launch time interval of the domain shader and a hull shader latency. In some cases, the throttling circuitry includes a first counter that is incremented in response to launching a thread group from the buffer and a second counter that modifies the first counter based on a measured latency of the domain shader.Type: GrantFiled: December 15, 2020Date of Patent: November 22, 2022Assignee: Advanced Micro Devices, Inc.Inventor: Nishank Pathak
-
Patent number: 11509333Abstract: Systems, apparatuses, and methods for implementing masked fault detection for reliable low voltage cache operation are disclosed. A processor includes a cache that can operate at a relatively low voltage level to conserve power. However, at low voltage levels, the cache is more likely to suffer from bit errors. To mitigate the bit errors occurring in cache lines at low voltage levels, the cache employs a strategy to uncover masked faults during runtime accesses to data by actual software applications. For example, on the first read of a given cache line, the data of the given cache line is inverted and written back to the same data array entry. Also, the error correction bits are regenerated for the inverted data. On a second read of the given cache line, if the fault population of the given cache line changes, then the given cache line's error protection level is updated.Type: GrantFiled: December 17, 2020Date of Patent: November 22, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Shrikanth Ganapathy, John Kalamatianos
-
Patent number: 11507158Abstract: Electrical design current throttling, including: applying an electrical design current (EDC) threshold for each control processing unit component of a plurality of the central processing unit components responsive to the corresponding priority of each central processing unit component, the priority of a central processing unit component responsive to a central processing unit component's current usage data.Type: GrantFiled: May 12, 2020Date of Patent: November 22, 2022Assignees: ADVANCED MICRO DEVICES, INC., ATI TECHNOLOGIES ULCInventors: Xiuting Kaleen Cheng Man, Erik Swanson, Larry D. Hewitt, Adam N. C. Clark
-
Patent number: 11507519Abstract: A processing system selectively compresses cache lines at a cache or at a memory or encrypts cache lines at the memory based on evictions of entries mapping virtual-to-physical address translations from a translation lookaside buffer (TLB). Upon eviction of a TLB entry, the processing system identifies cache lines corresponding to the physical addresses of the evicted TLB entry and selectively compresses the cache lines to increase the effective storage capacity of the processing system or encrypts the cache lines to protect against vulnerabilities.Type: GrantFiled: December 28, 2020Date of Patent: November 22, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Jagadish B. Kotra, Gabriel H. Loh, Matthew R. Poremba
-
Patent number: 11507517Abstract: Disclosed is a cache directory including one or more cache directories configurable to interchange within each cache directory entry at least one bit between a first field and a second field to change the size of the region of memory represented and the number of cache lines tracked in the cache subsystem.Type: GrantFiled: September 25, 2020Date of Patent: November 22, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Amit Apte, Ganesh Balakrishnan
-
Patent number: 11507527Abstract: A chiplet system includes a central processing unit (CPU) communicably coupled to a first GPU chiplet of a GPU chiplet array. The GPU chiplet array includes the first GPU chiplet communicably coupled to the CPU via a bus and a second GPU chiplet communicably coupled to the first GPU chiplet via an active bridge chiplet. The active bridge chiplet is an active silicon die that bridges GPU chiplets and allows partitioning of systems-on-a-chip (SoC) functionality into smaller functional chiplet groupings.Type: GrantFiled: September 27, 2019Date of Patent: November 22, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Skyler J. Saleh, Ruijin Wu
-
Patent number: 11503310Abstract: A device includes an encoder, decoder, codec or combination thereof and inline hardware conversion units that are operative to convert stored image data into one of: an HDR/WCG format and an SDR/SCG format during the conversion process. Each of the inline hardware conversion units is operative to perform the conversion process independent of another read operation with the memory that stores the image data to be converted. In one example, an encoding unit is operative to perform a write operation with a memory to store the converted image data after completing the conversion process. In another example, a decoding unit is operative to perform a read operation with the memory to retrieve the image data from the memory before initiating the conversion process. In another example, an encoder/decoder unit is operative to perform at least one of: the read operation and the write operation.Type: GrantFiled: October 31, 2018Date of Patent: November 15, 2022Assignees: ATI TECHNOLOGIES ULC, ADVANCED MICRO DEVICES, INC.Inventors: Lei Zhang, David Glen, Kim A. Meinerth
-
Patent number: 11503295Abstract: An apparatus for encoding an image and an apparatus for decoding an image are presented. An image contains one or more regions. For encoding the image, the image is decomposed into one or more regions and a region is evaluated to determine whether the region meets a predetermined compressions acceptability criteria. The region is then encoded in response to the transformed and quantized region meeting the predetermined compression acceptability criteria. For decoding the image, a region of the image is selected and the selected region is decoded using metadata associated with the selected region. The metadata includes transformation quantization settings and information describing an aspect ratio used to compress the region.Type: GrantFiled: May 15, 2020Date of Patent: November 15, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Andrew S. Pomianowski, Konstantine Iourcha
-
Patent number: 11500778Abstract: Embodiments include methods, systems and non-transitory computer-readable computer readable media including instructions for executing a prefetch kernel with reduced intermediate state storage resource requirements. These include executing a prefetch kernel on a graphics processing unit (GPU), such that the prefetch kernel begins executing before a processing kernel. The prefetch kernel performs memory operations that are based upon at least a subset of memory operations in the processing kernel.Type: GrantFiled: March 9, 2020Date of Patent: November 15, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Nuwan S. Jayasena, James Michael O'Connor, Michael Mantor
-
Patent number: 11494592Abstract: Systems, apparatuses, and methods for converting data to a tiling format when implementing convolutional neural networks are disclosed. A system includes at least a memory, a cache, a processor, and a plurality of compute units. The memory stores a first buffer and a second buffer in a linear format, where the first buffer stores convolutional filter data and the second buffer stores image data. The processor converts the first and second buffers from the linear format to third and fourth buffers, respectively, in a tiling format. The plurality of compute units load the tiling-formatted data from the third and fourth buffers in memory to the cache and then perform a convolutional filter operation on the tiling-formatted data. The system generates a classification of a first dataset based on a result of the convolutional filter operation.Type: GrantFiled: August 28, 2020Date of Patent: November 8, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Song Zhang, Jiantan Liu, Hua Zhang, Min Yu
-
Patent number: 11494192Abstract: A processing element is implemented in a stage of a pipeline and configured to execute an instruction. A first array of multiplexers is to provide information associated with the instruction to the processing element in response to the instruction being in a first set of instructions. A second array of multiplexers is to provide information associated with the instruction to the first processing element in response to the instruction being in a second set of instructions. A control unit is to gate at least one of power or a clock signal provided to the first array of multiplexers in response to the instruction being in the second set.Type: GrantFiled: April 28, 2020Date of Patent: November 8, 2022Assignees: Advanced Micro Devices, Inc., ADVANCED MICRO DEVICES (SHANGHAI) CO., LTD.Inventors: Jiasheng Chen, YunXiao Zou, Bin He, Angel E. Socarras, QingCheng Wang, Wei Yuan, Michael Mantor
-
Patent number: 11494316Abstract: A memory controller includes a memory channel controller that uses multiple groups of command queue and arbiter pairs. Each arbiter is coupled to a respective command queue to select memory access commands from each command queue according to predetermined criteria. Each arbiter selects from among the memory access requests in each command queue independently based on the predetermined criteria and sends selected memory access requests to a selector that serves as a second level arbiter which sends the request to a memory subchannel.Type: GrantFiled: October 30, 2020Date of Patent: November 8, 2022Assignee: ADVANCED MICRO DEVICES, INC.Inventors: James R. Magro, Kedarnath Balakrishnan, Brendan T. Mangan
-
Patent number: 11495478Abstract: Systems, apparatuses, and methods for efficiently performing active thermal control during device testing are disclosed. A device testing system includes a device under test, a thermal structure on top of the device under test, and a controller configured to determine when to apply and remove thermal energy to the device under test through the thermal structure. The thermal structure includes a thermal transfer block that transfers thermal energy to and from the device under test below the thermal transfer block. The thermal structure also includes a coolant block above the thermal transfer block that removes thermal energy from the thermal transfer block. There is no heating element between the coolant block and the thermal transfer block. Rather, the thermal structure includes a heating element in a wall of the thermal transfer block. Therefore, an unobstructed thermal path exists from the device under test to the coolant block.Type: GrantFiled: December 16, 2019Date of Patent: November 8, 2022Assignee: Advanced Micro Devices, Inc.Inventors: Travis Oneal Cagle, Sheldon S. Grooms
-
Patent number: 11494211Abstract: An electronic device includes a processor that executes a guest operating system and a hypervisor, an input-output (IO) device, and an input-output memory management unit (IOMMU). The IOMMU handles communications between the IOMMU and the guest operating system by: replacing, in communications received from the guest operating system, guest domain identifiers (domainIDs) with corresponding host domainIDs and/or guest device identifiers (deviceIDs) with corresponding host deviceIDs before further processing the communications; replacing, in communications received from the IO device, host deviceIDs with guest deviceIDs before providing the communications to the guest operating system; and placing, into communications generated in the IOMMU and destined for the guest operating system, guest domainIDs and/or guest deviceIDs before providing the communications to the guest operating system. The IOMMU handles the communications without intervention by the hypervisor.Type: GrantFiled: April 22, 2019Date of Patent: November 8, 2022Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Maggie Chan, Philip Ng, Paul Blinzer