Patents Assigned to Advanced Micros Devices, Inc.

Split frame rendering

Patent number: 10922868

Abstract: Improvements in the graphics processing pipeline that allow multiple pipelines to cooperate to render a single frame are disclosed. Two approaches are provided. In a first approach, world-space pipelines for the different graphics processing pipelines process all work for draw calls received from a central processing unit (CPU). In a second approach, the world-space pipelines divide up the work. Work that is divided is synchronized and redistributed at various points in the world-space pipeline. In either approach, the triangles output by the world-space pipelines are distributed to the screen-space pipelines based on the portions of the render surface overlapped by the triangles. Triangles are rendered by screen-space pipelines associated with the render surface portions overlapped by those triangles.

Type: Grant

Filed: June 26, 2019

Date of Patent: February 16, 2021

Assignee: Advanced Micro Devices, Inc.

Inventors: Mangesh P. Nijasure, Todd Martin, Michael Mantor
High density cross link die with polymer routing layer

Patent number: 10923430

Abstract: Various multi-die arrangements and methods of manufacturing the same are disclosed. In one aspect, a semiconductor chip device is provided that includes a first molding layer and an interconnect chip at least partially encased in the first molding layer. The interconnect chip has a first side and a second side opposite the first side and a polymer layer on the first side. The polymer layer includes plural conductor traces. A redistribution layer (RDL) structure is positioned on the first molding layer and has plural conductor structures electrically connected to the plural conductor traces. The plural conductor traces provide lateral routing.

Type: Grant

Filed: June 30, 2019

Date of Patent: February 16, 2021

Assignees: ADVANCED MICRO DEVICES, INC., ATI TECHNOLOGIES ULC

Inventors: Chun-Hung Lin, Rahul Agarwal, Milind Bhagavat, Fei Guo
Using loop exit prediction to accelerate or suppress loop mode of a processor

Patent number: 10915322

Abstract: A processor predicts a number of loop iterations associated with a set of loop instructions. In response to the predicted number of loop iterations exceeding a first loop iteration threshold, the set of loop instructions are executed in a loop mode that includes placing at least one component of an instruction pipeline of the processor in a low-power mode or state and executing the set of loop instructions from a loop buffer. In response to the predicted number of loop iterations being less than or equal to a second loop iteration threshold, the set of instructions are executed in a non-loop mode that includes maintaining at least one component of the instruction pipeline in a powered up state and executing the set of loop instructions from an instruction fetch unit of the instruction pipeline.

Type: Grant

Filed: September 18, 2018

Date of Patent: February 9, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Arunachalam Annamalai, Marius Evers, Aparna Thyagarajan, Anthony Jarvis
Stripe based self-gating for retiming pipelines

Patent number: 10917094

Abstract: Systems, apparatuses, and methods for implementing stripe-based self-gating and change detect signal propagation for retiming pipelines are disclosed. A circuit includes one or more stripes, with each stripe including a plurality of stages of registers, with each stage only receiving input signals from the preceding stage. For a given stripe, the first stage of registers are self-gated to reduce power consumption by only clocking a group of registers when any of their input signals change. The self-gating signals of the first stage of registers are combined together to create a change detect signal which is passed through a register and provided to a second stage of registers as a clock-enable signal. Accordingly, the second stage registers are only clocked when the change detect signal indicates a change will be forwarded from the first stage. This reduces power consumption for the second stage without causing the area increase associated with self-gating circuitry.

Type: Grant

Filed: June 19, 2019

Date of Patent: February 9, 2021

Assignee: Advanced Micro Devices, Inc.

Inventor: Qing Meng
Pseudo-random logical to physical core assignment at boot for age averaging

Patent number: 10915330

Abstract: A computing device includes a processor having a plurality of cores, a core translation component, and a core assignment component. The core translation component provides a set of registers, one register for each core of the multiple processor cores. The core assignment component includes components to provide a core index to each of the registers of the core translation component according to a core assignment scheme during processor initialization. Process instructions from an operating system are transferred to a respective core based on the core indices.

Type: Grant

Filed: December 19, 2017

Date of Patent: February 9, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Amitabh Mehra, Krishna Sai Bernucho
CONTROL OF DUAL-VOLTAGE MEMORY OPERATION

Publication number: 20210035614

Abstract: An integrated circuit includes a memory and a system management unit. The memory has a memory array operating according to a memory power supply voltage and access circuitry coupled to said memory array operating according to a logic power supply voltage. The system management unit activates a first control signal to control an operation of the memory selectively in response to a magnitude of a difference in voltage between the logic power supply voltage and the memory power supply voltage.

Type: Application

Filed: March 17, 2020

Publication date: February 4, 2021

Applicant: Advanced Micro Devices, Inc.

Inventor: Russell Schreiber
Bit error protection in cache memories

Patent number: 10908991

Abstract: A computing device having a cache memory that is configured in a write-back mode is described. A cache controller in the cache memory acquires, from a record of bit errors that are present in each of a plurality of portions of the cache memory, a number of bit errors in a portion of the cache memory. The cache controller detects a coherency state of data stored in the portion of the cache memory. Based on the coherency state and the number of bit errors, the cache controller selects an error protection from among a plurality of error protections. The cache controller uses the selected error protection to protect the data stored in the portion of the cache memory from errors.

Type: Grant

Filed: September 6, 2018

Date of Patent: February 2, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: John Kalamatianos, Shrikanth Ganapathy
Providing copies of input-output memory management unit registers to guest operating systems

Patent number: 10909053

Abstract: An electronic device includes a processor that executes a guest operating system, an input-output memory management unit (IOMMU), and a main memory that stores an IOMMU backing store. The IOMMU backing store includes a separate copy of a set of IOMMU memory-mapped input-output (MMIO) registers for each guest operating system in a set of supported guest operating systems. The IOMMU receives, from the guest operating system, a communication that accesses data in a given IOMMU MMIO register. The IOMMU then performs a corresponding access of the data in a copy of the given IOMMU MMIO register in the IOMMU backing store associated with the guest operating system.

Type: Grant

Filed: May 27, 2019

Date of Patent: February 2, 2021

Assignees: ADVANCED MICRO DEVICES, INC., ATI TECHNOLOGIES ULC

Inventors: Maggie Chan, Philip Ng, Paul Blinzer
CHIPLET-INTEGRATED MACHINE LEARNING ACCELERATORS

Publication number: 20210026686

Abstract: Techniques for performing machine learning operations are provided. The techniques include configuring a first portion of a first chiplet as a cache; performing caching operations via the first portion; configuring at least a first sub-portion of the first portion of the chiplet as directly-accessible memory; and performing machine learning operations with the first sub-portion by a machine learning accelerator within the first chiplet.

Type: Application

Filed: July 20, 2020

Publication date: January 28, 2021

Applicant: Advanced Micro Devices, Inc.

Inventors: Swapnil P. Sakharshete, Andrew S. Pomianowski, Maxim V. Kazakov, Vineet Goel, Milind N. Nemlekar, Skyler Jonathon Saleh
FORWARD RENDERING PIPELINE WITH LIGHT CULLING

Publication number: 20210027525

Abstract: A method for enhanced forward rendering is disclosed which includes a depth pre-pass, light culling and a final shading. The depth pre-pass minimizes the cost of final shading by avoiding high pixel overdraw. The light culling stage calculates a list of light indices overlapping a pixel. The light indices are calculated on a per-tile basis, where the screen has been split into units of tiles. The final shading evaluates materials using information stored for each light. The forward rendering method may be executed on a processor, such as a single graphics processing unit (GPU) for example.

Type: Application

Filed: October 12, 2020

Publication date: January 28, 2021

Applicant: Advanced Micro Devices, Inc.

Inventors: Takahiro Harada, Jerry McKee, Jason Yang
Device and method for accelerating matrix multiply operations as a sum of outer products

Patent number: 10902087

Abstract: A processing device is provided which includes memory and a processor comprising a plurality of processor cores in communication with each other via first and second hierarchical communication links. Each processor core in a group of the processor cores is in communication with each other via the first hierarchical communication links. Each processor core is configured to store, in the memory, one of a plurality of sub-portions of data of a first matrix, store, in the memory, one of a plurality of sub-portions of data of a second matrix, determine an outer product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core of the group of processor cores, another sub-portion of data of the second matrix and determine another outer product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.

Type: Grant

Filed: October 31, 2018

Date of Patent: January 26, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
Multi-RDL structure packages and methods of fabricating the same

Patent number: 10903168

Abstract: Various arrangements of multi-RDL structure devices are disclosed. In one aspect, an apparatus is provided that includes a first redistribution layer structure and a second redistribution layer structure mounted on the first redistribution layer structure. A first semiconductor chip is mounted on the second redistribution layer structure and electrically connected to both the second redistribution layer structure and the first redistribution layer structure.

Type: Grant

Filed: May 29, 2020

Date of Patent: January 26, 2021

Assignee: Advanced Micro Devices, Inc.

Inventors: Milind S. Bhagavat, Lei Fu, Farshad Ghahghahi
Secure coprocessor assisted hardware debugging

Patent number: 10895597

Abstract: Systems, apparatuses, and methods for implementing debug features on a secure coprocessor to handle communication and computation between a debug tool and a debug target are disclosed. A debug tool generates a graphical user interface (GUI) to display debug information to a user for help in debugging a debug target such as a system on chip (SoC). A secure coprocessor is embedded on the debug target, and the secure coprocessor receives debug requests generated by the debug tool. The secure coprocessor performs various computation tasks and/or other operations to prevent multiple round-trip messages being sent back and forth between the debug tool and the debug target. The secure coprocessor is able to access system memory and determine a status of a processor being tested even when the processor becomes unresponsive.

Type: Grant

Filed: November 21, 2018

Date of Patent: January 19, 2021

Assignee: Advanced Micro Devices, Inc.

Inventors: Tan Peng, Dong Zhu
Method and apparatus for data scrambling

Patent number: 10895901

Abstract: A method and apparatus for scrambling and descrambling data in a computer system includes transmitting non-scrambled data from a first high speed inter chip (IP) link circuit located on a first chip to a first serializer/deserializer (SERDES) physical (PHY) circuit located on the first chip, the first high speed link IP indicating the data is not scrambled. The received non-scrambled data is scrambled by the first SERDES PHY circuit and transmitted to a second chip. The received scrambled data is descrambled by a second SERDES PHY circuit located on the second chip. The non-scrambled data is transmitted by the second SERDES PHY circuit to a second high speed link IP circuit located on the second chip to a third circuit for further processing or transmission.

Type: Grant

Filed: September 27, 2019

Date of Patent: January 19, 2021

Assignees: ADVANCED MICRO DEVICES INC., ATI TECHNOLOGIES ULC

Inventors: Yanfeng Wang, Michael J. Tresidder, Kevin M. Lepak, Larry David Hewitt, Noah Beck
Low latency synchronization for operation cache and instruction cache fetching and decoding instructions

Patent number: 10896044

Abstract: The techniques described herein provide an instruction fetch and decode unit having an operation cache with low latency in switching between fetching decoded operations from the operation cache and fetching and decoding instructions using a decode unit. This low latency is accomplished through a synchronization mechanism that allows work to flow through both the operation cache path and the instruction cache path until that work is stopped due to needing to wait on output from the opposite path. The existence of decoupling buffers in the operation cache path and the instruction cache path allows work to be held until that work is cleared to proceed. Other improvements, such as a specially configured operation cache tag array that allows for detection of multiple hits in a single cycle, also improve latency by, for example, improving the speed at which entries are consumed from a prediction queue that stores predicted address blocks.

Type: Grant

Filed: June 21, 2018

Date of Patent: January 19, 2021

Assignee: Advanced Micro Devices, Inc.

Inventors: Marius Evers, Dhanaraj Bapurao Tavare, Ashok Tirupathy Venkatachar, Arunachalam Annamalai, Donald A. Priore, Douglas R. Williams
MULTI-VERSION SHADERS

Publication number: 20210011697

Abstract: Described herein are techniques for generating a stitched shader program. The techniques include identifying a set of shader programs to include in the stitched shader program, wherein the set includes at least one multiversion shader program that includes a first version of instructions and a second version of instructions, wherein the first version of instructions uses a first number of resources that is different than a second number of resources used by the second version of instructions. The techniques also include combining the set of shader programs to form the stitched shader program. The techniques further include determining a number of resources for the stitched shader program. The techniques also include based on the determined number of resources, modifying the instructions corresponding to the multiversion shader program to, when executed, execute either the first version of instructions, or the second version of instructions.

Type: Application

Filed: July 11, 2019

Publication date: January 14, 2021

Applicant: Advanced Micro Devices, Inc.

Inventor: Sumesh Udayakumaran
ADAPTIVE FILTER REPLACEMENT IN CONVOLUTIONAL NEURAL NETWORKS

Publication number: 20210012203

Abstract: Systems, methods, and devices for increasing inference speed of a trained convolutional neural network (CNN). A first computation speed of first filters having a first filter size in a layer of the CNN is determined, and a second computation speed of second filters having a second filter size in the layer of the CNN is determined. The size of at least one of the first filters is changed to the second filter size if the second computation speed is faster than the first computation speed. In some implementations the CNN is retrained, after changing the size of at least one of the first filters to the second filter size, to generate a retrained CNN. The size of a fewer number of the first filters is changed to the second filter size if a key performance indicator loss of the retrained CNN exceeds a threshold.

Type: Application

Filed: July 10, 2019

Publication date: January 14, 2021

Applicant: Advanced Micro Devices, Inc.

Inventors: Abhinav Vishnu, Prakash Sathyanath Raghavendra, Tamer M. Elsharnouby, Rachida Kebichi, Walid Ali, Jonathan Charles Gallmeier
Replacing pointers with hashing in tree-based page table designs

Patent number: 10884948

Abstract: A device includes an address translation table to, in each node of a set of nodes in the address translation table, store a key value and a hash function identifier, a hash engine coupled with the address translation table to, for each node in the set of nodes, calculate a hash result for the key value by executing a hash function identified by the hash function identifier, and a processing unit coupled with the hash engine to, in response to a request to translate a virtual memory address to a physical memory address, identify a physical memory region corresponding to the virtual memory address based on the calculated hash result for each node in the set of nodes.

Type: Grant

Filed: May 16, 2019

Date of Patent: January 5, 2021

Assignee: Advanced Micro Devices, Inc.

Inventor: Alexander D. Breslow
Coordinating accesses of shared resources by clients in a computing device

Patent number: 10884477

Abstract: The described embodiments include a computing device with a plurality of clients and a shared resource for processing job items. During operation, a given client of the plurality of clients stores first job items in a queue for the given client. When the queue for the given client meets one or more conditions, the given client notifies one or more other clients that the given client is to process job items using the shared resource. The given client then processes the first job items from the queue using the shared resource. Based on being notified, at least one other client that has second job items to be processed using the shared resource, processes the second job items using the shared resource. The given client can transition the shared resource between power states to enable the processing of job items.

Type: Grant

Filed: October 20, 2016

Date of Patent: January 5, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Alexander J. Branover, Benjamin Tsien
Computational optics

Patent number: 10884319

Abstract: A system and method for controlling characteristics of collected image data are disclosed. The system and method include performing pre-processing of an image using GPUs, configuring an optic based on the pre-processing, the configuring being designed to account for features of the pre-processed image, acquiring an image using the configured optic, processing the acquired image using GPUs, and determining if the processed acquired image accounts for feature of the pre-processed image, and the determination is affirmative, outputting the image, wherein if the determination is negative repeating the configuring of the optic and re-acquiring the image.

Type: Grant

Filed: October 23, 2017

Date of Patent: January 5, 2021

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Allen H. Rush, Hui Zhou

prev … 89 90 91 92 93 94 95 96 97 … next