Patents Assigned to Advanced Micros Devices, Inc.
-
Patent number: 10922868Abstract: Improvements in the graphics processing pipeline that allow multiple pipelines to cooperate to render a single frame are disclosed. Two approaches are provided. In a first approach, world-space pipelines for the different graphics processing pipelines process all work for draw calls received from a central processing unit (CPU). In a second approach, the world-space pipelines divide up the work. Work that is divided is synchronized and redistributed at various points in the world-space pipeline. In either approach, the triangles output by the world-space pipelines are distributed to the screen-space pipelines based on the portions of the render surface overlapped by the triangles. Triangles are rendered by screen-space pipelines associated with the render surface portions overlapped by those triangles.Type: GrantFiled: June 26, 2019Date of Patent: February 16, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Mangesh P. Nijasure, Todd Martin, Michael Mantor
-
Patent number: 10923430Abstract: Various multi-die arrangements and methods of manufacturing the same are disclosed. In one aspect, a semiconductor chip device is provided that includes a first molding layer and an interconnect chip at least partially encased in the first molding layer. The interconnect chip has a first side and a second side opposite the first side and a polymer layer on the first side. The polymer layer includes plural conductor traces. A redistribution layer (RDL) structure is positioned on the first molding layer and has plural conductor structures electrically connected to the plural conductor traces. The plural conductor traces provide lateral routing.Type: GrantFiled: June 30, 2019Date of Patent: February 16, 2021Assignees: ADVANCED MICRO DEVICES, INC., ATI TECHNOLOGIES ULCInventors: Chun-Hung Lin, Rahul Agarwal, Milind Bhagavat, Fei Guo
-
Patent number: 10915322Abstract: A processor predicts a number of loop iterations associated with a set of loop instructions. In response to the predicted number of loop iterations exceeding a first loop iteration threshold, the set of loop instructions are executed in a loop mode that includes placing at least one component of an instruction pipeline of the processor in a low-power mode or state and executing the set of loop instructions from a loop buffer. In response to the predicted number of loop iterations being less than or equal to a second loop iteration threshold, the set of instructions are executed in a non-loop mode that includes maintaining at least one component of the instruction pipeline in a powered up state and executing the set of loop instructions from an instruction fetch unit of the instruction pipeline.Type: GrantFiled: September 18, 2018Date of Patent: February 9, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Arunachalam Annamalai, Marius Evers, Aparna Thyagarajan, Anthony Jarvis
-
Patent number: 10917094Abstract: Systems, apparatuses, and methods for implementing stripe-based self-gating and change detect signal propagation for retiming pipelines are disclosed. A circuit includes one or more stripes, with each stripe including a plurality of stages of registers, with each stage only receiving input signals from the preceding stage. For a given stripe, the first stage of registers are self-gated to reduce power consumption by only clocking a group of registers when any of their input signals change. The self-gating signals of the first stage of registers are combined together to create a change detect signal which is passed through a register and provided to a second stage of registers as a clock-enable signal. Accordingly, the second stage registers are only clocked when the change detect signal indicates a change will be forwarded from the first stage. This reduces power consumption for the second stage without causing the area increase associated with self-gating circuitry.Type: GrantFiled: June 19, 2019Date of Patent: February 9, 2021Assignee: Advanced Micro Devices, Inc.Inventor: Qing Meng
-
Patent number: 10915330Abstract: A computing device includes a processor having a plurality of cores, a core translation component, and a core assignment component. The core translation component provides a set of registers, one register for each core of the multiple processor cores. The core assignment component includes components to provide a core index to each of the registers of the core translation component according to a core assignment scheme during processor initialization. Process instructions from an operating system are transferred to a respective core based on the core indices.Type: GrantFiled: December 19, 2017Date of Patent: February 9, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Amitabh Mehra, Krishna Sai Bernucho
-
Publication number: 20210035614Abstract: An integrated circuit includes a memory and a system management unit. The memory has a memory array operating according to a memory power supply voltage and access circuitry coupled to said memory array operating according to a logic power supply voltage. The system management unit activates a first control signal to control an operation of the memory selectively in response to a magnitude of a difference in voltage between the logic power supply voltage and the memory power supply voltage.Type: ApplicationFiled: March 17, 2020Publication date: February 4, 2021Applicant: Advanced Micro Devices, Inc.Inventor: Russell Schreiber
-
Patent number: 10908991Abstract: A computing device having a cache memory that is configured in a write-back mode is described. A cache controller in the cache memory acquires, from a record of bit errors that are present in each of a plurality of portions of the cache memory, a number of bit errors in a portion of the cache memory. The cache controller detects a coherency state of data stored in the portion of the cache memory. Based on the coherency state and the number of bit errors, the cache controller selects an error protection from among a plurality of error protections. The cache controller uses the selected error protection to protect the data stored in the portion of the cache memory from errors.Type: GrantFiled: September 6, 2018Date of Patent: February 2, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: John Kalamatianos, Shrikanth Ganapathy
-
Patent number: 10909053Abstract: An electronic device includes a processor that executes a guest operating system, an input-output memory management unit (IOMMU), and a main memory that stores an IOMMU backing store. The IOMMU backing store includes a separate copy of a set of IOMMU memory-mapped input-output (MMIO) registers for each guest operating system in a set of supported guest operating systems. The IOMMU receives, from the guest operating system, a communication that accesses data in a given IOMMU MMIO register. The IOMMU then performs a corresponding access of the data in a copy of the given IOMMU MMIO register in the IOMMU backing store associated with the guest operating system.Type: GrantFiled: May 27, 2019Date of Patent: February 2, 2021Assignees: ADVANCED MICRO DEVICES, INC., ATI TECHNOLOGIES ULCInventors: Maggie Chan, Philip Ng, Paul Blinzer
-
Publication number: 20210026686Abstract: Techniques for performing machine learning operations are provided. The techniques include configuring a first portion of a first chiplet as a cache; performing caching operations via the first portion; configuring at least a first sub-portion of the first portion of the chiplet as directly-accessible memory; and performing machine learning operations with the first sub-portion by a machine learning accelerator within the first chiplet.Type: ApplicationFiled: July 20, 2020Publication date: January 28, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Swapnil P. Sakharshete, Andrew S. Pomianowski, Maxim V. Kazakov, Vineet Goel, Milind N. Nemlekar, Skyler Jonathon Saleh
-
Publication number: 20210027525Abstract: A method for enhanced forward rendering is disclosed which includes a depth pre-pass, light culling and a final shading. The depth pre-pass minimizes the cost of final shading by avoiding high pixel overdraw. The light culling stage calculates a list of light indices overlapping a pixel. The light indices are calculated on a per-tile basis, where the screen has been split into units of tiles. The final shading evaluates materials using information stored for each light. The forward rendering method may be executed on a processor, such as a single graphics processing unit (GPU) for example.Type: ApplicationFiled: October 12, 2020Publication date: January 28, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Takahiro Harada, Jerry McKee, Jason Yang
-
Patent number: 10902087Abstract: A processing device is provided which includes memory and a processor comprising a plurality of processor cores in communication with each other via first and second hierarchical communication links. Each processor core in a group of the processor cores is in communication with each other via the first hierarchical communication links. Each processor core is configured to store, in the memory, one of a plurality of sub-portions of data of a first matrix, store, in the memory, one of a plurality of sub-portions of data of a second matrix, determine an outer product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core of the group of processor cores, another sub-portion of data of the second matrix and determine another outer product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.Type: GrantFiled: October 31, 2018Date of Patent: January 26, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Shaizeen Aga, Nuwan Jayasena, Allen H. Rush, Michael Ignatowski
-
Patent number: 10903168Abstract: Various arrangements of multi-RDL structure devices are disclosed. In one aspect, an apparatus is provided that includes a first redistribution layer structure and a second redistribution layer structure mounted on the first redistribution layer structure. A first semiconductor chip is mounted on the second redistribution layer structure and electrically connected to both the second redistribution layer structure and the first redistribution layer structure.Type: GrantFiled: May 29, 2020Date of Patent: January 26, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Milind S. Bhagavat, Lei Fu, Farshad Ghahghahi
-
Patent number: 10895597Abstract: Systems, apparatuses, and methods for implementing debug features on a secure coprocessor to handle communication and computation between a debug tool and a debug target are disclosed. A debug tool generates a graphical user interface (GUI) to display debug information to a user for help in debugging a debug target such as a system on chip (SoC). A secure coprocessor is embedded on the debug target, and the secure coprocessor receives debug requests generated by the debug tool. The secure coprocessor performs various computation tasks and/or other operations to prevent multiple round-trip messages being sent back and forth between the debug tool and the debug target. The secure coprocessor is able to access system memory and determine a status of a processor being tested even when the processor becomes unresponsive.Type: GrantFiled: November 21, 2018Date of Patent: January 19, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Tan Peng, Dong Zhu
-
Patent number: 10895901Abstract: A method and apparatus for scrambling and descrambling data in a computer system includes transmitting non-scrambled data from a first high speed inter chip (IP) link circuit located on a first chip to a first serializer/deserializer (SERDES) physical (PHY) circuit located on the first chip, the first high speed link IP indicating the data is not scrambled. The received non-scrambled data is scrambled by the first SERDES PHY circuit and transmitted to a second chip. The received scrambled data is descrambled by a second SERDES PHY circuit located on the second chip. The non-scrambled data is transmitted by the second SERDES PHY circuit to a second high speed link IP circuit located on the second chip to a third circuit for further processing or transmission.Type: GrantFiled: September 27, 2019Date of Patent: January 19, 2021Assignees: ADVANCED MICRO DEVICES INC., ATI TECHNOLOGIES ULCInventors: Yanfeng Wang, Michael J. Tresidder, Kevin M. Lepak, Larry David Hewitt, Noah Beck
-
Patent number: 10896044Abstract: The techniques described herein provide an instruction fetch and decode unit having an operation cache with low latency in switching between fetching decoded operations from the operation cache and fetching and decoding instructions using a decode unit. This low latency is accomplished through a synchronization mechanism that allows work to flow through both the operation cache path and the instruction cache path until that work is stopped due to needing to wait on output from the opposite path. The existence of decoupling buffers in the operation cache path and the instruction cache path allows work to be held until that work is cleared to proceed. Other improvements, such as a specially configured operation cache tag array that allows for detection of multiple hits in a single cycle, also improve latency by, for example, improving the speed at which entries are consumed from a prediction queue that stores predicted address blocks.Type: GrantFiled: June 21, 2018Date of Patent: January 19, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Marius Evers, Dhanaraj Bapurao Tavare, Ashok Tirupathy Venkatachar, Arunachalam Annamalai, Donald A. Priore, Douglas R. Williams
-
Publication number: 20210011697Abstract: Described herein are techniques for generating a stitched shader program. The techniques include identifying a set of shader programs to include in the stitched shader program, wherein the set includes at least one multiversion shader program that includes a first version of instructions and a second version of instructions, wherein the first version of instructions uses a first number of resources that is different than a second number of resources used by the second version of instructions. The techniques also include combining the set of shader programs to form the stitched shader program. The techniques further include determining a number of resources for the stitched shader program. The techniques also include based on the determined number of resources, modifying the instructions corresponding to the multiversion shader program to, when executed, execute either the first version of instructions, or the second version of instructions.Type: ApplicationFiled: July 11, 2019Publication date: January 14, 2021Applicant: Advanced Micro Devices, Inc.Inventor: Sumesh Udayakumaran
-
Publication number: 20210012203Abstract: Systems, methods, and devices for increasing inference speed of a trained convolutional neural network (CNN). A first computation speed of first filters having a first filter size in a layer of the CNN is determined, and a second computation speed of second filters having a second filter size in the layer of the CNN is determined. The size of at least one of the first filters is changed to the second filter size if the second computation speed is faster than the first computation speed. In some implementations the CNN is retrained, after changing the size of at least one of the first filters to the second filter size, to generate a retrained CNN. The size of a fewer number of the first filters is changed to the second filter size if a key performance indicator loss of the retrained CNN exceeds a threshold.Type: ApplicationFiled: July 10, 2019Publication date: January 14, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Abhinav Vishnu, Prakash Sathyanath Raghavendra, Tamer M. Elsharnouby, Rachida Kebichi, Walid Ali, Jonathan Charles Gallmeier
-
Patent number: 10884948Abstract: A device includes an address translation table to, in each node of a set of nodes in the address translation table, store a key value and a hash function identifier, a hash engine coupled with the address translation table to, for each node in the set of nodes, calculate a hash result for the key value by executing a hash function identified by the hash function identifier, and a processing unit coupled with the hash engine to, in response to a request to translate a virtual memory address to a physical memory address, identify a physical memory region corresponding to the virtual memory address based on the calculated hash result for each node in the set of nodes.Type: GrantFiled: May 16, 2019Date of Patent: January 5, 2021Assignee: Advanced Micro Devices, Inc.Inventor: Alexander D. Breslow
-
Patent number: 10884477Abstract: The described embodiments include a computing device with a plurality of clients and a shared resource for processing job items. During operation, a given client of the plurality of clients stores first job items in a queue for the given client. When the queue for the given client meets one or more conditions, the given client notifies one or more other clients that the given client is to process job items using the shared resource. The given client then processes the first job items from the queue using the shared resource. Based on being notified, at least one other client that has second job items to be processed using the shared resource, processes the second job items using the shared resource. The given client can transition the shared resource between power states to enable the processing of job items.Type: GrantFiled: October 20, 2016Date of Patent: January 5, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Alexander J. Branover, Benjamin Tsien
-
Patent number: 10884319Abstract: A system and method for controlling characteristics of collected image data are disclosed. The system and method include performing pre-processing of an image using GPUs, configuring an optic based on the pre-processing, the configuring being designed to account for features of the pre-processed image, acquiring an image using the configured optic, processing the acquired image using GPUs, and determining if the processed acquired image accounts for feature of the pre-processed image, and the determination is affirmative, outputting the image, wherein if the determination is negative repeating the configuring of the optic and re-acquiring the image.Type: GrantFiled: October 23, 2017Date of Patent: January 5, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Allen H. Rush, Hui Zhou