Patents Assigned to Advanced Micro Devics, Inc.
-
Patent number: 11625251Abstract: A parallel processing (PP) level coherence directory, also referred to as a Processing In-Memory Probe Filter (PimPF), is added to a coherence directory controller. When the coherence directory controller receives a broadcast PIM command from a host, or a PIM command that is directed to multiple memory banks in parallel, the PimPF accelerates processing of the PIM command by maintaining a directory for cache coherence that is separate from existing system level directories in the coherence directory controller. The PimPF maintains a directory according to address signatures that define the memory addresses affected by a broadcast PIM command. Two implementations are described: a lightweight implementation that accelerates PIM loads into registers, and a heavyweight implementation that accelerates both PIM loads into registers and PIM stores into memory.Type: GrantFiled: December 23, 2021Date of Patent: April 11, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Varun Agrawal, Yasuko Eckert
-
Patent number: 11626874Abstract: Systems, apparatuses, and methods for conveying and receiving information as electrical signals in a computing system are disclosed. A computing system includes multiple transmitters sending singled-ended data signals to multiple receivers. A termination voltage is generated and sent to the multiple receivers. The termination voltage is coupled to each of signal termination circuitry and signal sampling circuitry within each of the multiple receivers. Any change in the termination voltage affects the termination circuitry and affects comparisons performed by the sampling circuitry. Received signals are reconstructed at the receivers using the received signals, the signal termination circuitry and the signal sampling circuitry.Type: GrantFiled: October 15, 2021Date of Patent: April 11, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Achal Kathuria, Pradeep Jayaraman
-
Patent number: 11625352Abstract: A memory controller includes a command queue and an arbiter for selecting entries from the command queue for transmission to a DRAM. The arbiter transacts streaks of consecutive read commands and streaks of consecutive write commands. The arbiter has a current mode indicating the type of commands currently being transacted, and a cross mode indicating the other type. The arbiter is operable to monitor commands in the command queue for the current mode and the cross mode, and in response to designated conditions, send at least one cross-mode command to the memory interface queue while continuing to operate in the current mode. In response to an end streak condition, the arbiter swaps the current mode and the cross mode, and transacts the cross-mode command.Type: GrantFiled: June 12, 2020Date of Patent: April 11, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Guanhao Shen, Ravindra Nath Bhargava, Raghava Sravan Adidamu
-
Patent number: 11625807Abstract: Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and intra-lane biased indexing mechanism for a vector general purpose register (VGPR) file. The VGPR file is split into two files. The first VGPR file is a larger register file with one read port and one write port. The second VGPR file is a smaller register file with multiple read ports and one write port. The second VGPR introduces the ability to co-issue more than one instruction per clock cycle.Type: GrantFiled: February 22, 2021Date of Patent: April 11, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Jiasheng Chen, Timour Paltashev, Alexander Lyashevsky, Carl Kittredge Wakeland, Michael J. Mantor
-
Publication number: 20230105709Abstract: A cache includes an upstream port, a downstream port, a cache memory, and a control circuit. The control circuit temporarily stores memory access requests received from the upstream port, and checks for dependencies for a new memory access request with older memory access requests temporarily stored therein. If one of the older memory access requests creates a false dependency with the new memory access request, the control circuit drops an allocation of a cache line to the cache memory for the older memory access request while continuing to process the new memory access request.Type: ApplicationFiled: December 28, 2021Publication date: April 6, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Chintan S. Patel, Girish Balaiah Aswathaiya
-
Publication number: 20230109344Abstract: Techniques for performing cache operations are provided. The techniques include tracking performance events for a plurality of test sets of a cache, detecting a replacement policy change trigger event associated with a test set of the plurality of test sets, and in response to the replacement policy change trigger event, operating non-test sets of the cache according to a replacement policy associated with the test set.Type: ApplicationFiled: September 30, 2021Publication date: April 6, 2023Applicant: Advanced Micro Devices, Inc.Inventors: John Kelley, Vanchinathan Venkataramani, Paul J. Moyer
-
Publication number: 20230108964Abstract: Methods, devices, and systems for retrieving information based on cache miss prediction. A prediction that a cache lookup for the information will miss a cache is made based on a history table. The cache lookup for the information is performed based on the request. A main memory fetch for the information is begun before the cache lookup completes, based on the prediction that the cache lookup for the information will miss the cache. In some implementations, the prediction includes comparing a first set of bits stored in the history table with a second set of bits stored in the history table. In some implementations, the prediction includes comparing at least a portion of an address of the request for the information with a set of bits in the history table.Type: ApplicationFiled: September 30, 2021Publication date: April 6, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Ciji Isen, Paul J. Moyer
-
Patent number: 11620248Abstract: A system and method for efficient data transfer in a computing system are described. A computing system includes multiple nodes that receive tasks to process. A bridge interconnect transfers data between two processing nodes without the aid of a system bus on the motherboard. One of the multiple bridge interconnects of the computing system is an optical bridge interconnect that transmits optical information across the optical bridge interconnect between two nodes. The receiving node uses photonic integrated circuits to translate the optical information into electrical information for processing by electrical integrated circuits. One or more nodes switch between using an optical bridge interconnect and a non-optical bridge interconnect based on one or more factors such as measured power consumption and measured data transmission error rates.Type: GrantFiled: March 31, 2021Date of Patent: April 4, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Robert E. Radke, Christopher M. Jaggers
-
Patent number: 11620788Abstract: Accesses to a mipmap by a shader in a graphics pipeline are monitored. The mipmap is stored in a memory or cache associated with the shader and the mipmap represents a texture at a hierarchy of levels of detail. A footprint in the mipmap of the texture is marked based on the monitored accesses. The footprint indicates, on a per-tile, per-level-of-detail (LOD) basis, tiles of the mipmap that are expected to be accessed in subsequent shader operations. In some cases, the footprint is defined by a plurality of footprint indicators that indicate whether the tiles of the mipmap are expected to be accessed in subsequent shader operations. In that case, the plurality of footprint indicators are set to a first value to indicate that the tile was not access during the first frame or a second value to indicate that the tile was accessed during the first frame.Type: GrantFiled: December 16, 2020Date of Patent: April 4, 2023Assignee: Advanced Micro Devices, Inc.Inventor: Christopher J. Brennan
-
Patent number: 11620525Abstract: A heterogeneous processing system includes at least one central processing unit (CPU) core and at least one graphics processing unit (GPU) core. The heterogeneous processing system is configured to compute an activation for each one of a plurality of neurons for a first network layer of a neural network. The heterogeneous processing system randomly drops a first subset of the plurality of neurons for the first network layer and keeps a second subset of the plurality of neurons for the first network layer. Activation for each one of the second subset of the plurality of neurons is forwarded to the CPU core and coalesced to generate a set of coalesced activation sub-matrices.Type: GrantFiled: September 25, 2018Date of Patent: April 4, 2023Assignee: Advanced Micro Devices, Inc.Inventor: Abhinav Vishnu
-
Patent number: 11619982Abstract: An integrated circuit includes a plurality of tiles receiving a power supply voltage, each having a corresponding analog circuit and operates in response to a first voltage, and a hardware controller receiving a voltage identification code and provides the first voltage to each of the plurality of tiles in response thereto. The hardware controller comprises a test time controller determining coefficients of a waveform that describes an average correspondence between the power supply voltage and the first voltage for the plurality of tiles, and a boot time controller determining a respective error signal indicating an error between the waveform and a respective actual waveform for each of the plurality of tiles, and providing the respective error signal to the corresponding analog circuit of each of the plurality of tiles. The corresponding analog circuit of each of the plurality of tiles adjusts the first voltage according to the respective error signal.Type: GrantFiled: December 28, 2020Date of Patent: April 4, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Miguel Rodriguez, Stephen Victor Kosonocky, Peter T. Hardman
-
Patent number: 11620224Abstract: Techniques for controlling prefetching of instructions into an instruction cache are provided. The techniques include tracking either or both of branch target buffer misses and instruction cache misses, modifying a throttle toggle based on the tracking, and adjusting prefetch activity based on the throttle toggle.Type: GrantFiled: December 10, 2019Date of Patent: April 4, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Aparna Thyagarajan, Ashok Tirupathy Venkatachar, Marius Evers, Angelo Wong, William E. Jones
-
Publication number: 20230097562Abstract: Described herein is a technique for performing ray tracing operations. The technique includes encountering, at a non-leaf node, a pointer to a bottom-level acceleration structure having one or more delta instances; identifying an index associated with the pointer, wherein the index identifies an instance within the bottom-level acceleration structure; and obtaining data for the instance based on the pointer and the index.Type: ApplicationFiled: September 28, 2021Publication date: March 30, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Konstantin I. Shkurko, Matthäus G. Chajdas, Michael Mantor
-
Publication number: 20230101038Abstract: A method and processing device for accessing data is provided. The processing device comprises a cache and a processor. The cache comprises a first data section having a first cache hit latency and a second data section having a second cache hit latency that is different from the first cache hit latency of the first data section. The processor is configured to request access to data in memory, the data corresponding to a memory address which includes an identifier that identifies the first data section of the cache. The processor is also configured to load the requested data, determined to be located in the first data section of the cache, according to the first cache hit latency of the first data section of the cache.Type: ApplicationFiled: September 29, 2021Publication date: March 30, 2023Applicant: Advanced Micro Devices, Inc.Inventor: Patrick J. Shyvers
-
Publication number: 20230096138Abstract: A processing device is provided which includes a processor and a data storage structure. The data storage structure comprises a data storage array comprising a plurality of lines. Each line comprises at least one A latch configured to store a data bit and a clock gater. The data storage structure also comprises a write data B latch configured to store, over different clock cycles, a different data bit, each to be written to the at least one A latch of one of the plurality of lines. The data storage structure also comprises a plurality of write index B latches shared by the clock gaters of the lines. The write index B latches are configured to store, over the different clock cycles, combinations of index bits having values which index one of the lines to which a corresponding data bit is to be stored.Type: ApplicationFiled: September 24, 2021Publication date: March 30, 2023Applicant: Advanced Micro Devices, Inc.Inventor: Patrick J. Shyvers
-
Publication number: 20230101748Abstract: Techniques for invalidating cache lines are provided. The techniques include issuing, to a first level of a memory hierarchy, a weak exclusive read request for a speculatively executing store instruction; determining whether to invalidate one or more cache lines associated with the store instruction in one or more memories; and issuing the weak invalidation request to additional levels of the memory hierarchy.Type: ApplicationFiled: September 30, 2021Publication date: March 30, 2023Applicant: Advanced Micro Devices, Inc.Inventor: Paul J. Moyer
-
Publication number: 20230101640Abstract: Devices and methods for linear addressing are provided. A device is provided which comprises a plurality of components having assigned registers used to store data to execute a program and a power management controller, in communication with the components. The power management controller is configured to send one of a request to remove power to the components and a request to reduce power to the components when it is determined that the components are idle, execute a first process of one of removing power and reducing power to the components and entering a reduced power state when an acknowledgement of the request is received and execute a second process of restoring power to the components when one or more of the components are indicated to be active.Type: ApplicationFiled: September 23, 2021Publication date: March 30, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Mihir Shaileshbhai Doctor, Alexander J. Branover, Benjamin Tsien, Indrani Paul, Christopher T. Weaver, Thomas J. Gibney, Stephen V. Kosonocky, John P. Petry
-
Publication number: 20230100230Abstract: Systems and methods are disclosed for maintaining insertion policies of a lower-level cache. Techniques are described for selecting, based on metadata of an evicted data block received from an upper-level cache, an insertion policy out of the insertion policies. Then, determining, based on the selected insertion policy, whether to insert the data block into the lower-level cache. If it is determined to insert, the data block is inserted into the lower-level cache according to the selected insertion policy. Techniques for dynamically updating the insertion policies of the lower-level cache are also disclosed.Type: ApplicationFiled: September 28, 2021Publication date: March 30, 2023Applicant: Advanced Micro Devices, Inc.Inventor: Paul J. Moyer
-
Publication number: 20230098336Abstract: A voltage level-shifting circuit for an integrated circuit includes an input terminal receiving a voltage signal referenced to an input/output (PO) voltage level. A transistor overvoltage protection circuit includes a first p-type metal oxide semiconductor (PMOS) transistor includes a source coupled to the second voltage supply, a gate receiving an enable signal, and a drain connected to a central node. A first n-type metal oxide semiconductor (NMOS) transistor includes a drain connected to the central node, a gate connected to the input terminal, and a source connected to an output terminal. A second NMOS transistor includes a drain connected to the input terminal, a gate connected to the central node, and a source connected to the output terminal.Type: ApplicationFiled: September 28, 2021Publication date: March 30, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Prateek Mishra, Thanapandi G, Jagadeesh Anathahalli Singrigowda, Dhruvin Devangbhai Shah, Girish Anathahalli Singrigowda, Animesh Jain
-
Publication number: 20230099806Abstract: Described herein is a technique for performing operations for a bounding volume hierarchy. The techniques include: for a bounding box with quantized orientation, the bounding box being part of a bounding volume hierarchy, rotating a ray according to the quantized orientation to generate a rotated ray; performing an intersection test against the bounding box with the rotated ray; and according to the results of the intersection test, continuing traversal of the bounding volume hierarchy.Type: ApplicationFiled: September 29, 2021Publication date: March 30, 2023Applicant: Advanced Micro Devices, Inc.Inventors: David Ronald Oldcorn, Matthäus G. Chajdas, Michael A. Kern