Patents Assigned to Advanced Micro Device, Inc.
  • Publication number: 20230305923
    Abstract: A memory system uses error detection codes to detect when errors have occurred in a region of memory. A count of the number of errors is kept and a notification is output in response to the number of errors satisfying a threshold value. The notification is an indication to a host (e.g., a program accessing or managing a machine learning system) that the threshold number of errors have been detected in the region of memory. As long as the number of errors that have been detected in the region of memory remains under the threshold number no notification need be output to the host.
    Type: Application
    Filed: March 25, 2022
    Publication date: September 28, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Sudhanva Gurumurthi, Ganesh Suryanarayan Dasika
  • Publication number: 20230307405
    Abstract: An electronic device can include a first die, a second die, and an interconnect. The first die or the second die has a principal function as a power module or a memory. The first die includes a first bond pad, and the second die includes a second bond pad. The device sides of the first and second dies are along the same sides as the first and second bond pads. In an embodiment, the first die and the second die are in a chip first, die face-up configuration. The first and the second bond pads are electrically connected along a first solderless connection that includes the interconnect. In another embodiment, each material within the electrical connection between the first and the second bond pads has a flow point or melting point temperature of at least 300° C.
    Type: Application
    Filed: March 25, 2022
    Publication date: September 28, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Lei Fu, Raja Swaminathan, Brett P. Wilkerson
  • Publication number: 20230305849
    Abstract: Array of pointers prefetching is described. In accordance with described techniques, a pointer target instruction is detected by identifying that a destination location of a load instruction is used in an address compute for a memory operation and the load instruction is included in a sequence of load instructions having addresses separated by a step size. An instruction for fetching data of a future load instruction is injected in an instruction stream of a processor. The data of the future load instruction is stored in a temporary register. An additional instruction is injected in the instruction stream for prefetching a pointer target based on an address of the memory operation and the data of the future load instruction.
    Type: Application
    Filed: March 25, 2022
    Publication date: September 28, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Chetana N. Keltcher, Alok Garg, Paul S. Keltcher
  • Patent number: 11768779
    Abstract: Systems, apparatuses, and methods for cache management based on access type priority are disclosed. A system includes at least a processor and a cache. During a program execution phase, certain access types are more likely to cause demand hits in the cache than others. Demand hits are load and store hits to the cache. A run-time profiling mechanism is employed to find which access types are more likely to cause demand hits. Based on the profiling results, the cache lines that will likely be accessed in the future are retained based on their most recent access type. The goal is to increase demand hits and thereby improve system performance. An efficient cache replacement policy can potentially reduce redundant data movement, thereby improving system performance and reducing energy consumption.
    Type: Grant
    Filed: December 16, 2019
    Date of Patent: September 26, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jieming Yin, Yasuko Eckert, Subhash Sethumurugan
  • Patent number: 11768778
    Abstract: Techniques for performing cache operations are provided. The techniques include tracking re-references for cache lines of a cache, detecting that eviction is to occur, and selecting a cache line for eviction from the cache based on a re-reference indication.
    Type: Grant
    Filed: September 30, 2021
    Date of Patent: September 26, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Paul J. Moyer
  • Patent number: 11769041
    Abstract: Systems, apparatuses, and methods for implementing a low latency long short-term memory (LSTM) machine learning engine using sequence interleaving techniques are disclosed. A computing system includes at least a host processing unit, a machine learning engine, and a memory. The host processing unit detects a plurality of sequences which will be processed by the machine learning engine. The host processing unit interleaves the sequences into data blocks and stores the data blocks in the memory. When the machine learning engine receives a given data block, the machine learning engine performs, in parallel, a plurality of matrix multiplication operations on the plurality of sequences in the given data block and a plurality of coefficients. Then, the outputs of the matrix multiplication operations are coupled to one or more LSTM layers.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: September 26, 2023
    Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Sateesh Lagudu, Lei Zhang, Allen H. Rush
  • Patent number: 11768771
    Abstract: The techniques described herein improve cache traffic performance in the context of contended lock instructions. More specifically, each core maintains a lock address contention table that stores addresses corresponding to contended lock instructions. The lock address contention table also includes a state value that indicates progress through a series of states meant to track whether a load by the core in a spin-loop associated with semaphore acquisition has obtained the semaphore in an exclusive state. Upon detecting that a load in a spin-loop has obtained the semaphore in an exclusive state, the core responds to incoming requests for access to the semaphore with negative acknowledgments. This allows the core to maintain the semaphore cache line in an exclusive state, which allows it to acquire the semaphore faster and to avoid transmitting that cache line to other cores unnecessarily.
    Type: Grant
    Filed: December 9, 2021
    Date of Patent: September 26, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: John M. King, Gregory W. Smaus
  • Patent number: 11768664
    Abstract: A graphics processing unit (GPU) implements operations, with associated op codes, to perform mixed precision mathematical operations. The GPU includes an arithmetic logic unit (ALU) with different execution paths, wherein each execution path executes a different mixed precision operation. By implementing mixed precision operations at the ALU in response to designate op codes that delineate the operations, the GPU efficiently increases the precision of specified mathematical operations while reducing execution overhead.
    Type: Grant
    Filed: October 2, 2019
    Date of Patent: September 26, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Bin He, Michael Mantor, Jiasheng Chen
  • Publication number: 20230297381
    Abstract: Load dependent branch prediction is described. In accordance with described techniques, a load dependent branch instruction is detected by identifying that a destination location of a load instruction is used in an operation for determining whether a conditional branch is taken or not taken. The load instruction is included in a sequence of load instructions having addresses separated by a step size. An instruction is injected in an instruction stream of a processor for fetching data of a future load instruction using an address of the load instruction offset by a distance based on the step size. An additional instruction is injected in the instruction stream of the processor for precomputing an outcome of a load dependent branch using an address computed based on an address of the operation and the data of the future load instruction.
    Type: Application
    Filed: March 21, 2022
    Publication date: September 21, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Chetana N. Keltcher, Alok Garg, Paul S Keltcher
  • Publication number: 20230298256
    Abstract: A technique for performing ray tracing operations is provided. The technique includes, in response to detecting that a threshold number of traversal stage work-items of a wavefront have terminated, increasing intersection test parallelization for non-terminated work-items.
    Type: Application
    Filed: June 20, 2022
    Publication date: September 21, 2023
    Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC
    Inventors: Daniel James Skinner, Michael John Livesley, David William John Pankratz
  • Publication number: 20230298261
    Abstract: Techniques for performing rendering operations are disclosed herein. The techniques include performing two-level primitive batch binning in parallel across multiple rendering engines, wherein tiles for subdividing coarse-level work across the rendering engines have the same size as tiles for performing coarse binning.
    Type: Application
    Filed: June 21, 2022
    Publication date: September 21, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Michael John Livesley, Ruijin Wu, Mangesh P. Nijasure
  • Patent number: 11762017
    Abstract: A system for performing a scan test of a processor core includes a scan test module and a processor including a processor core and an input/output die, where the input/output die is coupled to the processor core. The scan test module transmits, in parallel to the input/output die, scan test input data. A serializer/deserializer module of the input/output die receives the input data, serializes the input data, and transmits the serialized input data to the processor core. A serializer/deserializer module of the processor core receives the serialized scan test input data, deserializes the input data, receives result data generated in dependence upon the input data, serializes the result data, and transmits the serialized result data to the input/output die. The input/output die serializer/deserializer module receives the result data, deserializes the result data, and provides the result data to the scan test module. Error detection can be carried out through redundancy.
    Type: Grant
    Filed: November 22, 2021
    Date of Patent: September 19, 2023
    Assignee: ADVANCED MICRO DEVICES, INC.
    Inventors: Ahmet Tokuz, Saurabh Upadhyay
  • Patent number: 11764789
    Abstract: Systems and techniques for applying voltage biases to gates of driver circuitry of an integrated circuit (IC) based on a detected bus voltage, IC supply voltage, or both are used to mitigate Electrical Over-Stress (EOS) issues in components of the driver circuitry caused, for instance, by high bus voltages in serial communication systems relative to maximum operating voltages of those components. A driver bias generator selectively applies bias voltages at gates of transistors of a stacked driver structure of an IC to prevent the voltage drop across any given transistor of the stacked driver structure from exceeding a predetermined threshold associated with the maximum operating voltage range of the transistors.
    Type: Grant
    Filed: September 28, 2021
    Date of Patent: September 19, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Rajesh Mangalore Anand, Prasant Kumar Vallur, Piyush Gupta, Girish Anathahalli Singrigowda, Jagadeesh Anathahalli Singrigowda
  • Patent number: 11762777
    Abstract: Devices and methods for cache prefetching are provided. A device is provided which comprises memory and a processor. The memory comprises a DRAM cache, a cache dedicated to the processor and one or more intermediate caches between the dedicated cache and the DRAM cache. The processor is configured to issue prefetch requests to prefetch data, issue data access requests to fetch the data and when one or more previously issued prefetch requests are determined to be inaccurate, issue a prefetch request to prefetch a tag, corresponding to the memory address of requested data in the DRAM cache. A tag look-up is performed at the DRAM cache without performing tag look-ups at the dedicated cache or the intermediate caches. The tag is prefetched from the DRAM cache without prefetching the requested data.
    Type: Grant
    Filed: March 31, 2021
    Date of Patent: September 19, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jagadish B. Kotra, Marko Scrbak, Matthew Raymond Poremba
  • Patent number: 11762658
    Abstract: A processing unit such as a graphics processing unit (GPU) includes a plurality of vector signal processors (VSPs) that include multiply/accumulate elements. The processing unit also includes a plurality of registers associated with the plurality of VSPs. First portions of first and second matrices are fetched into the plurality of registers prior to a first round that includes a plurality of iterations. The multiply/accumulate elements perform matrix multiplication and accumulation on different combinations of subsets of the first portions of the first and second matrices in the plurality of iterations prior to fetching second portions of the first and second matrices into the plurality of registers for a second round. The accumulated results of multiplying the first portions of the first and second matrices are written into an output buffer in response to completing the plurality of iterations.
    Type: Grant
    Filed: September 24, 2019
    Date of Patent: September 19, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Bin He, Michael Mantor, Jiasheng Chen, Jian Huang
  • Patent number: 11762828
    Abstract: A method includes, for each key of a plurality of keys, identifying from a set of buckets a first bucket for the key based on a first hash function, and identifying from the set of buckets a second bucket for the key based on a second hash function. An entry for the key is stored in a bucket selected from one of the first bucket and the second bucket. The entry is inserted in a sequence of entries in a memory block. A position of the entry in the sequence of entries corresponds to the selected bucket. For each bucket in the set of buckets, an indication of a number of entries in the bucket is recorded.
    Type: Grant
    Filed: August 17, 2018
    Date of Patent: September 19, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Alexander D. Breslow, Nuwan S. Jayasena
  • Patent number: 11763155
    Abstract: A system comprising an electronic device that includes a processor is described. During operation, the processor acquires a full version of a neural network, the neural network including internal elements for processing instances of input image data having a set of color channels. The processor then generates, from the neural network, a set of sub-networks, each sub-network being a separate copy of the neural network with the internal elements for processing at least one of the color channels in instances of input image data removed, so that each sub-network is configured for processing a different set of one or more color channels in instances of input image data. The processor next provides the sub-networks for processing instances of input image data—and may itself use the sub-networks for processing instances of input image data.
    Type: Grant
    Filed: August 12, 2019
    Date of Patent: September 19, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sudhanva Gurumurthi, Abhinav Vishnu
  • Publication number: 20230290035
    Abstract: A system and method for performing graphics processing is provided. The system and method includes processing an allocation command for a buffer object; reserving processor address space for a data store of the buffer object with uncommitted physical memory in response to the allocation command including a null parameter, and reserving processor address space for a data store of the buffer object with committed physical memory in response to the allocation command including a non-null parameter.
    Type: Application
    Filed: May 19, 2023
    Publication date: September 14, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Graham Sellers, Eric Zolnowski, Pierre Boudier, Juraj Obert
  • Publication number: 20230290400
    Abstract: A data transmission system includes a first integrated circuit. The first integrated circuit includes a first mixing terminal coupled to a first power supply voltage terminal at a point internal to the first integrated circuit, a first return terminal, a first resistor having a first terminal coupled to the first mixing terminal, and a second terminal for providing a first mixed voltage, and a second resistor having a first terminal coupled to the second terminal of the first resistor, and a second terminal coupled to the first return terminal.
    Type: Application
    Filed: June 30, 2022
    Publication date: September 14, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Aaron D Willey, Karthik Gopalakrishnan, Ramon Mangaser
  • Publication number: 20230289290
    Abstract: A method includes monitoring one or more metrics for each of a plurality of cache users sharing a cache, and assigning each of the plurality of cache users to one of a plurality of groups based on the monitored one or more metrics.
    Type: Application
    Filed: May 17, 2023
    Publication date: September 14, 2023
    Applicant: Advanced Micro Devices, Inc.
    Inventor: John Kelley