Patents Assigned to Advanced Micro Devices
-
Patent number: 11556162Abstract: A processor utilizes instruction based sampling to generate sampling data sampled on a per instruction basis during execution of an instruction. The sampling data indicates what processor hardware was used due to the execution of the instruction. Software receives the sampling data and generates an estimate of energy used by the instruction based on the sampling data. The sampling data may include microarchitectural events and the energy estimate utilizes a base energy amount corresponding to the instruction executed along with energy amounts corresponding to the microarchitectural events in the sampling data. The sampling data may include switching events associated with hardware blocks that switched due to execution of the instruction and the energy estimate for the instruction is based on the switching events and capacitance estimates associated with the hardware blocks.Type: GrantFiled: March 16, 2018Date of Patent: January 17, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Shijia Wei, Joseph L. Greathouse, John Kalamatianos
-
Patent number: 11556250Abstract: A system including a stack of two or more layers of volatile memory, such as layers of a 3D stacked DRAM memory, places data in the stack based on a temperature or a refresh rate. When a threshold is exceeded, data are moved from a first region to a second region in the stack, the second region having one or both of a second temperature lower than a first temperature of the first region or a second refresh rate lower than a first refresh rate of the first region.Type: GrantFiled: July 27, 2020Date of Patent: January 17, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Jagadish B. Kotra, Karthik Rao, Joseph L. Greathouse
-
Publication number: 20230010801Abstract: Methods, devices, and systems for prefetching data. First data is loaded from a first memory location. The first data in cached in a cache memory. Other data is prefetched to the cache memory based on a compression of the first data and a compression of the other data. In some implementations, the compression of the first data and the compression of the other data are determined based on metadata associated with the first data and metadata associated with the other data. In some implementations, the other data is prefetched to the cache memory based on a total of a compressed size of the first data and a compressed size of the other data being less than a threshold size. In some implementations, the other data is not prefetched to the cache memory based on the other data being uncompressed.Type: ApplicationFiled: July 8, 2021Publication date: January 12, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Pazhani Pillai, Harish Kumar Kovalam Rajendran
-
Patent number: 11551990Abstract: Exemplary embodiments provide thermal wear spreading among a plurality of thermal die regions in an integrated circuit or among dies by using die region wear-out data that represents a cumulative amount of time each of a number of thermal die regions in one or more dies has spent at a particular temperature level. In one example, die region wear-out data is stored in persistent memory and is accrued over a life of each respective thermal region so that a long term monitoring of temperature levels in the various die regions is used to spread thermal wear among the thermal die regions. In one example, spreading thermal wear is done by controlling task execution such as thread execution among one or more processing cores, dies and/or data access operations for a memory.Type: GrantFiled: August 11, 2017Date of Patent: January 10, 2023Assignee: ADVANCED MICRO DEVICES, INC.Inventors: David A. Roberts, Greg Sadowski, Steven Raasch
-
Patent number: 11550588Abstract: A branch predictor of a processor includes one or more prediction structures, including a predicted branch address and predicted branch direction, that identify predicted branches. To reduce power consumption, the branch predictor selects one or more of the prediction structures that are not expected to provide useful branch prediction information and filters the selected structures such that the filtered structures are not used for branch prediction. The branch predictor thereby reduces the amount of power used for branch prediction without substantially reducing the accuracy of the predicted branches.Type: GrantFiled: August 22, 2018Date of Patent: January 10, 2023Assignee: Advanced Micro Devices, Inc.Inventors: John Kalamatianos, Adithya Yalavarti, Varun Agrawal, Subhankar Pal, Vinesh Srinivasan
-
Patent number: 11550627Abstract: A processor core is configured to execute a parent task that is described by a data structure stored in a memory. A coprocessor is configured to dispatch a child task to the at least one processor core in response to the coprocessor receiving a request from the parent task concurrently with the parent task executing on the at least one processor core. In some cases, the parent task registers the child task in a task pool and the child task is a future task that is configured to monitor a completion object and enqueue another task associated with the future task in response to detecting the completion object. The future task is configured to self-enqueue by adding a continuation future task to a continuation queue for subsequent execution in response to the future task failing to detect the completion object.Type: GrantFiled: March 29, 2021Date of Patent: January 10, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Anthony Gutierrez, Sooraj Puthoor
-
Patent number: 11550722Abstract: Methods, systems, and apparatuses provide support for multiple address spaces in order to facilitate data movement. One system includes a host processor; a memory; a data fabric coupled to the host processor and to the memory; a first input/output memory manage unit (IOMMU) and a second IOMMU, each of the first and second IOMMUs coupled to the data fabric; a first root port and a second root port, each of the first and second root ports coupled to a corresponding IOMMU of the first and second IOMMUs; and a first peripheral component endpoint and a second peripheral component endpoint, each of the first and second peripheral component endpoints coupled to a corresponding root port of the first and second root ports, wherein each of the first and second root ports comprises hardware control logic operative to: synchronize the first and second root ports.Type: GrantFiled: March 2, 2021Date of Patent: January 10, 2023Assignees: ATI TECHNOLOGIES ULC, ADVANCED MICRO DEVICES, INC.Inventors: Philip Ng, Nippon Raval, BuHeng Xu, Rostislav S. Dobrin, Shawn Han
-
Patent number: 11550728Abstract: A processing system includes a processor, a memory, and an operating system that are used to allocate a page table caching memory object (PTCM) for a user of the processing system. An allocation of the PTCM is requested from a PTCM allocation system. In order to allocate the PTCM, a plurality of physical memory pages from a memory are allocated to store a PTCM page table that is associated with the PTCM. A lockable region of a cache is designated to hold a copy of the PTCM page table, after which the lockable region of the cache is subsequently locked. The PTCM page table is populated with page table entries associated with the PTCM and copied to the locked region of the cache.Type: GrantFiled: September 27, 2019Date of Patent: January 10, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Derrick Allen Aguren, Eric H. Van Tassell, Gabriel H. Loh, Jay Fleischman
-
Patent number: 11551398Abstract: Systems, apparatuses, and methods for implementing light volume rendering techniques are disclosed. A processor is coupled to a memory. A processor renders the geometry of a scene into a geometry buffer. For a given light source in the scene, the processor initiates two shader pipeline passes to determine which pixels in the geometry buffer to light. On the first pass, the processor renders a front-side of a light volume corresponding to the light source. Any pixels of the geometry buffer which are in front of the front-side of the light volume are marked as pixels to be discarded. Then, during the second pass, only those pixels which were not marked to be discarded are sent to the pixel shader. This approach helps to reduce the overhead involved in applying a lighting effect to the scene by reducing the amount of work performed by the pixel shader.Type: GrantFiled: August 31, 2020Date of Patent: January 10, 2023Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Mika Tuomi, Miikka Petteri Kangasluoma, Jan Henrik Achrenius, Laurent Lefebvre
-
Patent number: 11553222Abstract: Virtual Reality (VR) processing devices and methods are provided for transmitting user feedback information comprising at least one of user position information and user orientation information, receiving encoded audio-video (A/V) data, which is generated based on the transmitted user feedback information, separating the A/V data into video data and audio data corresponding to a portion of a next frame of a sequence of frames of the video data to be displayed, decoding the portion of a next frame of the video data and the corresponding audio data, providing the audio data for aural presentation and controlling the portion of the next frame of the video data to be displayed in synchronization with the corresponding audio data.Type: GrantFiled: December 23, 2020Date of Patent: January 10, 2023Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Lei Zhang, Gabor Sines, Khaled Mammou, David Glen, Layla A. Mah, Rajabali M. Koduri, Bruce Montag
-
Publication number: 20230004385Abstract: A processing device is provided which comprises a plurality of compute units configured to process data, a plurality of arithmetic logic units, instantiated separate from the plurality of compute units, and configured to store the data at the arithmetic logic units and perform calculations using the data and an interconnect network, connecting the arithmetic logic units and configured to provide the arithmetic logic units with shared access to the data for communication between the arithmetic logic units. The interconnect network is also configured to provide the compute units with shared access to the data for communication between the compute units.Type: ApplicationFiled: June 30, 2021Publication date: January 5, 2023Applicant: Advanced Micro Devices, Inc.Inventor: Maxim V. Kazakov
-
Publication number: 20230004459Abstract: A memory controller includes a memory channel controller adapted to receive memory access requests and dispatch associated commands addressable in a system memory address space to a non-volatile storage class memory (SCM) module. The non-volatile error reporting circuit identifies error conditions associated with the non-volatile SCM module and maps the error conditions from a first number of possible error conditions associated with the non-volatile SCM module to a second, smaller number of virtual error types for reporting to an error monitoring module of a host operating system, the mapping based at least on a classification that the error condition will or will not have a deleterious effect on an executable process running on the host operating system.Type: ApplicationFiled: July 14, 2022Publication date: January 5, 2023Applicant: Advanced Micro Devices, Inc.Inventors: James R. Magro, Kedarnath Balakrishnan, Vilas Sridharan
-
Publication number: 20230004871Abstract: Methods, systems, and devices for pipeline fusion of a plurality of kernels. In some implementations, a first batch of a first kernel is executed on a first processing device to generate a first output of the first kernel based on an input. A first batch of a second kernel is executed on a second processing device to generate a first output of the second kernel based on the first output of the first kernel. A second batch of the first kernel is executed on the first processing device to generate a second output of the first kernel based on the input. The execution of the second batch of the first kernel overlaps at least partially in time with executing the first batch of the second kernel.Type: ApplicationFiled: June 30, 2021Publication date: January 5, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Swapnil P. Sakharshete, Maxim V. Kazakov, Milind N. Nemlekar, Samuel Lawrence Wasmundt
-
Patent number: 11544065Abstract: A processor includes a front-end with an instruction set that operates at a first bit width and a floating point unit coupled to receive the instruction set in the processor that operates at the first bit width. The floating point unit operates at a second bit width and, based upon a bit width assessment of the instruction set provided to the floating point unit, the floating point unit employs a shadow-latch configured floating point register file to perform bit width reconfiguration. The shadow-latch configured floating point register file includes a plurality of regular latches and a plurality of shadow latches for storing data that is to be either read from or written to the shadow latches. The bit width reconfiguration enables the floating point unit that operates at the second bit width to operate on the instruction set received at the first bit width.Type: GrantFiled: September 27, 2019Date of Patent: January 3, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Arun A. Nair, Todd Baumgartner, Michael Estlick, Erik Swanson
-
Patent number: 11543877Abstract: An apparatus includes a processor, a sleep state duration prediction module, and a system management unit. The sleep state duration prediction module is configured to predict a sleep state duration for component of the processing device. The system management unit is to transition the component into a sleep state selected from a plurality of sleep states based on a comparison of the predicted sleep state duration to at least one duration threshold. Each sleep state of the plurality of sleep states is a lower power state than a previous sleep state of the plurality of sleep states.Type: GrantFiled: March 31, 2021Date of Patent: January 3, 2023Assignees: Advanced Micro Devices, Inc., ATI TECHNOLOGIES ULCInventors: Karthik Rao, Indrani Paul, Donny Yi, Oleksandr Khodorkovsky, Leonardo De Paula Rosa Piga, Wonje Choi, Dana G. Lewis, Sriram Sambamurthy
-
Patent number: 11544815Abstract: A processing device is provided which includes memory and a processor. The processor is configured to receive an input image having a first resolution, generate linear down-sampled versions of the input image by down-sampling the input image via a linear upscaling network and generate non-linear down-sampled versions of the input image by down-sampling the input image via a non-linear upscaling network. The processor is also configured to convert the down-sampled versions of the input image into pixels of an output image having a second resolution higher than the first resolution and provide the output image for display.Type: GrantFiled: November 18, 2019Date of Patent: January 3, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Alexander M. Potapov, Skyler Jonathon Saleh, Swapnil P. Sakharshete, Vineet Goel
-
Patent number: 11544196Abstract: Systems, apparatuses, and methods for implementing a multi-tiered approach to cache compression are disclosed. A cache includes a cache controller, light compressor, and heavy compressor. The decision on which compressor to use for compressing cache lines is made based on certain resource availability such as cache capacity or memory bandwidth. This allows the cache to opportunistically use complex algorithms for compression while limiting the adverse effects of high decompression latency on system performance. To address the above issue, the proposed design takes advantage of the heavy compressors for effectively reducing memory bandwidth in high bandwidth memory (HBM) interfaces as long as they do not sacrifice system performance. Accordingly, the cache combines light and heavy compressors with a decision-making unit to achieve reduced off-chip memory traffic without sacrificing system performance.Type: GrantFiled: December 23, 2019Date of Patent: January 3, 2023Assignee: Advanced Micro Devices, Inc.Inventors: SeyedMohammad SeyedzadehDelcheh, Shomit N. Das, Bradford Michael Beckmann
-
Patent number: 11544106Abstract: Systems, apparatuses, and methods for implementing continuation analysis tasks (CATs) are disclosed. In one embodiment, a system implements hardware acceleration of CATs to manage the dependencies and scheduling of an application composed of multiple tasks. In one embodiment, a continuation packet is referenced directly by a first task. When the first task completes, the first task enqueues a continuation packet on a first queue. The first task can specify on which queue to place the continuation packet. The agent responsible for the first queue dequeues and executes the continuation packet which invokes an analysis phase which is performed prior to determining which dependent tasks to enqueue. If it is determined during the analysis phase that a second task is now ready to be launched, the second task is enqueued on one of the queues. Then, an agent responsible for this queue dequeues and executes the second task.Type: GrantFiled: April 13, 2020Date of Patent: January 3, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Steven Tony Tye, Brian L. Sumner, Bradford Michael Beckmann, Sooraj Puthoor
-
Patent number: 11544121Abstract: Systems, apparatuses, and methods for generating network messages on a parallel processor are disclosed. A system includes at least a parallel processor, a general purpose processor, and a network interface unit. The parallel processor includes at least a plurality of compute units, a command processor, and a cache. A thread within a kernel executing on a compute unit of the parallel processor generates a network message and stores the network message and a corresponding indication in the cache. In response to detecting the indication of the network message in the cache, the command processor processes and conveys the network message to the network interface unit without involving the general purpose processor.Type: GrantFiled: November 16, 2017Date of Patent: January 3, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Michael Wayne LeBeane, Khaled Hamidouche, Walter B. Benton
-
Publication number: 20220413759Abstract: A data processor includes a staging buffer, a command queue, a picker, and an arbiter. The staging buffer receives and stores first memory access requests. The command queue stores second memory access requests, each indicating one of a plurality of ranks of a memory system. The picker picks among the first memory access requests in the staging buffer and provides selected ones of the first memory access requests to the command queue. The arbiter selects among the second memory access requests from the command queue based on at least a preference for accesses to a current rank of the memory system. The picker picks accesses to the current rank among the first memory access requests of the staging buffer and provides the selected ones of the first memory access requests to the command queue.Type: ApplicationFiled: June 24, 2021Publication date: December 29, 2022Applicant: Advanced Micro Devices, Inc.Inventors: Guanhao Shen, Ravindra Nath Bhargava