Patents Assigned to Advanced Micros Devices, Inc.
-
Patent number: 11030117Abstract: A host processor receives an address translation request from an accelerator, which may be trusted or un-trusted. The address translation request includes a virtual address in a virtual address space that is shared by the host processor and the accelerator. The host processor encrypts a physical address in a host memory indicated by the virtual address in response to the accelerator being permitted to access the physical address. The host processor then provides the encrypted physical address to the accelerator. The accelerator provides memory access requests including the encrypted physical address to the host processor, which decrypts the physical address and selectively accesses a location in the host memory indicated by the decrypted physical address depending upon whether the accelerator is permitted to access the location indicated by the decrypted physical address.Type: GrantFiled: July 14, 2017Date of Patent: June 8, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Nuwan Jayasena, Brandon K. Potter, Andrew G. Kegel
-
Patent number: 11029852Abstract: The present application describes embodiments of an interface for coupling flash memory and dynamic random access memory (DRAM) in a processing system. Some embodiments include a dedicated interface between a flash memory and DRAM. The dedicated interface is to provide access to the flash memory in response to instructions received over a DRAM interface between the DRAM and a processing device. Some embodiments of a method include accessing a flash memory via a dedicated interface between the flash memory and a dynamic random access memory (DRAM) in response to an instruction received over a DRAM interface between the DRAM and a processing device.Type: GrantFiled: December 14, 2016Date of Patent: June 8, 2021Assignee: Advanced Micro Devices, Inc.Inventor: James Bauman
-
Patent number: 11023241Abstract: Systems and methods selectively bypass address-generation hardware in processor instruction pipelines. In an embodiment, a processor includes an address-generation stage and an address-generation-bypass-determination unit (ABDU). The ABDU receives a load/store instruction. If an effective address for the load/store instruction is not known at the ABDU, the ABDU routes the load/store instruction via the address-generation stage of the processor. If, however, the effective address of the load/store instruction is known at the ABDU, the ABDU routes the load/store instruction to bypass the address-generation stage of the processor.Type: GrantFiled: August 21, 2018Date of Patent: June 1, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Andrej Kocev, Jay Fleischman, Kai Troester, Johnny C. Chu, Tim J. Wilkens, Neil Marketkar, Michael W. Long
-
Patent number: 11025934Abstract: A host processor, such as a central processing unit (CPU), programmed to execute a software driver that causes the host processor to generate a motion compensation command for a plurality of cores of a massively parallel processor, such as a graphics processing unit (GPU), to provide motion compensation for encoded video. The motion compensation command for the plurality of cores of the massively parallel processor contains executable instructions for processing a plurality of motion vectors grouped by a plurality of prediction modes from a re-ordered motion vector buffer by the plurality of cores of the massively parallel processor.Type: GrantFiled: December 16, 2014Date of Patent: June 1, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Michael L. Schmit, Ashish Farmer, Radhakrishna Giduthuri
-
Patent number: 11023410Abstract: A system is described that performs memory access operations. The system includes a processor in a first node, a memory in a second node, a communication interconnect coupled to the processor and the memory, and an interconnect controller in the first node coupled between the processor and the communication interconnect. Upon executing a multi-line memory access instruction, the processor prepares a memory access operation for accessing, in the memory, a block of data including at least some of each of at least two lines of data. The processor then causes the interconnect controller to use a single remote direct memory access memory transfer to perform the memory access operation for the block of data via the communication interconnect.Type: GrantFiled: September 11, 2018Date of Patent: June 1, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: David A. Roberts, Shenghsun Cho
-
Patent number: 11023242Abstract: A method and apparatus of asynchronous scheduling in a graphics device includes sending one or more instructions from an instruction scheduler to one or more instruction first-in/first-out (FIFO) devices. An instruction in the one or more FIFO devices is selected for execution by a single-instruction/multiple-data (SIMD) pipeline unit. It is determined whether all operands for the selected instruction are available for execution of the instruction, and if all the operands are available, the selected instruction is executed on the SIMD pipeline unit. The self-timed arithmetic pipeline unit (SIMD pipeline unit) is effectively encapsulated in a synchronous, (e.g., clocked by global clock), scheduler and register file environment.Type: GrantFiled: January 27, 2017Date of Patent: June 1, 2021Assignees: ATI TECHNOLOGIES ULC, ADVANCED MICRO DEVICES, INC.Inventors: John Kalamatianos, Greg Sadowski, Syed Zohaib M. Gilani
-
Publication number: 20210157485Abstract: Systems, methods, and devices for performing pattern-based cache block compression and decompression. An uncompressed cache block is input to the compressor. Byte values are identified within the uncompressed cache block. A cache block pattern is searched for in a set of cache block patterns based on the byte values. A compressed cache block is output based on the byte values and the cache block pattern. A compressed cache block is input to the decompressor. A cache block pattern is identified based on metadata of the cache block. The cache block pattern is applied to a byte dictionary of the cache block. An uncompressed cache block is output based on the cache block pattern and the byte dictionary. A subset of cache block patterns is determined from a training cache trace based on a set of compressed sizes and a target number of patterns for each size.Type: ApplicationFiled: September 23, 2020Publication date: May 27, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Matthew Tomei, Shomit N. Das, David A. Wood
-
Publication number: 20210157598Abstract: Techniques are provided for allocating registers for a processor. The techniques include identifying a first instruction of an instruction dispatch set that meets all register allocation suppression criteria of a first set of register allocation suppression criteria, suppressing register allocation for the first instruction, identifying a second instruction of the instruction dispatch set that does not meet all register allocation suppression criteria of a second set of register allocation suppression criteria, and allocating a register for the second instruction.Type: ApplicationFiled: November 26, 2019Publication date: May 27, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Neil N. Marketkar, Arun A. Nair
-
Publication number: 20210157590Abstract: A technique for performing store-to-load forwarding is provided. The technique includes determining a virtual address for data to be loaded for the load instruction, identifying a matching store instruction from one or more store instruction memories by comparing a virtual-address-based comparison value for the load instruction to one or more virtual-address-based comparison values of one or more store instructions, determining a physical address for the load instruction, and validating the load instruction based on a comparison between the physical address of the load instruction and a physical address of the matching store instruction.Type: ApplicationFiled: November 27, 2019Publication date: May 27, 2021Applicant: Advanced Micro Devices, Inc.Inventors: John M. King, Matthew T. Sobel
-
Publication number: 20210157611Abstract: Described herein are techniques for executing a heterogeneous code object executable. According to the techniques, a loader identifies a first memory appropriate for loading a first architecture-specific portion of the heterogeneous code object executable, wherein the first architecture specific portion includes instructions for a first architecture, identifies a second memory appropriate for loading a second architecture-specific portion of the heterogeneous code object executable, wherein the second architecture specific portion includes instructions for a second architecture that is different than the first architecture, loads the first architecture-specific portion into the first memory and the second architecture-specific portion into the second memory, and performs relocations on the first architecture-specific portion and on the second architecture-specific portion.Type: ApplicationFiled: November 22, 2019Publication date: May 27, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Steven Tony Tye, Brian Laird Sumner, Konstantin Zhuravlyov
-
Publication number: 20210158601Abstract: Described herein is a technique for performing ray tracing. According to this technique, instead of executing intersection and/or any hit shaders during traversal of an acceleration structure to determine the closest hit for a ray, an acceleration structure is fully traversed in an invocation of a shader program, and the closest intersection with a triangle is recorded in a data structure associated with the material of the triangle. Later, a scheduler launches waves by grouping together multiple data items associated with the same material. The rays processed by that wave are processed with a continuation ray, rather than the full original ray. A continuation ray starts from the previous point of intersection and extends in the direction of the original ray. These steps help counter divergence that would occur if a single shader program that inlined the intersection and any hit shaders were executed.Type: ApplicationFiled: February 3, 2021Publication date: May 27, 2021Applicant: Advanced Micro Devices, Inc.Inventor: Skyler Jonathon Saleh
-
Publication number: 20210157559Abstract: Described herein are techniques for performing compilation operations for heterogeneous code objects. According to the techniques, a compiler identifies architectures targeted by a compilation unit, compiles the compilation unit into a heterogeneous code object that includes a different code object portion for each identified architecture, performs name mangling on functions of the compilation unit, links the heterogeneous code object with a second code object to form an executable, and generates relocation records for the executable.Type: ApplicationFiled: November 22, 2019Publication date: May 27, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Steven Tony Tye, Brian Laird Sumner, Konstantin Zhuravlyov
-
Publication number: 20210158222Abstract: Methods, devices, and systems for emulating a compute kernel with an ANN. The compute kernel is executed on a processor, and it is determined whether the compute kernel is a hotspot kernel. If the compute kernel is a hotspot kernel, the compute kernel is emulated with an ANN, and the ANN is substituted for the compute kernel.Type: ApplicationFiled: November 25, 2019Publication date: May 27, 2021Applicant: Advanced Micro Devices, Inc.Inventor: Nicholas Malaya
-
Patent number: 11018125Abstract: Various semiconductor chip devices and methods of manufacturing the same are disclosed. In one aspect, a semiconductor chip device is provided that has a reconstituted semiconductor chip package that includes an interposer that has a first side and a second and opposite side and a metallization stack on the first side, a first semiconductor chip on the metallization stack and at least partially encased by a dielectric layer on the metallization stack, and plural semiconductor chips positioned over and at least partially laterally overlapping the first semiconductor chip.Type: GrantFiled: July 13, 2020Date of Patent: May 25, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Milind S. Bhagavat, Rahul Agarwal, Gabriel H. Loh
-
Patent number: 11016555Abstract: An apparatus and a method for controlling power consumption associated with a computing device having first and second processors configured to perform different types of operations includes providing a user interface that allows, during normal operation of the computing device, at least one of: (i) a user selection of desired performance levels of the first and second processors relative to one another, such that higher desired performance levels of one processor correspond to lower desired performance levels of the other processor, and (ii) a user selection of a desired performance level of the first processor and a user selection of a desired performance level of the second processor, the two user selections being made independently of one another. The apparatus and method control, during normal operation of the computing device, performance levels of the processors in response to the one or more user selections of the desired performance levels.Type: GrantFiled: August 1, 2018Date of Patent: May 25, 2021Assignee: Advanced Micro Devices, Inc.Inventor: I-Ming Lin
-
Patent number: 11016763Abstract: Systems, apparatuses, and methods for compacting multiple groups of micro-operations into individual cache lines of a micro-operation cache are disclosed. A processor includes at least a decode unit and a micro-operation cache. When a new group of micro-operations is decoded and ready to be written to the micro-operation cache, the micro-operation cache determines which set is targeted by the new group of micro-operations. If there is a way in this set that can store the new group without evicting any existing group already stored in the way, then the new group is stored into the way with the existing group(s) of micro-operations. Metadata is then updated to indicate that the new group of micro-operations has been written to the way. Additionally, the micro-operation cache manages eviction and replacement policy at the granularity of micro-operation groups rather than at the granularity of cache lines.Type: GrantFiled: March 8, 2019Date of Patent: May 25, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Jagadish B. Kotra, John Kalamatianos
-
Publication number: 20210149276Abstract: A system and method for controlling characteristics of collected image data are disclosed. The system and method include performing pre-processing of an image using GPUs, configuring an optic based on the pre-processing, the configuring being designed to account for features of the pre-processed image, acquiring an image using the configured optic, processing the acquired image using GPUs, and determining if the processed acquired image accounts for feature of the pre-processed image, and the determination is affirmative, outputting the image, wherein if the determination is negative repeating the configuring of the optic and re-acquiring the image.Type: ApplicationFiled: December 23, 2020Publication date: May 20, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Allen H. Rush, Hui Zhou
-
Publication number: 20210150669Abstract: A processing device is provided which includes memory and a processor. The processor is configured to receive an input image having a first resolution, generate linear down-sampled versions of the input image by down-sampling the input image via a linear upscaling network and generate non-linear down-sampled versions of the input image by down-sampling the input image via a non-linear upscaling network.Type: ApplicationFiled: November 18, 2019Publication date: May 20, 2021Applicant: Advanced Micro Devices, Inc.Inventors: Alexander M. Potapov, Skyler Jonathon Saleh, Swapnil P. Sakharshete, Vineet Goel
-
Patent number: 11011466Abstract: Various semiconductor chip devices and methods of making the same are disclosed. In one aspect, an apparatus is provided that includes a first redistribution layer (RDL) structure having a first plurality of conductor traces, a first molding layer on the first RDL structure, plural conductive pillars in the first molding layer, each of the conductive pillars including a first end and a second end, a second RDL structure on the first molding layer, the second RDL structure having a second plurality of conductor traces, and wherein some of the conductive pillars are electrically connected between some of the first plurality of conductor traces and some of the second plurality of conductor traces to provide a first inductor coil.Type: GrantFiled: March 28, 2019Date of Patent: May 18, 2021Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Milind S. Bhagavat, Rahul Agarwal, Chia-Hao Cheng
-
Patent number: 11011495Abstract: A data processor is implemented as an integrated circuit. The data processor includes a processor die. The processor die is connected to an integrated voltage regulator die using die-to-die bonding. The integrated voltage regulator die provides a regulated voltage to the processor die, and the processor die operates in response to the regulated voltage.Type: GrantFiled: August 23, 2018Date of Patent: May 18, 2021Assignee: Advanced Micro Devices, Inc.Inventors: Milind Bhagavat, David Hugh McIntyre, Rahul Agarwal