Patents Assigned to Advanced Micros Devices, Inc.
-
Patent number: 10530325Abstract: Systems, apparatuses, and methods for performing efficient data transfer in a computing system are disclosed. A computing system includes multiple transmitters sending singled-ended data signals to multiple receivers. A receiver includes multiple series inductors moved from a signal path to sampling circuitry to a termination path used for impedance matching. The removed direct current (DC) resistances of the inductors in the signal path reduces signal attenuation. The termination path has alternating current (AC) reactances of the inductors, which provide a frequency-dependent termination impedance. This termination impedance provides a positive reflection coefficient for high operating frequencies, which boosts the input signal being received by the sampling circuitry.Type: GrantFiled: August 30, 2018Date of Patent: January 7, 2020Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Dean E. Gonzales, Xuan Chen, Jeffrey Cooper, Milam Paraschou
-
Patent number: 10529677Abstract: Various chip stack power delivery circuits are disclosed. In one aspect, an apparatus is provided that includes a stack of semiconductor chips that has an uppermost semiconductor chip and a lowermost semiconductor chip. A heat spreader is positioned on the uppermost semiconductor chip. A power transfer circuit is configured to transfer electric power from the heat spreader to the uppermost semiconductor chip.Type: GrantFiled: April 27, 2018Date of Patent: January 7, 2020Assignee: Advanced Micro Devices, Inc.Inventor: Dmitri Yudanov
-
Patent number: 10528613Abstract: A method and apparatus for performing a search in a processor-in-memory (PIM) system having a first processor and at least one memory module includes receiving one or more images by the first processor. The first processor sends a query for a search of memory for a matching image to the one or more images to at least one memory module, which searches memory in the memory module, in response to the received query. The at least one memory module sends the results of the search to the first processor, and the first processor performs a comparison of the received results from the at least one memory module to the received one or more images.Type: GrantFiled: November 23, 2015Date of Patent: January 7, 2020Assignee: Advanced Micro Devices, Inc.Inventor: Dong Ping Zhang
-
Patent number: 10528483Abstract: A system includes one or more processor cores and a cache hierarchy. The cache hierarchy includes a first-level cache, a second-level cache, and a third-level cache. The cache hierarchy further includes cache hierarchy control logic configured to implement a caching policy in which each cacheline cached in the first-level cache has a copy of the cacheline cached in at least one of the second-level cache and the third-level cache. The caching policy further provides that an eviction of a cacheline from the second-level cache does not trigger an eviction of a copy of that cacheline from the first-level cache, and that an eviction of a cacheline from the third-level cache triggers the cache hierarchy control logic to evict a copy of that cacheline from the first-level cache when the cacheline is not present in the second-level cache.Type: GrantFiled: October 23, 2017Date of Patent: January 7, 2020Assignee: Advanced Micro Devices, Inc.Inventor: Paul Moyer
-
Patent number: 10529118Abstract: A technique for compressing an original image is disclosed. According to the technique, an original image is obtained and a delta-encoded image is generated based on the original image. Next, a segregated image is generated based on the delta-encoded image and then the segregated image is compressed to produce a compressed image. The segregated image is generated because the segregated image may be compressed more efficiently than the original image and the delta image.Type: GrantFiled: June 29, 2018Date of Patent: January 7, 2020Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Ruijin Wu, Skyler Jonathon Saleh, Christopher J. Brennan, Kei Ming Kwong, Anthony Hung-Cheong Chan
-
Patent number: 10529693Abstract: Various semiconductor chip devices with stacked chips are disclosed. In one aspect, a semiconductor chip device is provided. The semiconductor chip device includes a first semiconductor chip that has a front side and a back side and plural through chip vias. The through chip vias have a first footprint. The back side is configured to have a second semiconductor chip stacked thereon. The second semiconductor chip includes plural interconnects that have a second footprint larger than the first footprint. The back side includes a backside interconnect structure configured to connect to the interconnects and provide fanned-in pathways to the through chip vias.Type: GrantFiled: November 29, 2017Date of Patent: January 7, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Rahul Agarwal, Milind S. Bhagavat
-
Publication number: 20200004585Abstract: Techniques for executing shader programs with divergent control flow on a single instruction multiple data (“SIMD”) processor are disclosed. These techniques includes detecting entry into a divergent section of a shader program and, for the work-items that enter the divergent section, placing a task entry into a task queue associated with the target of each work-item. The target is the destination, in code, of any particular work-item, and is also referred to as a code segment herein. The task queues store task entries for code segments generated by different (or the same) wavefronts. A command processor examines task lists and schedules wavefronts for execution by grouping together tasks in the same task list into wavefronts and launching those wavefronts. By grouping tasks from different wavefronts together for execution in the same front, serialization of execution is greatly reduced or eliminated.Type: ApplicationFiled: June 29, 2018Publication date: January 2, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Skyler Jonathon Saleh, Maxim V. Kazakov
-
Publication number: 20200005135Abstract: Systems, methods, and devices for deploying an artificial neural network (ANN). Candidate ANNs are generated for performing an inference task based on specifications of a target inference device. Trained ANNs are generated by training the candidate ANNs to perform the inference task on an inference device conforming to the specifications. Characteristics describing the trained ANNs performance of the inference task on a device conforming to the specifications are determined. Profiles that reflect the characteristics of each trained ANN are stored. The stored profiles are queried based on requirements of an application to select an ANN from among the trained ANNs. The selected ANN is deployed on an inference device conforming to the target inference device specifications. Input data is communicated to the deployed ANN from the application. An output is generated using the deployed ANN, and the output is communicated to the application.Type: ApplicationFiled: June 29, 2018Publication date: January 2, 2020Applicant: Advanced Micro Devices, Inc.Inventor: Shuai Che
-
Publication number: 20200005514Abstract: A technique for compressing an original image is disclosed. According to the technique, an original image is obtained and a delta-encoded image is generated based on the original image. Next, a segregated image is generated based on the delta-encoded image and then the segregated image is compressed to produce a compressed image. The segregated image is generated because the segregated image may be compressed more efficiently than the original image and the delta image.Type: ApplicationFiled: June 29, 2018Publication date: January 2, 2020Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Ruijin Wu, Skyler Jonathon Saleh, Christopher J. Brennan, Kei Ming Kwong, Anthony Hung-Cheong Chan
-
Patent number: 10522193Abstract: A system, method, and computer program product are provided for a memory device system. One or more memory dies and at least one logic die are disposed in a package and communicatively coupled. The logic die comprises a processing device configurable to manage virtual memory and operate in an operating mode. The operating mode is selected from a set of operating modes comprising a slave operating mode and a host operating mode.Type: GrantFiled: September 12, 2018Date of Patent: December 31, 2019Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Nuwan S. Jayasena, Gabriel H. Loh, Bradford M. Beckmann, James M. O'Connor, Lisa R. Hsu
-
Patent number: 10523428Abstract: A method and apparatus provides cryptographic keys using, for example, a cryptographic co-processor (CCP) that uses spare processor cycles to work on cryptographic key generation in advance of the keys being needed by a requestor such as an application, or other process in the device. In one example, the cryptographic co-processor detects an idle condition of the CCP such as an idle condition of a cryptographic engine in the CCP. Control logic causes the CCP to generate at least one asymmetric key component corresponding to an asymmetric cryptographic key in response to detecting the idle condition. The method and apparatus stores the asymmetric key component(s) in persistent memory and generates the asymmetric cryptographic key using the stored asymmetric key component that was generated in response to detection of the idle condition of the CCP.Type: GrantFiled: November 22, 2017Date of Patent: December 31, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Gongyuan Zhuang, Thomas R. Woller
-
Publication number: 20190394503Abstract: Virtual Reality (VR) processing devices and methods are provided for transmitting user feedback information comprising at least one of user position information and user orientation information, receiving encoded audio-video (A/V) data, which is generated based on the transmitted user feedback information, separating the A/V data into video data and audio data corresponding to a portion of a next frame of a sequence of frames of the video data to be displayed, decoding the portion of a next frame of the video data and the corresponding audio data, providing the audio data for aural presentation and controlling the portion of the next frame of the video data to be displayed in synchronization with the corresponding audio data.Type: ApplicationFiled: September 5, 2019Publication date: December 26, 2019Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Lei Zhang, Gabor Sines, Khaled Mammou, David Glen, Layla A. Mah, Rajabali M. Koduri, Bruce Montag
-
Publication number: 20190391850Abstract: Methods and systems for opportunistic load balancing in deep neural networks (DNNs) using metadata. Representative computational costs are captured, obtained or determined for a given architectural, functional or computational aspect of a DNN system. The representative computational costs are implemented as metadata for the given architectural, functional or computational aspect of the DNN system. In an implementation, the computed computational cost is implemented as the metadata. A scheduler detects whether there are neurons in subsequent layers that are ready to execute. The scheduler uses the metadata and neuron availability to schedule and load balance across compute resources and available resources.Type: ApplicationFiled: June 26, 2018Publication date: December 26, 2019Applicant: Advanced Micro Devices, Inc.Inventors: Nicholas Malaya, Yasuko Eckert
-
Publication number: 20190391813Abstract: The techniques described herein provide an instruction fetch and decode unit having an operation cache with low latency in switching between fetching decoded operations from the operation cache and fetching and decoding instructions using a decode unit. This low latency is accomplished through a synchronization mechanism that allows work to flow through both the operation cache path and the instruction cache path until that work is stopped due to needing to wait on output from the opposite path. The existence of decoupling buffers in the operation cache path and the instruction cache path allows work to be held until that work is cleared to proceed. Other improvements, such as a specially configured operation cache tag array that allows for detection of multiple hits in a single cycle, also improve latency by, for example, improving the speed at which entries are consumed from a prediction queue that stores predicted address blocks.Type: ApplicationFiled: June 21, 2018Publication date: December 26, 2019Applicant: Advanced Micro Devices, Inc.Inventors: Marius Evers, Dhanaraj Bapurao Tavare, Ashok Tirupathy Venkatachar, Arunachalam Annamalai, Donald A. Priore, Douglas R. Williams
-
Patent number: 10515182Abstract: A non-transitory computer-readable medium includes instructions that, when provided to and executed by a processor, cause the processor to receive a first placement of domain instances of an integrated circuit layout provided as a tile having a group of multiple power domain modules. The first placement of domain instances is scanned to identify instances associated with a preselected power specification. A heuristic is applied to the first placement of domain instances to form an observation area. The heuristic demarcates select instances to form the observation area. Each instance associated with the preselected power specification is identified in the observation area. A contiguous region of instances is formed from the select instances in the observation area. The first placement of domain instances in the integrated circuit layout is modified to provided revised placement for instances associated with the contiguous region of instances.Type: GrantFiled: June 30, 2017Date of Patent: December 24, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Elsie Lo, Erhan Ergin, Dipanjan Sengupta, Rajit Seahra, Sowmya Thikkavarapu, Kameswara Goutham Vankayalapati
-
Patent number: 10515173Abstract: An electronic device includes a first integrated circuit chip including a processing functional block, and a second integrated circuit chip including an input-output (IO) functional block. The IO functional block performs one or more IO processing operations on behalf of the processing functional block in the first integrated circuit chip. The first integrated circuit chip lacks at least some elements of the IO functional block, so that the processing functional block is unable to perform corresponding IO operations without the IO functional block.Type: GrantFiled: December 29, 2017Date of Patent: December 24, 2019Assignee: ADVANCED MICRO DEVICES, INC.Inventors: David A. Roberts, Dean Gonzales
-
Patent number: 10515671Abstract: Logic such as a memory controller writes primary data from an incoming write request as well as corresponding replicated primary data (which is a copy of the primary data) to one or more different memory banks of random access memory in response to determining a memory access contention condition for the address (including a range of addresses) corresponding to the incoming write request. When the memory bank containing the primary data is busy servicing a write request, such as to another row of memory in the bank, a read request for the primary data is serviced by reading the replicated primary data from the different memory bank of the random access memory to service the incoming read request.Type: GrantFiled: September 22, 2016Date of Patent: December 24, 2019Assignee: Advanced Micro Devices, Inc.Inventor: David A. Roberts
-
Publication number: 20190384722Abstract: A data processing system includes a memory, a group of input/output (I/O) devices, an input/output memory management unit (IOMMU). The IOMMU is connected to the memory and adapted to allocate a hardware resource from among a group of hardware resources to receive an address translation request for a memory access from an I/O device. The IOMMU detects address translation requests from the plurality of I/O devices. The IOMMU reorders the address translation requests such that an order of dispatching an address translation request is based on a policy associated with the I/O device that is requesting the memory access. The IOMMU selectively allocates a hardware resource to the input/output device, based on the policy that is associated with the I/O device in response to the reordering.Type: ApplicationFiled: June 13, 2018Publication date: December 19, 2019Applicant: Advanced Micro Devices, Inc.Inventors: Arkaprava Basu, Michael LeBeane, Eric Van Tassell
-
Patent number: 10509736Abstract: An input-output (IO) memory management unit (IOMMU) uses a reverse map table (RMT) to ensure that address translations acquired from a nested page table are correct and that IO devices are permitted to access pages in a memory when performing memory accesses in a computing device. A translation lookaside buffer (TLB) flushing mechanism is used to invalidate address translation information in TLBs that are affected by changes in the RMT. A modified Address Translation Caching (ATC) mechanism may be used, in which only partial address translation information is provided to IO devices so that the RMT is checked when performing memory accesses for the IO devices using the cached address translation information.Type: GrantFiled: April 10, 2018Date of Patent: December 17, 2019Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Nippon Raval, David A. Kaplan, Philip Ng
-
Patent number: 10509752Abstract: A data processing system includes a processing unit that forms a base die and has a group of through-silicon vias (TSVs), and is connected to a memory system. The memory system includes a die stack that includes a first die and a second die. The first die has a first surface that includes a group of micro-bump landing pads and a group of TSV landing pads. The group of micro-bump landing pads are connected to the group of TSVs of the processing unit using a corresponding group of micro-bumps. The first die has a group of memory die TSVs. The subsequent die has a first surface that includes a group of micro-bump landing pads and a group of TSV landing pads connected to the group of TSVs of the first die. The first die communicates with the processing unit using first cycle timing, and with the subsequent die using second cycle timing.Type: GrantFiled: April 27, 2018Date of Patent: December 17, 2019Assignee: Advanced Micro Devices, Inc.Inventors: Russell Schreiber, John Wuu, Michael K. Ciraula, Patrick J. Shyvers