Patents Assigned to Advanced Micro Devices
-
Patent number: 8778734Abstract: A system and method for efficiently addressing dies in a three-dimensional stacked integrated circuit. Multiple stacked dies may be included in a single package or module. At least two of the dies are vertically stacked. One or more of the dies may include die enumeration logic that generates a unique die address space identifier (ID) for a particular die. Unless a die is a base die, each die receives a unique die ID for itself from a die placed below itself. The die then generates a unique die ID for one or more dies placed above itself and sends these die IDs to the dies located on top of itself. If a die is a base die, then the logic may receive a root value for a first unique die ID within the vertical stack from the package substrate or silicon based interposer beneath it.Type: GrantFiled: March 28, 2012Date of Patent: July 15, 2014Assignee: Advanced Micro Devices, Inc.Inventor: Sophocles R. Metsis
-
Patent number: 8782645Abstract: A system and method for efficient automatic scheduling of the execution of work units between multiple heterogeneous processor cores. A processing node includes a first processor core with a general-purpose micro-architecture and a second processor core with a single instruction multiple data micro-architecture. A computer program comprises one or more compute kernels, or function calls. A compiler computes pre-runtime information of the given function call. A runtime scheduler produces one or more work units by matching each of the one or more kernels with an associated record of data. The scheduler assigns work units either to the first or to the second processor core based at least in part on the computed pre-runtime information. In addition, the scheduler is able to change an original assignment for a waiting work unit based on dynamic runtime behavior of other work units corresponding to a same kernel as the waiting work unit.Type: GrantFiled: May 11, 2011Date of Patent: July 15, 2014Assignee: Advanced Micro Devices, Inc.Inventors: Mauricio Breternitz, Patryk Kaminski, Keith Lowery, Anton Chernoff
-
Publication number: 20140192052Abstract: An apparatus, computer readable medium, and method of rendering a 2D object using a 3D graphics processing unit (GPU). The method includes one or more shaders running on the 3D GPU forming a 3D object by accessing the 2D object. The method may include the one or more shaders forming the 3D object by forming a plurality of 3D vertex attributes of the 2D object. The 3D vertex attributes may include position, color, and texture. The method may include copying a plurality of the 2D objects from a central processing memory (CPU) to a GPU memory. The one or more shaders may access the 2D object from the GPU memory.Type: ApplicationFiled: January 10, 2013Publication date: July 10, 2014Applicant: Advanced Micro Devices, Inc.Inventor: Brian K. Bennett
-
Patent number: 8775999Abstract: A method for validating standard cells stored in a standard cell library and for use in design of an integrated circuit device is described. Each standard cell of the standard cells is iteratively placed adjacent to each side and corner of itself and each other standard cell of the standard cells to produce an interim test layout comprising a first plurality of cell pair permutations. The cell pair permutations are reduced by identifying at least one of: illegal or redundant left-right and top-bottom boundaries, and removing any cell pair permutations using the identified boundaries to generate a final test layout comprising a second plurality of cell pair permutations.Type: GrantFiled: November 8, 2012Date of Patent: July 8, 2014Assignee: Advanced Micro Devices, Inc.Inventors: Juang-Ying Chueh, Charles Tung
-
Patent number: 8772083Abstract: Various substrates or circuit boards for receiving a semiconductor chip and methods of processing the same are disclosed. In one aspect, a method of manufacturing is provided that includes forming a first opening in a solder mask positioned on a side of a substrate. The first opening does not extend to the side. A second opening is formed in the solder mask that extends to the side. The first opening may serve as an underfill anchor site.Type: GrantFiled: September 10, 2011Date of Patent: July 8, 2014Assignees: ATI Technologies ULC, Advanced Micro Devices, Inc.Inventors: Andrew K W Leung, Roden R. Topacio, Yu-Ling Hsieh, Yip Seng Low
-
Patent number: 8775762Abstract: A memory controller includes a batch unit, a batch scheduler, and a memory command scheduler. The batch unit includes a plurality of source queues for receiving memory requests from a plurality of sources. Each source is associated with a selected one of the source queues. The batch unit is operable to generate batches of memory requests in the source queues. The batch scheduler is operable to select a batch from one of the source queues. The memory command scheduler is operable to receive the selected batch from the batch scheduler and issue the memory requests in the selected batch to a memory interfacing with the memory controller.Type: GrantFiled: May 7, 2012Date of Patent: July 8, 2014Assignee: Advanced Micro Devices, Inc.Inventors: Gabriel H. Loh, Rachata Ausavarungnirun
-
Publication number: 20140185611Abstract: A cluster compute server includes nodes coupled in a network topology via a fabric that source routes packets based on location identifiers assigned to the nodes, the location identifiers representing the locations in the network topology. Host interfaces at the nodes may be associated with link layer addresses that do not reflect the location identifier associated with the nodes. The nodes therefore implement locally cached link layer address translations that map link layer addresses to corresponding location identifiers in the network topology. In response to originating a packet directed to one of these host interfaces, the node accesses the local translation cache to obtain a link layer address translation for a destination link layer address of the packet. When a node experiences a cache miss, the node queries a management node to obtain the specified link layer address translation from a master translation table maintained by the management node.Type: ApplicationFiled: December 31, 2012Publication date: July 3, 2014Applicant: Advanced Micro Devices, Inc.Inventors: Sean Lie, Vikrama Ditya, Gary R. Lauterbach
-
Publication number: 20140189094Abstract: A cluster compute server comprises a set of one or more compute nodes, each compute node instantiating at least one virtual network interface controller representing a network interface controller of a primary network node remote to the compute node and appearing as a local network interface controller to the compute node. A first network node comprising a network interface controller coupleable to an external network operates as the primary network node for the set. The first network node emulates link aggregation partners for virtual network interfaces of the set based on first link state information maintained at the first network node. A second network node comprising a network interface controller coupleable to the external operates as a secondary network node for the set. The second network node maintains second link state information that mirrors the first link state information maintained at the first network node.Type: ApplicationFiled: December 31, 2012Publication date: July 3, 2014Applicant: Advanced Micro Devices, Inc.Inventor: Vikrama Ditya
-
Publication number: 20140188996Abstract: A server system allows system's nodes to access a fabric interconnect of the server system directly, rather than via an interface that virtualizes the fabric interconnect as a network or storage interface. The server system also employs controllers to provide an interface to the fabric interconnect via a standard protocol, such as a network protocol or a storage protocol. The server system thus facilitates efficient and flexible transfer of data between the server system's nodes.Type: ApplicationFiled: December 31, 2012Publication date: July 3, 2014Applicant: Advanced Micro Devices, Inc.Inventors: Sean Lie, Gary Lauterbach
-
Publication number: 20140184243Abstract: An integrated circuit (IC) measures uncertainties in a first signal. The IC comprises a programmable delay circuit to introduce a programmable delay to the first signal to generate a first delayed signal. The IC further comprises a digital delay line (DDL) comprising a first delay chain of delay elements having input to receive the first delayed signal. The DDL further comprises a set of storage elements, each storage element having an input coupled to an output of a corresponding delay element of the first delay chain, and an output to provide a corresponding bit of a digital reading. The DDL additionally comprises a decoder to generate a digital signature from the digital reading and a controller to iteratively adjust the programmed delay of the programmable delay circuit to search for a failure in a resulting digital signature.Type: ApplicationFiled: December 31, 2012Publication date: July 3, 2014Applicant: Advanced Micro Devices, Inc.Inventors: Arun S. Iyer, Prashanth Vallur, Shraddha Padiyar, Amit Govil
-
Publication number: 20140185627Abstract: A cluster compute server comprises a fabric interconnect, a first node coupled to the fabric interconnect and comprising a network interface controller coupleable to an external network, and a second node coupled to the fabric interconnect and comprising a fabric interface to provide a set of one or more virtual network interface controllers representing the network interface controller of the first node. The one or more virtual network interface controllers each appear as a local network interface controller to software executed at the second node. The first node is to emulate one or more link aggregation partners for the set of one or more virtual network interface controllers.Type: ApplicationFiled: December 31, 2012Publication date: July 3, 2014Applicant: Advanced Micro Devices, Inc.Inventor: Vikrama Ditya
-
Publication number: 20140189443Abstract: A server system performs error detection on a hop-by-hop basis at multiple compute nodes, thereby facilitating the detection of a compute node experiencing failure. The server system communicates a packet from an originating node (the originating node) to a destination node by separating the packet into multiple flow control digits (flits) and routing the flits using a series of hops over a set of intermediate nodes. The packet's final flit includes error detection information, such as checksum data. As each intermediate node receives the final flit, it performs error detection using the error detection information. The pattern of nodes that detect an error indicates which intermediate node has experienced a failure.Type: ApplicationFiled: December 31, 2012Publication date: July 3, 2014Applicant: Advanced Micro Devices, Inc.Inventors: Min Xu, Sean Lie, Gene Shen
-
Publication number: 20140189700Abstract: A processor uses a token scheme to govern the maximum number of memory access requests each of a set of processor cores can have pending at a northbridge of the processor. To implement the scheme, the northbridge issues a minimum number of tokens to each of the processor cores and keeps a number of tokens in reserve. In response to determining that a given processor core is generating a high level of memory access activity the northbridge issues some of the reserve tokens to the processor core. The processor core returns the reserve tokens to the northbridge in response to determining that it is not likely to continue to generate the high number of memory access requests, so that the reserve tokens are available to issue to another processor core.Type: ApplicationFiled: December 27, 2012Publication date: July 3, 2014Applicant: Advanced Micro Devices, Inc.Inventors: Douglas R. Williams, Vydhyanathan Kalyanasundharam, Marius Evers, Michael K. Fertig
-
Patent number: 8769247Abstract: Methods and apparatuses are provided for increased efficiency in a processor via early instruction completion. An apparatus is provided for increased efficiency in a processor via early instruction completion. The apparatus comprises an execution unit for processing instructions and determining whether a later issued instruction is ready for completion or an earlier issued instruction is ready for completion and a retire unit for retiring the later issued instruction when the later instruction is ready for completion or to retire the earlier instruction when later instruction is not ready for completion and the earlier issued instruction has a known good completion status. A method is provided for increased efficiency in a processor via early instruction completion. The method comprises completing an earlier issued instruction having a known good completion status ahead of a later issued instruction when the later issued instruction is not ready for completion.Type: GrantFiled: April 15, 2011Date of Patent: July 1, 2014Assignee: Advanced Micro Devices, Inc.Inventors: Michael D Estlick, Kevin Hurd, Jay Fleischman
-
Patent number: 8769539Abstract: A method and apparatus are provided to control the order of execution of load and store operations. Also provided is a computer readable storage device encoded with data for adapting a manufacturing facility to create the apparatus. One embodiment of the method includes determining whether a first group, comprising at least one or more instructions, is to be selected from a scheduling queue of a processor for execution using either a first execution mode or a second execution mode. The method also includes, responsive to determining that the first group is to be selected for execution using the second execution mode, preventing selection of the first group until a second group, comprising at least one or more instructions, that entered the scheduling queue prior to the first group is selected for execution.Type: GrantFiled: November 16, 2010Date of Patent: July 1, 2014Assignee: Advanced Micro Devices, Inc.Inventors: Daniel Hopper, Suzanne Plummer, Christopher D. Bryant
-
Publication number: 20140181402Abstract: A method of managing cache memory includes assigning a caching priority designator to an address that addresses information stored in a memory system. The information is stored in a cacheline of a first level of cache memory in the memory system. The cacheline is evicted from the first level of cache memory. A second level in the memory system to which to write back the information is determined based at least in part on the caching priority designator. The information is written back to the second level.Type: ApplicationFiled: December 21, 2012Publication date: June 26, 2014Applicant: Advanced Micro Devices, Inc.Inventor: Sean T. WHITE
-
Publication number: 20140181417Abstract: A die-stacked memory device implements an integrated coherency manager to offload cache coherency protocol operations for the devices of a processing system. The die-stacked memory device includes a set of one or more stacked memory dies and a set of one or more logic dies. The one or more logic dies implement hardware logic providing a memory interface and the coherency manager. The memory interface operates to perform memory accesses in response to memory access requests from the coherency manager and the one or more external devices. The coherency manager comprises logic to perform coherency operations for shared data stored at the stacked memory dies. Due to the integration of the logic dies and the memory dies, the coherency manager can access shared data stored in the memory dies and perform related coherency operations with higher bandwidth and lower latency and power consumption compared to the external devices.Type: ApplicationFiled: December 23, 2012Publication date: June 26, 2014Applicant: Advanced Micro Devices, Inc.Inventors: Gabriel H. Loh, Bradford M. Beckmann, Lisa R. Hsu, Michael Ignatowski, Michael J. Schulte
-
Publication number: 20140181482Abstract: An arithmetic unit performs store-to-load forwarding based on predicted dependencies between store instructions and load instructions. In some embodiments, the arithmetic unit maintains a table of store instructions that are awaiting movement to a load/store unit of the instruction pipeline. In response to receiving a load instruction that is predicted to be dependent on a store instruction stored at the table, the arithmetic unit causes the data associated with the store instruction to be placed into the physical register targeted by the load instruction. In some embodiments, the arithmetic unit performs the forwarding by mapping the physical register targeted by the load instruction to the physical register where the data associated with the store instruction is located.Type: ApplicationFiled: December 20, 2012Publication date: June 26, 2014Applicant: Advanced Micro Devices, Inc.Inventors: Gregory W. Smaus, Francesco Spadini, Matthew A. Rafacz, Michael Achenbach, Christopher J. Burke, Emil Talpes, Matthew M. Crum
-
Publication number: 20140181428Abstract: A die-stacked memory device implements an integrated QoS manager to provide centralized QoS functionality in furtherance of one or more specified QoS objectives for the sharing of the memory resources by other components of the processing system. The die-stacked memory device includes a set of one or more stacked memory dies and one or more logic dies. The logic dies implement hardware logic for a memory controller and the QoS manager. The memory controller is coupleable to one or more devices external to the set of one or more stacked memory dies and operates to service memory access requests from the one or more external devices. The QoS manager comprises logic to perform operations in furtherance of one or more QoS objectives, which may be specified by a user, by an operating system, hypervisor, job management software, or other application being executed, or specified via hardcoded logic or firmware.Type: ApplicationFiled: December 23, 2012Publication date: June 26, 2014Applicant: Advanced Micro Devices, Inc.Inventors: Lisa R. Hsu, Gabriel H. Loh, Bradford M. Beckmann, Michael Ignatowski
-
Publication number: 20140181407Abstract: For a memory access at a processor, only a subset (less than all) of the ways of a cache associated with a memory address is prepared for access. The subset of ways is selected based on stored information indicating, for each memory access, which corresponding way of the cache was accessed. The subset of ways is selected and preparation of the subset of ways is initiated prior to the final determination as to which individual cache way in the subset is to be accessed.Type: ApplicationFiled: December 26, 2012Publication date: June 26, 2014Applicant: Advanced Micro Devices, Inc.Inventors: Matthew M. Crum, Teik-Chung Tan