Patents Assigned to Advanced Micro Devices
-
Patent number: 9727241Abstract: A processor maintains a count of accesses to each memory page. When the accesses to a memory page exceed a threshold amount for that memory page, the processor sets an indicator for the page. Based on the indicators for the memory pages, the processor manages data at one or more levels of the processor's memory hierarchy.Type: GrantFiled: February 6, 2015Date of Patent: August 8, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Gabriel H. Loh, David A. Roberts, Mitesh R. Meswani, Mark R. Nutter, John R. Slice, Prashant Nair, Michael Ignatowski
-
Patent number: 9727340Abstract: The present invention provides a method and apparatus for scheduling based on tags of different types. Some embodiments of the method include broadcasting a first tag to entries in a queue of a scheduler. The first tag is broadcast in response to a first instruction associated with a first entry in the queue being picked for execution. The first tag includes information identifying the first entry and information indicating a type of the first tag. Some embodiments of the method also include marking at least one second entry in the queue is ready to be picked for execution in response to at least one second tag associated with at least one second entry in the queue matching the first tag.Type: GrantFiled: July 17, 2013Date of Patent: August 8, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Michael Achenbach, Teik Tan, Gregory W. Smaus, Ganesh Venkataramanan, Emil Talpes
-
Patent number: 9728002Abstract: A graphics processing unit is configured to map pixels of a first frame of a video stream to texels, select a subset of the texels for shading based on previously cached texels that were shaded for a second frame, and shade the subset of the texels. The graphics processing unit is also configured to cache the shaded subset of the texels with the previously cached texels and determine values for the pixels of the first frame based on the cached texels.Type: GrantFiled: December 18, 2015Date of Patent: August 8, 2017Assignee: Advanced Micro Devices, Inc.Inventor: Karl Hillesland
-
Patent number: 9727435Abstract: A method for automatically scaling estimates of digital power consumed by a portion of an integrated circuit (IC) device by the operating frequency of the portion of the IC are described herein. The method may include obtaining an energy value which may correspond to an amount of energy used by the portion of the IC. A cumulative energy value may be generated by repeatedly, at a frequency proportional to the operating frequency of the portion of the IC, obtaining energy values and adding each obtained energy value to a sum of energy values for the portion of the IC. The cumulative energy value may be sampled at a time sample interval to generate an estimate of the portion of the IC's digital power consumption that is automatically scaled with the operating frequency of the portion of the IC.Type: GrantFiled: June 22, 2015Date of Patent: August 8, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Samuel D. Naffziger, Suresh B. Periyacheri
-
Publication number: 20170223370Abstract: A system and method for providing video compression that includes encoding using an encoding engine a YUV stream wherein Y, U and V color values are encoded in parallel and patching together the Y, U and V color streams to form a compressed YUV output stream. The encoding engine further includes encoding each color value of the YUV stream in parallel using parallel encoding engines and a control engine for controlling operation all of the encoding engines in parallel. The YUV stream has an average bits per pixel value that varies from a first value to a second value that is double the first value. The encoding engine includes encoding the YUV stream in generally the same amount of time regardless of the average bits per pixel value.Type: ApplicationFiled: April 19, 2017Publication date: August 3, 2017Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Haibin Li, Zhen Chen, Lei Zhang, Ji Zhou, Zhong Cai
-
Publication number: 20170220346Abstract: Briefly, methods and apparatus to migrate a software thread from one wavefront executing on one execution unit to another wavefront executing on another execution unit whereby both execution units are associated with a compute unit of a processing device such as, for example, a GPU. The methods and apparatus may execute compiled dynamic thread migration swizzle buffer instructions that when executed allow access to a dynamic thread migration swizzle buffer that allows for the migration of register context information when migrating software threads. The register context information may be located in one or more locations of a register file prior to storing the register context information into the dynamic thread migration swizzle buffer. The method and apparatus may also return the register context information from the dynamic thread migration swizzle buffer to one or more different register file locations of the register file.Type: ApplicationFiled: January 29, 2016Publication date: August 3, 2017Applicant: Advanced Micro Devices, Inc.Inventors: Bradford Beckmann, Sooraj Puthoor
-
Patent number: 9720708Abstract: Techniques are disclosed relating to data transformation for distributing workloads between processors or cores within a processor. In various embodiments, a first processing element receives a set of bytecode. The set of bytecode specifies a set of tasks and a first data structure that specifies data to be operated on during performance of the set of tasks. The first data structure is stored non-contiguously in memory of the computer system. In response to determining to offload the set of tasks to a second processing element of the computer system, the first processing element generates a second data structure that specifies the data. The second data structure is stored contiguously in memory of the computer system. The first processing element provides the second data structure to the second processing element for performance of the set of tasks.Type: GrantFiled: August 19, 2011Date of Patent: August 1, 2017Assignee: Advanced Micro Devices, Inc.Inventor: Eric R. Caspole
-
Patent number: 9720487Abstract: Durations of power management states are predicted on a per-process basis. Some embodiments include storing, in one or more data structures associated with one or more processes, information indicating previous durations of a power management state associated with the process(es). Some embodiments also include predicting a subsequent duration of the power management state for the process(es) using information stored in the data structure(s).Type: GrantFiled: January 10, 2014Date of Patent: August 1, 2017Assignee: Advanced Micro Devices, Inc.Inventors: William L. Bircher, Madhu Saravana Sibi Govindan, Manish Arora, Michael J. Schulte, Nuwan S. Jayasena
-
Patent number: 9720486Abstract: A device and method of operating a synchronous frequency processing environment served by a common power source and common clock source. The method includes operating the processing environment to have a first power consumption. The method further includes determining a first synchronous frequency processing domain within the processing environment where it is desired to implement a first clock frequency alteration in a clock signal for the first synchronous frequency processing domain. The first clock frequency alteration generates an associated first alteration in a power consumption from the first synchronous frequency processing domain. The method further includes determining a second clock frequency alteration to a clock signal for a second synchronous frequency processing domain of the processing environment. The second clock frequency alteration is determined so as to reduce a change in the first power consumption caused by the first alteration in power consumption.Type: GrantFiled: September 25, 2015Date of Patent: August 1, 2017Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Angel E. Socarras, Fei Guo
-
Publication number: 20170212757Abstract: A graphics processing unit is disclosed, the graphics processing unit having a processor having one or more SIMD processing units, and a local data share corresponding to one of the one or more SIMD processing units, the local data share comprising one or more low latency accessible memory regions for each group of threads assigned to one or more execution wavefronts, and a global data share comprising one or more low latency memory regions for each group of threads.Type: ApplicationFiled: April 10, 2017Publication date: July 27, 2017Applicant: Advanced Micro Devices, Inc.Inventors: Michael J. Mantor, Brian Emberling
-
Patent number: 9715389Abstract: A method includes suppressing execution of at least one dependent instruction of a load instruction by a processor using stored dependency information responsive to an invalid status of the load instruction. A processor includes an execution unit to execute instructions and a scheduler. The scheduler is to select for execution in the execution unit a load instruction having at least one dependent instruction and suppress execution of the at least one dependent instruction using stored dependency information responsive to an invalid status of the load instruction.Type: GrantFiled: June 25, 2013Date of Patent: July 25, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Francesco Spadini, Michael Achenbach
-
Publication number: 20170206625Abstract: Described is a method and apparatus to accelerate rendering of 3D graphics images. When rendering, the transformation matrix (or equivalent) used for projecting primitives is modified so that a resulting image is smaller and/or warped compared to a regular unmodified rendering. The effect of such transformation is fewer pixels being rendered and thus a better performance. To compute the final image, the warped image is rectified by an inverse transformation. Depending on the warping transformation used, the resulting (rectified) image will be blurred in a controlled way, either simulating a directional motion blur, location-dependent sharpness/blurriness or other blurring effects. By intelligently selecting the warping transformation in correspondence with the rendered scene, overall performance is increased without losing the perceived fidelity of the final image.Type: ApplicationFiled: January 17, 2017Publication date: July 20, 2017Applicant: Advanced Micro Devices, Inc.Inventor: Evgene Fainstain
-
Publication number: 20170206630Abstract: Methods are provided for creating objects in a way that permits an API client to explicitly participate in memory management for an object created using the API. Methods for managing data object memory include requesting memory requirements for an object using an API and expressly allocating a memory location for the object based on the memory requirements. Methods are also provided for cloning objects such that a state of the object remains unchanged from the original object to the cloned object or can be explicitly specified.Type: ApplicationFiled: April 3, 2017Publication date: July 20, 2017Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Guennadi Riguer, Brian K. Bennett
-
Patent number: 9710276Abstract: In a normal, non-loop mode a uOp buffer receives and stores for dispatch the uOps generated by a decode stage based on a received instruction sequence. In response to detecting a loop in the instruction sequence, the uOp buffer is placed into a loop mode whereby, after the uOps associated with the loop have been stored at the uOp buffer, storage of further uOps at the buffer is suspended. To execute the loop, the uOp buffer repeatedly dispatches the uOps associated with the loop's instructions until the end condition of the loop is met and the uOp buffer exits the loop mode.Type: GrantFiled: November 9, 2012Date of Patent: July 18, 2017Assignee: Advanced Micro Devices, Inc.Inventors: David N. Suggs, Luke Yen, Steven Beigelmacher
-
Patent number: 9710392Abstract: Embodiments are described for methods and systems for mapping virtual memory pages to physical memory pages by analyzing a sequence of memory-bound accesses to the virtual memory pages, determining a degree of contiguity between the accessed virtual memory pages, and mapping sets of the accessed virtual memory pages to respective single physical memory pages. Embodiments are also described for a method for increasing locality of memory accesses to DRAM in virtual memory systems by analyzing a pattern of virtual memory accesses to identify contiguity of accessed virtual memory pages, predicting contiguity of the accessed virtual memory pages based on the pattern, and mapping the identified and predicted contiguous virtual memory pages to respective single physical memory pages.Type: GrantFiled: August 15, 2014Date of Patent: July 18, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Syed Ali Jafri, Yasuko Eckert, Srilatha Manne, Mithuna S Thottethodi
-
Patent number: 9712353Abstract: A method and system is provided for allowing signals across electrical domains. The method includes applying a clock signal (of at least 1 GHz) to an electronic element in a location having first electrical properties. Data is output from the first electronic element; and received at a second electronic element located in a location having second electrical properties. The first and second electrical properties are different by either voltage and clock frequency.Type: GrantFiled: October 1, 2012Date of Patent: July 18, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Andy Sung, Leon Lai, Daniel Wang
-
Patent number: 9710589Abstract: Systems, apparatuses, and methods for reducing the area of a semiconductor structure. A spacing violation may be detected for a gap width used to separate first and second regions of a layer of semiconductor material. In response to detecting the violation, the first and second regions are merged into a combined region, and then a cut mask layer is formed above the combined region. Next, an etch process is performed through the cut mask layer to remove an exposed third region within the combined region, wherein the exposed third region is interposed between first and second region portions of the combined region.Type: GrantFiled: June 24, 2015Date of Patent: July 18, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Kalpeshkumar Girishchandra Dave, Naveen Chandra Srivastava, Pankaj Kumar, Janardhan Achanta, Shreekanth Karandoor Sampigethaya
-
Patent number: 9710034Abstract: A method and apparatus using temperature margin to balance performance with power allocation. Nominal, middle and high power levels are determined for compute elements. A set of temperature thresholds are determined that drive the power allocation of the compute elements towards a balanced temperature profile. For a given workload, temperature differentials are determined for each of the compute elements relative the other compute elements, where the temperature differentials correspond to workload utilization of the compute element. If temperature overhead is available, and a compute element is below a temperature threshold, then particular compute elements are allocated power to match or drive toward the balanced temperature profile.Type: GrantFiled: June 8, 2015Date of Patent: July 18, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Samuel D. Naffziger, Michael Osborn, Sebastien Nussbaum
-
Patent number: 9704995Abstract: A system and method for fabricating non-planar devices while managing short channel and heating effects are described. A semiconductor device fabrication process includes forming a non-planar device where the body of the device is insulated from the silicon substrate, but the source and drain regions are not insulated from the silicon substrate. The process builds a local silicon on insulator (SOI) while not insulating area around the source and drain regions from the silicon substrate. A trench is etched a length at least that of a channel length of the device while being bounded by a site for a source region and a site for a drain region. The trench is filled with relatively thick layers to form the local SOI. When nanowires of a gate are residing on top of the layer-filled trench, a second trench is etched into the top layer for depositing gate metal in the second trench.Type: GrantFiled: September 20, 2016Date of Patent: July 11, 2017Assignee: Advanced Micro Devices, Inc.Inventor: Richard T. Schultz
-
Publication number: 20170195683Abstract: A texture compression method is described. The method comprises splitting an original texture having a plurality of pixels into original blocks of pixels. Then, for each of the original blocks of pixels, a partition is identified that has one or more disjoint subsets of pixels whose union is the original block of pixels. The original block of pixels is further subdivided into one or more subsets according to the identified partition. Finally, each subset is independently compressed to form a compressed texture block.Type: ApplicationFiled: August 15, 2016Publication date: July 6, 2017Applicants: ATI Technologies ULC, Advanced Micro Devices, Inc.Inventors: Konstantine Iourcha, Andrew S.C. Pomianowski