Patents Assigned to Advanced Micro Devices
-
Publication number: 20170193697Abstract: A method, a system, and a computer-readable storage medium directed to performing high-speed parallel tessellation of 3D surface patches are disclosed. The method includes generating a plurality of primitives in parallel. Each primitive in the plurality is generated by a sequence of functional blocks, in which each sequence acts independently of all the other sequences.Type: ApplicationFiled: December 30, 2015Publication date: July 6, 2017Applicant: Advanced Micro Devices, Inc.Inventors: Timour T. Paltashev, Boris Prokopenko, Vladimir V. Kibardin
-
Patent number: 9696790Abstract: Processor power may be managed by executing state storage and power gating instructions after receiving an idle indication. The idle indication may be received while the processor is executing instructions in a first mode, and the processor may execute the state storage and power gating instructions in a second mode. The state storage and power gating instructions may be inaccessible to the processor when operating in the first mode.Type: GrantFiled: October 24, 2014Date of Patent: July 4, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Noah Beck, John D. Wilkes, Jr., Francisco Leonel Duran
-
Patent number: 9697176Abstract: A method of multiplication of a sparse matrix and a vector to obtain a new vector and a system for implementing the method are claimed. Embodiments of the method are intended to optimize the performance of sparse matrix-vector multiplication in highly parallel processors, such as GPUs. The sparse matrix is stored in compressed sparse row (CSR) format.Type: GrantFiled: November 14, 2014Date of Patent: July 4, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Mayank Daga, Joseph L. Greathouse
-
Patent number: 9697146Abstract: A processor uses a token scheme to govern the maximum number of memory access requests each of a set of processor cores can have pending at a northbridge of the processor. To implement the scheme, the northbridge issues a minimum number of tokens to each of the processor cores and keeps a number of tokens in reserve. In response to determining that a given processor core is generating a high level of memory access activity the northbridge issues some of the reserve tokens to the processor core. The processor core returns the reserve tokens to the northbridge in response to determining that it is not likely to continue to generate the high number of memory access requests, so that the reserve tokens are available to issue to another processor core.Type: GrantFiled: December 27, 2012Date of Patent: July 4, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Douglas R. Williams, Vydhyanathan Kalyanasundharam, Marius Evers, Michael K. Fertig
-
Patent number: 9696998Abstract: The apparatuses, systems, and methods in accordance with the embodiments disclosed herein may facilitate modifying post silicon instruction behavior. Embodiments herein may provide registers in predetermined locations in an integrated circuit. These registers may be mapped to generic instructions, which can modify an operation of the integrated circuit. In some embodiments, these registers may be used to implement a patch routine to change the behavior of at least a portion of the integrated circuit. In this manner, the original design of the integrated circuit may be altered.Type: GrantFiled: August 29, 2013Date of Patent: July 4, 2017Assignee: Advanced Micro Devices, Inc.Inventor: Frank C Galloway
-
Patent number: 9696784Abstract: A system, method and a computer program product for processing media content on a media player having direct access to hardware are provided in exemplary embodiments. When the media player is initialized, an operating system is placed into a stand-by mode that decreases power consumption on an electronic device. Instead of the operating system, a hardware pipeline processes media content. A hardware pipeline is dedicated to process a media content based on the media content type. The media content is processed using the dedicated hardware pipeline to reduce the power consumption during processing.Type: GrantFiled: September 14, 2012Date of Patent: July 4, 2017Assignees: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Greg Sadowski, Gabor Sines
-
Patent number: 9698790Abstract: A programmable device comprises one or more programming regions, each comprising a plurality of configurable logic blocks, where each of the plurality of configurable logic blocks is selectively connectable to any other configurable logic block via a programmable interconnect fabric. The programmable device further comprises configuration logic configured to, in response to an instruction in an instruction stream, reconfigure hardware in one or more of the configurable logic blocks in a programming region independently from any of the other programming regions.Type: GrantFiled: June 26, 2015Date of Patent: July 4, 2017Assignee: Advanced Micro Devices, Inc.Inventor: David A. Roberts
-
Patent number: 9697125Abstract: For each access request received at a shared cache of the data processing device, a memory access pattern (MAP) monitor predicts which of the memory banks, and corresponding row buffers, would be accessed by the access request if the requesting thread were the only thread executing at the data processing device. By recording predicted accesses over time for a number of access requests, the MAP monitor develops a pattern of predicted memory accesses by executing threads. The pattern can be employed to assign resources at the shared cache, thereby managing memory more efficiently.Type: GrantFiled: April 9, 2015Date of Patent: July 4, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Jaewoong Chung, Shekhar Srikantaiah, Lisa Hsu
-
Patent number: 9697147Abstract: A processing system comprises one or more processor devices and other system components coupled to a stacked memory device having a set of stacked memory layers and a set of one or more logic layers. The set of logic layers implements a metadata manager that offloads metadata management from the other system components. The set of logic layers also includes a memory interface coupled to memory cell circuitry implemented in the set of stacked memory layers and coupleable to the devices external to the stacked memory device. The memory interface operates to perform memory accesses for the external devices and for the metadata manager. By virtue of the metadata manager's tight integration with the stacked memory layers, the metadata manager may perform certain memory-intensive metadata management operations more efficiently than could be performed by the external devices.Type: GrantFiled: August 6, 2012Date of Patent: July 4, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Gabriel H. Loh, James M. O'Connor, Bradford M. Beckmann, Michael Ignatowski
-
Patent number: 9697003Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.Type: GrantFiled: June 7, 2013Date of Patent: July 4, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
-
Publication number: 20170185451Abstract: Methods, devices, and systems for data driven scheduling of a plurality of computing cores of a processor. A plurality of threads may be executed on the plurality of computing cores, according to a default schedule. The plurality of threads may be analyzed, based on the execution, to determine correlations among the plurality of threads. A data driven schedule may be generated based on the correlations. The plurality of threads may be executed on the plurality of computing cores according to the data driven schedule.Type: ApplicationFiled: December 28, 2015Publication date: June 29, 2017Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Jimshed Mirza, YunPeng Zhu
-
Publication number: 20170185409Abstract: Methods, devices, and systems for capturing an accuracy of an instruction executing on a processor. An instruction may be executed on the processor, and the accuracy of the instruction may be captured using a hardware counter circuit. The accuracy of the instruction may be captured by analyzing bits of at least one value of the instruction to determine a minimum or maximum precision datatype for representing the field, and determining whether to adjust a value of the hardware counter circuit accordingly. The representation may be output to a debugger or logfile for use by a developer, or may be output to a runtime or virtual machine to automatically adjust instruction precision or gating of portions of the processor datapath.Type: ApplicationFiled: December 28, 2015Publication date: June 29, 2017Applicant: Advanced Micro Devices, Inc.Inventors: Leonardo de Paula Rosa Piga, Abhinandan Majumdar, Indrani Paul, Wei Huang, Manish Arora, Joseph L. Greathouse
-
Patent number: 9692426Abstract: A phase locked loop (PLL) system includes a PLL and a calibration circuit. The PLL has a reference clock input, a voltage controlled oscillator (VCO) clock output, and a feedback clock output. The calibration circuit provides a reference clock signal to the reference clock input of the PLL, induces first and second phase disturbances between the reference clock signal and a feedback clock signal, measures respective first and second zero crossing times of a phase error between the reference clock signal and the feedback clock signal, and estimates a bandwidth of the PLL in response to an average of the first and second zero crossing times.Type: GrantFiled: May 6, 2013Date of Patent: June 27, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Meei-Ling Chiang, Boon-Aik Ang, Dennis Fischette, Jr.
-
Publication number: 20170178275Abstract: Described is a method and system for using a solid state device (SSD) as an eviction pad for graphics processing units (GPUs). The method for eviction processing includes a processor that determines when a dedicated memory associated with a GPU and a host memory associated with the processor are congested. The processor sends a content transfer command to the SSD. The SSD initiates a content transfer directly with the dedicated memory associated with the GPU. The GPU transfers the contents directly to the SSD. The processor sends a content transfer command to the SSD when the evicted contents are needed by the GPU. The SSD then initiates and transfers the evicted contents back to the dedicated memory.Type: ApplicationFiled: December 22, 2015Publication date: June 22, 2017Applicant: Advanced Micro Devices, Inc.Inventors: Tzachi Cohen, Yaki Tebeka, Assaf Pagi
-
Publication number: 20170177484Abstract: A probe filter determines whether to issue a probe to at least one other processing node in response to a memory access request, and includes a region probe filter directory, a line probe filter directory, and a controller. The region probe filter directory identifies regions of memory for which at least one cache line may be cached in a data processing system and a state of each region, wherein a size of each region corresponds to a plurality of cache lines. The line probe filter directory identifies cache lines cached in the data processing system and a state of each cache line. The controller accesses at least one of the region probe filter directory and the line probe filter directory in response to a memory access request to determine whether to issue the probe, and does not issue any probe in response to a read-only request.Type: ApplicationFiled: December 22, 2015Publication date: June 22, 2017Applicant: Advanced Micro Devices, Inc.Inventor: Patrick N. Conway
-
Patent number: 9686536Abstract: A video device having data lanes and a method of operating the video device includes generating performance monitoring and/or debug data in response to the operation of the video device. The generated data is sampled from component of the video device operating in various clocking domain. The data sampled from the components is combined into a unified stream which is independent of the various clocking domain. The unified stream is transmitted across one more data lanes of a video link along with corresponding audio and/or video data in real time.Type: GrantFiled: May 20, 2014Date of Patent: June 20, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Eric Rentschler, Sebastien Nussbaum
-
Patent number: 9685953Abstract: In one form, a logic circuit includes an asynchronous logic circuit, a synchronous logic circuit, and an interface circuit coupled between the asynchronous logic circuit and the synchronous logic circuit. The asynchronous logic circuit has a plurality of asynchronous outputs for providing a corresponding plurality of asynchronous signals. The synchronous logic circuit has a plurality of synchronous inputs corresponding to the plurality of asynchronous outputs, a stretch input for receiving a stretch signal, and a clock output for providing a clock signal. The synchronous logic circuit provides the clock signal as a periodic signal but prolongs a predetermined state of the clock signal while the stretch signal is active. The asynchronous interface detects whether metastability could occur when latching any of the plurality of the asynchronous outputs of the asynchronous logic circuit using said clock signal, and activates the stretch signal while the metastability could occur.Type: GrantFiled: September 9, 2016Date of Patent: June 20, 2017Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Greg Sadowski
-
Publication number: 20170168546Abstract: A method and apparatus for performing inter-lane power management includes de-energizing one or more execution lanes upon a determination that the one or more execution lanes are to be predicated. Energy from the predicated execution lanes is redistributed to one or more active execution lanes.Type: ApplicationFiled: December 9, 2015Publication date: June 15, 2017Applicant: Advanced Micro Devices, Inc.Inventors: Mitesh R. Meswani, David A. Roberts, Dmitri Yudanov, Arkaprava Basu, Sergey Blagodurov
-
Patent number: 9678806Abstract: Briefly, methods and apparatus to rebalance workloads among processing cores utilizing a hybrid work donation and work stealing technique are disclosed that improve workload imbalances within processing devices such as, for example, GPUs. In one example, the methods and apparatus allow for workload distribution between a first processing core and a second processing core by providing queue elements from one or more workgroup queues associated with workgroups executing on the first processing core to a first donation queue that may also be associated with the workgroups executing on the first processing core. The method and apparatus also determine if a queue level of the first donation queue is beyond a threshold, and if so, steal one or more queue elements from a second donation queue associated with workgroups executing on the second processing core.Type: GrantFiled: June 26, 2015Date of Patent: June 13, 2017Assignee: Advanced Micro Devices, Inc.Inventors: Shuai Che, Bradford Beckmann, Marc S. Orr, Ayse Yilmazer
-
Patent number: 9680450Abstract: In one form, a flip-flop comprises a master latch, a slave latch, and a multiplexer. The master latch has an input for receiving a data input signal, and an output, and operates in transparent and latching modes during respective first and second phases of a clock signal. The slave latch has an input coupled to the output of the master latch, and an output, and operates in the transparent and latching modes during the second and first phases of the clock signal, respectively. The multiplexer has a first input coupled to the output of the slave latch, a second input coupled to the output of the master latch, and an output for providing a data output signal, and provides the first input to the output during the first phase of the clock signal, and the second input to the output during the second phase of the clock signal.Type: GrantFiled: February 19, 2015Date of Patent: June 13, 2017Assignee: Advanced Micro Devices, Inc.Inventor: Daniel W. Bailey