Abstract: Methods and apparatuses are provided for servicing an interrupt in a computer system. The method includes a device driver receiving an interrupt request. The device driver is responsive to the interrupt request to store interrupt data in a portion of the memory. The interrupt data includes identification of at least one processor of the plurality of processors capable of servicing the interrupt request; priority of the interrupt request; a thread context; and an address for instructions to service the interrupt request. The device driver then instructs the peripheral device to issue a memory write to the plurality of processors so that each may determine if it can use the thread context and the instructions to service the interrupt. A computer system is provided with the hardware needed to perform the method.
Abstract: A method and system for allocating memory to a memory operation executed by a processor in a computer arrangement having a first processor configured for unified operation with a second processor. The method includes receiving a memory operation from a processor and mapping the memory operation to one of a plurality of memory heaps. The mapping produces a mapping result. The method also includes providing the mapping result to the processor.
Type:
Application
Filed:
December 21, 2012
Publication date:
October 3, 2013
Applicants:
ATI Technologies ULC, Advanced Micro Devices, Inc.
Inventors:
Anthony Asaro, Kevin Normoyle, Mark Hummel
Abstract: A method, computer program product, and system are provided for scheduling a plurality of instructions in a computing system. For example, the method can generate a plurality of instruction lineages, in which the plurality of instruction lineages is assigned to one or more registers. Each of the plurality of instruction lineages has at least one node representative of an instruction from the plurality of instructions. The method can also determine a node order based on respective priority values associated with each of the nodes. Further, the method can include scheduling the plurality of instructions based on the node order and the one or more registers assigned to the one or more registers.
Abstract: Provided is a method of permitting the reordering of a visibility order of operations in a computer arrangement configured for permitting a first processor and a second processor threads to access a shared memory. The method includes receiving in a program order, a first and a second operation in a first thread and permitting the reordering of the visibility order for the operations in the shared memory based on the class of each operation. The visibility order determines the visibility in the shared memory, by a second thread, of stored results from the execution of the first and second operations.
Type:
Application
Filed:
August 17, 2012
Publication date:
October 3, 2013
Applicants:
Advanced Micro Devices, Inc., ATI Technologies ULC
Inventors:
Anthony Asaro, Kevin Normoyle, Mark Hummel
Abstract: The subject invention provides systems and methods that monitor and/or control turbulence of an immersion medium. The systems and methods relate to computer controlled techniques that reduce effects of immersion medium flow due to a liquid temperature gradient. According to an aspect of the subject invention, a number of temperature measurements of the immersion medium are obtained, and the temperature measurements are utilized to generate a gradient map of the immersion medium. By way of illustration, the temperature measurements can be made via wireless temperature sensors. The gradient map can be utilized to understand the stability of the immersion medium. According to an aspect of the subject invention, instability identified with the gradient map can be mitigated.
Abstract: A video graphics system, graphics processor, and method of reducing memory bandwidth consumption include logic that groups binary data of a block of pixels into bit-planes. Each bit-plane corresponds to a different bit position in the binary data of the block and includes a bit value from each pixel in the block at that corresponding bit position. An encoding, associated with the block of pixels, represents which ones of the bit-planes are constant-value bit-planes having binary data comprised of a same bit value from every pixel in the block and which of the bit-planes are mixed-value bit-planes. Logic accesses memory storing the block of pixels to process the binary data of each mixed-value bit-plane and accesses memory storing the encoding to process the binary data of each constant-value bit-plane when a processing operation is performed on the block of pixels.
Abstract: Provided herein is a method for implementing antialiasing including independently operating different portions of a graphics pipeline at different sampling rates in accordance with pixel color details.
Abstract: During a replacement gate approach, the inverse tapering of the opening obtained after removal of the polysilicon material may be reduced by depositing a spacer layer and forming corresponding spacer elements on inner sidewalls of the opening. Consequently, the metal-containing gate electrode material and the high-k dielectric material may be deposited with enhanced reliability.
Type:
Application
Filed:
May 20, 2013
Publication date:
September 26, 2013
Applicant:
Advanced Micro Devices, Inc.
Inventors:
Kai Frohberg, Uwe Griebenow, Katrin Reiche, Heike Berthold
Abstract: A device may include an oscillator to generate a clock signal based on first and second control signals. The oscillator may include a first buffer stage a second buffer stage. The first buffer stage may output a first signal that is based on an output of the second buffer stage and the first control signal. The second buffer stage may output the clock signal. The clock signal may be based on the first signal and the second control signal.
Type:
Grant
Filed:
December 21, 2011
Date of Patent:
September 24, 2013
Assignee:
Advanced Micro Devices, Inc.
Inventors:
Bruce A. Doyle, Emerson S. Fang, Alvin L. Loke, Shawn Searles, Stephen F. Greenwood
Abstract: A method and apparatus are disclosed for implementing early release of speculatively read data in a hardware transactional memory system. A processing core comprises a hardware transactional memory system configured to receive an early release indication for a specified word of a group of words in a read set of an active transaction. The early release indication comprises a request to remove the specified word from the read set. In response to the early release request, the processing core removes the group of words from the read set only after determining that no word in the group other than the specified word has been speculatively read during the active transaction.
Type:
Grant
Filed:
December 13, 2012
Date of Patent:
September 24, 2013
Assignee:
Advanced Micro Devices, Inc.
Inventors:
Jaewoong Chung, David S Christie, Michael Hohmuth, Stephan Diestelhorst, Martin Pohlack, Luke Yen
Abstract: Methods and systems are provided for graphics processing unit distributed work-item queuing. One or more work-items of a wavefront are queued into a first level queue of a compute unit. When one or more additional work-items exist, a queuing of the additional work-items into a second level queue of the compute unit is performed. The queuing of the work-items into the first and second level queue is performed based on an assignment technique.
Type:
Application
Filed:
March 16, 2012
Publication date:
September 19, 2013
Applicant:
Advanced Micro Devices, Inc.
Inventors:
Michael L. Schmit, Radhakrishna Giduthuri
Abstract: A method of fabricating a semiconductor device having a transistor with a metal gate electrode and a gate dielectric layer includes forming a protective layer on the gate dielectric layer and forming a metal gate electrode over the protective layer. The protective layer has a graded composition between the gate dielectric layer and the metal gate electrode.
Abstract: An apparatus and method for dynamically adjusting power limits for processing nodes and other components, such as peripheral interfaces, is disclosed. The apparatus includes multiple processing nodes and other components, and further includes a power management unit configured to set a first frequency limit for at least one of the processing nodes responsive to receiving an indication of a first detected temperature greater than a first temperature threshold. Initial power limits are set below guard-band power limits for components that do not have reliable reporting of power consumption or for cost or power saving reasons. The amount of throttling of processing nodes is used to adjust the power limits for the processing nodes and these components.
Type:
Application
Filed:
May 8, 2013
Publication date:
September 19, 2013
Applicant:
ADVANCED MICRO DEVICES, INC.
Inventors:
Alexander J. Branover, Ashish Jain, Ann M. Ling, Maurice B. Steinman
Abstract: Methods and systems are provided for graphics processing unit optimization via wavefront reforming including queuing one or more work-items of a wavefront into a plurality of queues of a compute unit. Each queue is associated with a particular processor within the compute unit. A plurality of work passes are performed. A determination is made which of the plurality of queues are below a threshold amount of work-items. Remaining one or more work-items from the queues with remaining ones of the work-items are redistributed to the below threshold queues. A subsequent work pass is performed. The, repeating of the determining, redistributing, and performing the subsequent work pass is done until all the queues are empty.
Type:
Application
Filed:
March 16, 2012
Publication date:
September 19, 2013
Applicant:
Advanced Micro Devices, Inc.
Inventors:
Michael L. Schmit, Radhakrishna Giduthuri
Abstract: An integrated circuit includes a clock-tree with a plurality of clock buffers, a plurality of clocked storage elements, and a plurality of logic circuits. Each clocked storage element has a clock input terminal connected to one of the plurality of clock buffers and a weight. Each of the logic circuits is associated with two of the plurality of clocked storage elements and is characterized as having a logic depth. The weight of each clocked storage element is equal to a sum of an inverse of a logic depth of each of the plurality of logic circuits associated therewith. A first clocked storage element which has a highest weight and is adjacent to and interacts with a second clocked storage element via one of the plurality of logic circuits. A first clock buffer provides a common clock signal to the first and second clocked storage elements.
Abstract: According to one embodiment, an optical isolation module includes first and second linear polarizers, a Faraday rotator situated between the first and second linear polarizers and a transmissive element including a half-wave plate also situated between the first and second linear polarizers. In one embodiment, a method for performing optical isolation includes rotating an axis of polarization of a linearly polarized light beam by a first rotation in a first direction, and selectively rotating a portion of the linearly polarized light beam by a second rotation in the first direction to produce first and second linearly polarized light beam portions. As a result, the first linearly polarized light beam portion undergoes the first rotation, and the second linearly polarized light beam portion undergoes the first and second rotations. The method further includes filtering one of the first and second linearly polarized light beam portions to produce a light annulus.
Abstract: A system and method for increasing processor throughput by decreasing a loop critical path. In one embodiment, a table comprises multiple stack entries, each comprising an x87 floating-point (FP) stack specifier. The combinatorial logic for operand translation of N FP instructions per clock cycle may require N instantiated copies of a combinatorial logic block. Each instantiated copy may determine a new ordering of the stack entries. Control logic may receive necessary information from the corresponding N FP instructions and determine a corresponding combined computational effect, or stack reordering, on entries within the table based on two or more instructions. Resulting control signals are conveyed to the N instantiated copies. A resulting accumulative delay from an input of the first copy to the output of the Nth copy may be less than or equal to (N?1)*time_delay versus a longer N*time_delay.
Type:
Grant
Filed:
June 11, 2009
Date of Patent:
September 17, 2013
Assignee:
Advanced Micro Devices, Inc.
Inventors:
Ranganathan Sudhakar, Daryl Lieu, Debjit Das Sarma
Abstract: A method and apparatus are disclosed for determining the presence of adjacent channel interference. Received digital signals are processed to detect the existence of strong channels adjacent to the channel of interest and control signals may be generated based on the detection of strong adjacent channels. The control signals are then used to adjust the signal power of the received signals.
Type:
Grant
Filed:
March 20, 2009
Date of Patent:
September 17, 2013
Assignee:
Advanced Micro Devices, Inc.
Inventors:
Xiaoqiang Ma, Azzedine Touzni, Jason Adams, David Lewis, Louis Giannini, Feng Huang
Abstract: Apparatus for memory elements and related methods for performing an allocate operation are provided. An exemplary memory element includes a plurality of way memory elements and a replacement module coupled to the plurality of way memory elements. Each way memory element is configured to selectively output data bits maintained at an input address. The replacement module is configured to enable output of the data bits maintained at the input address of a way memory element of the plurality of way memory elements for replacement in response to an allocate instruction including the input address.
Type:
Grant
Filed:
November 19, 2010
Date of Patent:
September 10, 2013
Assignee:
Advanced Micro Devices, Inc.
Inventors:
Michael Ciraula, Carson Henrion, Ryan Freese
Abstract: A method and apparatus for performing template-based classification of a circuit design are disclosed. A template file is read that defines a plurality of channel-connected-region (CCR) templates. A graph is formatted for each of the CCR templates. A plurality of CCRs are identified based on a partitioned netlist file that defines a given circuit design. A graph is generated for each of the identified CCRs. A matching CCR template graph is identified for each generated CCR graph. The template file may further defines super-CCR templates, and a graph may be formatted for each of the super-CCR templates. All possible combinations of CCRs and previously-matched super-CCRs that are candidates to match the formatted super-CCR template graph may be determined in an interative manner, for each formatted super-CCR template graph. A determination may be made as to which of the candidate combinations actually match the formatted super-CCR template graph.
Type:
Grant
Filed:
December 17, 2010
Date of Patent:
September 10, 2013
Assignee:
Advanced Micro Devices, Inc.
Inventors:
Weiqing Guo, Thomas D. Burd, Arun Chandra