Patents Assigned to Advanced Micro Device, Inc.
-
Patent number: 11854652Abstract: A sense amplifier is biased to reduce leakage current equalize matched transistor bias during an idle state. A first read select transistor couples a true bit line and a sense amplifier true (SAT) signal line and a second read select transistor couples a complement bit line and a sense amplifier complement (SAC) signal line. The SAT and SAC signal lines are precharged during a precharge state. An equalization circuit shorts the SAT and SAC signal lines during the precharge state. A differential sense amplifier circuit for latching the memory cell value is coupled to the SAT signal line and the SAC signal line. The precharge circuit and the differential sense amplifier circuit are turned off during a sleep state to cause the SAT and SAC signal lines to float. A sleep circuit shorts the SAT and SAC signal lines during the sleep state.Type: GrantFiled: November 10, 2022Date of Patent: December 26, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Russell J. Schreiber, Ryan T. Freese, Eric W. Busta
-
Patent number: 11855061Abstract: A three-dimensional integrated circuit includes a first die structure having a first geometry. The first die structure includes a first region that operates with a first power density and a second region that operates with a second power density. The first power density is less than the second power density. The three-dimensional integrated circuit includes a second die structure having a second geometry. A stacked portion of the second die structure is aligned with the first region. The three-dimensional integrated circuit includes an additional die structure stacked with the first die structure and the second die structure. The additional die structure has the first geometry or the second geometry.Type: GrantFiled: August 19, 2022Date of Patent: December 26, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Brett P. Wilkerson, Milind S. Bhagavat, Rahul Agarwal, Dmitri Yudanov
-
Patent number: 11854602Abstract: A memory controller monitors memory command selected for dispatch to the memory and sends commands controlling a read clock state. A memory includes a read clock circuit and a mode register. The read clock circuit has an output for providing a hybrid read clock signal in response to a clock signal and a read clock mode signal. The mode register provides the read clock mode signal in response to a read clock mode, wherein the read clock circuit provides the hybrid read clock signal as a free-running clock signal that toggles continuously when the read clock mode is a first mode, and as a strobe signal that is active only in response to the memory receiving a read command when the read clock mode is a second mode.Type: GrantFiled: June 27, 2022Date of Patent: December 26, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Aaron John Nygren, Karthik Gopalakrishnan, Tsun Ho Liu
-
Patent number: 11853193Abstract: An approach is provided for a program profiler to implement inverse performance driven program analysis, which enables a user to specify a desired optimization end state and receive instructions on how to implement the optimization end state. The program profiler accesses profile data from an execution of a plurality of tasks executed on a plurality of computing resources. The program profiler constructs a dependency graph based on the profile data. The program profiler causes a user interface to be presented that represents the profile data. The program profiler receives an input for a modification of one or more execution attributes of one or more target tasks. The program profiler determines that the modification is projected to improve a performance metric while maintaining a validity of the dependency graph. The program profiler presents, via the user interface, one or more steps to implement the modification.Type: GrantFiled: October 29, 2021Date of Patent: December 26, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Budirijanto Purnomo, Chen Shen
-
Patent number: 11853111Abstract: Methods and apparatuses control electrical current supplied to a plurality of processing units in a multi-processor system. A plurality of current usage information corresponding to the processing units are received by a controller to determine a threshold current for each of the processing units. The controller determines a frequency reduction action and an instructions-per-cycle (IPC) reduction action for the each of the processing units based on the threshold current and regulates operations of the processing units based on the determined frequency and IPC reduction actions.Type: GrantFiled: September 8, 2022Date of Patent: December 26, 2023Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Amitabh Mehra, Richard Martin Born, Sriram Srinivasan, Sneha Komatireddy, Michael L Golden, Xiuting Kaleen C. Man, Gokul Subramani Ramalingam Lakshmi Devi, Xiaojie He
-
Publication number: 20230409337Abstract: Devices and methods for partial sorting for coherence recovery are provided. The partial sorting is efficiently executed by utilizing existing hardware along the memory path (e.g., memory local to the compute unit). The devices include an accelerated processing device which comprises memory and a processor. The processor is, for example, a compute unit of a GPU which comprises a plurality of SIMD units and is configured to determine, for data entries each comprising a plurality of bits, a number of occurrences of different types of the data entries by storing the number of occurrences in one or more portions of the memory local to the processor, sort the data entries based on the determined number of occurrences stored in the one or more portions of the memory local to the processor and execute the sorted data entries.Type: ApplicationFiled: June 21, 2022Publication date: December 21, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Matthäus G. Chajdas, Christopher J. Brennan
-
Publication number: 20230409336Abstract: In accordance with described techniques for VLIW Dynamic Communication, an instruction that causes dynamic communication of data to at least one processing element of a very long instruction word (VLIW) machine is dispatched to a plurality of processing elements of the VLIW machine. A first count of data communications issued by the plurality of processing elements and a second count of data communications served by the plurality of processing elements are maintained. At least one additional instruction is determined for dispatch to the plurality of processing elements of the VLIW machine based on the first count and the second count. For example, an instruction that is independent of the instruction is determined for dispatch while the first count and the second count are unequal, and an instruction that is dependent on the instruction is determined for dispatch based on the first count and the second count being equal.Type: ApplicationFiled: June 17, 2022Publication date: December 21, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Sriseshan Srikanth, Karthik Ramu Sangaiah, Anthony Thomas Gutierrez, Vedula Venkata Srikant Bharadwaj, John Kalamatianos
-
Publication number: 20230409232Abstract: A method and apparatus for training data in a computer system includes reading data stored in a first memory address in a memory and writing it to a buffer. Training data is generated for transmission to the first memory address. The data is transmitted to the first memory address. Information relating to the training data is read from the first memory address and the stored data is read from the buffer and written to the memory area where the training data was transmitted.Type: ApplicationFiled: June 21, 2022Publication date: December 21, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Anwar Kashem, Craig Daniel Eaton, Pouya Najafi Ashtiani, Tsun Ho Liu
-
Publication number: 20230409868Abstract: Activation scaled clipping layers for neural networks are described. An activation scaled clipping layer processes an output of a neuron in a neural network using a scaling parameter and a clipping parameter. The scaling parameter defines how numerical values are amplified relative to zero. The clipping parameter specifies a numerical threshold that causes the neuron output to be expressed as a value defined by the numerical threshold if the neuron output satisfies the numerical threshold. In some implementations, the scaling parameter is linear and treats numbers within a numerical range as being equivalent, such that any number in the range is scaled by a defined magnitude, regardless of value. Alternatively, the scaling parameter is nonlinear, which causes the activation scaled clipping layer to amplify numbers within a range by different magnitudes. Each scaling and clipping parameter is learnable during training of a machine learning model implementing the neural network.Type: ApplicationFiled: June 20, 2022Publication date: December 21, 2023Applicant: Advanced Micro Devices, Inc.Inventors: Hai Xiao, Adam H Li, Harris Eleftherios Gasparakis
-
Publication number: 20230409982Abstract: Methods, devices, and systems for emulating a compute kernel with an ANN. The compute kernel is executed on a processor, and it is determined whether the compute kernel is a hotspot kernel. If the compute kernel is a hotspot kernel, the compute kernel is emulated with an ANN, and the ANN is substituted for the compute kernel.Type: ApplicationFiled: August 25, 2023Publication date: December 21, 2023Applicant: Advanced Micro Devices, Inc.Inventor: Nicholas Malaya
-
Patent number: 11847462Abstract: A software-based instruction scoreboard indicates dependencies between closely-issued instructions issued to an arithmetic logic unit (ALU) pipeline. The software-based instruction scoreboard inserts one or more control words into the command stream between the dependent instructions, which is then executed by the ALU pipeline. The control words identify the instruction(s) upon which the dependent instructions depend (parent instructions) so that the GPU hardware can ensure that the ALU pipeline does not stall while the dependent instruction waits for results from the parent instruction.Type: GrantFiled: December 15, 2020Date of Patent: December 19, 2023Assignee: Advanced Micro Devices, Inc.Inventor: Brian Emberling
-
Patent number: 11847463Abstract: A processor includes a load/store unit and an execution pipeline to execute an instruction that represents a single-instruction-multiple-data (SIMD) operation, and which references a memory block storing operand data for one or more lanes of a plurality of lanes and a mask vector indicating which lanes of a plurality of lanes are enabled and which are disabled for the operation. The execution pipeline executes an instruction in a first execution mode unless a memory fault is generated during execution of the instruction in the first execution mode. In response to the memory fault, the execution pipeline re-executes the instruction in a second execution mode. In the first execution mode, a single load operation is attempted to access the memory block via the load/store unit. In the second execution mode, a separate load operation is performed by the load/store unit for each enabled lane of the plurality of lanes prior to executing the SIMD operation.Type: GrantFiled: September 27, 2019Date of Patent: December 19, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Kai Troester, Scott Thomas Bingham, John M. King, Michael Estlick, Erik Swanson, Robert Weidner
-
Patent number: 11847048Abstract: A processing device and methods of controlling remote persistent writes are provided. Methods include receiving an instruction of a program to issue a persistent write to remote memory. The methods also include logging an entry in a local domain when the persistent write instruction is received and providing a first indication that the persistent write will be persisted to the remote memory. The methods also include executing the persistent write to the remote memory and providing a second indication that the persistent write to the remote memory is completed. The methods also include providing the first and second indications when it is determined not to execute the persistent write according to global ordering and providing the second indication without providing the first indication when it is determined to execute the persistent write to remote memory according to global ordering.Type: GrantFiled: September 24, 2020Date of Patent: December 19, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Nuwan Jayasena, Shaizeen Aga
-
Patent number: 11847061Abstract: A technical solution to the technical problem of how to support memory-centric operations on cached data uses a novel memory-centric memory operation that invokes write back functionality on cache controllers and memory controllers. The write back functionality enforces selective flushing of dirty, i.e., modified, cached data that is needed for memory-centric memory operations from caches to the completion level of the memory-centric memory operations, and updates the coherence state appropriately at each cache level. The technical solution ensures that commands to implement the selective cache flushing are ordered before the memory-centric memory operation at the completion level of the memory-centric memory operation.Type: GrantFiled: July 26, 2021Date of Patent: December 19, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Shaizeen Aga, Nuwan Jayasena, John Kalamatianos
-
Patent number: 11847062Abstract: In response to eviction of a first clean data block from an intermediate level of cache in a multi-cache hierarchy of a processing system, a cache controller accesses an address of the first clean data block. The controller initiates a fetch of the first clean data block from a system memory into a last-level cache using the accessed address.Type: GrantFiled: December 16, 2021Date of Patent: December 19, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Tarun Nakra, Jay Fleischman, Gautam Tarasingh Hazari, Akhil Arunkumar, William L. Walker, Gabriel H. Loh, John Kalamatianos, Marko Scrbak
-
Patent number: 11848269Abstract: A system and method for creating layout for standard cells are described. In various implementations, a floating metal net in the metal zero layer of a standard cell is selected for conversion to a power rail. The metal zero layer is a lowest metal layer above the gate region of a transistor. A semiconductor process (or process) forms a power rail in a metal zero track reserved for power rails. The process forms another power rail in a metal zero track reserved for floating metal nets, and electrically shorts the two power rails using a local interconnect layer between the two power rails. The charging and discharging times of a source region physically connected to the two power rails decreases.Type: GrantFiled: October 4, 2021Date of Patent: December 19, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Partha Pratim Ghosh, Pratap Kumar Das, Prasanth M
-
Patent number: 11847055Abstract: A technical solution to the technical problem of how to reduce the undesirable side effects of offloading computations to memory uses read hints to preload results of memory-side processing into a processor-side cache. A cache controller, in response to identifying a read hint in a memory-side processing instruction, causes results of the memory-side processing to be preloaded into a processor-side cache. Implementations include, without limitation, enabling or disabling the preloading based upon cache thrashing levels, preloading results, or portions of results, of memory-side processing to particular destination caches, preloading results based upon priority and/or degree of confidence, and/or during periods of low data bus and/or command bus utilization, last stores considerations, and enforcing an ordering constraint to ensure that preloading occurs after memory-side processing results are complete.Type: GrantFiled: June 30, 2021Date of Patent: December 19, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Shaizeen Aga, Nuwan Jayasena
-
Publication number: 20230401159Abstract: A method and system for providing memory in a computer system. The method includes receiving a memory access request for a shared memory address from a processor, mapping the received memory access request to at least one virtual memory pool to produce a mapping result, and providing the mapping result to the processor.Type: ApplicationFiled: August 24, 2023Publication date: December 14, 2023Applicants: Advanced Micro Devices, Inc., ATI Technologies ULCInventors: Anthony Asaro, Kevin Normoyle, Mark Hummel
-
Publication number: 20230400905Abstract: A technique for operating a device is disclosed. The technique includes attempting to detect presence of a user based on emitted and reflected audio signals; and controlling power state of the device based on the attempting.Type: ApplicationFiled: June 14, 2022Publication date: December 14, 2023Applicant: Advanced Micro Devices, Inc.Inventor: Eswar Chandra Saranu
-
Patent number: 11842199Abstract: An asynchronous pipeline includes a first stage and one or more second stages. A controller provides control signals to the first stage to indicate a modification to an operating speed of the first stage. The modification is determined based on a comparison of a completion status of the first stage to one or more completion statuses of the one or more second stages. In some cases, the controller provides control signals indicating modifications to an operating voltage applied to the first stage and a drive strength of a buffer in the first stage. Modules can be used to determine the completion statuses of the first stage and the one or more second stages based on the monitored output signals generated by the stages, output signals from replica critical paths associated with the stages, or a lookup table that indicates estimated completion times.Type: GrantFiled: June 26, 2020Date of Patent: December 12, 2023Assignee: Advanced Micro Devices, Inc.Inventors: Greg Sadowski, John Kalamatianos, Shomit N. Das