Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.
Type:
Application
Filed:
July 3, 2023
Publication date:
November 2, 2023
Applicant:
Advanced Micro Devices, Inc.
Inventors:
Vedula Venkata Srikant Bharadwaj, Shomit Das, Anthony T. Gutierrez, Vignesh Adhinarayanan
Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.
Type:
Application
Filed:
September 25, 2020
Publication date:
March 31, 2022
Applicant:
Advanced Micro Devices, Inc.
Inventors:
Vedula Venkata Srikant Bharadwaj, Shomit N. Das, Anthony T. Gutierrez, Vignesh Adhinarayanan
Abstract: Systems, methods, devices, and computer-implemented instructions for processor power management implemented in a compiler. In some implementations, a characteristic of code is determined. An instruction based on the determined characteristic is inserted into the code. The code and inserted instruction are compiled to generate compiled code. The compiled code is output.
Type:
Grant
Filed:
September 25, 2020
Date of Patent:
August 15, 2023
Assignee:
Advanced Micro Devices, Inc.
Inventors:
Vedula Venkata Srikant Bharadwaj, Shomit N. Das, Anthony T. Gutierrez, Vignesh Adhinarayanan
Abstract: An electronic device includes a quantum processor having a plurality of qubits and a processor. The processor runs a plurality of instances of a quantum program substantially in parallel on the quantum processor using a separate set of qubits from among the plurality of qubits for each instance of the quantum program. The processor then acquires an output for each instance of the quantum program from the quantum processor. The processor next uses the outputs for generating an output of the quantum program.
Type:
Application
Filed:
September 30, 2021
Publication date:
March 30, 2023
Inventors:
Salonik Resch, Anthony Gutierrez, Yasuko Eckert, Vedula Venkata Srikant Bharadwaj, Mark H. Oskin
Abstract: A processor sets memory timing parameters based on a profile of a workload to be executed at the processor and based on a thermal budget associated with the processor. For a given workload and amount of available thermal headroom, as indicated by a detected temperature, the processor adjusts one or more of the memory timing parameters according to the workload profile. The processor is thereby able to tailor the memory timing parameters according to the memory access behavior of the workload, improving overall processing efficiency.
Type:
Application
Filed:
December 21, 2020
Publication date:
June 23, 2022
Inventors:
Max RUTTENBERG, Vedula Venkata Srikant BHARADWAJ, Yasuko ECKERT, Mark H. OSKIN, Anthony GUTIERREZ
Abstract: Systems and methods are disclosed that map quantum circuits to physical qubits of a quantum computer. Techniques are disclosed to generate a graph that characterizes the physical qubits of the quantum computer and to compute the resource requirements of each circuit of the quantum circuits. For each circuit, the graph is searched for a subgraph that matches the resource requirements of the circuit, based on a density matrix. Physical qubits, defined by the matching subgraph, are then allocated to the logical qubits of the circuit.
Type:
Application
Filed:
September 30, 2021
Publication date:
March 30, 2023
Applicant:
Advanced Micro Devices, Inc.
Inventors:
Anthony T. Gutierrez, Salonik Resch, Yasuko Eckert, Gabriel H. Loh, Mark Henry Oskin, Vedula Venkata Srikant Bharadwaj
Abstract: Systems and methods are disclosed that map quantum circuits to physical qubits of a quantum computer. Techniques are disclosed to generate a graph that characterizes the physical qubits of the quantum computer and to compute the resource requirements of each circuit of the quantum circuits. For each circuit, the graph is searched for a subgraph that matches the resource requirements of the circuit, based on a density matrix. Physical qubits, defined by the matching subgraph, are then allocated to the logical qubits of the circuit.
Type:
Grant
Filed:
September 30, 2021
Date of Patent:
March 5, 2024
Assignee:
Advanced Micro Devices, Inc.
Inventors:
Anthony T. Gutierrez, Salonik Resch, Yasuko Eckert, Gabriel H. Loh, Mark Henry Oskin, Vedula Venkata Srikant Bharadwaj
Abstract: Flow control credit management is provided when converting traffic from a first parallel link width on a first link to a second parallel link width on a second link A current value is calculated for a variable flow control credit exchange rate (R) associated with the first and second links. A first flow control credit indicator is received on the second link, and a credit amount calculated based on the first flow control credit indicator and R. A second flow control credit indicator for the credit amount is then transmitted on the first link.
Abstract: An integrated circuit includes, a network on chip (NOC) that includes a plurality of processing elements and a plurality of NOC nodes, interconnected to the plurality of processing elements. The integrated circuit includes logic that is configured to: increment by one, a virtual channel identifier to produce an incremented destination VC identifier, the virtual channel (VC) identifier associated with at least portion of a packet stored in at least one virtual channel buffer; determine that a destination virtual channel buffer corresponding to the incremented destination VC identifier in a destination NOC node in the NOC is available to store the portion of the packet; and in response to the determination, send the portion of the packet and the incremented destination VC identifier to the destination NOC node.
Abstract: A dynamically configurable overprovisioned microprocessor optimally supports a variety of different compute application workloads and with the capability to tradeoff among compute performance, energy consumption, and clock frequency on a per-compute application basis, using general-purpose microprocessor designs. In some embodiments, the overprovisioned microprocessor comprises a physical compute resource and a dynamic configuration logic configured to: detect an activation-warranting operating condition; undarken the physical compute resource responsive to detecting the activation-warranting operating condition; detect a configuration-warranting operating condition; and configure the overprovisioned microprocessor to use the undarkened physical compute resource responsive to detecting the configuration-warranting operating condition.
Type:
Application
Filed:
September 30, 2020
Publication date:
March 31, 2022
Inventors:
Anthony Gutierrez, Vedula Venkata Srikant Bharadwaj, Yasuko Eckert, Mark H. Oskin
Abstract: VLIW directed Power Management is described. In accordance with described techniques, a program is compiled to generate instructions for execution by a very long instruction word machine. During the compiling, power configurations for the very long instruction word machine to execute the instructions are determined, and fields of the instructions are populated with the power configurations. In one or more implementations, an instruction that includes a power configuration for the very long instruction word machine and operations for execution by the very long instruction word machine is obtained. A power setting of the very long instruction word machine is adjusted based on the power configuration of the instruction, and the operations of the instruction are executed by the very long instruction word machine.
Type:
Application
Filed:
December 14, 2021
Publication date:
June 15, 2023
Applicant:
Advanced Micro Devices, Inc.
Inventors:
Anthony Thomas Gutierrez, Karthik Ramu Sangaiah, Vedula Venkata Srikant Bharadwaj
Abstract: VLIW directed Power Management is described. In accordance with described techniques, a program is compiled to generate instructions for execution by a very long instruction word machine. During the compiling, power configurations for the very long instruction word machine to execute the instructions are determined, and fields of the instructions are populated with the power configurations. In one or more implementations, an instruction that includes a power configuration for the very long instruction word machine and operations for execution by the very long instruction word machine is obtained. A power setting of the very long instruction word machine is adjusted based on the power configuration of the instruction, and the operations of the instruction are executed by the very long instruction word machine.
Type:
Grant
Filed:
December 14, 2021
Date of Patent:
November 14, 2023
Assignee:
Advanced Micro Devices, Inc.
Inventors:
Anthony Thomas Gutierrez, Karthik Ramu Sangaiah, Vedula Venkata Srikant Bharadwaj
Abstract: A processor distributes memory timing parameters and data among different memory modules based upon memory access patterns. The memory access patterns indicate different types, or classes, of data for an executing workload, with each class associated with different memory access characteristics, such as different row buffer hit rate levels, different frequencies of access, different criticalities, and the like. The processor assigns each memory module to a data class and sets the memory timing parameters for each memory module according to the module’s assigned data class, thereby tailoring the memory timing parameters for efficient access of the corresponding data.
Type:
Application
Filed:
January 30, 2023
Publication date:
June 15, 2023
Inventors:
Max RUTTENBERG, Vendula Venkata Srikant BHARADWAJ, Yasuko ECKERT, Anthony GUTIERREZ, Mark H. OSKIN
Abstract: A processor distributes memory timing parameters and data among different memory modules based upon memory access patterns. The memory access patterns indicate different types, or classes, of data for an executing workload, with each class associated with different memory access characteristics, such as different row buffer hit rate levels, different frequencies of access, different criticalities, and the like. The processor assigns each memory module to a data class and sets the memory timing parameters for each memory module according to the module's assigned data class, thereby tailoring the memory timing parameters for efficient access of the corresponding data.
Type:
Application
Filed:
December 22, 2020
Publication date:
June 23, 2022
Inventors:
Max RUTTENBERG, Vendula Venkata Srikant BHARADWAJ, Yasuko ECKERT, Anthony GUTIERREZ, Mark H. OSKIN
Abstract: A dynamically configurable overprovisioned microprocessor optimally supports a variety of different compute application workloads and with the capability to tradeoff among compute performance, energy consumption, and clock frequency on a per-compute application basis, using general-purpose microprocessor designs. In some embodiments, the overprovisioned microprocessor comprises a physical compute resource and a dynamic configuration logic configured to: detect an activation-warranting operating condition; undarken the physical compute resource responsive to detecting the activation-warranting operating condition; detect a configuration-warranting operating condition; and configure the overprovisioned microprocessor to use the undarkened physical compute resource responsive to detecting the configuration-warranting operating condition.
Type:
Grant
Filed:
September 30, 2020
Date of Patent:
May 21, 2024
Assignee:
ADVANCED MICRO DEVICES, INC.
Inventors:
Anthony Gutierrez, Vedula Venkata Srikant Bharadwaj, Yasuko Eckert, Mark H. Oskin
Abstract: A system is described that includes an integrated circuit chip having a network-on-chip. The network-on-chip includes multiple routers arranged in a topology and a separate communication link coupled between each router and each of one or more neighboring routers of that router among the multiple routers in the topology. The integrated circuit chip also includes multiple nodes, each node coupled to a router of the multiple routers. When operating, a given router of the multiple routers keeps a record of operating states of some or all of the multiple routers and corresponding communication links. The given router then routes flits to destination nodes via one or more other routers of the multiple routers based at least in part on the operating states of the some or all of the multiple routers and the corresponding communication links.
Abstract: An integrated circuit includes a network on chip (NOC) that includes a plurality of processing elements and a plurality of NOC nodes, interconnected to the plurality of processing elements. The integrated circuit includes logic that is configured to: increment by one, a virtual channel identifier to produce an incremented destination VC identifier, the virtual channel (VC) identifier associated with at least portion of a packet stored in at least one virtual channel buffer; determine that a destination virtual channel buffer corresponding to the incremented destination VC identifier in a destination NOC node in the NOC is available to store the portion of the packet; and in response to the determination, send the portion of the packet and the incremented destination VC identifier to the destination NOC node.
Abstract: In accordance with described techniques for VLIW Dynamic Communication, an instruction that causes dynamic communication of data to at least one processing element of a very long instruction word (VLIW) machine is dispatched to a plurality of processing elements of the VLIW machine. A first count of data communications issued by the plurality of processing elements and a second count of data communications served by the plurality of processing elements are maintained. At least one additional instruction is determined for dispatch to the plurality of processing elements of the VLIW machine based on the first count and the second count. For example, an instruction that is independent of the instruction is determined for dispatch while the first count and the second count are unequal, and an instruction that is dependent on the instruction is determined for dispatch based on the first count and the second count being equal.
Type:
Application
Filed:
June 17, 2022
Publication date:
December 21, 2023
Applicant:
Advanced Micro Devices, Inc.
Inventors:
Sriseshan Srikanth, Karthik Ramu Sangaiah, Anthony Thomas Gutierrez, Vedula Venkata Srikant Bharadwaj, John Kalamatianos
Abstract: A processor distributes memory timing parameters and data among different memory modules based upon memory access patterns. The memory access patterns indicate different types, or classes, of data for an executing workload, with each class associated with different memory access characteristics, such as different row buffer hit rate levels, different frequencies of access, different criticalities, and the like. The processor assigns each memory module to a data class and sets the memory timing parameters for each memory module according to the module's assigned data class, thereby tailoring the memory timing parameters for efficient access of the corresponding data.
Type:
Grant
Filed:
December 22, 2020
Date of Patent:
February 21, 2023
Assignee:
Advanced Micro Devices, Inc.
Inventors:
Max Ruttenberg, Vendula Venkata Srikant Bharadwaj, Yasuko Eckert, Anthony Gutierrez, Mark H. Oskin
Abstract: A system is described that includes an integrated circuit chip having a network-on-chip. The network-on-chip includes multiple routers arranged in a topology and a separate communication link coupled between each router and each of one or more neighboring routers of that router among the multiple routers in the topology. The integrated circuit chip also includes multiple nodes, each node coupled to a router of the multiple routers. When operating, a given router of the multiple routers keeps a record of operating states of some or all of the multiple routers and corresponding communication links. The given router then routes flits to destination nodes via one or more other routers of the multiple routers based at least in part on the operating states of the some or all of the multiple routers and the corresponding communication links.