Patents by Inventor Leonardo Piga

Leonardo Piga has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240085964
    Abstract: A system and method for updating power supply voltages due to variations from aging are described. A functional unit includes a power supply monitor capable of measuring power supply variations in a region of the functional unit. An age counter measures an age of the functional unit. A control unit notifies the power supply monitor to measure an operating voltage reference. When the control unit receives a measured operating voltage reference, the control unit determines an updated age of the region different from the current age based on the measured operating voltage reference. The control unit updates the age counter with the corresponding age, which is younger than the previous age in some cases due to the region not experiencing predicted stress and aging. The control unit is capable of determining a voltage adjustment for the operating voltage reference based on an age indicated by the age counter.
    Type: Application
    Filed: November 20, 2023
    Publication date: March 14, 2024
    Inventors: Sriram Sambamurthy, Sriram Sundaram, Indrani Paul, Larry David Hewitt, Anil Harwani, Aaron Joseph Grenat, Dana Glenn Lewis, Leonardo Piga, Wonje Choi, Karthik Rao
  • Patent number: 11880610
    Abstract: A cluster compute server stores different types of data at different storage volumes in order to reduce data duplication at the storage volumes. The storage volumes are categorized into two classes: common storage volumes and dedicated storage volumes, wherein the common storage volumes store data to be accessed and used by multiple compute nodes (or multiple virtual servers) of the cluster compute server. The dedicated storage volumes, in contrast, store data to be accessed only by a corresponding compute node (or virtual server).
    Type: Grant
    Filed: November 16, 2020
    Date of Patent: January 23, 2024
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mauricio Breternitz, Jr., Leonardo Piga
  • Patent number: 11829222
    Abstract: A system and method for updating power supply voltages due to variations from aging are described. A functional unit includes a power supply monitor capable of measuring power supply variations in a region of the functional unit. An age counter measures an age of the functional unit. A control unit notifies the power supply monitor to measure an operating voltage reference. When the control unit receives a measured operating voltage reference, the control unit determines an updated age of the region different from the current age based on the measured operating voltage reference. The control unit updates the age counter with the corresponding age, which is younger than the previous age in some cases due to the region not experiencing predicted stress and aging. The control unit is capable of determining a voltage adjustment for the operating voltage reference based on an age indicated by the age counter.
    Type: Grant
    Filed: December 18, 2020
    Date of Patent: November 28, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Sriram Sambamurthy, Sriram Sundaram, Indrani Paul, Larry David Hewitt, Anil Harwani, Aaron Joseph Grenat, Dana Glenn Lewis, Leonardo Piga, Wonje Choi, Karthik Rao
  • Publication number: 20220100249
    Abstract: A system and method for updating power supply voltages due to variations from aging are described. A functional unit includes a power supply monitor capable of measuring power supply variations in a region of the functional unit. An age counter measures an age of the functional unit. A control unit notifies the power supply monitor to measure an operating voltage reference. When the control unit receives a measured operating voltage reference, the control unit determines an updated age of the region different from the current age based on the measured operating voltage reference. The control unit updates the age counter with the corresponding age, which is younger than the previous age in some cases due to the region not experiencing predicted stress and aging. The control unit is capable of determining a voltage adjustment for the operating voltage reference based on an age indicated by the age counter.
    Type: Application
    Filed: December 18, 2020
    Publication date: March 31, 2022
    Inventors: Sriram Sambamurthy, Sriram Sundaram, Indrani Paul, Larry David Hewitt, Anil Harwani, Aaron Joseph Grenat, Dana Glenn Lewis, Leonardo Piga, Wonje Choi, Karthik Rao
  • Publication number: 20210173591
    Abstract: A cluster compute server stores different types of data at different storage volumes in order to reduce data duplication at the storage volumes. The storage volumes are categorized into two classes: common storage volumes and dedicated storage volumes, wherein the common storage volumes store data to be accessed and used by multiple compute nodes (or multiple virtual servers) of the cluster compute server. The dedicated storage volumes, in contrast, store data to be accessed only by a corresponding compute node (or virtual server).
    Type: Application
    Filed: November 16, 2020
    Publication date: June 10, 2021
    Inventors: Mauricio BRETERNITZ, JR., Leonardo PIGA
  • Patent number: 10866768
    Abstract: A cluster compute server stores different types of data at different storage volumes in order to reduce data duplication at the storage volumes. The storage volumes are categorized into two classes: common storage volumes and dedicated storage volumes, wherein the common storage volumes store data to be accessed and used by multiple compute nodes (or multiple virtual servers) of the cluster compute server. The dedicated storage volumes, in contrast, store data to be accessed only by a corresponding compute node (or virtual server).
    Type: Grant
    Filed: December 12, 2014
    Date of Patent: December 15, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Mauricio Breternitz, Jr., Leonardo Piga
  • Patent number: 10613957
    Abstract: Systems, apparatuses, and methods for achieving balanced execution in a multi-node cluster through runtime detection of performance variation are described. During a training phase, performance counters and an amount of time spent waiting for synchronization is monitored for a plurality of tasks for each node of the multi-node cluster. These values are utilized to generate a model which correlates the values of the performance counters to the amount of time spent waiting for synchronization. Once the model is built, the values of the performance counters are monitored for a period of time at the start of each task, and these values are input into the model. The model generates a prediction of whether a given node is on the critical path. If the given node is predicted to be on the critical path, the power allocation of the given node is increased.
    Type: Grant
    Filed: June 24, 2016
    Date of Patent: April 7, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Brian J. Kocoloski, Leonardo Piga, Wei Huang, Indrani Paul
  • Patent number: 10452437
    Abstract: Systems, apparatuses, and methods for performing temperature-aware task scheduling and proactive power management. A SoC includes a plurality of processing units and a task queue storing pending tasks. The SoC calculates a thermal metric for each pending task to predict an amount of heat the pending task will generate. The SoC also determines a thermal gradient for each processing unit to predict a rate at which the processing unit's temperature will change when executing a task. The SoC also monitors a thermal margin of how far each processing unit is from reaching its thermal limit. The SoC minimizes non-uniform heat generation on the SoC by scheduling pending tasks from the task queue to the processing units based on the thermal metrics for the pending tasks, the thermal gradients of each processing unit, and the thermal margin available on each processing unit.
    Type: Grant
    Filed: June 24, 2016
    Date of Patent: October 22, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Abhinandan Majumdar, Brian J. Kocoloski, Leonardo Piga, Wei Huang, Yasuko Eckert
  • Patent number: 10355966
    Abstract: Systems, apparatuses, and methods for managing variations among nodes in parallel system frameworks. Sensor and performance data associated with the nodes of a multi-node cluster may be monitored to detect variations among the nodes. A variability metric may be calculated for each node of the cluster based on the sensor and performance data associated with the node. The variability metrics may then be used by a mapper to efficiently map tasks of a parallel application to the nodes of the cluster. In one embodiment, the mapper may assign the critical tasks of the parallel application to the nodes with the lowest variability metrics. In another embodiment, the hardware of the nodes may be reconfigured so as to reduce the node-to-node variability.
    Type: Grant
    Filed: March 25, 2016
    Date of Patent: July 16, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Samuel Lawrence Wasmundt, Leonardo Piga, Indrani Paul, Wei Huang, Manish Arora
  • Patent number: 10237335
    Abstract: Systems, apparatuses, and methods for managing cluster-level performance variability without a centralized controller are described. Each node of a multi-node cluster tracks a maximum and minimum progress across the plurality of nodes for a workload executed by the cluster. Each node also tracks its local progress on its current task. Each node also utilizes a comparison of the local progress to reported maximum and minimum progress across the cluster to identify a critical, or slow, node and whether to increase or reduce an amount of power allocated to the node. The nodes append information about the maximum and minimum progress to messages sent to other nodes to report their knowledge of maximum and minimum progress with other nodes. A node updates its local information if the node receives a message from another node with more up-to-date information about the state of progress across the cluster.
    Type: Grant
    Filed: June 15, 2016
    Date of Patent: March 19, 2019
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Leonardo Piga
  • Patent number: 10067709
    Abstract: Systems, apparatuses, and methods for accelerating page migration using a two-level bloom filter are disclosed. In one embodiment, a system includes a GPU and a CPU and a multi-level memory hierarchy. When a memory request misses in a first memory, the GPU is configured to check a first level of a two-level bloom filter to determine if a page targeted by the memory request is located in a second memory. If the first level of the two-level bloom filter indicates that the page is not in the second memory, then the GPU generates a page fault and sends the memory request to a third memory. If the first level of the two-level bloom filter indicates that the page is in the second memory, then the GPU sends the memory request to the CPU.
    Type: Grant
    Filed: September 19, 2016
    Date of Patent: September 4, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Leonardo Piga, Mauricio Breternitz
  • Patent number: 10048741
    Abstract: Systems, apparatuses, and methods for implementing performance estimation mechanisms are disclosed. In one embodiment, a computing system includes at least one processor and a memory subsystem. During a characterization phase, the system utilizes a memory intensive workload to detect when the memory subsystem reaches its saturation point. Then, the system collects performance counter values during a sampling phase of a target application to determine the memory bandwidth. If the memory bandwidth is greater than the saturation point, then the system generates a prediction of the memory time which is based on a ratio of the memory bandwidth over the saturation point. Otherwise, if the memory bandwidth is less than the saturation point, the system assumes memory time is constant versus processor frequency. Then, the system uses the memory time and an estimate of the compute time to estimate a phase time for the target application at different processor frequencies.
    Type: Grant
    Filed: January 26, 2017
    Date of Patent: August 14, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Md Abdullah Shahneous Bari, Leonardo Piga, Indrani Paul
  • Publication number: 20180210531
    Abstract: Systems, apparatuses, and methods for implementing performance estimation mechanisms are disclosed. In one embodiment, a computing system includes at least one processor and a memory subsystem. During a characterization phase, the system utilizes a memory intensive workload to detect when the memory subsystem reaches its saturation point. Then, the system collects performance counter values during a sampling phase of a target application to determine the memory bandwidth. If the memory bandwidth is greater than the saturation point, then the system generates a prediction of the memory time which is based on a ratio of the memory bandwidth over the saturation point. Otherwise, if the memory bandwidth is less than the saturation point, the system assumes memory time is constant versus processor frequency. Then, the system uses the memory time and an estimate of the compute time to estimate a phase time for the target application at different processor frequencies.
    Type: Application
    Filed: January 26, 2017
    Publication date: July 26, 2018
    Inventors: Md Abdullah Shahneous Bari, Leonardo Piga, Indrani Paul
  • Patent number: 9983652
    Abstract: Systems, apparatuses, and methods for balancing computation and communication power in power constrained environments. A data processing cluster with a plurality of compute nodes may perform parallel processing of a workload in a power constrained environment. Nodes that finish tasks early may be power-gated based on one or more conditions. In some scenarios, a node may predict a wait duration and go into a reduced power consumption state if the wait duration is predicted to be greater than a threshold. The power saved by power-gating one or more nodes may be reassigned for use by other nodes. A cluster agent may be configured to reassign the unused power to the active nodes to expedite workload processing.
    Type: Grant
    Filed: December 4, 2015
    Date of Patent: May 29, 2018
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Leonardo Piga, Indrani Paul, Wei Huang
  • Publication number: 20180081585
    Abstract: Systems, apparatuses, and methods for accelerating page migration using a two-level bloom filter are disclosed. In one embodiment, a system includes a GPU and a CPU and a multi-level memory hierarchy. When a memory request misses in a first memory, the GPU is configured to check a first level of a two-level bloom filter to determine if a page targeted by the memory request is located in a second memory. If the first level of the two-level bloom filter indicates that the page is not in the second memory, then the GPU generates a page fault and sends the memory request to a third memory. If the first level of the two-level bloom filter indicates that the page is in the second memory, then the GPU sends the memory request to the CPU.
    Type: Application
    Filed: September 19, 2016
    Publication date: March 22, 2018
    Inventors: Leonardo Piga, Mauricio Breternitz
  • Publication number: 20170371719
    Abstract: Systems, apparatuses, and methods for performing temperature-aware task scheduling and proactive power management. A SoC includes a plurality of processing units and a task queue storing pending tasks. The SoC calculates a thermal metric for each pending task to predict an amount of heat the pending task will generate. The SoC also determines a thermal gradient for each processing unit to predict a rate at which the processing unit's temperature will change when executing a task. The SoC also monitors a thermal margin of how far each processing unit is from reaching its thermal limit. The SoC minimizes non-uniform heat generation on the SoC by scheduling pending tasks from the task queue to the processing units based on the thermal metrics for the pending tasks, the thermal gradients of each processing unit, and the thermal margin available on each processing unit.
    Type: Application
    Filed: June 24, 2016
    Publication date: December 28, 2017
    Inventors: Abhinandan Majumdar, Brian J. Kocoloski, Leonardo Piga, Wei Huang, Yasuko Eckert
  • Publication number: 20170373955
    Abstract: Systems, apparatuses, and methods for achieving balanced execution in a multi-node cluster through runtime detection of performance variation are described. During a training phase, performance counters and an amount of time spent waiting for synchronization is monitored for a plurality of tasks for each node of the multi-node cluster. These values are utilized to generate a model which correlates the values of the performance counters to the amount of time spent waiting for synchronization. Once the model is built, the values of the performance counters are monitored for a period of time at the start of each task, and these values are input into the model. The model generates a prediction of whether a given node is on the critical path. If the given node is predicted to be on the critical path, the power allocation of the given node is increased.
    Type: Application
    Filed: June 24, 2016
    Publication date: December 28, 2017
    Inventors: Brian J. Kocoloski, Leonardo Piga, Wei Huang, Indrani Paul
  • Publication number: 20170371761
    Abstract: Systems, apparatuses, and methods for performing real-time tracking of performance targets using dynamic compilation. A performance target is specified in a service level agreement. A dynamic compiler analyzes a software application executing in real-time and determine which high-level application metrics to track. The dynamic compiler then inserts instructions into the code to increment counters associated with the metrics. A power optimization unit then utilizes the counters to determine if the system is currently meeting the performance target. If the system is exceeding the performance target, then the power optimization unit reduces the power consumption of the system while still meeting the performance target.
    Type: Application
    Filed: June 24, 2016
    Publication date: December 28, 2017
    Inventors: Leonardo Piga, Brian J. Kocoloski, Wei Huang, Abhinandan Majumdar, Indrani Paul
  • Publication number: 20170366412
    Abstract: Systems, apparatuses, and methods for managing cluster-level performance variability without a centralized controller are described. Each node of a multi-node cluster tracks a maximum and minimum progress across the plurality of nodes for a workload executed by the cluster. Each node also tracks its local progress on its current task. Each node also utilizes a comparison of the local progress to reported maximum and minimum progress across the cluster to identify a critical, or slow, node and whether to increase or reduce an amount of power allocated to the node. The nodes append information about the maximum and minimum progress to messages sent to other nodes to report their knowledge of maximum and minimum progress with other nodes. A node updates its local information if the node receives a message from another node with more up-to-date information about the state of progress across the cluster.
    Type: Application
    Filed: June 15, 2016
    Publication date: December 21, 2017
    Inventor: Leonardo Piga
  • Publication number: 20170279703
    Abstract: Systems, apparatuses, and methods for managing variations among nodes in parallel system frameworks. Sensor and performance data associated with the nodes of a multi-node cluster may be monitored to detect variations among the nodes. A variability metric may be calculated for each node of the cluster based on the sensor and performance data associated with the node. The variability metrics may then be used by a mapper to efficiently map tasks of a parallel application to the nodes of the cluster. In one embodiment, the mapper may assign the critical tasks of the parallel application to the nodes with the lowest variability metrics. In another embodiment, the hardware of the nodes may be reconfigured so as to reduce the node-to-node variability.
    Type: Application
    Filed: March 25, 2016
    Publication date: September 28, 2017
    Inventors: Samuel Lawrence Wasmundt, Leonardo Piga, Indrani Paul, Wei Huang, Manish Arora