Patents by Inventor Manuel Gautho

Manuel Gautho has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11809319
    Abstract: The technology disclosed herein involves tracking contention and using the tracked contention to manage processor cache. The technology can be implemented in a processor's cache controlling logic and can enable the processor to track which locations in main memory are contentious. The technology can use the contentiousness of locations to determine where to store the data in cache and how to allocate and evict cache lines in the cache. In one example, the technology can store the data in a shared cache when the location is contentious and can bypass the shared cache and store the data in the private cache when the location is uncontentious. This may be advantageous because storing the data in shared cache can reduce or avoid having multiple copies in different private caches and can reduce the cache coherency overhead involved to keep copies in the private caches in sync.
    Type: Grant
    Filed: January 20, 2022
    Date of Patent: November 7, 2023
    Assignee: Nvidia Corporation
    Inventors: Anurag Chaudhary, Christopher Richard Feilbach, Jasjit Singh, Manuel Gautho, Aprajith Thirumalai, Shailender Chaudhry
  • Patent number: 11789869
    Abstract: The technology disclosed herein involves tracking contention and using the tracked contention to reduce latency of exclusive memory operations. The technology enables a processor to track which locations in main memory are contentious and to modify the order exclusive memory operations are processed based on the contentiousness. A thread can include multiple exclusive operations for the same memory location (e.g., exclusive load and a complementary exclusive store). The multiple exclusive memory operations can be added to a queue and include one or more intervening operations between them in the queue. The processor may process the operations in the queue based on the order they were added and may use the tracked contention to perform out-of-order processing for some of the exclusive operations. For example, the processor can execute the exclusive load operation and because the corresponding location is contentious can process the complementary exclusive store operation before the intervening operations.
    Type: Grant
    Filed: January 20, 2022
    Date of Patent: October 17, 2023
    Assignee: Nvidia Corporation
    Inventors: Anurag Chaudhary, Christopher Richard Feilbach, Jasjit Singh, Manuel Gautho, Aprajith Thirumalai, Shailender Chaudhry
  • Publication number: 20230244603
    Abstract: The technology disclosed herein involves tracking contention and using the tracked contention to manage processor cache. The technology can be implemented in a processor’s cache controlling logic and can enable the processor to track which locations in main memory are contentious. The technology can use the contentiousness of locations to determine where to store the data in cache and how to allocate and evict cache lines in the cache. In one example, the technology can store the data in a shared cache when the location is contentious and can bypass the shared cache and store the data in the private cache when the location is uncontentious. This may be advantageous because storing the data in shared cache can reduce or avoid having multiple copies in different private caches and can reduce the cache coherency overhead involved to keep copies in the private caches in sync.
    Type: Application
    Filed: January 20, 2022
    Publication date: August 3, 2023
    Inventors: Anurag Chaudhary, Christopher Richard Feilbach, Jasjit Singh, Manuel Gautho, Aprajith Thirumalai, Shailender Chaudhry
  • Publication number: 20230244604
    Abstract: The technology disclosed herein involves tracking contention and using the tracked contention to reduce latency of exclusive memory operations. The technology enables a processor to track which locations in main memory are contentious and to modify the order exclusive memory operations are processed based on the contentiousness. A thread can include multiple exclusive operations for the same memory location (e.g., exclusive load and a complementary exclusive store). The multiple exclusive memory operations can be added to a queue and include one or more intervening operations between them in the queue. The processor may process the operations in the queue based on the order they were added and may use the tracked contention to perform out-of-order processing for some of the exclusive operations. For example, the processor can execute the exclusive load operation and because the corresponding location is contentious can process the complementary exclusive store operation before the intervening operations.
    Type: Application
    Filed: January 20, 2022
    Publication date: August 3, 2023
    Inventors: Anurag Chaudhary, Christopher Richard Feilbach, Jasjit Singh, Manuel Gautho, Aprajith Thirumalai, Shailender Chaudhry
  • Patent number: 11625225
    Abstract: Various embodiments include a modulo operation generator associated with a cache memory in a computer-based system. The modulo operation generator generates a first sum by performing an addition and/or a subtraction function on an input address. A first portion of the first sum is applied to a lookup table that generates a correction value. The correction value is then added to a second portion of the first sum to generate a second sum. The second sum is adjusted, as needed, to be less than the divisor. The adjusted second sum forms a residue value that identifies a cache memory slice in which the input data value corresponding to the input address is stored. By generating the residue value in this manner, the cache memory efficiently distributes input data values among the slices in a cache memory even when the number of slices is not a power of two.
    Type: Grant
    Filed: August 3, 2022
    Date of Patent: April 11, 2023
    Assignee: NVIDIA CORPORATION
    Inventors: Xiaofei Chang, Manuel Gautho
  • Publication number: 20220374207
    Abstract: Various embodiments include a modulo operation generator associated with a cache memory in a computer-based system. The modulo operation generator generates a first sum by performing an addition and/or a subtraction function on an input address. A first portion of the first sum is applied to a lookup table that generates a correction value. The correction value is then added to a second portion of the first sum to generate a second sum. The second sum is adjusted, as needed, to be less than the divisor. The adjusted second sum forms a residue value that identifies a cache memory slice in which the input data value corresponding to the input address is stored. By generating the residue value in this manner, the cache memory efficiently distributes input data values among the slices in a cache memory even when the number of slices is not a power of two.
    Type: Application
    Filed: August 3, 2022
    Publication date: November 24, 2022
    Inventors: Xiaofei CHANG, Manuel GAUTHO
  • Patent number: 11455145
    Abstract: Various embodiments include a modulo operation generator associated with a cache memory in a computer-based system. The modulo operation generator generates a first sum by performing an addition and/or a subtraction function on an input address. A first portion of the first sum is applied to a lookup table that generates a correction value. The correction value is then added to a second portion of the first sum to generate a second sum. The second sum is adjusted, as needed, to be less than the divisor. The adjusted second sum forms a residue value that identifies a cache memory slice in which the input data value corresponding to the input address is stored. By generating the residue value in this manner, the cache memory distributes input data values among the slices in a cache memory even when the number of slices is not a power of two.
    Type: Grant
    Filed: July 28, 2020
    Date of Patent: September 27, 2022
    Assignee: NVIDIA CORPORATION
    Inventors: Xiaofei Chang, Manuel Gautho
  • Publication number: 20220035599
    Abstract: Various embodiments include a modulo operation generator associated with a cache memory in a computer-based system. The modulo operation generator generates a first sum by performing an addition and/or a subtraction function on an input address. A first portion of the first sum is applied to a lookup table that generates a correction value. The correction value is then added to a second portion of the first sum to generate a second sum. The second sum is adjusted, as needed, to be less than the divisor. The adjusted second sum forms a residue value that identifies a cache memory slice in which the input data value corresponding to the input address is stored. By generating the residue value in this manner, the cache memory efficiently distributes input data values among the slices in a cache memory even when the number of slices is not a power of two.
    Type: Application
    Filed: July 28, 2020
    Publication date: February 3, 2022
    Inventors: XIAOFEI CHANG, MANUEL GAUTHO
  • Patent number: 7337339
    Abstract: Power management for a multi-processor chip includes a centralized global power manager that monitors global power for the whole chip, and local power managers. Local power managers manage power for local blocks such as processor cores, caches, and memory controllers. When a local block executes an instruction or accesses memory, an event is generated and looked up in a local power estimate table. A local power estimate for that event is sent to the global power manager, which sums all local power estimates received from all local blocks. An exponential moving average (EMA) is generated and compared to a global power threshold. When global power is over the threshold, local targets are sent to power managers that generate and monitor local power averages that must remain under the local target. The local block is throttled by the local power manager to reduce power when the local target is exceeded.
    Type: Grant
    Filed: September 15, 2005
    Date of Patent: February 26, 2008
    Assignee: Azul Systems, Inc.
    Inventors: Jack H. Choquette, Kevin B. Normoyle, Elias Atmeh, Scott D. Sellers, Murali Sundaresan, Manuel Gautho