Patents by Inventor Anurag Chaudhary

Anurag Chaudhary has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20250113287
    Abstract: A transceiver for sending and receiving data packets on a communication channel. The transceiver receives a first request packet including a plurality of information fields having a first type of operation to be performed on a memory device and a first address. The transceiver stores the first type of operation and the first address in a memory associated with the transceiver, and sends to a target device, the first request packet with the first address. The transceiver then receives a second request packet, including a second address in the memory device, and determines, based on the first type of operation, the first address, and the second address, that the second request packet is part of a sequence of request packets to the target device. The transceiver then eliminates, in a header of the second request packet, a portion of the second address to form a third request packet and sends the third request packet to the target device.
    Type: Application
    Filed: September 29, 2023
    Publication date: April 3, 2025
    Inventors: Mark Rosenbluth, Anurag Chaudhary, Harsh Kumar, Guan Wang
  • Patent number: 12235765
    Abstract: Various embodiments include techniques for storing data in a repurposed cache memory in a computing system. The disclosed techniques include a system level cache controller that processes a memory operation for a processing unit. The controller and the processing unit communicate over a network-on-chip. To process the memory operation, the controller selects a repurposed cache memory from a pool of active cache memories associated with processing units that are inoperable and/or are in a low-power state. To select the repurposed cache memory, the controller generates a candidate vector that identifies the position of the requesting processing unit relative to the controller. The candidate vector enables the controller in selecting a repurposed cache memory that is, for example, on the shortest path between the processing unit and the controller. These techniques result in a lower latency, and improved memory performance, relative to prior conventional techniques.
    Type: Grant
    Filed: May 23, 2023
    Date of Patent: February 25, 2025
    Assignee: NVIDIA CORPORATION
    Inventors: Ariel Szapiro, Anurag Chaudhary, Mark Rosenbluth, Mayank Baunthiyal
  • Publication number: 20250061078
    Abstract: In various examples, when a bridge of a chip has received an eviction request from a client of the chip, the bridge may transmit a read request that corresponds to the same cache line to another chip without waiting for an inter-chip completion response for the eviction request. When the read request is received, the bridge may determine whether the eviction request has already been sent to the other chip and transmit the read request based at least on the eviction request being sent to the other chip using an ordered communication network to ensure the communications are received and/or processed by the other chip in an order that maintains memory coherency. Additionally, the chips may process read unique requests without using an inter-chip completion acknowledgement and may process copy back requests by transmitting corresponding copy back write data with the copy back requests.
    Type: Application
    Filed: August 15, 2023
    Publication date: February 20, 2025
    Inventors: Anurag Chaudhary, Guan Wang, Harsh Kumar
  • Publication number: 20240394186
    Abstract: Various embodiments include techniques for storing data in a repurposed cache memory in a computing system. The disclosed techniques include a system level cache controller that processes a memory operation for a processing unit. The controller and the processing unit communicate over a network-on-chip. To process the memory operation, the controller selects a repurposed cache memory from a pool of active cache memories associated with processing units that are inoperable and/or are in a low-power state. To select the repurposed cache memory, the controller generates a candidate vector that identifies the position of the requesting processing unit relative to the controller. The candidate vector enables the controller in selecting a repurposed cache memory that is, for example, on the shortest path between the processing unit and the controller. These techniques result in a lower latency, and improved memory performance, relative to prior conventional techniques.
    Type: Application
    Filed: May 23, 2023
    Publication date: November 28, 2024
    Inventors: Ariel SZAPIRO, Anurag CHAUDHARY, Mark ROSENBLUTH, Mayank BAUNTHIYAL
  • Publication number: 20240296130
    Abstract: Various embodiments include a network for transmitting data words from a source node to a destination node. The source node optionally inverts the logic levels of each data word so that the number of logic ‘1’ bits in each data word is less than or equal to half of the data bits. The destination node recovers the original data words by passing the data words not inverted by the source node and inverting the data words that were inverted by the source node. As the packet is transmitted through the network, each node encodes and/or decodes the data words by generating an output transition for each logic ‘1’ bit of the input data word. Because no more than half the bits of the input data word are logic ‘1’ bits, the node generates output transitions for no more than one half of the data bits.
    Type: Application
    Filed: March 2, 2023
    Publication date: September 5, 2024
    Inventors: Anurag CHAUDHARY, Scott Matthew PITKETHLY, Peter Lindsay GENTLE
  • Patent number: 12072815
    Abstract: Various embodiments include a network for transmitting data words from a source node to a destination node. The source node optionally inverts the logic levels of each data word so that the number of logic ‘1’ bits in each data word is less than or equal to half of the data bits. The destination node recovers the original data words by passing the data words not inverted by the source node and inverting the data words that were inverted by the source node. As the packet is transmitted through the network, each node encodes and/or decodes the data words by generating an output transition for each logic ‘1’ bit of the input data word. Because no more than half the bits of the input data word are logic ‘1’ bits, the node generates output transitions for no more than one half of the data bits.
    Type: Grant
    Filed: March 2, 2023
    Date of Patent: August 27, 2024
    Assignee: NVIDIA CORPORATION
    Inventors: Anurag Chaudhary, Scott Matthew Pitkethly, Peter Lindsay Gentle
  • Patent number: 11809319
    Abstract: The technology disclosed herein involves tracking contention and using the tracked contention to manage processor cache. The technology can be implemented in a processor's cache controlling logic and can enable the processor to track which locations in main memory are contentious. The technology can use the contentiousness of locations to determine where to store the data in cache and how to allocate and evict cache lines in the cache. In one example, the technology can store the data in a shared cache when the location is contentious and can bypass the shared cache and store the data in the private cache when the location is uncontentious. This may be advantageous because storing the data in shared cache can reduce or avoid having multiple copies in different private caches and can reduce the cache coherency overhead involved to keep copies in the private caches in sync.
    Type: Grant
    Filed: January 20, 2022
    Date of Patent: November 7, 2023
    Assignee: Nvidia Corporation
    Inventors: Anurag Chaudhary, Christopher Richard Feilbach, Jasjit Singh, Manuel Gautho, Aprajith Thirumalai, Shailender Chaudhry
  • Patent number: 11789869
    Abstract: The technology disclosed herein involves tracking contention and using the tracked contention to reduce latency of exclusive memory operations. The technology enables a processor to track which locations in main memory are contentious and to modify the order exclusive memory operations are processed based on the contentiousness. A thread can include multiple exclusive operations for the same memory location (e.g., exclusive load and a complementary exclusive store). The multiple exclusive memory operations can be added to a queue and include one or more intervening operations between them in the queue. The processor may process the operations in the queue based on the order they were added and may use the tracked contention to perform out-of-order processing for some of the exclusive operations. For example, the processor can execute the exclusive load operation and because the corresponding location is contentious can process the complementary exclusive store operation before the intervening operations.
    Type: Grant
    Filed: January 20, 2022
    Date of Patent: October 17, 2023
    Assignee: Nvidia Corporation
    Inventors: Anurag Chaudhary, Christopher Richard Feilbach, Jasjit Singh, Manuel Gautho, Aprajith Thirumalai, Shailender Chaudhry
  • Publication number: 20230244604
    Abstract: The technology disclosed herein involves tracking contention and using the tracked contention to reduce latency of exclusive memory operations. The technology enables a processor to track which locations in main memory are contentious and to modify the order exclusive memory operations are processed based on the contentiousness. A thread can include multiple exclusive operations for the same memory location (e.g., exclusive load and a complementary exclusive store). The multiple exclusive memory operations can be added to a queue and include one or more intervening operations between them in the queue. The processor may process the operations in the queue based on the order they were added and may use the tracked contention to perform out-of-order processing for some of the exclusive operations. For example, the processor can execute the exclusive load operation and because the corresponding location is contentious can process the complementary exclusive store operation before the intervening operations.
    Type: Application
    Filed: January 20, 2022
    Publication date: August 3, 2023
    Inventors: Anurag Chaudhary, Christopher Richard Feilbach, Jasjit Singh, Manuel Gautho, Aprajith Thirumalai, Shailender Chaudhry
  • Publication number: 20230244603
    Abstract: The technology disclosed herein involves tracking contention and using the tracked contention to manage processor cache. The technology can be implemented in a processor’s cache controlling logic and can enable the processor to track which locations in main memory are contentious. The technology can use the contentiousness of locations to determine where to store the data in cache and how to allocate and evict cache lines in the cache. In one example, the technology can store the data in a shared cache when the location is contentious and can bypass the shared cache and store the data in the private cache when the location is uncontentious. This may be advantageous because storing the data in shared cache can reduce or avoid having multiple copies in different private caches and can reduce the cache coherency overhead involved to keep copies in the private caches in sync.
    Type: Application
    Filed: January 20, 2022
    Publication date: August 3, 2023
    Inventors: Anurag Chaudhary, Christopher Richard Feilbach, Jasjit Singh, Manuel Gautho, Aprajith Thirumalai, Shailender Chaudhry
  • Patent number: 9824009
    Abstract: Systems and methods for coherency maintenance are presented. The systems and methods include utilization of multiple information state tracking approaches or protocols at different memory or storage levels. In one embodiment, a first coherency maintenance approach (e.g., similar to a MESI protocol, etc.) can be implemented at one storage level while a second coherency maintenance approach (e.g., similar to a MOESI protocol, etc.) can be implemented at another storage level. Information at a particular storage level or tier can be tracked by a set of local state indications and a set of essence state indications. The essence state indication can be tracked “externally” from a storage layer or tier directory (e.g., in a directory of another cache level, in a hub between cache levels, etc.). One storage level can control operations based upon the local state indications and another storage level can control operations based in least in part upon an essence state indication.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: November 21, 2017
    Assignee: NVIDIA CORPORATION
    Inventors: Anurag Chaudhary, Guillermo Juan Rozas
  • Patent number: 9639471
    Abstract: Attributes of access requests can be used to distinguish one set of access requests from another set of access requests. The prefetcher can determine a pattern for each set of access requests and then prefetch cache lines accordingly. In an embodiment in which there are multiple caches, a prefetcher can determine a destination for prefetched cache lines associated with a respective set of access requests. For example, the prefetcher can prefetch one set of cache lines into one cache, and another set of cache lines into another cache. Also, the prefetcher can determine a prefetch distance for each set of access requests. For example, the prefetch distances for the sets of access requests can be different.
    Type: Grant
    Filed: November 27, 2012
    Date of Patent: May 2, 2017
    Assignee: NVIDIA Corporation
    Inventor: Anurag Chaudhary
  • Patent number: 9563562
    Abstract: Prefetching is permitted to cross from one physical memory page to another. More specifically, if a stream of access requests contains virtual addresses that map to more than one physical memory page, then prefetching can continue from a first physical memory page to a second physical memory page. The prefetching advantageously continues to the second physical memory page based on the confidence level and prefetch distance established while the first physical memory page was the target of the access requests.
    Type: Grant
    Filed: November 27, 2012
    Date of Patent: February 7, 2017
    Assignee: Nvidia Corporation
    Inventors: Joseph Rowlands, Anurag Chaudhary
  • Patent number: 9367467
    Abstract: A system and method for managing cache replacements and a memory subsystem incorporating the system or the method. In one embodiment, the system includes: (1) a cache controller operable to control a cache and, in order: (1a) issue a pre-fetch command when the cache has a cache miss, (1b) perform at least one housekeeping task to ensure that the cache can store a replacement line and (1c) issue a fetch command and (2) a memory controller associated with a memory of a lower level than the cache and operable to respond to the pre-fetch command by performing at least one housekeeping task to ensure that the memory can provide the replacement line and respond to the fetch command by providing the replacement line.
    Type: Grant
    Filed: August 22, 2014
    Date of Patent: June 14, 2016
    Assignee: Nvidia Corporation
    Inventors: Anurag Chaudhary, Guillermo Rozas
  • Publication number: 20160055087
    Abstract: A system and method for managing cache replacements and a memory subsystem incorporating the system or the method. In one embodiment, the system includes: (1) a cache controller operable to control a cache and, in order: (1a) issue a pre-fetch command when the cache has a cache miss, (1b) perform at least one housekeeping task to ensure that the cache can store a replacement line and (1c) issue a fetch command and (2) a memory controller associated with a memory of a lower level than the cache and operable to respond to the pre-fetch command by performing at least one housekeeping task to ensure that the memory can provide the replacement line and respond to the fetch command by providing the replacement line.
    Type: Application
    Filed: August 22, 2014
    Publication date: February 25, 2016
    Inventors: Anurag Chaudhary, Guillermo Rozas
  • Patent number: 9262328
    Abstract: Cache hit information is used to manage (e.g., cap) the prefetch distance for a cache. In an embodiment in which there is a first cache and a second cache, where the second cache (e.g., a level two cache) has greater latency than the first cache (e.g., a level one cache), a prefetcher prefetches cache lines to the second cache and is configured to receive feedback from that cache. The feedback indicates whether an access request issued in response to a cache miss in the first cache results in a cache hit in the second cache. The prefetch distance for the second cache is determined according to the feedback.
    Type: Grant
    Filed: November 27, 2012
    Date of Patent: February 16, 2016
    Assignee: NVIDIA CORPORATION
    Inventor: Anurag Chaudhary
  • Publication number: 20140181404
    Abstract: Systems and methods for coherency maintenance are presented. The systems and methods include utilization of multiple information state tracking approaches or protocols at different memory or storage levels. In one embodiment, a first coherency maintenance approach (e.g., similar to a MESI protocol, etc.) can be implemented at one storage level while a second coherency maintenance approach (e.g., similar to a MOESI protocol, etc.) can be implemented at another storage level. Information at a particular storage level or tier can be tracked by a set of local state indications and a set of essence state indications. The essence state indication can be tracked “externally” from a storage layer or tier directory (e.g., in a directory of another cache level, in a hub between cache levels, etc.). One storage level can control operations based upon the local state indications and another storage level can control operations based in least in part upon an essence state indication.
    Type: Application
    Filed: December 21, 2012
    Publication date: June 26, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Anurag Chaudhary, Guillermo Juan Rozas
  • Publication number: 20140149668
    Abstract: Attributes of access requests can be used to distinguish one set of access requests from another set of access requests. The prefetcher can determine a pattern for each set of access requests and then prefetch cache lines accordingly. In an embodiment in which there are multiple caches, a prefetcher can determine a destination for prefetched cache lines associated with a respective set of access requests. For example, the prefetcher can prefetch one set of cache lines into one cache, and another set of cache lines into another cache. Also, the prefetcher can determine a prefetch distance for each set of access requests. For example, the prefetch distances for the sets of access requests can be different.
    Type: Application
    Filed: November 27, 2012
    Publication date: May 29, 2014
    Applicant: NVIDIA CORPORATION
    Inventor: Anurag Chaudhary
  • Publication number: 20140149678
    Abstract: Cache hit information is used to manage (e.g., cap) the prefetch distance for a cache. In an embodiment in which there is a first cache and a second cache, where the second cache (e.g., a level two cache) has greater latency than the first cache (e.g., a level one cache), a prefetcher prefetches cache lines to the second cache and is configured to receive feedback from that cache. The feedback indicates whether an access request issued in response to a cache miss in the first cache results in a cache hit in the second cache. The prefetch distance for the second cache is determined according to the feedback.
    Type: Application
    Filed: November 27, 2012
    Publication date: May 29, 2014
    Applicant: NVIDIA CORPORATION
    Inventor: Anurag Chaudhary
  • Publication number: 20140149679
    Abstract: Prefetching is permitted to cross from one physical memory page to another. More specifically, if a stream of access requests contains virtual addresses that map to more than one physical memory page, then prefetching can continue from a first physical memory page to a second physical memory page. The prefetching advantageously continues to the second physical memory page based on the confidence level and prefetch distance established while the first physical memory page was the target of the access requests.
    Type: Application
    Filed: November 27, 2012
    Publication date: May 29, 2014
    Applicant: NVIDIA CORPORATION
    Inventors: Joseph Rowlands, Anurag Chaudhary