Patents by Inventor Gurushankar Rajamani

Gurushankar Rajamani has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11934826
    Abstract: Methods, systems, and apparatus, including computer-readable media, are described for performing vector reductions using a shared scratchpad memory of a hardware circuit having processor cores that communicate with the shared memory. For each of the processor cores, a respective vector of values is generated based on computations performed at the processor core. The shared memory receives the respective vectors of values from respective resources of the processor cores using a direct memory access (DMA) data path of the shared memory. The shared memory performs an accumulation operation on the respective vectors of values using an operator unit coupled to the shared memory. The operator unit is configured to accumulate values based on arithmetic operations encoded at the operator unit. A result vector is generated based on performing the accumulation operation using the respective vectors of values.
    Type: Grant
    Filed: November 19, 2021
    Date of Patent: March 19, 2024
    Assignee: Google LLC
    Inventors: Thomas Norrie, Gurushankar Rajamani, Andrew Everett Phelps, Matthew Leever Hedlund, Norman Paul Jouppi
  • Patent number: 11928580
    Abstract: Methods, systems, and apparatus, including computer-readable media, are described for interleaving memory requests to accelerate memory accesses at a hardware circuit configured to implement a neural network model. A system generates multiple requests that are processed against a memory of the system. Each request is used to retrieve data from the memory. For each request, the system generates multiple sub-requests based on a respective size of the data to be retrieved using the request. The system generates a sequence of interleaved sub-requests that includes respective sub-requests of a first request interleaved among respective sub-requests of a second request. Based on the sequence of interleaved sub-requests, a module of the system receives respective portions of data accessed from different address locations of the memory. The system processes each of the respective portions of data to generate a neural network inference using the neural network model implemented at the hardware circuit.
    Type: Grant
    Filed: April 4, 2022
    Date of Patent: March 12, 2024
    Assignee: Google LLC
    Inventors: Gurushankar Rajamani, Alice Kuo
  • Patent number: 11748028
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing data on a memory controller. One of the methods comprises obtaining a first request and a second request to access respective data corresponding to the first and second requests at a first memory device of the plurality of memory devices; and initiating interleaved processing of the respective data; receiving an indication to stop processing requests to access data at the first memory device and to initiate processing requests to access data at a second memory device, determining that the respective data corresponding to the first and second requests have not yet been fully processed at the time of receiving the indication, and in response, storing, in memory accessible to the memory controller, data corresponding to the requests which have not yet been fully processed.
    Type: Grant
    Filed: November 23, 2022
    Date of Patent: September 5, 2023
    Assignee: Google LLC
    Inventors: Amin Farmahini, Benjamin Steel Gelb, Gurushankar Rajamani, Sukalpa Biswas
  • Publication number: 20230062889
    Abstract: An application specific integrated circuit (ASIC) is provided for reliable transport of packets. The network interface card may include a reliable transport accelerator (RTA). The RTA may include a cache lookup database. The RTA may be configured to determine, from a received data packet, a connection identifier and query the cache lookup database for a cache entry corresponding to a connection context having the connection identifier. In response to the query, the RTA may receive a cache hit or a cache miss.
    Type: Application
    Filed: December 16, 2021
    Publication date: March 2, 2023
    Inventors: Weihuang Wang, Srinivas Vaduvatha, Xiaoming Wang, Gurushankar Rajamani, Abhishek Agarwal, Jiazhen Zheng, Prashant Chandra
  • Patent number: 11513724
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing data on a memory controller. One of the methods comprises obtaining a first request and a second request to access respective data corresponding to the first and second requests at a first memory device of the plurality of memory devices; and initiating interleaved processing of the respective data; receiving an indication to stop processing requests to access data at the first memory device and to initiate processing requests to access data at a second memory device, determining that the respective data corresponding to the first and second requests have not yet been fully processed at the time of receiving the indication, and in response, storing, in memory accessible to the memory controller, data corresponding to the requests which have not yet been fully processed.
    Type: Grant
    Filed: June 15, 2021
    Date of Patent: November 29, 2022
    Assignee: Google LLC
    Inventors: Amin Farmahini, Benjamin Steel Gelb, Gurushankar Rajamani, Sukalpa Biswas
  • Publication number: 20220374692
    Abstract: Methods, systems, and apparatus, including computer-readable media, are described for interleaving memory requests to accelerate memory accesses at a hardware circuit configured to implement a neural network model. A system generates multiple requests that are processed against a memory of the system. Each request is used to retrieve data from the memory. For each request, the system generates multiple sub-requests based on a respective size of the data to be retrieved using the request. The system generates a sequence of interleaved sub-requests that includes respective sub-requests of a first request interleaved among respective sub-requests of a second request. Based on the sequence of interleaved sub-requests, a module of the system receives respective portions of data accessed from different address locations of the memory. The system processes each of the respective portions of data to generate a neural network inference using the neural network model implemented at the hardware circuit.
    Type: Application
    Filed: April 4, 2022
    Publication date: November 24, 2022
    Inventors: Gurushankar Rajamani, Alice Kuo
  • Publication number: 20220156071
    Abstract: Methods, systems, and apparatus, including computer-readable media, are described for performing vector reductions using a shared scratchpad memory of a hardware circuit having processor cores that communicate with the shared memory. For each of the processor cores, a respective vector of values is generated based on computations performed at the processor core. The shared memory receives the respective vectors of values from respective resources of the processor cores using a direct memory access (DMA) data path of the shared memory. The shared memory performs an accumulation operation on the respective vectors of values using an operator unit coupled to the shared memory. The operator unit is configured to accumulate values based on arithmetic operations encoded at the operator unit. A result vector is generated based on performing the accumulation operation using the respective vectors of values.
    Type: Application
    Filed: November 19, 2021
    Publication date: May 19, 2022
    Inventors: Thomas Norrie, Gurushankar Rajamani, Andrew Everett Phelps, Matthew Leever Hedlund, Norman Paul Jouppi
  • Patent number: 11295206
    Abstract: Methods, systems, and apparatus, including computer-readable media, are described for interleaving memory requests to accelerate memory accesses at a hardware circuit configured to implement a neural network model. A system generates multiple requests that are processed against a memory of the system. Each request is used to retrieve data from the memory. For each request, the system generates multiple sub-requests based on a respective size of the data to be retrieved using the request. The system generates a sequence of interleaved sub-requests that includes respective sub-requests of a first request interleaved among respective sub-requests of a second request. Based on the sequence of interleaved sub-requests, a module of the system receives respective portions of data accessed from different address locations of the memory. The system processes each of the respective portions of data to generate a neural network inference using the neural network model implemented at the hardware circuit.
    Type: Grant
    Filed: May 15, 2020
    Date of Patent: April 5, 2022
    Assignee: Google LLC
    Inventors: Gurushankar Rajamani, Alice Kuo
  • Patent number: 11182159
    Abstract: Methods, systems, and apparatus, including computer-readable media, are described for performing vector reductions using a shared scratchpad memory of a hardware circuit having processor cores that communicate with the shared memory. For each of the processor cores, a respective vector of values is generated based on computations performed at the processor core. The shared memory receives the respective vectors of values from respective resources of the processor cores using a direct memory access (DMA) data path of the shared memory. The shared memory performs an accumulation operation on the respective vectors of values using an operator unit coupled to the shared memory. The operator unit is configured to accumulate values based on arithmetic operations encoded at the operator unit. A result vector is generated based on performing the accumulation operation using the respective vectors of values.
    Type: Grant
    Filed: August 31, 2020
    Date of Patent: November 23, 2021
    Assignee: Google LLC
    Inventors: Thomas Norrie, Gurushankar Rajamani, Andrew Everett Phelps, Matthew Leever Hedlund, Norman Paul Jouppi
  • Publication number: 20210311658
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing data on a memory controller. One of the methods comprises obtaining a first request and a second request to access respective data corresponding to the first and second requests at a first memory device of the plurality of memory devices; and initiating interleaved processing of the respective data; receiving an indication to stop processing requests to access data at the first memory device and to initiate processing requests to access data at a second memory device, determining that the respective data corresponding to the first and second requests have not yet been fully processed at the time of receiving the indication, and in response, storing, in memory accessible to the memory controller, data corresponding to the requests which have not yet been fully processed.
    Type: Application
    Filed: June 15, 2021
    Publication date: October 7, 2021
    Inventors: Amin Farmahini, Benjamin Steel Gelb, Gurushankar Rajamani, Sukalpa Biswas
  • Patent number: 11137936
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing data on a memory controller. One of the methods comprises obtaining a first request and a second request to access respective data corresponding to the first and second requests at a first memory device of the plurality of memory devices; and initiating interleaved processing of the respective data; receiving an indication to stop processing requests to access data at the first memory device and to initiate processing requests to access data at a second memory device, determining that the respective data corresponding to the first and second requests have not yet been fully processed at the time of receiving the indication, and in response, storing, in memory accessible to the memory controller, data corresponding to the requests which have not yet been fully processed.
    Type: Grant
    Filed: July 15, 2020
    Date of Patent: October 5, 2021
    Assignee: Google LLC
    Inventors: Amin Farmahini, Benjamin Steel Gelb, Gurushankar Rajamani, Sukalpa Biswas
  • Publication number: 20210263739
    Abstract: Methods, systems, and apparatus, including computer-readable media, are described for performing vector reductions using a shared scratchpad memory of a hardware circuit having processor cores that communicate with the shared memory. For each of the processor cores, a respective vector of values is generated based on computations performed at the processor core. The shared memory receives the respective vectors of values from respective resources of the processor cores using a direct memory access (DMA) data path of the shared memory. The shared memory performs an accumulation operation on the respective vectors of values using an operator unit coupled to the shared memory. The operator unit is configured to accumulate values based on arithmetic operations encoded at the operator unit. A result vector is generated based on performing the accumulation operation using the respective vectors of values.
    Type: Application
    Filed: August 31, 2020
    Publication date: August 26, 2021
    Inventors: Thomas Norrie, Gurushankar Rajamani, Andrew Everett Phelps, Matthew Leever Hedlund, Norman Paul Jouppi
  • Publication number: 20210248451
    Abstract: Methods, systems, and apparatus, including computer-readable media, are described for interleaving memory requests to accelerate memory accesses at a hardware circuit configured to implement a neural network model. A system generates multiple requests that are processed against a memory of the system. Each request is used to retrieve data from the memory. For each request, the system generates multiple sub-requests based on a respective size of the data to be retrieved using the request. The system generates a sequence of interleaved sub-requests that includes respective sub-requests of a first request interleaved among respective sub-requests of a second request. Based on the sequence of interleaved sub-requests, a module of the system receives respective portions of data accessed from different address locations of the memory. The system processes each of the respective portions of data to generate a neural network inference using the neural network model implemented at the hardware circuit.
    Type: Application
    Filed: May 15, 2020
    Publication date: August 12, 2021
    Inventors: Gurushankar Rajamani, Alice Kuo
  • Publication number: 20210223985
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing data on a memory controller. One of the methods comprises obtaining a first request and a second request to access respective data corresponding to the first and second requests at a first memory device of the plurality of memory devices; and initiating interleaved processing of the respective data; receiving an indication to stop processing requests to access data at the first memory device and to initiate processing requests to access data at a second memory device, determining that the respective data corresponding to the first and second requests have not yet been fully processed at the time of receiving the indication, and in response, storing, in memory accessible to the memory controller, data corresponding to the requests which have not yet been fully processed.
    Type: Application
    Filed: July 15, 2020
    Publication date: July 22, 2021
    Inventors: Amin Farmahini, Benjamin Steel Gelb, Gurushankar Rajamani, Sukalpa Biswas
  • Patent number: 9594713
    Abstract: Bridging strongly ordered write transactions to devices in weakly ordered domains, and related apparatuses, methods, and computer-readable media are disclosed. In one aspect, a host bridge device is configured to receive strongly ordered write transactions from one or more strongly ordered producer devices. The host bridge device issues the strongly ordered write transactions to one or more consumer devices within a weakly ordered domain. The host bridge device detects a first write transaction that is not accepted by a first consumer device of the one or more consumer devices. For each of one or more write transactions issued subsequent to the first write transaction and accepted by a respective consumer device, the host bridge device sends a cancellation message to the respective consumer device. The host bridge device replays the first write transaction and the one or more write transactions that were issued subsequent to the first write transaction.
    Type: Grant
    Filed: September 12, 2014
    Date of Patent: March 14, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Randall John Pascarella, Jaya Prakash Subramaniam Ganasan, Thuong Quang Truong, Gurushankar Rajamani, Joseph Gerald McDonald, Thomas Philip Speier
  • Publication number: 20160077991
    Abstract: Bridging strongly ordered write transactions to devices in weakly ordered domains, and related apparatuses, methods, and computer-readable media are disclosed. In one aspect, a host bridge device is configured to receive strongly ordered write transactions from one or more strongly ordered producer devices. The host bridge device issues the strongly ordered write transactions to one or more consumer devices within a weakly ordered domain. The host bridge device detects a first write transaction that is not accepted by a first consumer device of the one or more consumer devices. For each of one or more write transactions issued subsequent to the first write transaction and accepted by a respective consumer device, the host bridge device sends a cancellation message to the respective consumer device. The host bridge device replays the first write transaction and the one or more write transactions that were issued subsequent to the first write transaction.
    Type: Application
    Filed: September 12, 2014
    Publication date: March 17, 2016
    Inventors: Randall John Pascarella, Jaya Prakash Subramaniam Ganasan, Thuong Quang Truong, Gurushankar Rajamani, Joseph Gerald McDonald, Thomas Philip Speier
  • Patent number: 9164943
    Abstract: Embodiments of the invention describe an apparatus, system and method for executing self-correction logic for serial-to-parallel data converters. Embodiments of the invention receive one of a plurality of serial data streams from a peripheral device, each of the serial data streams having one or more bits. In response to detecting that a shift register chain includes a register select value, embodiments of the invention may store the received serial data stream in one of a plurality of data registers, wherein the one data register is selected based, at least in part, on a position of the register select value in the shift register chain. In response to detecting the shift register chain does contain the register select value, embodiments of the invention may insert the register select value at a register of the shift register chain.
    Type: Grant
    Filed: January 18, 2012
    Date of Patent: October 20, 2015
    Assignee: Intel Corporation
    Inventors: Anil Sharma, Kanaka Lakshimi Siva Prasad Gadey Naga Venkata, Gurushankar Rajamani
  • Publication number: 20140223045
    Abstract: Embodiments of the invention describe an apparatus, system and method for executing self-correction logic for serial-to-parallel data converters. Embodiments of the invention receive one of a plurality of serial data streams from a peripheral device, each of the serial data streams having one or more bits. In response to detecting that a shift register chain includes a register select value, embodiments of the invention may store the received serial data stream in one of a plurality of data registers, wherein the one data register is selected based, at least in part, on a position of the register select value in the shift register chain. In response to detecting the shift register chain does contain the register select value, embodiments of the invention may insert the register select value at a register of the shift register chain.
    Type: Application
    Filed: January 18, 2012
    Publication date: August 7, 2014
    Inventors: Anil Sharma, Kanaka Lakshimi Siva Prasad Gadey Naga Venkata, Gurushankar Rajamani
  • Patent number: 8782318
    Abstract: Methods and apparatus relating to increase Input Output Hubs in constrained link based multi-processor systems are described. In one embodiment, a first input output hub (IOH) and a second IOH are coupled a link interconnect and a plurality of processors, coupled to the first and second IOHs include pre-allocated resources for a single IOH. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: March 2, 2011
    Date of Patent: July 15, 2014
    Assignee: Intel Corporation
    Inventors: Debendra Das Sharma, Chandra P. Joshi, Gurushankar Rajamani
  • Publication number: 20120226848
    Abstract: Methods and apparatus relating to increase Input Output Hubs in constrained link based multi-processor systems are described. In one embodiment, a first input output hub (IOH) and a second IOH are coupled a link interconnect and a plurality of processors, coupled to the first and second IOHs include pre-allocated resources for a single IOH. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: March 2, 2011
    Publication date: September 6, 2012
    Inventors: Debendra Das Sharma, Chandra P. Joshi, Gurushankar Rajamani