Patents by Inventor Raghu Prabhakar

Raghu Prabhakar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220027308
    Abstract: A processing system comprises a control bus and a plurality of logic units. The control bus is configurable by configuration data to form signal routes in a control barrier network coupled to processing units in an array of processing units. The plurality of logic units has inputs and outputs connected to the control bus and to the array of processing units. A logic unit in the plurality of logic units is operatively coupled to a processing unit in the array of processing units and is configurable by the configuration data to consume source tokens and a status signal from the processing unit on the inputs and to produce barrier tokens and an enable signal on the outputs based on the source tokens and the status signal on the inputs.
    Type: Application
    Filed: October 1, 2021
    Publication date: January 27, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Raghu PRABHAKAR, Manish K. SHAH, Ram SIVARAMAKRISHNAN, Pramod NATARAJA, David Brian JACKSON, Gregory Frederick GROHOSKI
  • Patent number: 11232360
    Abstract: Disclosed is a data processing system that includes compile time logic configured to process a processing graph to generate a modified processing graph, which includes a plurality of forward processing nodes of a forward pass and a plurality of backward processing nodes of a backward pass. The data processing system also includes runtime logic configured with the compile time logic to execute the modified processing graph to generate, at a backward processing node of the plurality of backward processing nodes, a plurality of partial weight gradients, based on processing a corresponding plurality of gradient tiles of a gradient tensor, and generate, based on the plurality of partial weight gradients, a final weight gradient corresponding to the gradient tensor.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: January 25, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Tejas Nagendra Babu Nama, Ruddhi Chaphekar, Ram Sivaramakrishnan, Raghu Prabhakar, Sumti Jairath, Junjue Wang, Kaizhao Liang, Adi Fuchs, Matheen Musaddiq, Arvind Krishna Sujeeth
  • Patent number: 11227207
    Abstract: Disclosed is a data processing system that includes compile time logic to section a graph into a sequence of sections, configure a first section to generate a first set of output tiles in a first target tiling configuration in response to processing a first set of input tiles in a first input tiling configuration, and configure a second section to generate a second set of output tiles in a second target tiling configuration in response to processing the first set of output tiles in a second input tiling configuration. Runtime logic is configured to pad a first input into a first padded input, read the first set of input tiles from the first padded input in the first input tiling configuration, and process the first set of input tiles through the first section to generate the first set of output tiles in the first target tiling configuration.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: January 18, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Tejas Nagendra Babu Nama, Ruddhi Chaphekar, Ram Sivaramakrishnan, Raghu Prabhakar, Sumti Jairath, Junjue Wang, Kaizhao Liang, Adi Fuchs, Matheen Musaddiq, Arvind Krishna Sujeeth
  • Patent number: 11204889
    Abstract: A method of processing partitions of a tensor in a target order includes receiving, by a reorder unit and from two or more producer units, a plurality of partitions of a tensor in a first order that is different from the target order, storing the plurality of partitions in the reorder unit, and providing, from the reorder unit, the plurality of partitions in the target order to one or more consumer units. In an example, the one or more consumer units process the plurality of partitions in the target order.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: December 21, 2021
    Assignee: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Nathan Francis Sheeley, Matheen Musaddiq, Scott Layson Burson, Sitanshu Gupta, Sumti Jairath, Pramod Nataraja, Ajit Punj
  • Patent number: 11195080
    Abstract: Disclosed is a data processing system that includes compile time logic configured to section a graph into a sequence of sections, and configure each section of the sequence of sections such that an input layer of a section processes an input, one or more intermediate layers of the corresponding section processes corresponding one or more intermediate outputs, and a final layer of the corresponding section generates a final output. The final output has a non-overlapping final tiling configuration, the one or more intermediate outputs have corresponding one or more overlapping intermediate tiling configurations, and the input has an overlapping input tiling configuration. The compile time logic is further to determine the various tiling configurations by starting from the final layer and reverse traversing through the one or more intermediate layers, and ending with the input layer.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: December 7, 2021
    Assignee: SambaNova Systems, Inc.
    Inventors: Tejas Nagendra Babu Nama, Ruddhi Chaphekar, Ram Sivaramakrishnan, Raghu Prabhakar, Sumti Jairath, Junjue Wang, Kaizhao Liang, Adi Fuchs, Matheen Musaddiq, Arvind Krishna Sujeeth
  • Publication number: 20210373867
    Abstract: A compiler configured to configure memory nodes with a ready-to-read credit counter and a write credit counter. The ready-to-read credit counter of a particular upstream memory node initialized with as many read credits as a buffer depth of a corresponding downstream memory node. The ready-to-read credit counter configured to decrement when a buffer data unit is written by the particular upstream memory node into the corresponding downstream memory node, and to increment when the particular upstream memory node receives from the corresponding downstream memory node a read ready token. The write credit counter of the particular upstream memory node initialized with one or more write credits and configured to decrement when the particular upstream memory node begins writing the buffer data unit into the corresponding downstream memory node, and to increment when the particular upstream memory node receives from the corresponding downstream memory node a write done token.
    Type: Application
    Filed: June 2, 2020
    Publication date: December 2, 2021
    Applicant: SambaNova Systems, Inc.
    Inventors: Weiwei CHEN, Raghu PRABHAKAR, David Alan KOEPLINGER, Sitanshu GUPTA, Ruddhi Arun CHAPHEKAR, Ajit PUNJ, Sumti JAIRATH
  • Patent number: 11188497
    Abstract: A reconfigurable data processor comprises a bus system, and an array of configurable units connected to the bus system, configurable units in the array including configuration data stores to store unit files comprising a plurality of sub-files of configuration data particular to the corresponding configurable units. Configurable units in the plurality of configurable units each include logic to execute a unit configuration load process, including receiving via the bus system, sub-files of a unit file particular to the configurable unit, and loading the received sub-files into the configuration store of the configurable unit. A configuration load controller connected to the bus system, including logic to execute an array configuration load process, including distributing a configuration file comprising unit files for a plurality of the configurable units in the array.
    Type: Grant
    Filed: November 21, 2018
    Date of Patent: November 30, 2021
    Assignee: SambaNova Systems, Inc.
    Inventors: Manish K. Shah, Ram Sivaramakrishnan, Mark Luttrell, David Brian Jackson, Raghu Prabhakar, Sumti Jairath, Gregory Frederick Grohoski, Pramod Nataraja
  • Patent number: 11182221
    Abstract: The technology disclosed relates to buffer-based inter-node streaming of configuration data over a network fabric. In particular, the technology disclosed relates to a runtime processor configured to load and execute a first subset of configuration files in a set of configuration files on a first reconfigurable processor operatively coupled to a first processing node, load and execute a second subset of configuration files in the set of configuration files on a second reconfigurable processor operatively coupled to a second processing node, and use a first plurality of buffers operatively coupled to the first processing node, and a second plurality of buffers operatively coupled to the second processing node to stream data between the first reconfigurable processor and the second reconfigurable processor to load and execute the first subset of configuration files and the second subset of configuration files.
    Type: Grant
    Filed: December 18, 2020
    Date of Patent: November 23, 2021
    Assignee: SambaNova Systems, Inc.
    Inventors: Ram Sivaramakrishnan, Sumti Jairath, Emre Ali Burhan, Manish K. Shah, Raghu Prabhakar, Ravinder Kumar, Arnav Goel, Ranen Chatterjee, Gregory Frederick Grohoski, Kin Hing Leung, Dawei Huang, Manoj Unnikrishnan, Martin Russell Raumann, Bandish B. Shah
  • Patent number: 11182264
    Abstract: A data processing system comprises a plurality of reconfigurable processors including a first reconfigurable processor and additional reconfigurable processors, a plurality of buffers in a shared memory accessible to the first reconfigurable processor and the additional reconfigurable processors, and runtime logic configured to execute one or more configuration files for applications using the first reconfigurable processor and the additional reconfigurable processors. Execution of the configuration files includes receiving data from the first reconfigurable processor and providing the data to at least one of the additional reconfigurable processors, and receiving data from the at least one of the additional reconfigurable processors and providing the data to the first reconfigurable processor.
    Type: Grant
    Filed: December 18, 2020
    Date of Patent: November 23, 2021
    Assignee: SambaNova Systems, Inc.
    Inventors: Ram Sivaramakrishnan, Sumti Jairath, Emre Ali Burhan, Manish K. Shah, Raghu Prabhakar, Ravinder Kumar, Arnav Goel, Ranen Chatterjee, Gregory Frederick Grohoski, Kin Hing Leung, Dawei Huang, Manoj Unnikrishnan, Martin Russell Raumann, Bandish B. Shah
  • Patent number: 11126574
    Abstract: A data processing system comprises compile time logic, runtime logic, a control bus, and instrumentation units operatively coupled to processing units of an array. The compile time logic is configured to generate configuration files for a dataflow graph. The runtime logic is configured to execute the configuration files on the array, and to trigger start and stop events, as defined by the configuration files, in response to implementation of compute and memory operations of the dataflow graph on the array. A control bus is configured to form event routes in the array. The instrumentation units have inputs and outputs connected to the control bus and to the processing units. The instrumentation units are configured to consume the start events on the inputs and start counting clock cycles, consume the stop events on the inputs and stop counting the clock cycles, and report the counted clock cycles on the outputs.
    Type: Grant
    Filed: February 12, 2021
    Date of Patent: September 21, 2021
    Assignee: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Matthew Thomas Grimm, Sumti Jairath, Kin Hing Leung, Sitanshu Gupta, Yuan Lin, Luca Boasso
  • Publication number: 20210271519
    Abstract: A reconfigurable data processor comprises an array of configurable units configurable to allocate a plurality of sets of configurable units in the array to implement respective execution fragments of the data processing operation. Quiesce logic is coupled to configurable units in the array, configurable to respond to a quiesce control signal to quiesce the sets of configurable units in the array on quiesce boundaries of the respective execution fragments, and to forward quiesce ready signals for the respective execution fragments when the corresponding sets of processing units are ready. An array quiesce controller distributes the quiesce control signal to configurable units in the array, and receives quiesce ready signals for the respective execution fragments from the quiesce logic.
    Type: Application
    Filed: May 17, 2021
    Publication date: September 2, 2021
    Applicant: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Manish K. Shah, Pramod Nataraja, David Brian Jackson, Kin Hing Leung, Ram Sivaramakrishnan, Sumti Jairath, Gregory Frederick Grohoski
  • Publication number: 20210271630
    Abstract: The technology disclosed partitions a dataflow graph of a high-level program into memory allocations and execution fragments. The memory allocations represent creation of logical memory spaces in on-processor and/or off-processor memories for data required to implement the dataflow graph. The execution fragments represent operations on the data. The technology disclosed designates the memory allocations to virtual memory units and the execution fragments to virtual compute units. The technology disclosed partitions the execution fragments into memory fragments and compute fragments, and assigns the memory fragments to the virtual memory units and the compute fragments to the virtual compute units. The technology disclosed then allocates the virtual memory units to physical memory units and the virtual compute units to physical compute units.
    Type: Application
    Filed: May 20, 2021
    Publication date: September 2, 2021
    Applicant: SambaNova Systems, Inc.
    Inventors: David Alan KOEPLINGER, Raghu PRABHAKAR, Sumti JAIRATH
  • Patent number: 11080227
    Abstract: The technology disclosed partitions a dataflow graph of a high-level program into memory allocations and execution fragments. The memory allocations represent creation of logical memory spaces in on-processor and/or off-processor memories for data required to implement the dataflow graph. The execution fragments represent operations on the data. The technology disclosed designates the memory allocations to virtual memory units and the execution fragments to virtual compute units. The technology disclosed partitions the execution fragments into memory fragments and compute fragments, and assigns the memory fragments to the virtual memory units and the compute fragments to the virtual compute units. The technology disclosed then allocates the virtual memory units to physical memory units and the virtual compute units to physical compute units.
    Type: Grant
    Filed: August 8, 2019
    Date of Patent: August 3, 2021
    Assignee: SambaNova Systems, Inc.
    Inventors: David Alan Koeplinger, Raghu Prabhakar, Sumti Jairath
  • Patent number: 11055141
    Abstract: A reconfigurable data processor comprises an array of configurable units configurable to allocate a plurality of sets of configurable units in the array to implement respective execution fragments of the data processing operation. Quiesce logic is coupled to configurable units in the array, configurable to respond to a quiesce control signal to quiesce the sets of configurable units in the array on quiesce boundaries of the respective execution fragments, and to forward quiesce ready signals for the respective execution fragments when the corresponding sets of processing units are ready. An array quiesce controller distributes the quiesce control signal to configurable units in the array, and receives quiesce ready signals for the respective execution fragments from the quiesce logic.
    Type: Grant
    Filed: July 8, 2019
    Date of Patent: July 6, 2021
    Assignee: SAMBANOVA SYSTEMS, INC.
    Inventors: Raghu Prabhakar, Manish K. Shah, Pramod Nataraja, David Brian Jackson, Kin Hing Leung, Ram Sivaramakrishnan, Sumti Jairath, Gregory Frederick Grohoski
  • Publication number: 20210055940
    Abstract: A reconfigurable data processor comprises a bus system, and an array of configurable units connected to the bus system, configurable units in the array including configuration data stores to store unit files comprising a plurality of sub-files of configuration data particular to the corresponding configurable units. Configurable units in the plurality of configurable units each include logic to execute a unit configuration load process, including receiving via the bus system, sub-files of a unit file particular to the configurable unit, and loading the received sub-files into the configuration store of the configurable unit. A configuration load controller connected to the bus system, including logic to execute an array configuration load process, including distributing a configuration file comprising unit files for a plurality of the configurable units in the array.
    Type: Application
    Filed: November 9, 2020
    Publication date: February 25, 2021
    Applicant: SambaNova Systems, Inc.
    Inventors: Manish K. Shah, Ram Sivaramakrishnan, Mark Luttrell, David B. Jackson, Raghu Prabhakar, Sumti Jairath, Gregory Frederick Grohoski, Pramod Nataraja
  • Publication number: 20210042259
    Abstract: The technology disclosed partitions a dataflow graph of a high-level program into memory allocations and execution fragments. The memory allocations represent creation of logical memory spaces in on-processor and/or off-processor memories for data required to implement the dataflow graph. The execution fragments represent operations on the data. The technology disclosed designates the memory allocations to virtual memory units and the execution fragments to virtual compute units. The technology disclosed partitions the execution fragments into memory fragments and compute fragments, and assigns the memory fragments to the virtual memory units and the compute fragments to the virtual compute units. The technology disclosed then allocates the virtual memory units to physical memory units and the virtual compute units to physical compute units.
    Type: Application
    Filed: August 8, 2019
    Publication date: February 11, 2021
    Applicant: SambaNova Systems, Inc.
    Inventors: David Alan KOEPLINGER, Raghu PRABHAKAR, Sumti JAIRATH
  • Publication number: 20210011770
    Abstract: A reconfigurable data processor comprises an array of configurable units configurable to allocate a plurality of sets of configurable units in the array to implement respective execution fragments of the data processing operation. Quiesce logic is coupled to configurable units in the array, configurable to respond to a quiesce control signal to quiesce the sets of configurable units in the array on quiesce boundaries of the respective execution fragments, and to forward quiesce ready signals for the respective execution fragments when the corresponding sets of processing units are ready. An array quiesce controller distributes the quiesce control signal to configurable units in the array, and receives quiesce ready signals for the respective execution fragments from the quiesce logic.
    Type: Application
    Filed: July 8, 2019
    Publication date: January 14, 2021
    Applicant: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Manish K. Shah, Pramod Nataraja, David Brian Jackson, Kin Hing Leung, Ram Sivaramakrishnan, Sumti Jairath, Gregory Frederick Grohoski
  • Publication number: 20200356523
    Abstract: A reconfigurable data processor comprises an array of processing units arranged to perform execution fragments of a data processing operation. A control barrier network is coupled to processing units in the array. The control barrier network comprises a control bus configurable to form signal routes in the control barrier network, and a plurality of control barrier logic units having inputs and outputs connected to the control bus and to the array of processing units. The logic units in the plurality of logic units are configurable to consume source tokens and status signals on the inputs and produce barrier tokens on the outputs based on the source tokens and status signals on the inputs. Also, the logic units can produce enable signals for the array of processing units based on the source tokens and status signals on the inputs.
    Type: Application
    Filed: May 9, 2019
    Publication date: November 12, 2020
    Applicant: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Manish K. Shah, Ram Sivaramakrishnan, Pramod Nataraja, David Brian Jackson, Gregory Frederick Grohoski
  • Patent number: 10831507
    Abstract: A reconfigurable data processor comprises a bus system, and an array of configurable units connected to the bus system, configurable units in the array including configuration data stores to store unit files comprising a plurality of sub-files of configuration data particular to the corresponding configurable units. Configurable units in the plurality of configurable units each include logic to execute a unit configuration load process, including receiving via the bus system, sub-files of a unit file particular to the configurable unit, and loading the received sub-files into the configuration store of the configurable unit. A configuration load controller connected to the bus system, including logic to execute an array configuration load process, including distributing a configuration file comprising unit files for a plurality of the configurable units in the array.
    Type: Grant
    Filed: November 21, 2018
    Date of Patent: November 10, 2020
    Assignee: SambaNova Systems, Inc.
    Inventors: Manish K. Shah, Ram Sivaramakrishnan, Mark Luttrell, David Brian Jackson, Raghu Prabhakar, Sumti Jairath, Gregory Frederick Grohoski, Pramod Nataraja
  • Patent number: 10768899
    Abstract: A configurable circuit configurable according to the data width of elements of a matrix is described that includes a memory array, logic to write a matrix to the memory array having elements with a data width which can be specified using configuration data, logic for a transpose read of the matrix as-written and logic for normal read of the matrix as-written. The memory array includes first and second read ports operable in parallel. Transpose read logic and normal read logic can be coupled to the first and second read ports, respectively, allowing transpose and normal read of a matrix simultaneously.
    Type: Grant
    Filed: January 29, 2019
    Date of Patent: September 8, 2020
    Assignee: SambaNova Systems, Inc.
    Inventors: David Alan Koeplinger, Raghu Prabhakar, Ram Sivaramakrishnan, David Brian Jackson, Mark Luttrell