Patents by Inventor Raghu Prabhakar

Raghu Prabhakar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220309323
    Abstract: A data processing system includes memory and reconfigurable processors, operatively coupled to the memory, configured to execute a sequence of subgraphs of a graph. The sequence of subgraphs includes a preceding subgraph and a succeeding subgraph. The data processing system also includes data flow logic, operatively coupled to the reconfigurable processors and the memory, configured to store a tiled output of the preceding subgraph as a composed input in the memory and make available parts of the composed input for processing by the succeeding subgraph.
    Type: Application
    Filed: March 21, 2022
    Publication date: September 29, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Tejas Nagendra Babu NAMA, Ruddhi CHAPHEKAR, Ram SIVARAMAKRISHNAN, Raghu PRABHAKAR, Sumti JAIRATH, Junjue WANG, Kaizhao LIANG, Adi FUCHS, Matheen MUSADDIQ, Arvind Krishna SUJEETH
  • Publication number: 20220309319
    Abstract: Disclosed is a data processing system that includes compile time logic to section a graph into a sequence of sections, configure a first section to generate a first set of output tiles in a first target tiling configuration in response to processing a first set of input tiles in a first input tiling configuration, and configure a second section to generate a second set of output tiles in a second target tiling configuration in response to processing the first set of output tiles in a second input tiling configuration. Runtime logic is configured to pad a first input into a first padded input, read the first set of input tiles from the first padded input in the first input tiling configuration, and process the first set of input tiles through the first section to generate the first set of output tiles in the first target tiling configuration.
    Type: Application
    Filed: September 16, 2021
    Publication date: September 29, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Tejas Nagendra Babu NAMA, Ruddhi CHAPHEKAR, Ram SIVARAMAKRISHNAN, Raghu PRABHAKAR, Sumti JAIRATH, Junjue WANG, Kaizhao LIANG, Adi FUCHS, Matheen MUSADDIQ, Arvind Krishna SUJEETH
  • Patent number: 11443014
    Abstract: The technology disclosed relates to matrix multiplication where the multiplier can be a sparse matrix. In particular, a multiplication device includes first circuitry configured to obtain the multiplicand matrix and an index of columns of the multiplier matrix and to generate an intermediate matrix that has one row per entry in the index copied from a respective row of the multiplicand matrix based on a value of a corresponding entry in the index. The device also includes second circuitry configured to receive the intermediate matrix from the first circuitry, obtain non-zero values of the multiplier matrix and a list of a number of non-zero entries per row of the multiplier matrix, and generate a product matrix as a result of multiplies of the non-zero values of the multiplier matrix and the intermediate matrix.
    Type: Grant
    Filed: November 5, 2021
    Date of Patent: September 13, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Mingran Wang, Raghu Prabhakar, Darshan Dhimantkumar Gandhi, Maulik Subhash Desai, Nathan Francis Sheeley, Scott Layson Burson, Sitanshu Gupta
  • Publication number: 20220261365
    Abstract: A reconfigurable processor comprises an array of processing units and an instrumentation network. The array of processing units is configured to execute runtime events to execute an application. The instrumentation network is operatively coupled to the array of processing units. The instrumentation network comprises a control bus configured to form control signal routes in the instrumentation network. The instrumentation network further comprises a plurality of instrumentation counters having inputs and outputs connected to the control bus and to the processing units. Instrumentation counters in the plurality instrumentation units are configurable to consume control signals on the inputs and produce counts of the runtime events on the outputs.
    Type: Application
    Filed: September 20, 2021
    Publication date: August 18, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Raghu PRABHAKAR, Matthew Thomas GRIMM, Sumti JAIRATH, Kin Hing LEUNG, Sitanshu GUPTA, Yuan LIN, Luca BOASSO
  • Publication number: 20220261364
    Abstract: A data processing system comprises memory, compile time logic, runtime logic, and instrumentation profiling logic. The memory stores a dataflow graph for an application. The dataflow graph has a plurality of compute nodes that are configured to be producers to produce data for execution of the application, and to be consumers to consume the data for execution of the application. The compile time logic partitions execution of the dataflow graph into stages. Each of the stages has one or more compute nodes, one or more producers, and one or more consumers. The runtime logic determines a processing latency for each of the stages by calculating time elapsed between producers of a particular stage receiving input data and consumers of the particular stage receiving output data. The instrumentation profiling logic generates performance statistics for the dataflow graph based on the processing latency determined for each of the stages.
    Type: Application
    Filed: September 20, 2021
    Publication date: August 18, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Raghu PRABHAKAR, Matthew Thomas GRIMM, Sumti JAIRATH, Kin Hing LEUNG, Sitanshu GUPTA, Yuan LIN, Luca BOASSO
  • Patent number: 11386038
    Abstract: A reconfigurable data processor comprises an array of processing units arranged to perform execution fragments of a data processing operation. A control barrier network is coupled to processing units in the array. The control barrier network comprises a control bus configurable to form signal routes in the control barrier network, and a plurality of control barrier logic units having inputs and outputs connected to the control bus and to the array of processing units. The logic units in the plurality of logic units are configurable to consume source tokens and status signals on the inputs and produce barrier tokens on the outputs based on the source tokens and status signals on the inputs. Also, the logic units can produce enable signals for the array of processing units based on the source tokens and status signals on the inputs.
    Type: Grant
    Filed: May 9, 2019
    Date of Patent: July 12, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Manish K. Shah, Ram Sivaramakrishnan, Pramod Nataraja, David Brian Jackson, Gregory Frederick Grohoski
  • Publication number: 20220197710
    Abstract: The technology disclosed relates to inter-processor execution of configuration files on reconfigurable processors using smart network interface controller (SmartNIC) buffers. In particular, the technology disclosed relates to a runtime logic that is configured to execute configuration files that define applications and process application data for applications using a first reconfigurable processor and a second reconfigurable processor. The execution includes streaming configuration data in the configuration files and the application data between the first reconfigurable processor and the second reconfigurable processor using one or more SmartNIC buffers.
    Type: Application
    Filed: November 9, 2021
    Publication date: June 23, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Ram SIVARAMAKRISHNAN, Sumti JAIRATH, Emre Ali BURHAN, Manish K. SHAH, Raghu PRABHAKAR, Ravinder KUMAR, Arnav GOEL, Ranen CHATTERJEE, Gregory Frederick GROHOSKI, Kin Hing LEUNG, Dawei HUANG, Manoj UNNIKRISHNAN, Martin Russell RAUMANN, Bandish B. SHAH
  • Publication number: 20220197712
    Abstract: The technology disclosed relates to inter-node execution of configuration files on reconfigurable processors using smart network interface controller (SmartNIC) buffers. In particular, the technology disclosed relates to a runtime logic that is configured to execute configuration files that define applications and process application data for applications using a first reconfigurable processor on a first node, and a second host processor on a second node. The execution includes streaming configuration data in the configuration files and the application data between the first reconfigurable processor and the second host processor using one or more SmartNIC buffers.
    Type: Application
    Filed: November 9, 2021
    Publication date: June 23, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Ram SIVARAMAKRISHNAN, Sumti JAIRATH, Emre Ali BURHAN, Manish K. SHAH, Raghu PRABHAKAR, Ravinder KUMAR, Arnav GOEL, Ranen CHATTERJEE, Gregory Frederick GROHOSKI, Kin Hing LEUNG, Dawei HUANG, Manoj UNNIKRISHNAN, Martin Russell RAUMANN, Bandish B. SHAH
  • Publication number: 20220197711
    Abstract: The technology disclosed relates to runtime execution of functions across reconfigurable processor. In particular, the technology disclosed relates to a runtime logic that is configured to execute a first set of functions in a plurality of functions and/or data therefor on a first reconfigurable processor, and a second set of functions in the plurality of functions and/or data therefor on additional reconfigurable processors. Functions in the second set of functions and/or the data therefor are transmitted to the additional reconfigurable processors using one or more of a first reconfigurable processor-to-additional reconfigurable processors buffers, and results of executing the functions and/or the data therefor on the additional reconfigurable processors are transmitted to the first reconfigurable processor using one or more of additional reconfigurable processors-to-first reconfigurable processor buffers.
    Type: Application
    Filed: November 9, 2021
    Publication date: June 23, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Ram Sivaramakrishnan, Sumti Jairath, Emre Ali Burhan, Manish K. Shah, Raghu Prabhakar, Ravinder Kumar, Arnav Goel, Ranen Chatterjee, Gregory Frederick Grohoski, Kin Hing Leung, Dawei Huang, Manoj Unnikrishnan, Martin Russell Raumann, Bandish B. Shah
  • Publication number: 20220197713
    Abstract: The technology disclosed relates to inter-node execution of configuration files on reconfigurable processors using network interface controller (NIC) buffers. In particular, the technology disclosed relates to a runtime logic that is configured to execute configuration files that define applications and application data for applications using a first reconfigurable processor connected to a first host, and a second reconfigurable processor connected to a second host. The first reconfigurable processor is configured to push input data for the applications in a first plurality of buffers. The first host is configured to cause a first network interface controller (NIC) to stream the input data to a second plurality of buffers from the first plurality of buffers. The second host is configured to cause a second NIC to stream the input data to the second reconfigurable processor from the second plurality of buffers.
    Type: Application
    Filed: November 9, 2021
    Publication date: June 23, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Ram SIVARAMAKRISHNAN, Sumti JAIRATH, Emre Ali BURHAN, Manish K. SHAH, Raghu PRABHAKAR, Ravinder KUMAR, Arnav GOEL, Ranen CHATTERJEE, Gregory Frederick GROHOSKI, Kin Hing LEUNG, Dawei HUANG, Manoj UNNIKRISHNAN, Martin Russell RAUMANN, Bandish B. SHAH
  • Publication number: 20220197709
    Abstract: The technology disclosed relates to runtime execution of configuration files on reconfigurable processors with varying configuration granularity. In particular, the technology disclosed relates to a runtime logic that is configured to receive a set of configuration files for an application, and load and execute a first subset of configuration files in the set of configuration files and associated application data on a first reconfigurable processor. The first reconfigurable processor has a first level of configurable granularity. The runtime logic is further configured to load and execute a second subset of configuration files in the set of configuration files and associated application data on a second reconfigurable processor. The second reconfigurable processor has a second level of configurable granularity that is different from the first level of configurable granularity.
    Type: Application
    Filed: November 9, 2021
    Publication date: June 23, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Ram SIVARAMAKRISHNAN, Sumti JAIRATH, Emre Ali BURHAN, Manish K. SHAH, Raghu PRABHAKAR, Ravinder KUMAR, Arnav GOEL, Ranen CHATTERJEE, Gregory Frederick GROHOSKI, Kin Hing LEUNG, Dawei HUANG, Manoj UNNIKRISHNAN, Martin Russell RAUMANN, Bandish B. SHAH
  • Patent number: 11366783
    Abstract: An integrated circuit includes a plurality of configurable units, each configurable unit having two or more corresponding sections. The plurality of configurable units is arranged in a serial arrangement to form a chain of sections of the configurable units. A data bus is connected to the plurality of configurable units which communicates data at a clock rate. The chain of sections is to receive and write a series of tensors at the clock rate at a first end section of the chain of sections, and sequentially propagate the series of tensors through individual sections within the chain of sections at the clock rate. The chain of sections is to output the series of tensors at a second end section of the chain of sections. The chain of sections is to also output the series of tensors at an intermediate section of the chain of sections.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: June 21, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Nathan Francis Sheeley, Amitabh Menon, Sitanshu Gupta, Sumti Jairath, Matheen Musaddiq
  • Publication number: 20220156213
    Abstract: A reconfigurable data processor includes a plurality of configurable units, and a configuration controller. The configuration controller is configured to start execution of a first application graph in a first set of configurable units. Then, concurrently with the execution of the first application graph in the first set of configurable units, the configuration controllers receive a command to load a configuration file into a second set of configurable units and obtain the configuration file. The configuration file contains information to configure the second set of configurable units to execute a second application graph. The configuration file is then loaded into the second set of configurable units and execution of the second application graph is started in the second set of configurable units.
    Type: Application
    Filed: January 31, 2022
    Publication date: May 19, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Gregory Frederick Grohoski, Sumti Jairath, Mark Luttrell, Raghu Prabhakar, Ram Sivaramakrishnan, Manish K. Shah
  • Patent number: 11328209
    Abstract: A method for selectively dropping out feature elements from a tensor is disclosed. The method includes generating a mask that has a plurality of mask elements. Each mask element includes a corresponding plurality of bits representing either a first value or a second value, to indicate whether a corresponding feature element of the tensor output by a neural network layer is to be dropped out or retained. Each mask element of the plurality of mask elements of the mask is compressed to generate a corresponding compressed mask element of a plurality of compressed mask elements of a compressed mask, thereby generating the compressed mask from the mask. Each compressed mask element of the plurality of compressed mask elements includes a corresponding single bit. Feature elements are selectively dropped from the tensor, based on the compressed mask.
    Type: Grant
    Filed: June 2, 2021
    Date of Patent: May 10, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Sathish Terakanambi Sheshadri, Ram Sivaramakrishnan, Raghu Prabhakar
  • Publication number: 20220083499
    Abstract: A reconfigurable data processor comprises a bus system, and an array of configurable units connected to the bus system, configurable units in the array including configuration data stores to store unit files comprising a plurality of sub-files of configuration data particular to the corresponding configurable units. A configuration unload controller connected to the bus system, including logic to execute an array configuration unload process, including distributing a command to a plurality of the configurable units in the array to unload the unit files particular to the corresponding configurable units, the unit files each comprising a plurality of ordered sub-files, receiving sub-files via the bus system from the array of configurable units, and assembling an unload configuration file by arranging the received sub-files in memory according to the configurable unit of the unit file of which the sub-file is a part, and the order of the sub-file in the unit file.
    Type: Application
    Filed: November 22, 2021
    Publication date: March 17, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Manish K. Shah, Ram Sivaramakrishnan, Mark Luttrell, David B. Jackson, Raghu Prabhakar, Sumti Jairath, Gregory Frederick Grohoski, Pramod Nataraja
  • Patent number: 11263170
    Abstract: Disclosed is a data processing system to receive a processing graph of an application. A compile time logic is configured to modify the processing graph and generate a modified processing graph. The modified processing graph is configured to apply a post-padding tiling after applying a cumulative input padding that confines padding to an input. The cumulative input padding pads the input into a padded input. The post-padding tiling tiles the padded input into a set of pre-padded input tiles with a same tile size, tiles intermediate representation of the input into a set of intermediate tiles with a same tile size, and tiles output representation of the input into a set of non-overlapping output tiles with a same tile size. Runtime logic is configured with the compile time logic to execute the modified processing graph to execute the application.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: March 1, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Tejas Nagendra Babu Nama, Ruddhi Chaphekar, Ram Sivaramakrishnan, Raghu Prabhakar, Sumti Jairath, Junjue Wang, Kaizhao Liang, Adi Fuchs, Matheen Musaddiq, Arvind Krishna Sujeeth
  • Publication number: 20220058034
    Abstract: A data processing system comprises a pool of reconfigurable data flow resources and a runtime processor. The pool of reconfigurable data flow resources includes arrays of physical configurable units and memory. The runtime processor includes logic to receive a plurality of configuration files for user applications. The configuration files include configurations of virtual data flow resources required to execute the user applications. The runtime processor also includes logic to allocate physical configurable units and memory in the pool of reconfigurable data flow resources to the virtual data flow resources and load the configuration files to the allocated physical configurable units. The runtime processor further includes logic to execute the user applications using the allocated physical configurable units and memory.
    Type: Application
    Filed: August 18, 2020
    Publication date: February 24, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Gregory Frederick GROHOSKI, Manish K. SHAH, Raghu PRABHAKAR, Mark LUTTRELL, Ravinder KUMAR, Kin Hing LEUNG, Ranen CHATTERJEE, Sumti JAIRATH, David Alan KOEPLINGER, Ram SIVARAMAKRISHNAN, Matthew Thomas GRIMM
  • Patent number: 11256987
    Abstract: A method for selectively dropping out feature elements from a tensor is disclosed. The method includes generating a mask that has a plurality of mask elements arranged in a first order. A compressed mask is generated, which includes a plurality of compressed mask elements arranged in a second order that is different from the first order. For example, each mask element of the plurality of mask elements of the mask is compressed to generate a corresponding compressed mask element of the plurality of compressed mask elements of the compressed mask. Individual compressed mask element of the plurality of compressed mask elements is indicative of whether a corresponding feature element of the tensor output by a neural network layer is to be dropped out or retained. Feature elements are selectively dropped from the tensor, based on the compressed mask.
    Type: Grant
    Filed: June 2, 2021
    Date of Patent: February 22, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Sathish Terakanambi Sheshadri, Ram Sivaramakrishnan, Raghu Prabhakar
  • Patent number: 11250061
    Abstract: Disclosed is a data processing system which includes compile time logic configured to section a graph into a sequence of subgraphs, the sequence of subgraphs including at least a first subgraph. The compile time logic configures the first subgraph to generate a plurality of output tiles of an output tensor. A runtime logic configured with the compile time logic is to execute the sequence of subgraphs to generate, at the output of the first subgraph, the plurality of output tiles of the output tensor, and write the plurality of output tiles in a memory in an overlapping configuration. In an example, an overlapping region between any two neighboring output tiles of the plurality of output tiles comprises a summation of a corresponding region of a first neighboring output tile and a corresponding region of a second neighboring output tile.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: February 15, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Tejas Nagendra Babu Nama, Ruddhi Arun Chaphekar, Ram Sivaramakrishnan, Raghu Prabhakar, Sumti Jairath, Junjue Wang, Kaizhao Liang, Adi Fuchs, Matheen Musaddiq, Arvind Sujeeth
  • Patent number: 11237996
    Abstract: A reconfigurable data processor comprises an array of configurable units and a bus system configurable to define virtual machines. The system can partition the array of configurable units into a plurality of sets of configurable units, and block communications via the bus system between configurable units within a particular set and configurable units outside the particular set. A memory access controller can be connected to the bus system, configurable to confine access to memory outside the array of configurable units originating from within the particular set to memory space allocated to the particular.
    Type: Grant
    Filed: April 29, 2020
    Date of Patent: February 1, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Gregory Frederick Grohoski, Sumti Jairath, Mark Luttrell, Raghu Prabhakar, Ram Sivaramakrishnan, Manish K. Shah