Patents by Inventor Raghu Prabhakar

Raghu Prabhakar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240232127
    Abstract: A statically reconfigurable dataflow architecture processor performs an N-dimensional affine transform specified by a matrix on an input image to produce an output image. N counters iterate over the output image by N respective stride values (N output tile dimension lengths) to generate base pixel coordinates of N-dimensional output tiles into which the output image is subdividable. Statically reconfigurable pattern compute units, for each output tile of the output tiles: use the base pixel coordinates of the output tile and the N output tile dimension lengths to calculate the coordinates of corner pixels of the output tile and apply the affine transform matrix to the corner pixel coordinates and use the results to determine base pixel coordinates of a corresponding N-dimensional input tile into which the input image is subdividable. Statically reconfigurable pattern memory units load each corresponding input tile based on the determined input tile base pixel coordinates.
    Type: Application
    Filed: January 10, 2023
    Publication date: July 11, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Matthew Vilim, Raghu Prabhakar, Matt Feldman, Yaqi Zhang
  • Publication number: 20240233069
    Abstract: A statically reconfigurable dataflow architecture processor (SRDAP) performs an N-dimensional affine transform specified by a matrix on an input image to produce an output image includes L address pattern memory units (PMUs) comprising a memory arranged as a vector of L banks, and L corresponding data PMUs. Each data PMU receives a copy of the input image. In parallel: each address PMU writes an L-vector of addresses of input pixels to the vector of L banks and reads a single address of the written L-vector of addresses from a predetermined bank corresponding to a PMU number of the address PMU among the L address PMUs, and each data PMU receives the single address from the corresponding address PMU and uses it to read a single input pixel from the data PMU memory. A tree of pattern compute units coalesces the L single input pixels into an L-vector of input pixels.
    Type: Application
    Filed: January 10, 2023
    Publication date: July 11, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Matthew Vilim, Raghu Prabhakar, Matt Feldman, Yaqi Zhang
  • Publication number: 20240233068
    Abstract: A statically reconfigurable dataflow architecture processor (SRDAP) performs an N-dimensional affine transform specified by a matrix on an input image to produce an output image includes pattern compute units (PCUs) and pattern memory units (PMUs) interconnected by switches. PCUs have vector pipelines of functional units that perform operations on operands received from previous pipeline stages, another PCU, and/or PMUs. PMUs have memories loadable with the input image. The PCUs and PMUs are statically reconfigurable to, for all the output pixels: apply the matrix to vectors of output pixel coordinates to calculate corresponding vectors of input pixel coordinates, flatten the vectors of input pixel coordinates into vectors of PMU addresses of the input pixels, read values of the input pixels from the PMUs at the calculated input pixel addresses, and write vectors of the input pixel values to PMUs to form the output image.
    Type: Application
    Filed: January 10, 2023
    Publication date: July 11, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Matthew Vilim, Raghu Prabhakar, Matt Feldman, Yaqi Zhang
  • Publication number: 20240220325
    Abstract: A computer system includes an array of reconfigurable processor blocks which execute fragments of a larger data processing operation. An array controller distributes a control signal to the reconfigurable processors in the array and receives control signals for the respective execution fragments. The control signal may include quiesce logic or other control methods to execute the effective execution fragments of the larger data processing operation when individual processors become available.
    Type: Application
    Filed: March 12, 2024
    Publication date: July 4, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Manish K. Shah, Pramod Nataraja, David Brian Jackson, Kin Hing Leung, Ram Sivaramakrishnan, Sumti Jairath, Gregory Frederick Grohoski
  • Publication number: 20240192935
    Abstract: A compiler generates a configuration file to configure a fracturable data path in a coarse-grained reconfigurable processor. The configuration file, when loaded into the reconfigurable processor enables a fracturable data path in a configurable unit of the reconfigurable processor to produce multiple independent address sequences by analyzing two address calculations to determine the number of pipeline stages for each calculation. The configuration file includes first and second configuration data for distinct sets of computational stages within the pipelined computation stages, allowing the processor to generate a first address sequence using N pipeline stages and a second address sequence using M pipeline stages, where N and M are positive integers.
    Type: Application
    Filed: February 21, 2024
    Publication date: June 13, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Raghu PRABHAKAR, David Brian JACKSON, Scott BURSON
  • Patent number: 12001936
    Abstract: A processing graph of an application with a sequence of processing nodes is obtained which processes an input and generates an intermediate representation a further intermediate representation, and an output representation of the input at stages in the sequence of processing nodes. Graph metadata is generated that specifies a non-overlapping target tiling configuration for the output representation, an overlapping tiling configuration for the input, an overlapping tiling configuration for the intermediate representation, and a third tiling configuration for the further intermediate representation. The processing graph is modified based on the graph metadata to conform to the parameters specified by the graph metadata. A set of computer instructions is then created to execute the modified processing graph on a target processing system.
    Type: Grant
    Filed: March 21, 2022
    Date of Patent: June 4, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Tejas Nagendra Babu Nama, Ruddhi Chaphekar, Ram Sivaramakrishnan, Raghu Prabhakar, Sumti Jairath, Junjue Wang, Kaizhao Liang, Adi Fuchs, Matheen Musaddiq, Arvind Krishna Sujeeth
  • Patent number: 11995529
    Abstract: Disclosed is a data processing system that includes compile time logic to section a graph into a sequence of sections including a first section and a second section. The compile time logic is to configure the first section with a first topology of tiling configurations in which to tile inputs, intermediate outputs, and final outputs of the first section, and configure the second section with a second topology of tiling configurations in which to tile inputs, intermediate outputs, and final outputs of the second section. The data processing system further includes runtime logic configured with the compile time logic to execute the first section to generate the inputs, intermediate outputs, and final outputs of the first section in the first topology of tiling configurations, and execute the second section to generate the inputs, intermediate outputs, and final outputs of the second section in the second topology of tiling configurations.
    Type: Grant
    Filed: June 30, 2021
    Date of Patent: May 28, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Tejas Nagendra Babu Nama, Ruddhi Chaphekar, Ram Sivaramakrishnan, Raghu Prabhakar, Sumti Jairath, Junjue Wang, Kaizhao Liang, Adi Fuchs, Matheen Musaddiq, Arvind Krishna Sujeeth
  • Publication number: 20240168913
    Abstract: Disclosed is a method that includes sectioning a graph into a sequence of sections, the sequence of sections including at least a first section followed by a second section. The first section is configured to generate a first output in a first target tiling configuration in response to processing a first input in a first input tiling configuration. The graph is configured to reconfigure the first output in the first target tiling configuration to a second input in a second input tiling configuration. The second section is configured to generate a second output in a second target tiling configuration in response to processing the second input in the second input tiling configuration.
    Type: Application
    Filed: November 24, 2023
    Publication date: May 23, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Tejas Nagendra Babu NAMA, Ruddhi CHAPHEKAR, Ram SIVARAMAKRISHNAN, Raghu PRABHAKAR, Sumti JAIRATH, Junjue WANG, Kaizhao LIANG, Adi FUCHS, Matheen MUSADDIQ, Arvind Krishna SUJEETH
  • Patent number: 11983140
    Abstract: A reconfigurable data processor comprises a bus system, and an array of configurable units connected to the bus system, configurable units in the array including configuration data stores to store unit files comprising a plurality of sub-files of configuration data particular to the corresponding configurable units. A configuration unload controller connected to the bus system, including logic to execute an array configuration unload process, including distributing a command to a plurality of the configurable units in the array to unload the unit files particular to the corresponding configurable units, the unit files each comprising a plurality of ordered sub-files, receiving sub-files via the bus system from the array of configurable units, and assembling an unload configuration file by arranging the received sub-files in memory according to the configurable unit of the unit file of which the sub-file is a part, and the order of the sub-file in the unit file.
    Type: Grant
    Filed: November 22, 2021
    Date of Patent: May 14, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Manish K. Shah, Ram Sivaramakrishnan, Mark Luttrell, David B. Jackson, Raghu Prabhakar, Sumti Jairath, Gregory Frederick Grohoski, Pramod Nataraja
  • Patent number: 11971846
    Abstract: A logic unit in an array of processing units is configurable to consume source tokens and a status signal and to produce barrier tokens and an enable signal based on the source tokens and the status signal.
    Type: Grant
    Filed: February 14, 2023
    Date of Patent: April 30, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Manish K. Shah, Ram Sivaramakrishnan, Pramod Nataraja, David Brian Jackson, Gregory Frederick Grohoski
  • Publication number: 20240094794
    Abstract: An integrated circuit (IC) includes an array of statically reconfigurable compute units for separation into mutually exclusive groups. Each group includes statically reconfigurable number of compute units. Each compute unit includes a register statically reconfigurable with a group identifier that identifies which group the compute unit belongs to, a counter statically reconfigurable to synchronously increment with the counters of all the other compute units such that all the counters have the same value each clock cycle, and control circuitry that prevents the compute unit from starting to process data until the counter value matches the identifier. According to operation of the register, the counter, and the control circuitry, no more than the statically reconfigurable number of the compute units are allowed to start processing data concurrently to mitigate supply voltage droop caused by a time rate of change of current drawn by the IC through inductive loads of the IC.
    Type: Application
    Filed: April 8, 2023
    Publication date: March 21, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Darshan GANDHI, Manish K. SHAH, Raghu PRABHAKAR, Gregory Frederick GROHOSKI, Youngmoon CHOI, Jinuk SHIN
  • Patent number: 11934343
    Abstract: Disclosed is a data processing system to receive a processing graph of an application. A compile time logic is configured to modify the processing graph and generate a modified processing graph. The modified processing graph is configured to apply a post-padding tiling after applying a cumulative input padding that confines padding to an input. The cumulative input padding pads the input into a padded input. The post-padding tiling tiles the padded input into a set of pre-padded input tiles with a same tile size, tiles intermediate representation of the input into a set of intermediate tiles with a same tile size, and tiles output representation of the input into a set of non-overlapping output tiles with a same tile size. Runtime logic is configured with the compile time logic to execute the modified processing graph to execute the application.
    Type: Grant
    Filed: July 23, 2021
    Date of Patent: March 19, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Tejas Nagendra Babu Nama, Ruddhi Chaphekar, Ram Sivaramakrishnan, Raghu Prabhakar, Sumti Jairath, Junjue Wang, Kaizhao Liang, Adi Fuchs, Matheen Musaddiq, Arvind Krishna Sujeeth
  • Publication number: 20240085965
    Abstract: An integrated circuit (IC) includes an array of compute units. Each compute unit is configured such that, when transitioning from not processing data to processing data, the compute unit makes an individual contribution to an aggregate time rate of change of current drawn by the IC. Control circuitry is configurable to, for each compute unit of the array of compute units, control when the compute unit is eligible to transition from not processing data to processing data relative to when the other compute units start processing data to mitigate supply voltage droop caused by the aggregate time rate of change of current drawn by the IC through inductive loads of the IC.
    Type: Application
    Filed: April 8, 2023
    Publication date: March 14, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Darshan GANDHI, Manish K. SHAH, Raghu PRABHAKAR, Gregory Frederick GROHOSKI, Youngmoon CHOI, Jinuk SHIN
  • Publication number: 20240085967
    Abstract: An integrated circuit (IC) includes an array of compute units. Each compute unit is configured such that, when transitioning from not processing data to processing data, the compute unit makes an individual contribution to an aggregate time rate of change of current drawn by the IC. Control circuitry is configurable to, for each compute unit of the array of compute units, control when the compute unit is eligible to transition from not processing data to processing data relative to when the other compute units start processing data to mitigate supply voltage overshoot caused by the aggregate time rate of change of current drawn by the IC through inductive loads of the IC.
    Type: Application
    Filed: April 8, 2023
    Publication date: March 14, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Darshan GANDHI, Manish K. SHAH, Raghu PRABHAKAR, Gregory Frederick GROHOSKI, Youngmoon CHOI, Jinuk SHIN
  • Publication number: 20240085966
    Abstract: A method includes analyzing a dataflow graph to generate configuration information loadable into an integrated circuit. The dataflow graph specifies operations to be performed and data dependencies between the operations. The configuration information is usable by the integrated circuit to configure compute units of the integrated circuit to perform respective one or more of the operations of the dataflow graph, control data flow between the compute units to accomplish the data dependencies between the respective operations performed by the compute units, and control when each compute unit starts to perform the respective operations on the data to mitigate supply voltage droop caused by a time rate of change of current drawn by the integrated circuit through inductive loads of the integrated circuit.
    Type: Application
    Filed: April 8, 2023
    Publication date: March 14, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Darshan GANDHI, Manish K. SHAH, Raghu PRABHAKAR, Gregory Frederick GROHOSKI, Youngmoon CHOI, Jinuk SHIN
  • Patent number: 11928512
    Abstract: A reconfigurable data processor comprises an array of configurable units configurable to allocate a plurality of sets of configurable units in the array to implement respective execution fragments of the data processing operation. Quiesce logic is coupled to configurable units in the array, configurable to respond to a quiesce control signal to quiesce the sets of configurable units in the array on quiesce boundaries of the respective execution fragments, and to forward quiesce ready signals for the respective execution fragments when the corresponding sets of processing units are ready. An array quiesce controller distributes the quiesce control signal to configurable units in the array, and receives quiesce ready signals for the respective execution fragments from the quiesce logic.
    Type: Grant
    Filed: May 17, 2021
    Date of Patent: March 12, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Manish K. Shah, Pramod Nataraja, David Brian Jackson, Kin Hing Leung, Ram Sivaramakrishnan, Sumti Jairath, Gregory Frederick Grohoski
  • Patent number: 11928445
    Abstract: A complier produces a configuration file to configure a fracturable data path of a configurable unit in a coarse-grained reconfigurable processor to concurrently generate different address sequences generated using different address associated with different operations. The fracturable data path includes multiple computation stages respectively including a pipeline register. The compiler analyzes a first address calculation and a second address calculation and assigns a first set of stages to the first operation to generate the first address sequence and a second set of stages to the second operation to generate the second address sequence using the second set of stages, based on the analysis. A configuration file for the configurable unit is generated by the compiler that assigns the first set of stages to the first operation and the second set of stages to the second operation and includes two or more immediate values for each computation stage.
    Type: Grant
    Filed: January 19, 2023
    Date of Patent: March 12, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, David Brian Jackson, Scott Burson
  • Publication number: 20240069959
    Abstract: A system includes a coarse-grained reconfigurable (CGR) processor and a compiler configured to generate one or more configuration files for an application for execution on the CGR processor including an array of pattern compute units (PCUs) and pattern memory units (PMUs). A PCU is configured to perform an operation. A PMU includes operation-specific data related to the operation. The PMU is coupled to the PCU via a multi-segment datapath pipeline. The CGR processor is coupled to configure a segment of the datapath pipeline using a set of configurations bits corresponding to the operation-specific data to activate to the segment, to communicate the operation-specific data to the PCU via the activated segment. A finite state machine (FSM) is configured to progress through a plurality of states corresponding to the plurality of PMU contexts and allow the PMU to switch among multiple PMU contexts sequentially or concurrently.
    Type: Application
    Filed: August 23, 2023
    Publication date: February 29, 2024
    Applicant: SambaNova Systems, Inc.
    Inventor: Raghu PRABHAKAR
  • Publication number: 20240070111
    Abstract: A reconfigurable processing unit is disclosed, comprising a first internal network and a second internal network with different protocols, an interface to an external network with a different protocol, a first configurable unit connected to the first internal network, a second configurable unit connected to both the first internal network and the second internal network, and a third configurable unit connected to both the second internal network and the interface to the external network. The third configurable unit is configured to receive a payload from the external network and send the transaction type identifier and the source application ID to the second configurable unit over the second internal network. The second configurable unit sends information to the first configurable unit based on the transaction type identifier and the source application ID matching the local application ID retrieved from the register.
    Type: Application
    Filed: October 25, 2023
    Publication date: February 29, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Manish K. SHAH, Ram SIVARAMAKRISHNAN, Gregory Frederick GROHOSKI, Raghu PRABHAKAR
  • Publication number: 20240069770
    Abstract: A system includes a coarse-grained reconfigurable (CGR) processor and a compiler configured to generate one or more configuration files for an application for execution on the CGR processor including an array of pattern compute units (PCUs) and pattern memory units (PMUs). A PCU is configured to perform an operation. A PMU comprises a plurality of data structures including a plurality of portions of operation-specific data related to the operation. The PMU is coupled to the PCU via a multi-segment datapath pipeline. The CGR processor is coupled to configure a segment of the datapath pipeline using a set of configurations bits corresponding to a portion of the operation-specific data related to the operation to activate to the segment, to further communicate the operation-specific data to the PCU via the activated segment. The CGR processor is coupled to switch among multiple PMU contexts in various segments sequentially to concurrently.
    Type: Application
    Filed: August 22, 2023
    Publication date: February 29, 2024
    Applicant: SambaNova Systems, Inc.
    Inventor: Raghu PRABHAKAR