Patents by Inventor Ram Sivaramakrishnan

Ram Sivaramakrishnan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Lossless tiling in convolution networks—graph metadata generation

Patent number: 12001936

Abstract: A processing graph of an application with a sequence of processing nodes is obtained which processes an input and generates an intermediate representation a further intermediate representation, and an output representation of the input at stages in the sequence of processing nodes. Graph metadata is generated that specifies a non-overlapping target tiling configuration for the output representation, an overlapping tiling configuration for the input, an overlapping tiling configuration for the intermediate representation, and a third tiling configuration for the further intermediate representation. The processing graph is modified based on the graph metadata to conform to the parameters specified by the graph metadata. A set of computer instructions is then created to execute the modified processing graph on a target processing system.

Type: Grant

Filed: March 21, 2022

Date of Patent: June 4, 2024

Assignee: SambaNova Systems, Inc.

Inventors: Tejas Nagendra Babu Nama, Ruddhi Chaphekar, Ram Sivaramakrishnan, Raghu Prabhakar, Sumti Jairath, Junjue Wang, Kaizhao Liang, Adi Fuchs, Matheen Musaddiq, Arvind Krishna Sujeeth
Lossless tiling in convolution networks—tiling configuration for a sequence of sections of a graph

Patent number: 11995529

Abstract: Disclosed is a data processing system that includes compile time logic to section a graph into a sequence of sections including a first section and a second section. The compile time logic is to configure the first section with a first topology of tiling configurations in which to tile inputs, intermediate outputs, and final outputs of the first section, and configure the second section with a second topology of tiling configurations in which to tile inputs, intermediate outputs, and final outputs of the second section. The data processing system further includes runtime logic configured with the compile time logic to execute the first section to generate the inputs, intermediate outputs, and final outputs of the first section in the first topology of tiling configurations, and execute the second section to generate the inputs, intermediate outputs, and final outputs of the second section in the second topology of tiling configurations.

Type: Grant

Filed: June 30, 2021

Date of Patent: May 28, 2024

Assignee: SambaNova Systems, Inc.

Inventors: Tejas Nagendra Babu Nama, Ruddhi Chaphekar, Ram Sivaramakrishnan, Raghu Prabhakar, Sumti Jairath, Junjue Wang, Kaizhao Liang, Adi Fuchs, Matheen Musaddiq, Arvind Krishna Sujeeth
LOSSLESS TILING IN CONVOLUTION NETWORKS - TILING CONFIGURATION BETWEEN TWO SECTIONS

Publication number: 20240168913

Abstract: Disclosed is a method that includes sectioning a graph into a sequence of sections, the sequence of sections including at least a first section followed by a second section. The first section is configured to generate a first output in a first target tiling configuration in response to processing a first input in a first input tiling configuration. The graph is configured to reconfigure the first output in the first target tiling configuration to a second input in a second input tiling configuration. The second section is configured to generate a second output in a second target tiling configuration in response to processing the second input in the second input tiling configuration.

Type: Application

Filed: November 24, 2023

Publication date: May 23, 2024

Applicant: SambaNova Systems, Inc.

Inventors: Tejas Nagendra Babu NAMA, Ruddhi CHAPHEKAR, Ram SIVARAMAKRISHNAN, Raghu PRABHAKAR, Sumti JAIRATH, Junjue WANG, Kaizhao LIANG, Adi FUCHS, Matheen MUSADDIQ, Arvind Krishna SUJEETH
Efficient deconfiguration of a reconfigurable data processor

Patent number: 11983140

Abstract: A reconfigurable data processor comprises a bus system, and an array of configurable units connected to the bus system, configurable units in the array including configuration data stores to store unit files comprising a plurality of sub-files of configuration data particular to the corresponding configurable units. A configuration unload controller connected to the bus system, including logic to execute an array configuration unload process, including distributing a command to a plurality of the configurable units in the array to unload the unit files particular to the corresponding configurable units, the unit files each comprising a plurality of ordered sub-files, receiving sub-files via the bus system from the array of configurable units, and assembling an unload configuration file by arranging the received sub-files in memory according to the configurable unit of the unit file of which the sub-file is a part, and the order of the sub-file in the unit file.

Type: Grant

Filed: November 22, 2021

Date of Patent: May 14, 2024

Assignee: SambaNova Systems, Inc.

Inventors: Manish K. Shah, Ram Sivaramakrishnan, Mark Luttrell, David B. Jackson, Raghu Prabhakar, Sumti Jairath, Gregory Frederick Grohoski, Pramod Nataraja
Logic unit for a reconfigurable processor

Patent number: 11971846

Abstract: A logic unit in an array of processing units is configurable to consume source tokens and a status signal and to produce barrier tokens and an enable signal based on the source tokens and the status signal.

Type: Grant

Filed: February 14, 2023

Date of Patent: April 30, 2024

Assignee: SambaNova Systems, Inc.

Inventors: Raghu Prabhakar, Manish K. Shah, Ram Sivaramakrishnan, Pramod Nataraja, David Brian Jackson, Gregory Frederick Grohoski
Lossless tiling in convolution networks-backward pass

Patent number: 11934343

Abstract: Disclosed is a data processing system to receive a processing graph of an application. A compile time logic is configured to modify the processing graph and generate a modified processing graph. The modified processing graph is configured to apply a post-padding tiling after applying a cumulative input padding that confines padding to an input. The cumulative input padding pads the input into a padded input. The post-padding tiling tiles the padded input into a set of pre-padded input tiles with a same tile size, tiles intermediate representation of the input into a set of intermediate tiles with a same tile size, and tiles output representation of the input into a set of non-overlapping output tiles with a same tile size. Runtime logic is configured with the compile time logic to execute the modified processing graph to execute the application.

Type: Grant

Filed: July 23, 2021

Date of Patent: March 19, 2024

Assignee: SambaNova Systems, Inc.

Inventors: Tejas Nagendra Babu Nama, Ruddhi Chaphekar, Ram Sivaramakrishnan, Raghu Prabhakar, Sumti Jairath, Junjue Wang, Kaizhao Liang, Adi Fuchs, Matheen Musaddiq, Arvind Krishna Sujeeth
Quiesce reconfigurable data processor

Patent number: 11928512

Abstract: A reconfigurable data processor comprises an array of configurable units configurable to allocate a plurality of sets of configurable units in the array to implement respective execution fragments of the data processing operation. Quiesce logic is coupled to configurable units in the array, configurable to respond to a quiesce control signal to quiesce the sets of configurable units in the array on quiesce boundaries of the respective execution fragments, and to forward quiesce ready signals for the respective execution fragments when the corresponding sets of processing units are ready. An array quiesce controller distributes the quiesce control signal to configurable units in the array, and receives quiesce ready signals for the respective execution fragments from the quiesce logic.

Type: Grant

Filed: May 17, 2021

Date of Patent: March 12, 2024

Assignee: SambaNova Systems, Inc.

Inventors: Raghu Prabhakar, Manish K. Shah, Pramod Nataraja, David Brian Jackson, Kin Hing Leung, Ram Sivaramakrishnan, Sumti Jairath, Gregory Frederick Grohoski
RECONFIGURABLE DATAFLOW UNIT WITH REMOTE READ/WRITE FUNCTIONALITY

Publication number: 20240073136

Abstract: A reconfigurable processing unit is disclosed, comprising a first internal network and a second internal network with different protocols, an interface to an external network with a different protocol, a first configurable unit sending a request to access an external memory over the first internal network, a second configurable unit receiving the request on the first internal network, obtaining a memory address, determining an identifier for the target reconfigurable processing unit, and sending the request, identifier, and memory address over the second internal network, and a third configurable unit receiving the request, identifier, and memory address on the second internal network, determining a routable address on the external network based on the identifier, synthesizing a payload with the request, address, and identifier, and sending the payload to the routable address on the external network.

Type: Application

Filed: October 25, 2023

Publication date: February 29, 2024

Applicant: SambaNova Systems, Inc.

Inventors: Manish K. SHAH, Ram SIVARAMAKRISHNAN, Gregory Frederick GROHOSKI, Raghu PRABHAKAR
RECONFIGURABLE DATAFLOW UNIT HAVING REMOTE FIFO MANAGEMENT FUNCTIONALITY

Publication number: 20240070106

Abstract: A reconfigurable processing unit includes a first and second internal network, an interface to an external network, a first configurable unit coupled to the first internal network, a second configurable unit coupled to both internal networks, and a third configurable unit coupled to both the second internal network and the interface to the external network. The third configurable unit is configured to receive a payload containing a transaction type identifier and an identifier of the second configurable unit through the interface to the external network, and send a first packet including the transaction type identifier to the second configurable unit over the second internal network. The second configurable unit is configured to increment a counter in response to a particular transaction type identifier, and send a token to the first configurable unit over the first internal network while the counter is non-zero and the first configurable unit is executing.

Type: Application

Filed: October 25, 2023

Publication date: February 29, 2024

Applicant: SambaNova Systems, Inc.

Inventors: Manish K. SHAH, Ram SIVARAMAKRISHNAN, Gregory Frederick GROHOSKI, Raghu PRABHAKAR
RECONFIGURABLE DATAFLOW UNIT WITH STREAMING WRITE FUNCTIONALITY

Publication number: 20240070111

Abstract: A reconfigurable processing unit is disclosed, comprising a first internal network and a second internal network with different protocols, an interface to an external network with a different protocol, a first configurable unit connected to the first internal network, a second configurable unit connected to both the first internal network and the second internal network, and a third configurable unit connected to both the second internal network and the interface to the external network. The third configurable unit is configured to receive a payload from the external network and send the transaction type identifier and the source application ID to the second configurable unit over the second internal network. The second configurable unit sends information to the first configurable unit based on the transaction type identifier and the source application ID matching the local application ID retrieved from the register.

Type: Application

Filed: October 25, 2023

Publication date: February 29, 2024

Applicant: SambaNova Systems, Inc.

Inventors: Manish K. SHAH, Ram SIVARAMAKRISHNAN, Gregory Frederick GROHOSKI, Raghu PRABHAKAR
PEER-TO-PEER COMMUNICATION BETWEEN RECONFIGURABLE DATAFLOW UNITS

Publication number: 20240073129

Abstract: A computing system is disclosed, comprising a plurality of interconnected reconfigurable dataflow units (RDUs). Each RDU includes configurable units, internal networks, and external interfaces. The first configurable unit of the first RDU sends a request to access an external memory attached to the second RDU over its first internal network. The second configurable unit of the first RDU obtains a memory address for the request, determines an identifier for the second RDU, and sends the request, identifier, and memory address to the third configurable unit of the first RDU over its second internal network. The third configurable unit of the first RDU generates a routable address on the external network, synthesizes a payload, and sends it through an external network interface. The third configurable unit of the second RDU receives the payload, and the fourth configurable unit of the second RDU uses the address to access the external memory.

Type: Application

Filed: October 25, 2023

Publication date: February 29, 2024

Applicant: SambaNova Systems, Inc.

Inventors: Manish K. SHAH, Ram SIVARAMAKRISHNAN, Gregory Frederick GROHOSKI, Raghu PRABHAKAR
Inter-node execution of configuration files on reconfigurable processors using network interface controller (NIC) buffers

Patent number: 11886931

Abstract: The technology disclosed relates to inter-node execution of configuration files on reconfigurable processors using network interface controller (NIC) buffers. In particular, the technology disclosed relates to a runtime logic that is configured to execute configuration files that define applications and application data for applications using a first reconfigurable processor connected to a first host, and a second reconfigurable processor connected to a second host. The first reconfigurable processor is configured to push input data for the applications in a first plurality of buffers. The first host is configured to cause a first network interface controller (NIC) to stream the input data to a second plurality of buffers from the first plurality of buffers. The second host is configured to cause a second NIC to stream the input data to the second reconfigurable processor from the second plurality of buffers.

Type: Grant

Filed: November 9, 2021

Date of Patent: January 30, 2024

Assignee: SambaNova Systems, Inc.

Inventors: Ram Sivaramakrishnan, Sumti Jairath, Emre Ali Burhan, Manish K. Shah, Raghu Prabhakar, Ravinder Kumar, Arnav Goel, Ranen Chatterjee, Gregory Frederick Grohoski, Kin Hing Leung, Dawei Huang, Manoj Unnikrishnan, Martin Russell Raumann, Bandish B. Shah
Runtime execution of functions across reconfigurable processor

Patent number: 11886930

Abstract: The technology disclosed relates to runtime execution of functions across reconfigurable processor. In particular, the technology disclosed relates to a runtime logic that is configured to execute a first set of functions in a plurality of functions and/or data therefor on a first reconfigurable processor, and a second set of functions in the plurality of functions and/or data therefor on additional reconfigurable processors. Functions in the second set of functions and/or the data therefor are transmitted to the additional reconfigurable processors using one or more of a first reconfigurable processor-to-additional reconfigurable processors buffers, and results of executing the functions and/or the data therefor on the additional reconfigurable processors are transmitted to the first reconfigurable processor using one or more of additional reconfigurable processors-to-first reconfigurable processor buffers.

Type: Grant

Filed: November 9, 2021

Date of Patent: January 30, 2024

Assignee: SambaNova Systems, Inc.

Inventors: Ram Sivaramakrishnan, Sumti Jairath, Emre Ali Burhan, Manish K. Shah, Raghu Prabhakar, Ravinder Kumar, Arnav Goel, Ranen Chatterjee, Gregory Frederick Grohoski, Kin Hing Leung, Dawei Huang, Manoj Unnikrishnan, Martin Russell Raumann, Bandish B. Shah
Skip Buffer Splitting

Publication number: 20230385043

Abstract: A compiler transforms a high-level program into configuration data for a coarse-grained reconfigurable (CGR) data processor with an array of CGR units. The compiler includes a method that identifies a skip buffer in a dataflow graph, determines limitations associated with the array, and searches for a lowest cost implementation topology and stage depth. At least three topologies are considered, including a cascaded buffer topology, a hybrid buffer topology, and a striped buffer topology. The lowest cost implementation topology and stage depth are based on the size of the buffered data (usually, the size of a tensor), the depth of the skip buffer, and the array's limitations. The hybrid buffer topology includes multiple sections of parallel memory units. The data travels between memory units in one section to adjacent memory units in a next section without intervening reorder buffers.

Type: Application

Filed: September 14, 2022

Publication date: November 30, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Nathan SHEELEY, Weihang FAN, Matheen MUSADDIQ, Ram SIVARAMAKRISHNAN
USING INTEGRATED MATRICES IN BACK PROPAGATION COMPUTATIONS

Publication number: 20230367845

Abstract: A method comprises executing (K+P) number of transposition cycles to generate a transpose-extended matrix having N rows and (K+P) columns, in which columns 1 to K comprise a transposition of a first matrix having K rows and N columns, and columns (K+1) to (K+P) comprise constants or elements of an N×1 matrix. The method includes computing a sum-product of a row of a second matrix, having M rows and N columns, multiplied by a column among columns 1 to K of the transpose-extended matrix; and, computing a second sum-product of the row of the second matrix multiplied by a column among columns (K+1) to (K+P) of the transpose-extended matrix. The sum-products can comprise gradients of input matrices. A transpose processing unit can execute the transposition cycles to read K rows of the first matrix and insert P number of constant or N×1 columns to generate the transpose-extended matrix.

Type: Application

Filed: July 24, 2023

Publication date: November 16, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Pramod NATARAJA, Raghu PRABHAKAR, David Brian JACKSON, Ram SIVARAMAKRISHNAN
MATRIX SUMMATION USING INTEGRATED MATRICES WITH SCALAR INJECTION

Publication number: 20230367844

Abstract: A computing method comprises generating an integrated matrix having (K+P) number of columns, columns 1 through K of the integrated matrix comprising columns 1 through K of a multiplicand matrix and columns (K+1) though P of the integrated matrix comprising addend columns. The method computes K number of products of elements of a row of the integrated matrix multiplied by elements of a column of a second multiplicand matrix; computes a (K+1) product comprising an element of an addend column multiplied by a constant; and, computes a sum of the K number of products added to the (K+1) product. The sum is equivalent to a sum of products of a column of the M×K matrix multiplied by a row of the K×N matrix added to the an element of an addend column of the integrated matrix. A computing system and a computer program product can implement the method.

Type: Application

Filed: July 24, 2023

Publication date: November 16, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Pramod NATARAJA, Raghu PRABHAKAR, David Brian JACKSON, Ram SIVARAMAKRISHNAN
Runtime patching of configuration files

Patent number: 11782729

Abstract: A data processing system comprises a pool of reconfigurable data flow resources and a runtime processor. The pool of reconfigurable data flow resources includes arrays of physical configurable units and memory. The runtime processor includes logic to receive a plurality of configuration files for user applications. The configuration files include configurations of virtual data flow resources required to execute the user applications. The runtime processor also includes logic to allocate physical configurable units and memory in the pool of reconfigurable data flow resources to the virtual data flow resources and load the configuration files to the allocated physical configurable units. The runtime processor further includes logic to execute the user applications using the allocated physical configurable units and memory.

Type: Grant

Filed: August 18, 2020

Date of Patent: October 10, 2023

Assignee: SambaNova Systems, Inc.

Inventors: Gregory Frederick Grohoski, Manish K. Shah, Raghu Prabhakar, Mark Luttrell, Ravinder Kumar, Kin Hing Leung, Ranen Chatterjee, Sumti Jairath, David Alan Koeplinger, Ram Sivaramakrishnan, Matthew Thomas Grimm
TOP LEVEL NETWORK AND ARRAY LEVEL NETWORK FOR RECONFIGURABLE DATA PROCESSORS

Publication number: 20230289310

Abstract: A reconfigurable data processor comprises an array of configurable units and a bus system. The bus system is connected to the array of configurable units. The bus system includes a top level network and an array level network. The top level network is connected to an external data interface for communication with memory outside of the array of configurable units. The array level network is connected to configurable units in the array of configurable units.

Type: Application

Filed: May 18, 2023

Publication date: September 14, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Gregory Frederick GROHOSKI, Sumti JAIRATH, Mark LUTTRELL, Raghu PRABHAKAR, Ram SIVARAMAKRISHNAN, Manish K. SHAH
Matrix Multiplication on Coarse-grained Computing Grids

Publication number: 20230244748

Abstract: A method for multiplying matrices in a coarse-grained computing grid includes assigning each compute unit c of C compute units to a unique submatrix Rc of a result matrix R, wherein the C compute units are arranged in a 2D computing grid, configuring one or more source memory units to provide relevant matrix A data and matrix B data to the C compute units via a plurality of packets, configuring each compute unit c to produce the unique submatrix Rc and send the unique submatrix Rc to one or more desired memory units. The method also includes initiating data flow in the computing grid to produce the result matrix R within the desired memory units. To reduce packet traffic, Matrix B data corresponding to a column of compute units may be narrow-casted to each column of compute units. A corresponding system and computer-readable medium are also disclosed herein.

Type: Application

Filed: May 25, 2022

Publication date: August 3, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Pramod Natarja, Sitanshu Gupta, Ram Sivaramakrishnan, Ajit Punj
LOGIC UNIT FOR A RECONFIGURABLE PROCESSOR

Publication number: 20230195686

Abstract: A logic unit in an array of processing units is configurable to consume source tokens and a status signal and to produce barrier tokens and an enable signal based on the source tokens and the status signal.

Type: Application

Filed: February 14, 2023

Publication date: June 22, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Raghu PRABHAKAR, Manish K. SHAH, Ram SIVARAMAKRISHNAN, Pramod NATARAJA, David Brian JACKSON, Gregory Frederick GROHOSKI

1 2 3 4 next