Patents by Inventor Sumti Jairath

Sumti Jairath has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11934343
    Abstract: Disclosed is a data processing system to receive a processing graph of an application. A compile time logic is configured to modify the processing graph and generate a modified processing graph. The modified processing graph is configured to apply a post-padding tiling after applying a cumulative input padding that confines padding to an input. The cumulative input padding pads the input into a padded input. The post-padding tiling tiles the padded input into a set of pre-padded input tiles with a same tile size, tiles intermediate representation of the input into a set of intermediate tiles with a same tile size, and tiles output representation of the input into a set of non-overlapping output tiles with a same tile size. Runtime logic is configured with the compile time logic to execute the modified processing graph to execute the application.
    Type: Grant
    Filed: July 23, 2021
    Date of Patent: March 19, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Tejas Nagendra Babu Nama, Ruddhi Chaphekar, Ram Sivaramakrishnan, Raghu Prabhakar, Sumti Jairath, Junjue Wang, Kaizhao Liang, Adi Fuchs, Matheen Musaddiq, Arvind Krishna Sujeeth
  • Patent number: 11928512
    Abstract: A reconfigurable data processor comprises an array of configurable units configurable to allocate a plurality of sets of configurable units in the array to implement respective execution fragments of the data processing operation. Quiesce logic is coupled to configurable units in the array, configurable to respond to a quiesce control signal to quiesce the sets of configurable units in the array on quiesce boundaries of the respective execution fragments, and to forward quiesce ready signals for the respective execution fragments when the corresponding sets of processing units are ready. An array quiesce controller distributes the quiesce control signal to configurable units in the array, and receives quiesce ready signals for the respective execution fragments from the quiesce logic.
    Type: Grant
    Filed: May 17, 2021
    Date of Patent: March 12, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Manish K. Shah, Pramod Nataraja, David Brian Jackson, Kin Hing Leung, Ram Sivaramakrishnan, Sumti Jairath, Gregory Frederick Grohoski
  • Publication number: 20240078098
    Abstract: In a method, in response to an interface a computer-implemented analysis assistant initiates a presentation of inefficiency results, determined an efficiency analyzer based on a mapping of a dataflow program to execute on hardware of a computing system. The assistant receives an inefficiency included among the inefficiency results and composes formatted inefficiency results comprising a presentation format of the inefficiency to assist a developer of the dataflow program to interpret the inefficiency. The analysis assistant outputs the formatted inefficiency results to an interface, which can comprise an interface to output the formatted inefficiency results for use by the developer to improve the dataflow program in association with the inefficiency. In implementations the presentation can comprise an interactive presentation with a developer of the dataflow program. A computer program product and a computing system can implement the method.
    Type: Application
    Filed: November 8, 2023
    Publication date: March 7, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Blaine RISTER, Qingjian LI, Bowen YANG, Junjue WANG, Chen LIU, Zhuo CHEN, Arvind SUJEETH, Sumti JAIRATH
  • Publication number: 20240069880
    Abstract: In a method a computer-implemented efficiency analyzer selects operators from an intermediate representation of a dataflow program. The operators are included in a mapping of the operators to hardware of a computing system to execute the dataflow program. Based on the mapping and a description of the hardware, the efficiency analyzer computes an execution metric associated with executing the operators on the hardware. Based on the execution metric and hardware description, the efficiency analyzer determines an inefficiency metric, and based on the inefficiency metric, the efficiency analyzer determines an inefficiency associated with the dataflow program. The computing system to execute the dataflow program can comprise a coarse grain computing system and the hardware can include a reconfigurable processor of the computing system. A computer program product and a computing system to a the dataflow program can implement the method.
    Type: Application
    Filed: November 8, 2023
    Publication date: February 29, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Blaine RISTER, Qingjian LI, Bowen YANG, Junjue WANG, Chen LIU, Zhuo CHEN, Arvind SUJEETH, Sumti JAIRATH
  • Patent number: 11893424
    Abstract: A system for training parameters of a neural network includes a processing node with a processor reconfigurable at a first level of configuration granularity and a controller reconfigurable at a finer level of configuration granularity. The processor is configured to execute a first dataflow segment of the neural network with training data to generate a predicted output value using a set of neural network parameters, calculate a first intermediate result for a parameter based on the predicted output value, a target output value, and a parameter gradient, and provide the first intermediate result to the controller. The controller is configured to receive a second intermediate result over a network, and execute a second dataflow segment, dependent upon the first intermediate result and the second intermediate result, to generate a third intermediate result indicative of an update of the parameter.
    Type: Grant
    Filed: January 24, 2022
    Date of Patent: February 6, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Martin Russell Raumann, Qi Zheng, Bandish B. Shah, Ravinder Kumar, Kin Hing Leung, Sumti Jairath, Gregory Frederick Grohoski
  • Publication number: 20240037061
    Abstract: A sorting tool for determining an ordered sequence of nodes in an operation unit graph for placing and routing the operation unit graph onto a reconfigurable processor is presented as well as a method of operating a sorting tool for determining an ordered sequence of nodes in an operation unit graph for placing and routing the operation unit graph onto a reconfigurable processor. The sorting tool is configured to receive the operation unit graph including a set of unsorted nodes and edges that interconnect nodes in the set of unsorted nodes, determine an ordered sequence of the nodes in the operation unit graph, and provide the ordered sequence of nodes for the placing and routing of the operation unit graph onto the reconfigurable processor.
    Type: Application
    Filed: July 25, 2023
    Publication date: February 1, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Hong SUH, Sumti JAIRATH
  • Publication number: 20240036871
    Abstract: A placer and router for an iterative placement and routing of a sorted operation unit graph on a reconfigurable processor is presented as well as a method of operating a placer and router for an iterative placement and routing of a sorted operation unit graph on a reconfigurable processor. The placer and router is configured to receive an architectural specification of the reconfigurable processor and the sorted operation unit graph having an ordered sequence of nodes and edges that interconnect nodes in the ordered sequence of nodes. The placer and router is further configured to provide an assignment of nodes of the sorted operation unit graph to locations on the reconfigurable processor and an assignment of edges of the sorted operation unit graph to physical links and switches of the reconfigurable processor.
    Type: Application
    Filed: July 25, 2023
    Publication date: February 1, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Hong SUH, Sumti JAIRATH
  • Publication number: 20240037063
    Abstract: A placer and router for an iterative placement and routing of a sorted operation unit graph on a reconfigurable processor is presented as well as a method of operating a placer and router for an iterative placement and routing of a sorted operation unit graph on a reconfigurable processor. The placer and router is configured to receive an architectural specification of the reconfigurable processor and the sorted operation unit graph having an ordered sequence of nodes and edges that interconnect nodes in the ordered sequence of nodes. The placer and router is further configured to iteratively assign nodes of the sorted operation unit graph to locations on the reconfigurable processor followed by an assignment of edges that connect nodes that were assigned in the current iteration and nodes that were assigned in previous iterations to interconnection resources of the reconfigurable processor.
    Type: Application
    Filed: July 25, 2023
    Publication date: February 1, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Hong SUH, Sumti JAIRATH
  • Patent number: 11886931
    Abstract: The technology disclosed relates to inter-node execution of configuration files on reconfigurable processors using network interface controller (NIC) buffers. In particular, the technology disclosed relates to a runtime logic that is configured to execute configuration files that define applications and application data for applications using a first reconfigurable processor connected to a first host, and a second reconfigurable processor connected to a second host. The first reconfigurable processor is configured to push input data for the applications in a first plurality of buffers. The first host is configured to cause a first network interface controller (NIC) to stream the input data to a second plurality of buffers from the first plurality of buffers. The second host is configured to cause a second NIC to stream the input data to the second reconfigurable processor from the second plurality of buffers.
    Type: Grant
    Filed: November 9, 2021
    Date of Patent: January 30, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Ram Sivaramakrishnan, Sumti Jairath, Emre Ali Burhan, Manish K. Shah, Raghu Prabhakar, Ravinder Kumar, Arnav Goel, Ranen Chatterjee, Gregory Frederick Grohoski, Kin Hing Leung, Dawei Huang, Manoj Unnikrishnan, Martin Russell Raumann, Bandish B. Shah
  • Patent number: 11886930
    Abstract: The technology disclosed relates to runtime execution of functions across reconfigurable processor. In particular, the technology disclosed relates to a runtime logic that is configured to execute a first set of functions in a plurality of functions and/or data therefor on a first reconfigurable processor, and a second set of functions in the plurality of functions and/or data therefor on additional reconfigurable processors. Functions in the second set of functions and/or the data therefor are transmitted to the additional reconfigurable processors using one or more of a first reconfigurable processor-to-additional reconfigurable processors buffers, and results of executing the functions and/or the data therefor on the additional reconfigurable processors are transmitted to the first reconfigurable processor using one or more of additional reconfigurable processors-to-first reconfigurable processor buffers.
    Type: Grant
    Filed: November 9, 2021
    Date of Patent: January 30, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Ram Sivaramakrishnan, Sumti Jairath, Emre Ali Burhan, Manish K. Shah, Raghu Prabhakar, Ravinder Kumar, Arnav Goel, Ranen Chatterjee, Gregory Frederick Grohoski, Kin Hing Leung, Dawei Huang, Manoj Unnikrishnan, Martin Russell Raumann, Bandish B. Shah
  • Publication number: 20240020264
    Abstract: A cost estimation tool in a system for implementing an operation unit graph on a reconfigurable processor is presented as well as a method of operating a cost estimation tool for determining scaled logical edge bandwidths in an operation unit graph in preparation of placing and routing the operation unit graph onto a reconfigurable processor. The cost estimation tool may be configured to receive the operation unit graph, divide the operation unit graph in first and second subgraphs, determine maximum latencies of the first and second subgraphs, and determine a scaled logical edge bandwidth of a logical edge that couples a first logical unit of M logical units in the first subgraph with a second logical unit of N logical units in the first subgraph based on M, N, and scaled bandwidth limits of the M and N logical units.
    Type: Application
    Filed: July 13, 2023
    Publication date: January 18, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Yue FU, Kin Hing LEUNG, Joshua BROT, Arvind Krishna SUJEETH, Sumti JAIRATH, Andrew DENG, Chris RÉ, Raghu PRABHAKAR
  • Publication number: 20240020265
    Abstract: A system with a cost estimation tool for estimating a realized bandwidth consumption of a logical edge between a logical producer unit and a logical consumer unit of an operation unit graph during placement and routing of the logical producer unit, the logical consumer unit, and the logical edge onto a reconfigurable processor is presented as well as a method of operating such a cost estimation tool and a non-transitory computer-readable storage medium including instructions that, when executed by a processing unit, cause the processing unit to operate such a cost estimation tool The cost estimation tool may be configured to determine the realized bandwidth consumption of the tentative assignment based on an upper bandwidth limit of the logical edge, an end-to-end bandwidth, a scaling factor of a realized bandwidth, and a congestion estimation of the physical link.
    Type: Application
    Filed: July 13, 2023
    Publication date: January 18, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Yue FU, Kin Hing LEUNG, Likun HAO, Arvind Krishna SUJEETH, Sumti JAIRATH, Andrew DENG, Chris RÉ, Raghu PRABHAKAR
  • Publication number: 20240020170
    Abstract: A cost estimation tool in a system for implementing an operation unit graph on a reconfigurable processor is presented as well as a method of operating a cost estimation tool for estimating a cost of implementing an operation unit graph. The operation unit graph may include first and second logical units that perform first and second data operations and have first and second ports, respectively, coupled by a logical edge, on a reconfigurable processor. The method includes receiving the operation unit graph, determining first and second upper bandwidth limits of the first and second ports, respectively, determining a logical edge bandwidth of the logical edge based on the first and second upper bandwidth limits, determining a timing group for the logical edge, and providing the logical edge bandwidth and the timing group as a cost estimation of implementing the operation unit graph on the reconfigurable processor.
    Type: Application
    Filed: July 13, 2023
    Publication date: January 18, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Yue FU, Kin Hing LEUNG, Arvind Krishna SUJEETH, Sumti JAIRATH, Andrew DENG, Chris RÉ, Raghu PRABHAKAR
  • Patent number: 11847395
    Abstract: A system for executing a graph partitioned across a plurality of reconfigurable computing units includes a processing node that has a first computing unit reconfigurable at a first level of configuration granularity and a second computing unit reconfigurable at a second, finer, level of configuration granularity. The first computing unit is configured by a host system to execute a first dataflow segment of the graph using one or more dataflow pipelines to generate a first intermediate result and to provide the first intermediate result to the second computing unit without passing through the host system. The second computing unit is configured by the host system to execute a second dataflow segment of the graph, dependent upon the first intermediate result, to generate a second intermediate result and to send the second intermediate result to a third computing unit, without passing through the host system, to continue execution of the graph.
    Type: Grant
    Filed: January 27, 2022
    Date of Patent: December 19, 2023
    Assignee: SambaNova Systems, Inc.
    Inventors: Martin Russell Raumann, Qi Zheng, Bandish B. Shah, Ravinder Kumar, Kin Hing Leung, Sumti Jairath, Gregory Frederick Grohoski
  • Patent number: 11841811
    Abstract: A reconfigurable processor comprises an array of processing units and an instrumentation network. The array of processing units is configured to execute runtime events to execute an application. The instrumentation network is operatively coupled to the array of processing units. The instrumentation network comprises a control bus configured to form control signal routes in the instrumentation network. The instrumentation network further comprises a plurality of instrumentation counters having inputs and outputs connected to the control bus and to the processing units. Instrumentation counters in the plurality instrumentation units are configurable to consume control signals on the inputs and produce counts of the runtime events on the outputs.
    Type: Grant
    Filed: September 20, 2021
    Date of Patent: December 12, 2023
    Assignee: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Matthew Thomas Grimm, Sumti Jairath, Kin Hing Leung, Sitanshu Gupta, Yuan Lin, Luca Boasso
  • Patent number: 11816560
    Abstract: The technology disclosed relates to allocating available physical compute units (PCUs) and/or physical memory units (PMUs) of a reconfigurable data processor to operation units of an operation unit graph for execution thereof. In particular, it relates to selecting, for evaluation, an intermediate stage compute processing time between lower and upper search bounds of a generic stage compute processing time, determining a pipeline number of the PCUs and/or the PMUs required to process the operation unit graph, and iteratively, initializing new lower and upper search bounds of the generic stage compute processing time and selecting, for evaluation in a next iteration, a new intermediate stage compute processing time taking into account whether the pipeline number of the PCUs and/or the PMUs produced for a prior intermediate stage compute processing time in a previous iteration is lower or higher than the available PCUs and/or PMUs.
    Type: Grant
    Filed: August 8, 2022
    Date of Patent: November 14, 2023
    Assignee: SambaNova Systems, Inc.
    Inventors: Zhuo Chen, Sumti Jairath
  • Publication number: 20230325163
    Abstract: The technology disclosed relates to storing a dataflow graph with a plurality of compute nodes that transmit data along data connections, and controlling data transmission between compute nodes in the plurality of compute nodes along the data connections by using control connections to control writing of data.
    Type: Application
    Filed: June 7, 2023
    Publication date: October 12, 2023
    Applicant: SambaNova Systems, Inc.
    Inventors: Weiwei CHEN, Raghu PRABHAKAR, David Alan KOEPLINGER, Sitanshu GUPTA, Ruddhi CHAPHEKAR, Ajit PUNJ, Sumti JAIRATH
  • Patent number: 11782856
    Abstract: A data processing system comprises memory, compile time logic, runtime logic, and instrumentation profiling logic. The memory stores a dataflow graph for an application. The dataflow graph has a plurality of compute nodes that are configured to be producers to produce data for execution of the application, and to be consumers to consume the data for execution of the application. The compile time logic partitions execution of the dataflow graph into stages. Each of the stages has one or more compute nodes, one or more producers, and one or more consumers. The runtime logic determines a processing latency for each of the stages by calculating time elapsed between producers of a particular stage receiving input data and consumers of the particular stage receiving output data. The instrumentation profiling logic generates performance statistics for the dataflow graph based on the processing latency determined for each of the stages.
    Type: Grant
    Filed: September 20, 2021
    Date of Patent: October 10, 2023
    Assignee: SambaNova Systems, Inc.
    Inventors: Raghu Prabhakar, Matthew Thomas Grimm, Sumti Jairath, Kin Hing Leung, Sitanshu Gupta, Yuan Lin, Luca Boasso
  • Patent number: 11782729
    Abstract: A data processing system comprises a pool of reconfigurable data flow resources and a runtime processor. The pool of reconfigurable data flow resources includes arrays of physical configurable units and memory. The runtime processor includes logic to receive a plurality of configuration files for user applications. The configuration files include configurations of virtual data flow resources required to execute the user applications. The runtime processor also includes logic to allocate physical configurable units and memory in the pool of reconfigurable data flow resources to the virtual data flow resources and load the configuration files to the allocated physical configurable units. The runtime processor further includes logic to execute the user applications using the allocated physical configurable units and memory.
    Type: Grant
    Filed: August 18, 2020
    Date of Patent: October 10, 2023
    Assignee: SambaNova Systems, Inc.
    Inventors: Gregory Frederick Grohoski, Manish K. Shah, Raghu Prabhakar, Mark Luttrell, Ravinder Kumar, Kin Hing Leung, Ranen Chatterjee, Sumti Jairath, David Alan Koeplinger, Ram Sivaramakrishnan, Matthew Thomas Grimm
  • Publication number: 20230315407
    Abstract: According to a computing method a compiler determines a recompute node included in a dataflow application and a checkpoint tensor produced by the recompute node. The compiler determines a recompute cost to recompute the checkpoint tensor, and a memory cost to checkpoint the checkpoint tensor in a memory. Based on the recompute cost and/or the memory cost, the compiler determines a solution cost and compares the solution cost to a solution threshold. Based on comparing the solution cost to the solution threshold, the compiler determines a checkpoint solution to execute the dataflow application. The checkpoint solution can comprise recomputing or checkpointing the checkpoint tensor. In some implementations, the compiler can determine a recompute ratio of the recompute cost to the memory cost and can compare the recompute ratio to the solution threshold. A computer program product and a computing system can implement aspects of the method.
    Type: Application
    Filed: March 31, 2023
    Publication date: October 5, 2023
    Applicant: SambaNova Systems, Inc.
    Inventors: Bowen YANG, Zhuo CHEN, Fei WANG, Venkat Krishna SRINIVASAN, Chen LIU, Junjue WANG, Arvind Krishna SUJEETH, Sumti JAIRATH