Patents Assigned to SambaNova Systems, Inc.
  • Patent number: 12158855
    Abstract: A method comprises a Dynamic Equality of Service (DEoS) arbiter of a switch computing port DEoS metrics based on dynamic input activity of source nodes into input ports of the switch. Based on the port DEoS metrics, the arbiter selects an input port of the switch to make a through-connection to an output port of the switch. The port DEoS metrics can be based on node DEoS metrics including DEoS counters, and/or quantization ranges of DEoS counters, associated with the source nodes. A switching apparatus comprises a switch, a plurality of nodes coupled to the switch, and a DEoS arbiter. The switching apparatus can further comprise a first and second DEoS counter. The DEoS arbiter can perform operations of the method to arbitrate among input ports of the switch to make a through-connection.
    Type: Grant
    Filed: January 16, 2023
    Date of Patent: December 3, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Mark Luttrell, Manish K. Shah
  • Publication number: 20240394218
    Abstract: System and method for optimizing data-transfer among multiple compute units in a data-parallel computing system. A topological communications configurator (TCC) determines a connections-optimized configuration of processors associated with compute nodes of the computing system. The processors can execute dataflow workers of an application and form intranodal segments of an internodal interconnection topology coupling the intranodal segments. The TCC determines the connections-optimized configuration based on internodal communications costs corresponding to communications routes among the internodal segments via the internodal interconnection fabric.
    Type: Application
    Filed: August 5, 2024
    Publication date: November 28, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Greg DYKEMA, Aarti LALWANI
  • Publication number: 20240388519
    Abstract: A processor includes an internal network with separate packet-switching networks for request, response, data, and credit transmission. Each of the four networks includes switches interconnected by links. Interface circuits connect the internal network to communication links or electronic memory and communicate over the internal network. A network recovery circuit is also coupled to the internal network. Each switch has ports, buffers for input packets, routing circuitry to send packets to output ports based on destinations, and a watchdog timer to detect packet timeouts and notify the network recovery circuit of delays. The network recovery circuit responds to timeout messages by setting a network failure condition, ensuring efficient and reliable network operation.
    Type: Application
    Filed: May 7, 2024
    Publication date: November 21, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Paul JORDAN, Manish K. SHAH
  • Publication number: 20240385920
    Abstract: A coarse-grained reconfigurable architecture processor is disclosed, featuring an array of configurable units capable of executing an application with defined progress milestones. The processor includes a control bus connecting the configurable units and a hang detection circuit with a timer that resets upon receiving a control signal via the control bus. Upon reaching a progress milestone, a configurable unit sends a control signal to the hang detection circuit via the control bus. The hang detection circuit monitors the application's execution for hang conditions by detecting timer expiration, ensuring efficient and reliable processing of applications on the reconfigurable processor.
    Type: Application
    Filed: May 7, 2024
    Publication date: November 21, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Paul JORDAN, Manish K. SHAH
  • Publication number: 20240385929
    Abstract: Disclosed is a method for resetting configurable units in a reconfigurable processor with an array of configurable units and a force-quit controller on an integrated circuit substrate. The array includes multiple sub-arrays of configurable units. The method involves receiving a force-quit command at the force-quit controller and generating force-quit control signals to reset configurable units in a specific sub-array of the plurality of sub-arrays. The specific sub-array contains the force-quit controller. This method enhances the efficiency and reliability of reconfigurable processors by enabling targeted and controlled resets of configurable units within the reconfigurable processor architecture.
    Type: Application
    Filed: July 30, 2024
    Publication date: November 21, 2024
    Applicant: SambaNova Systems, Inc.
    Inventor: Manish K. SHAH
  • Publication number: 20240388493
    Abstract: A Coarse-grained Reconfigurable Processor (CGRP) includes an internal network with request, response, and data networks operating concurrently as separate packet-switched networks. The CGRP includes external interface circuits coupled to an interface configurable unit in an array of configurable units through the internal network. The CGRP also includes a network health monitor circuit that is configured to detect network failure conditions by writing and then reading health monitor registers across the internal network.
    Type: Application
    Filed: May 7, 2024
    Publication date: November 21, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Paul JORDAN, Manish K. SHAH
  • Publication number: 20240385921
    Abstract: A computing system is disclosed, comprising a host computer and multiple CGRPs (coarse-grained reconfigurable architecture processors) connected to the host computer through external communication links. Each CGRP includes an internal network, external interface circuits, memory interface circuits, arrays of configurable units, hang detection circuits, force-quit controllers, and a network recovery circuit with control registers. The host computer is programmed to configure and execute applications across the arrays of configurable units in both CGRPs. In case of a hang detection in one CGRP, the network recovery circuit initiates a force quit process and notifies the host computer. Additionally, the network recovery circuit compares application IDs and halts execution in the other CGRP if necessary. This system provides efficient failure tolerance and recovery mechanisms for parallel processing applications.
    Type: Application
    Filed: May 7, 2024
    Publication date: November 21, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Paul JORDAN, Manish K. SHAH
  • Patent number: 12147339
    Abstract: A processor has multiple memory interfaces and a memory interleaver controlling access to the memory interfaces. The memory interfaces may each couple with one or more memory devices. The number of memory devices coupled to the different memory interfaces may be unequal. The memory interleaver determines a memory region from a logical address, and a region relative address. It determines the interleave factor IF corresponding to the memory region. It performs an integer division to obtain a device line address, and a modulo operation to obtain an uncorrected channel address. The memory interleaver may add a region start address associated with the memory region to the device line address to obtain a physical line address. It may correct the uncorrected channel address, based on the memory region, to obtain a physical channel address. Some implementations use configuration memories to allow flexibility, other implementations are hardwired for a particular memory architecture.
    Type: Grant
    Filed: June 7, 2023
    Date of Patent: November 19, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Paul Jordan, Manish K. Shah
  • Patent number: 12147381
    Abstract: A method for placing, routing and using compute units and memory units in a reconfigurable computing grid includes receiving a placement graph for a computing task that defines a set of unplaced memory units, a set of unplaced compute units and data connections between the unplaced memory units and the unplaced compute units, the data connections comprising primary connections corresponding to the primary ports of the unplaced compute units and secondary connections corresponding to the secondary ports of the unplaced compute units. The method also includes forming a subgraph for each unplaced memory unit having a primary connection, each subgraph comprising the unplaced memory unit and each unplaced compute unit connected to the unplaced memory unit via a primary connection. The method also includes placing each formed subgraph as a cluster on the reconfigurable computing grid. A corresponding computer program product and system are also disclosed herein.
    Type: Grant
    Filed: December 16, 2022
    Date of Patent: November 19, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Kin Hing Leung, Feng Sheng, Ajit Punj
  • Publication number: 20240378147
    Abstract: A convolution calculation engine includes a kernel element counter for a convolution operation between a kernel and an input tensor. The kernel element counter wraps back to an initial kernel count value after reaching a maximum kernel count value. The convolution calculation engine also includes an offset look-up table (LUT) that provides a relative input offset into the input tensor based on an output of the kernel element counter and input location calculation logic that provides an input location within an input tensor for the convolution operation based on the relative input offset provided by the offset LUT.
    Type: Application
    Filed: May 8, 2023
    Publication date: November 14, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Mark William Gottscho, Ram SIVARAMAKRISHNAN, David Brian JACKSON, Ruddhi CHAPHEKAR, Tuowen Zhao, Lei Xia
  • Publication number: 20240378259
    Abstract: A convolution calculation engine to perform a convolution operation includes a convolution address compute unit. The convolution address compute unit includes an outer output base location register to provide an outer output base location for the convolution operation and an outer input base location register to provide an outer input base location for the convolution operation. It also includes a kernel element counter that starts to count from an initial kernel count value to a maximum kernel count value in response to a change in the outer output base location and a kernel offset generator to generate a kernel offset based on an output of the kernel element counter. In addition, the convolution address compute unit includes inner location logic to calculate an output location based on the outer output base location and an input location based on the outer input base location and output of the kernel element counter.
    Type: Application
    Filed: May 8, 2023
    Publication date: November 14, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Mark William Gottscho, Ram SIVARAMAKRISHNAN, David Brian JACKSON, Ruddhi CHAPHEKAR, Tuowen Zhao, Lei Xia
  • Patent number: 12143298
    Abstract: A computing system is disclosed, comprising a plurality of interconnected reconfigurable dataflow units (RDUs). Each RDU includes configurable units, internal networks, and external interfaces. The first configurable unit of the first RDU sends a request to access an external memory attached to the second RDU over its first internal network. The second configurable unit of the first RDU obtains a memory address for the request, determines an identifier for the second RDU, and sends the request, identifier, and memory address to the third configurable unit of the first RDU over its second internal network. The third configurable unit of the first RDU generates a routable address on the external network, synthesizes a payload, and sends it through an external network interface. The third configurable unit of the second RDU receives the payload, and the fourth configurable unit of the second RDU uses the address to access the external memory.
    Type: Grant
    Filed: October 25, 2023
    Date of Patent: November 12, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Manish K. Shah, Ram Sivaramakrishnan, Gregory Frederick Grohoski, Raghu Prabhakar
  • Publication number: 20240370240
    Abstract: A system and method for transforming a high-level program into configuration data for a coarse-grained reconfigurable (CGR) data processor with an array of CGR units. The high-level program is transformed into a dataflow graph that includes multiple interdependent asynchronously performing meta-pipelines. A first buffer is identified that stores data that is passed from a producer in a first meta-pipeline stage to a consumer in a second meta-pipeline stage. The system determines limitations associated with the array, and selects for implementation the lowest-cost buffer topology, chosen from a cascaded buffer topology, a hybrid buffer topology, and a striped buffer topology, where cost is determined by the number of memory units and on a number of times data is written into a memory unit while traveling through the first buffer. Optimal configuration data for the array is generated and stored.
    Type: Application
    Filed: July 17, 2024
    Publication date: November 7, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Nathan Francis SHEELEY, Weihang FAN, Matheen MUSADDIQ, Ram SIVARAMAKRISHNAN
  • Publication number: 20240370402
    Abstract: A reconfigurable processor is disclosed, featuring an array of configurable units interconnected by a bus system. Each configurable unit includes a configuration data store structured as a shift register that includes individually addressable argument registers. Program load logic is responsible for receiving sub-files of configuration data through the bus system and sequentially shifting them into the configuration data store, including the argument registers. Argument load logic is designed to receive argument data via the bus system and directly load it into the argument registers without the need for shifting through the shift register.
    Type: Application
    Filed: July 17, 2024
    Publication date: November 7, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Manish K. SHAH, Gregory Frederick GROHOSKI
  • Patent number: 12135971
    Abstract: A computing system includes an array of configurable units made up of sub-arrays of configurable units. Each sub-array has a first number of configurable compute units and a second number of configurable memory units with a first spatial arrangement. Each configurable unit includes a configuration data store. The system also includes a statically configurable bus system coupled to the configurable units and a tag indicating a sub-array of configurable units having a defect. A defect-aware configuration controller sends configuration data to the configuration data stores to implement a data processing operation using the array of configurable units by generating static route control signals for the statically configurable bus system, based on the tag and without support of a host processor, to send a portion of the configuration data targeted to the sub-array having the defect to a configuration data store of an alternative sub-array of configurable units in the array.
    Type: Grant
    Filed: August 22, 2023
    Date of Patent: November 5, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Gregory Frederick Grohoski, Manish K. Shah, Kin Hing Leung
  • Publication number: 20240345936
    Abstract: A system comprising a tool for providing actionable insight for bring up and performance debug of performant dataflow graphs on CGRA. A system comprising a tool for providing hierarchical traceable graph transformation of dataflow graph and annotated with runtime information after the compilation and execution back onto higher levels of stack from hardware metrics. A system comprising a tool for system performance monitoring and tuning by composition of compile time and runtime information of a workload dataflow graph on CGRA.
    Type: Application
    Filed: April 10, 2024
    Publication date: October 17, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Muthiah ANNAMALAI, Anders RAVNBORG
  • Publication number: 20240338297
    Abstract: A data processing system includes an array of reconfigurable units and a compiler configured to generate one or more configuration files for an application for execution on one or more reconfigurable processors. The data processing system further includes an execution flow logic which is configured to cause execution of the configuration files on the reconfigurable processors to be dependent upon one or more breakpoint conditions. The data processing further includes a runtime logic configured to execute the configuration files depending upon the breakpoint conditions. A corresponding method is also disclosed herein.
    Type: Application
    Filed: September 11, 2023
    Publication date: October 10, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Arnav GOEL, Qi ZHENG, Guoyao FENG, Chen YANG, Jianding LUO
  • Publication number: 20240338340
    Abstract: A data processing system including an array of reconfigurable units and a compiler configured to generate to execute a dataflow graph of a user application is disclosed. The dataflow graph includes a sequence of temporal partitions, each temporal partition including a sequence of graph control operations. Also disclosed is an intelligent graph orchestration and execution engine (IGOEE) configured to receive an optimization objective from the complier. The optimization objective can be for minimizing execution time of the reconfigurable processor or maximizing computing resource utilization of the reconfigurable processor. The IGOEE can reorganize the sequence of temporal partitions and the sequence of graph control operations within each temporal partition to satisfy the optimization objective; and execute the reorganized dataflow graph on the reconfigurable processor. A corresponding method is also disclosed herein.
    Type: Application
    Filed: September 8, 2023
    Publication date: October 10, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Arnav GOEL, Ravinder KUMAR, Arjun SABNIS, Qi ZHENG, Neal SANGHVI
  • Patent number: 12112250
    Abstract: A data processing system includes compile time logic to section a graph into a sequence of sections, including a first section followed by a second section. The compile time logic configured the first section to generate a first output in a first non-overlapping target configuration in response to processing an input in a first overlapping input configuration, and configures the second section to generate a second output in a second non-overlapping target configuration in response to processing the first output in a second overlapping input configuration. The compile time logic also creates a set of computer instructions to execute the first section and the second section on a target processing system.
    Type: Grant
    Filed: April 4, 2022
    Date of Patent: October 8, 2024
    Assignee: SambaNova Systems, Inc.
    Inventors: Tejas Nagendra Babu Nama, Ruddhi Chaphekar, Ram Sivaramakrishnan, Raghu Prabhakar, Sumti Jairath, Junjue Wang, Kaizhao Liang, Adi Fuchs, Matheen Musaddiq, Arvind Krishna Sujeeth
  • Publication number: 20240330236
    Abstract: A computing system includes a first network, a second network, multiple first agents connected to the first network, multiple second agents connected to the second network, and an interface circuit interconnecting the two networks. The interface circuit includes multiple request queues, a first arbiter for selecting requests from the second agents for transactions on the first network and entering them into the request queues, and credit counters associated with the first agents. A second arbiter selects requests from the oldest entry of each request queue based on the credit counters, sends transactions over the first network, and removes the selected requests from their respective queues. This system efficiently manages communication between the first and second networks, enhancing overall system performance.
    Type: Application
    Filed: June 10, 2024
    Publication date: October 3, 2024
    Applicant: SambaNova Systems, Inc.
    Inventors: Manish K. SHAH, John Philipp BAXLEY