Patents Assigned to SambaNova Systems, Inc.

Bandwidth-Aware Computational Graph Mapping

Publication number: 20230297349

Abstract: A computer-implemented method of transforming a high-level program for mapping onto a coarse-grained reconfigurable (CGR) processor with an array of CGR units, including sectioning a dataflow graph into a plurality of sections; extracting performance information for each of the plurality of sections; on a CGR unit: assigning to a section at least two computations dependent on a first data element; scheduling an additional load of the first data element in response to available memory bandwidth for that section; eliminating a buffer between the additional load of the first data element and one of the two computations, for that section; generating configuration data for the and communication channels, wherein the configuration data, when loaded onto an instance of the array of CGR units, causes the array of CGR units to implement the dataflow graph; and storing the configuration data in a non-transitory computer-readable storage medium.

Type: Application

Filed: March 15, 2023

Publication date: September 21, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Gao DENG, Weihang FAN, Fei WANG, Yun DU
Defect avoidance in a multidimensional array of functional configurable units

Patent number: 11762665

Abstract: A system includes a multidimensional array of homogenous Functional Configurable Units (FCUs), coupled using a multidimensional array of switches, and a parameter store on the device which stores parameters that tag a subarray of FCUs as unusable. Technologies are described which change the pattern of placement of configuration data, in dependence on the tagged subarray, by changing the routing through the array of switches. As a result, a multidimensional array of FCUs having unusable elements can still be used.

Type: Grant

Filed: May 5, 2022

Date of Patent: September 19, 2023

Assignee: SambaNova Systems, Inc.

Inventors: Gregory F. Grohoski, Manish K. Shah, Kin Hing Leung
TOP LEVEL NETWORK AND ARRAY LEVEL NETWORK FOR RECONFIGURABLE DATA PROCESSORS

Publication number: 20230289310

Abstract: A reconfigurable data processor comprises an array of configurable units and a bus system. The bus system is connected to the array of configurable units. The bus system includes a top level network and an array level network. The top level network is connected to an external data interface for communication with memory outside of the array of configurable units. The array level network is connected to configurable units in the array of configurable units.

Type: Application

Filed: May 18, 2023

Publication date: September 14, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Gregory Frederick GROHOSKI, Sumti JAIRATH, Mark LUTTRELL, Raghu PRABHAKAR, Ram SIVARAMAKRISHNAN, Manish K. SHAH
PARTITIONING DATAFLOW OPERATIONS FOR A RECONFIGURABLE COMPUTING SYSTEM

Publication number: 20230281156

Abstract: A method for partitioning executable operations for a reconfigurable computing system includes receiving a set of expressions comprising a plurality of operations and dependencies for those operations, partitioning the plurality of operations into selected executable partitions wherein each selected executable partition conforms to resource constraints for a reconfigurable unit of the reconfigurable computing system. Partitioning the plurality of operations into selected executable partitions may include seeding a candidate partition with an operation, recursively generating an additional candidate partition for each operation adjacent to the candidate partition whose dependent operations are already within the candidate partition or a previously selected partition, and selecting a best candidate partition based on resource cost. A corresponding system and computer-readable medium are also disclosed herein.

Type: Application

Filed: August 23, 2022

Publication date: September 7, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Yaqi Zhang, Mark Wagner, Matthew Feldman, Weiwei Chen
Critical Stage Optimization for Reconfigurable Architectures

Publication number: 20230273879

Abstract: A method for reducing latency and increasing throughput in a reconfigurable computing system includes receiving a user program for execution on a reconfigurable dataflow computing system, comprising a grid of compute units and grid of memory units interconnected with a switching array. The user program includes multiple tensor-based algebraic expressions that are converted to an intermediate representation comprising multiple stages. Each stage includes one or more logical operations executable via dataflow through compute units, and each stage is preceded by and followed by a buffer, each buffer corresponding to one or more memory units. The method includes detecting a memory mapping operation within a critical stage and moving the memory mapping operation to an adjacent stage, wherein the memory mapping operation is executable by memory units within the adjacent stage and dataflow through the buffer is controlled by one or more memory units within the grid of memory units.

Type: Application

Filed: February 28, 2023

Publication date: August 31, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Adam BORDELON, David Alan KOEPLINGER
Switch for routing data in an array of functional configurable units

Patent number: 11740911

Abstract: A system includes a multidimensional array of homogenous Functional Configurable Units (FCUs), coupled using a multidimensional array of switches, and a parameter store on the device which stores parameters that tag a subarray of FCUs as unusable. Technologies are described which change the pattern of placement of configuration data, in dependence on the tagged subarray, by changing the routing through the array of switches. As a result, a multidimensional array of FCUs having unusable elements can still be used.

Type: Grant

Filed: May 6, 2022

Date of Patent: August 29, 2023

Assignee: SambaNova Systems, Inc.

Inventors: Gregory F. Grohoski, Manish K. Shah, Kin Hing Leung
Overlapping Gradient Synchronization In Machine Learning

Publication number: 20230259823

Abstract: In a method an orchestrator of a computing system determines that results of Machine Learning model computations are available and dispatches a worker to perform model computations that include computing gradients of the results. The orchestrator determines that a set of gradients of the results is available and dispatches a gradient worker to compute a sum of the gradients. The orchestrator determines that a second set of gradients of the results is available and dispatches a second gradient worker to compute a sum of the second set of gradients. The orchestrator determines that the sums of the first and second gradients are available and dispatches a third gradient worker to compute synchronized gradients. The gradient workers compute the sums and synchronized gradients concurrent with training workers computing additional model computations results and/or gradients. A computer program product can include the method and a computing system can include the orchestrator.

Type: Application

Filed: February 13, 2023

Publication date: August 17, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Greg DYKEMA, Fansheng CHENG, Kuan ZHOU, Arnav GOEL, Subhra MAZUMDAR, Milad SHARIF, Po-Yu WU, Bowen YANG, Qi ZHENG
Dynamically-Sized Data Structures on Data Flow Architectures

Publication number: 20230259477

Abstract: A data processing system for implementing operations that generate a dynamically-sized output is presented. The data processing system includes a reconfigurable processor that is configured to implement a first operation, a second operation, a recording unit, and a control unit. The first operation generates an output, wherein a size of the output is unknown during a configuration phase. The second operation receives the output of the first operation as an input. The recording unit generates control data that is indicative of the size of the output. The control unit that provides the control data to the second operation, wherein the second operation processes the input based on the control data.

Type: Application

Filed: February 14, 2023

Publication date: August 17, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Abhishek SRIVASTAVA, Matthew VILIM, Raghu PRABHAKAR, Sankar RACHURU, Zhekun ZHANG, Matheen MUSADDIQ, Apurv VIVEK, Sitanshu GUPTA, Ayesha Siddiqua
EXPLOITING SHARED DIMENSIONS IN MATRIX COMPUTATIONS

Publication number: 20230252106

Abstract: A method generates pairs of split matrices based on a left and a right matrix sharing dimension K. A first column-split matrix comprises columns 1 to Q of the left matrix and a second column-split matrix comprises columns Q+1 to Q+P of the left matrix. A first row-split matrix comprises rows 1 to Q of the right matrix and a second row-split matrix comprises columns rows Q+1 to Q+P of the right matrix. The method multiplies the first column-matrix and first row matrix to compute a first dot product, and multiplies the second column-matrix and second row matrix to compute a second dot product. The method adds the dot products to compute a third dot product. The method can compute the first and second dot products concurrently. A computing system can comprise a matrix splitter to generate the matrices and can comprise matrix processing units to compute the dot products.

Type: Application

Filed: February 3, 2023

Publication date: August 10, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Pramod NATARAJA, Raghu PRABHAKAR
Timing Margin Sensor

Publication number: 20230251683

Abstract: A timing margin sensor circuit includes one or more time-to-digital converters (TDCs), a predictor, and a translation circuit. The TDC(s) measure(s) progress of a clock signal through one or more chains of delay stages. The progress depends on sense conditions acting upon the delay chain, such as the supply voltage and the temperature. The predictor receives the measured progress. If the delay chain becomes slower, the predictor extrapolates a predicted progress value. If the delay chain becomes faster, the predictor outputs the actual progress value. The translator translates the predictor output value to sense information that can be used in a clock stretcher circuit. The timing margin sensor may further have an averager/selector to average or select from the results of multiple TDCs. The timing margin sensor may further have a calibrator to compensate for nominal sense conditions, and one or more tunable delays circuits.

Type: Application

Filed: January 31, 2023

Publication date: August 10, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Mahmood KHAYATZADEH, Satyajit SARKAR, Jinuk SHIN
Two-Level Arbitration in a Reconfigurable Processor

Publication number: 20230251993

Abstract: A coarse-grained reconfigurable (CGR) processor includes agents coupled to a first network, an array of CGR units connected by a second network, and a tile agent coupled between the first and second networks. The tile agent includes links to receive requests for transactions on the first network, request queues respectively associated with the links, credit counters associated with respective agents, a first arbiter, and a second arbiter. The first arbiter selects a request from the received requests for transactions and enters the selected request into a request queue associated with a link that received the selected request. The second arbiter chooses a request from an oldest entry of each request queue based on the credit counters, sends a transaction based on the chosen request over the first network, and removes the chosen request from its respective request queue.

Type: Application

Filed: February 9, 2023

Publication date: August 10, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Manish K. SHAH, John Philipp BAXLEY
Direct Access to External Storage from a Reconfigurable Processor

Publication number: 20230251989

Abstract: A data processing system is presented that includes multiple local buses, a host processor, a network interface controller (NIC) for connecting to external storage via a network, one or more reconfigurable processors, and a bus switch. The bus switch couples the multiple local busses, thereby operatively coupling the one or more reconfigurable processors, the host processor, and the NIC. The one or more reconfigurable processors are configured to implement a virtual function that uses a virtual address for a memory access operation. The host processor is configured to implement an application programming interface (API) that translates the virtual address into a physical address, and the NIC uses the physical address to initiate a direct data access operation at the external storage that moves data directly between the one or more reconfigurable processors and the external storage, wherein the data bypasses the host processor.

Type: Application

Filed: February 9, 2023

Publication date: August 10, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Subhra MAZUMDAR, Guoyao FENG, Neal SANGHVI
Fast Argument Load in a Reconfigurable Data Processor

Publication number: 20230251994

Abstract: A reconfigurable processor includes an array of configurable units connected by a bus system. Each configurable unit has a configuration data store, organized as a shift register, to store configuration data. The configuration data store also includes individually addressable argument registers respectively made up of word-sized portions of the shift register to provide arguments to the configurable unit. The configurable unit also includes program load logic shift data into the configuration data store, and argument load logic to directly load data into the argument registers without shifting the received argument data through the shift register. A program load controller is associated with the array to respond to a program load command by executing a program load process, and a fast argument load (FAL) controller is associated with the array to respond to an FAL command by executing an FAL process.

Type: Application

Filed: February 2, 2023

Publication date: August 10, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Manish K. SHAH, Gregory Frederick GROHOSKI
Head Of Line Blocking Mitigation In A Reconfigurable Data Processor

Publication number: 20230251839

Abstract: A coarse-grained reconfigurable (CGR) processor comprises a first network and a second network; a plurality of agents coupled to the first network; an array of CGR units coupled together by the second network; and a tile agent coupled between the first network and the second network. The tile agent comprises a plurality of links, a plurality of credit counters associated with respective agents of the plurality of agents, a plurality of credit-hog counters associated with respective links of the plurality of links, and an arbiter to manage access to the first network from the plurality of links based their associated credit-hog counters. Furthermore, a credit-hog counter of the plurality of credit-hog counters changes in response to processing a request for a transaction from its associated link.

Type: Application

Filed: February 9, 2023

Publication date: August 10, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Manish K. SHAH, John Philipp BAXLEY
Handling Interrupts from a Virtual Function in a System with a Multi-Die Reconfigurable Processor

Publication number: 20230244515

Abstract: A system is presented that includes a communication link, a runtime processor, and a reconfigurable processor. The reconfigurable processor is adapted for generating an interrupt to the runtime processor in response to a predetermined event and includes first and second dies arranged in a package, having respective first and second arrays of coarse-grained reconfigurable (CGR) units, and respective first and second communication link interfaces coupled to the communication link. The runtime processor is adapted for configuring the first and second communication link interfaces to provide access to the first and second arrays of coarse-grained reconfigurable units from first and second physical function drivers and from at least one virtual function driver, and the reconfigurable processor is adapted for sending the interrupt to the first or to the second physical function driver and for sending the interrupt to a virtual function driver of the at least one virtual function driver.

Type: Application

Filed: March 7, 2023

Publication date: August 3, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Manish K. SHAH, Paul JORDAN, Maran WILSON, Ravinder KUMAR
Matrix Multiplication on Coarse-grained Computing Grids

Publication number: 20230244748

Abstract: A method for multiplying matrices in a coarse-grained computing grid includes assigning each compute unit c of C compute units to a unique submatrix Rc of a result matrix R, wherein the C compute units are arranged in a 2D computing grid, configuring one or more source memory units to provide relevant matrix A data and matrix B data to the C compute units via a plurality of packets, configuring each compute unit c to produce the unique submatrix Rc and send the unique submatrix Rc to one or more desired memory units. The method also includes initiating data flow in the computing grid to produce the result matrix R within the desired memory units. To reduce packet traffic, Matrix B data corresponding to a column of compute units may be narrow-casted to each column of compute units. A corresponding system and computer-readable medium are also disclosed herein.

Type: Application

Filed: May 25, 2022

Publication date: August 3, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Pramod Natarja, Sitanshu Gupta, Ram Sivaramakrishnan, Ajit Punj
Handling Interrupts from a Virtual Function in a System with a Reconfigurable Processor

Publication number: 20230244462

Abstract: A system is presented that includes a communication link, a runtime processor coupled to the communication link, and a reconfigurable processor. The reconfigurable processor is adapted for generating an interrupt to the runtime processor in response to a predetermined event and includes multiple arrays of coarse-grained reconfigurable (CGR) units and an interface to the communication link that couples the reconfigurable processor to the runtime processor via the communication link. The runtime processor is adapted for configuring the interface to the communication link to provide access to the multiple arrays of coarse-grained reconfigurable units from a physical function driver and from at least one virtual function driver, and the reconfigurable processor is adapted for sending the interrupt to the physical function driver and to a virtual function driver of the at least one virtual function driver within the runtime processor.

Type: Application

Filed: March 7, 2023

Publication date: August 3, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Manish K. SHAH, Paul JORDAN, Maran WILSON, Ravinder KUMAR
Configurable Access to a Reconfigurable Processor by a Virtual Function

Publication number: 20230244461

Abstract: A data processing system is presented that includes a communication link, a runtime processor coupled to the communication link, and one or more reconfigurable processors. A reconfigurable processor of the one or more reconfigurable processors is adapted for generating an interrupt to the runtime processor in response to a predetermined event and includes arrays of coarse-grained reconfigurable (CGR) units and an interface to the communication link that couples the reconfigurable processor to the runtime processor via the communication link. The runtime processor is adapted for configuring the interface to the communication link to provide access to the arrays of CGR units through the communication link from a physical function driver and from a virtual function driver.

Type: Application

Filed: February 1, 2023

Publication date: August 3, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Manish K. SHAH, Paul JORDAN, Maran WILSON, Ravinder KUMAR
Compiler flow logic for reconfigurable architectures

Patent number: 11714780

Abstract: The technology disclosed partitions a dataflow graph of a high-level program into memory allocations and execution fragments. The memory allocations represent creation of logical memory spaces in on-processor and/or off-processor memories for data required to implement the dataflow graph. The execution fragments represent operations on the data. The technology disclosed designates the memory allocations to virtual memory units and the execution fragments to virtual compute units. The technology disclosed partitions the execution fragments into memory fragments and compute fragments, and assigns the memory fragments to the virtual memory units and the compute fragments to the virtual compute units. The technology disclosed then allocates the virtual memory units to physical memory units and the virtual compute units to physical compute units.

Type: Grant

Filed: May 20, 2021

Date of Patent: August 1, 2023

Assignee: SambaNova Systems, Inc.

Inventors: David Alan Koeplinger, Raghu Prabhakar, Sumti Jairath
System of Heterogeneous Reconfigurable Processors for the Data-Parallel Execution of Applications

Publication number: 20230237013

Abstract: A system for a data-parallel execution of at least two implementations of an application on reconfigurable processors with different layouts is presented. The system comprises a pool of reconfigurable data flow resources with data transfer resources that interconnect first and second reconfigurable processors having first and second layouts that impose respective first and second constraints for the data-parallel execution of the application. The system further comprises an archive of configuration files and a host system that is operatively coupled to the first and second reconfigurable processors. The host system comprises first and second compilers that generate for the application, based on the respective first and second constraints, first and second configuration files that are stored in the archive of configuration files and adapted to be executed data-parallel compatible on respective first and second reconfigurable processors.

Type: Application

Filed: September 9, 2022

Publication date: July 27, 2023

Applicant: SambaNova Systems, Inc.

Inventors: Greg Dykema, Maran Wilson, Guoyao Feng, Kuan Zhou, Tianyu Sun, Taylor Lee, Kin Hing LEUNG, Arnav Goel, Conrad Turlik, Milad Sharif

prev 1 2 3 4 5 6 7 8 9 … next