Patents by Inventor Kin Hing Leung

Kin Hing Leung has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11625283
    Abstract: The technology disclosed relates to inter-processor execution of configuration files on reconfigurable processors using smart network interface controller (SmartNIC) buffers. In particular, the technology disclosed relates to a runtime logic that is configured to execute configuration files that define applications and process application data for applications using a first reconfigurable processor and a second reconfigurable processor. The execution includes streaming configuration data in the configuration files and the application data between the first reconfigurable processor and the second reconfigurable processor using one or more SmartNIC buffers.
    Type: Grant
    Filed: November 9, 2021
    Date of Patent: April 11, 2023
    Assignee: SambaNova Systems, Inc.
    Inventors: Ram Sivaramakrishnan, Sumti Jairath, Emre Ali Burhan, Manish K. Shah, Raghu Prabhakar, Ravinder Kumar, Arnav Goel, Ranen Chatterjee, Gregory Frederick Grohoski, Kin Hing Leung, Dawei Huang, Manoj Unnikrishnan, Martin Russell Raumann, Bandish B. Shah
  • Patent number: 11609798
    Abstract: The technology disclosed relates to runtime execution of configuration files on reconfigurable processors with varying configuration granularity. In particular, the technology disclosed relates to a runtime logic that is configured to receive a set of configuration files for an application, and load and execute a first subset of configuration files in the set of configuration files and associated application data on a first reconfigurable processor. The first reconfigurable processor has a first level of configurable granularity. The runtime logic is further configured to load and execute a second subset of configuration files in the set of configuration files and associated application data on a second reconfigurable processor. The second reconfigurable processor has a second level of configurable granularity that is different from the first level of configurable granularity.
    Type: Grant
    Filed: November 9, 2021
    Date of Patent: March 21, 2023
    Assignee: SambaNova Systems, Inc.
    Inventors: Ram Sivaramakrishnan, Sumti Jairath, Emre Ali Burhan, Manish K. Shah, Raghu Prabhakar, Ravinder Kumar, Arnav Goel, Ranen Chatterjee, Gregory Frederick Grohoski, Kin Hing Leung, Dawei Huang, Manoj Unnikrishnan, Martin Russell Raumann, Bandish B. Shah
  • Publication number: 20230014929
    Abstract: A system includes a multidimensional array of homogenous Functional Configurable Units (FCUs), coupled using a multidimensional array of switches, and a parameter store on the device which stores parameters that tag a subarray of FCUs as unusable. Technologies are described which change the pattern of placement of configuration data, in dependence on the tagged subarray, by changing the routing through the array of switches. As a result, a multidimensional array of FCUs having unusable elements can still be used.
    Type: Application
    Filed: May 5, 2022
    Publication date: January 19, 2023
    Applicant: SambaNova Systems, Inc.
    Inventors: Gregory F. GROHOSKI, Manish K. SHAH, Kin Hing LEUNG
  • Publication number: 20230016892
    Abstract: A system includes a multidimensional array of homogenous Functional Configurable Units (FCUs), coupled using a multidimensional array of switches, and a parameter store on the device which stores parameters that tag a subarray of FCUs as unusable. Technologies are described which change the pattern of placement of configuration data, in dependence on the tagged subarray, by changing the routing through the array of switches. As a result, a multidimensional array of FCUs having unusable elements can still be used.
    Type: Application
    Filed: May 6, 2022
    Publication date: January 19, 2023
    Applicant: SambaNova Systems, Inc.
    Inventors: Gregory F. GROHOSKI, Manish K. SHAH, Kin Hing LEUNG
  • Patent number: 11556494
    Abstract: A device architecture includes a spatially reconfigurable array of processors, such as configurable units of a CGRA, having spare homogenous subarrays, and a parameter store on the device which stores parameters that tag one or more elements as unusable. Configuration data is distributed using a statically reconfigurable bus system, to implement the pattern of placement of configuration data, in dependence on the tagged elements. As a result, a spatially reconfigurable array having unusable elements can be repaired.
    Type: Grant
    Filed: July 16, 2021
    Date of Patent: January 17, 2023
    Assignee: SambaNova Systems, Inc.
    Inventors: Gregory F. Grohoski, Manish K. Shah, Kin Hing Leung
  • Publication number: 20220261364
    Abstract: A data processing system comprises memory, compile time logic, runtime logic, and instrumentation profiling logic. The memory stores a dataflow graph for an application. The dataflow graph has a plurality of compute nodes that are configured to be producers to produce data for execution of the application, and to be consumers to consume the data for execution of the application. The compile time logic partitions execution of the dataflow graph into stages. Each of the stages has one or more compute nodes, one or more producers, and one or more consumers. The runtime logic determines a processing latency for each of the stages by calculating time elapsed between producers of a particular stage receiving input data and consumers of the particular stage receiving output data. The instrumentation profiling logic generates performance statistics for the dataflow graph based on the processing latency determined for each of the stages.
    Type: Application
    Filed: September 20, 2021
    Publication date: August 18, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Raghu PRABHAKAR, Matthew Thomas GRIMM, Sumti JAIRATH, Kin Hing LEUNG, Sitanshu GUPTA, Yuan LIN, Luca BOASSO
  • Publication number: 20220261365
    Abstract: A reconfigurable processor comprises an array of processing units and an instrumentation network. The array of processing units is configured to execute runtime events to execute an application. The instrumentation network is operatively coupled to the array of processing units. The instrumentation network comprises a control bus configured to form control signal routes in the instrumentation network. The instrumentation network further comprises a plurality of instrumentation counters having inputs and outputs connected to the control bus and to the processing units. Instrumentation counters in the plurality instrumentation units are configurable to consume control signals on the inputs and produce counts of the runtime events on the outputs.
    Type: Application
    Filed: September 20, 2021
    Publication date: August 18, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Raghu PRABHAKAR, Matthew Thomas GRIMM, Sumti JAIRATH, Kin Hing LEUNG, Sitanshu GUPTA, Yuan LIN, Luca BOASSO
  • Patent number: 11409540
    Abstract: A device architecture includes a spatially reconfigurable array of processors, such as configurable units of a CGRA, having spare elements, and a parameter store on the device which stores parameters that tag one or more elements as unusable. Technologies are described which change the pattern of placement of configuration data, in dependence on the tagged elements. As a result, a spatially reconfigurable array having unusable elements can be repaired.
    Type: Grant
    Filed: July 16, 2021
    Date of Patent: August 9, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Gregory F. Grohoski, Manish K. Shah, Kin Hing Leung
  • Patent number: 11392740
    Abstract: Roughly described, the invention involves a system including a plurality of functional units that execute different segments of a dataflow, and share intermediate results via a peer-to-peer messaging protocol. The functional units are reconfigurable, with different units being reconfigurable at different levels of granularity. The peer-to-peer messaging protocol includes control tokens or other mechanisms by which the consumer of the intermediate results learns that data has been transferred, and in response thereto triggers its next dataflow segment. A host or configuration controller configures the data units with their respective dataflow segments, but once execution of the configured dataflow begins, no host need be involved in orchestrating data synchronization, the transfer of intermediate results, or the triggering of processing after the data are received. Control overhead is therefore minimized.
    Type: Grant
    Filed: July 19, 2021
    Date of Patent: July 19, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Martin Russell Raumann, Qi Zheng, Bandish B. Shah, Ravinder Kumar, Kin Hing Leung, Sumti Jairath, Gregory Frederick Grohoski
  • Publication number: 20220197709
    Abstract: The technology disclosed relates to runtime execution of configuration files on reconfigurable processors with varying configuration granularity. In particular, the technology disclosed relates to a runtime logic that is configured to receive a set of configuration files for an application, and load and execute a first subset of configuration files in the set of configuration files and associated application data on a first reconfigurable processor. The first reconfigurable processor has a first level of configurable granularity. The runtime logic is further configured to load and execute a second subset of configuration files in the set of configuration files and associated application data on a second reconfigurable processor. The second reconfigurable processor has a second level of configurable granularity that is different from the first level of configurable granularity.
    Type: Application
    Filed: November 9, 2021
    Publication date: June 23, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Ram SIVARAMAKRISHNAN, Sumti JAIRATH, Emre Ali BURHAN, Manish K. SHAH, Raghu PRABHAKAR, Ravinder KUMAR, Arnav GOEL, Ranen CHATTERJEE, Gregory Frederick GROHOSKI, Kin Hing LEUNG, Dawei HUANG, Manoj UNNIKRISHNAN, Martin Russell RAUMANN, Bandish B. SHAH
  • Publication number: 20220197714
    Abstract: A system for training parameters of a neural network includes a processing node with a processor reconfigurable at a first level of configuration granularity and a controller reconfigurable at a finer level of configuration granularity. The processor is configured to execute a first dataflow segment of the neural network with training data to generate a predicted output value using a set of neural network parameters, calculate a first intermediate result for a parameter based on the predicted output value, a target output value, and a parameter gradient, and provide the first intermediate result to the controller. The controller is configured to receive a second intermediate result over a network, and execute a second dataflow segment, dependent upon the first intermediate result and the second intermediate result, to generate a third intermediate result indicative of an update of the parameter.
    Type: Application
    Filed: January 24, 2022
    Publication date: June 23, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Martin Russell RAUMANN, Qi ZHENG, Bandish B. SHAH, Ravinder KUMAR, Kin Hing LEUNG, Sumti JAIRATH, Gregory Frederick GROHOSKI
  • Publication number: 20220197713
    Abstract: The technology disclosed relates to inter-node execution of configuration files on reconfigurable processors using network interface controller (NIC) buffers. In particular, the technology disclosed relates to a runtime logic that is configured to execute configuration files that define applications and application data for applications using a first reconfigurable processor connected to a first host, and a second reconfigurable processor connected to a second host. The first reconfigurable processor is configured to push input data for the applications in a first plurality of buffers. The first host is configured to cause a first network interface controller (NIC) to stream the input data to a second plurality of buffers from the first plurality of buffers. The second host is configured to cause a second NIC to stream the input data to the second reconfigurable processor from the second plurality of buffers.
    Type: Application
    Filed: November 9, 2021
    Publication date: June 23, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Ram SIVARAMAKRISHNAN, Sumti JAIRATH, Emre Ali BURHAN, Manish K. SHAH, Raghu PRABHAKAR, Ravinder KUMAR, Arnav GOEL, Ranen CHATTERJEE, Gregory Frederick GROHOSKI, Kin Hing LEUNG, Dawei HUANG, Manoj UNNIKRISHNAN, Martin Russell RAUMANN, Bandish B. SHAH
  • Publication number: 20220197710
    Abstract: The technology disclosed relates to inter-processor execution of configuration files on reconfigurable processors using smart network interface controller (SmartNIC) buffers. In particular, the technology disclosed relates to a runtime logic that is configured to execute configuration files that define applications and process application data for applications using a first reconfigurable processor and a second reconfigurable processor. The execution includes streaming configuration data in the configuration files and the application data between the first reconfigurable processor and the second reconfigurable processor using one or more SmartNIC buffers.
    Type: Application
    Filed: November 9, 2021
    Publication date: June 23, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Ram SIVARAMAKRISHNAN, Sumti JAIRATH, Emre Ali BURHAN, Manish K. SHAH, Raghu PRABHAKAR, Ravinder KUMAR, Arnav GOEL, Ranen CHATTERJEE, Gregory Frederick GROHOSKI, Kin Hing LEUNG, Dawei HUANG, Manoj UNNIKRISHNAN, Martin Russell RAUMANN, Bandish B. SHAH
  • Publication number: 20220197711
    Abstract: The technology disclosed relates to runtime execution of functions across reconfigurable processor. In particular, the technology disclosed relates to a runtime logic that is configured to execute a first set of functions in a plurality of functions and/or data therefor on a first reconfigurable processor, and a second set of functions in the plurality of functions and/or data therefor on additional reconfigurable processors. Functions in the second set of functions and/or the data therefor are transmitted to the additional reconfigurable processors using one or more of a first reconfigurable processor-to-additional reconfigurable processors buffers, and results of executing the functions and/or the data therefor on the additional reconfigurable processors are transmitted to the first reconfigurable processor using one or more of additional reconfigurable processors-to-first reconfigurable processor buffers.
    Type: Application
    Filed: November 9, 2021
    Publication date: June 23, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Ram Sivaramakrishnan, Sumti Jairath, Emre Ali Burhan, Manish K. Shah, Raghu Prabhakar, Ravinder Kumar, Arnav Goel, Ranen Chatterjee, Gregory Frederick Grohoski, Kin Hing Leung, Dawei Huang, Manoj Unnikrishnan, Martin Russell Raumann, Bandish B. Shah
  • Publication number: 20220197712
    Abstract: The technology disclosed relates to inter-node execution of configuration files on reconfigurable processors using smart network interface controller (SmartNIC) buffers. In particular, the technology disclosed relates to a runtime logic that is configured to execute configuration files that define applications and process application data for applications using a first reconfigurable processor on a first node, and a second host processor on a second node. The execution includes streaming configuration data in the configuration files and the application data between the first reconfigurable processor and the second host processor using one or more SmartNIC buffers.
    Type: Application
    Filed: November 9, 2021
    Publication date: June 23, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Ram SIVARAMAKRISHNAN, Sumti JAIRATH, Emre Ali BURHAN, Manish K. SHAH, Raghu PRABHAKAR, Ravinder KUMAR, Arnav GOEL, Ranen CHATTERJEE, Gregory Frederick GROHOSKI, Kin Hing LEUNG, Dawei HUANG, Manoj UNNIKRISHNAN, Martin Russell RAUMANN, Bandish B. SHAH
  • Publication number: 20220198114
    Abstract: Roughly described, the invention involves a system including a plurality of functional units that execute different segments of a dataflow, and share intermediate results via a peer-to-peer messaging protocol. The functional units are reconfigurable, with different units being reconfigurable at different levels of granularity. The peer-to-peer messaging protocol includes control tokens or other mechanisms by which the consumer of the intermediate results learns that data has been transferred, and in response thereto triggers its next dataflow segment. A host or configuration controller configures the data units with their respective dataflow segments, but once execution of the configured dataflow begins, no host need be involved in orchestrating data synchronization, the transfer of intermediate results, or the triggering of processing after the data are received.
    Type: Application
    Filed: July 19, 2021
    Publication date: June 23, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Martin Russell RAUMANN, Qi ZHENG, Bandish B. SHAH, Ravinder KUMAR, Kin Hing LEUNG, Sumti JAIRATH, Gregory Frederick GROHOSKI
  • Publication number: 20220198117
    Abstract: A system for executing a graph partitioned across a plurality of reconfigurable computing units includes a processing node that has a first computing unit reconfigurable at a first level of configuration granularity and a second computing unit reconfigurable at a second, finer, level of configuration granularity. The first computing unit is configured by a host system to execute a first dataflow segment of the graph using one or more dataflow pipelines to generate a first intermediate result and to provide the first intermediate result to the second computing unit without passing through the host system. The second computing unit is configured by the host system to execute a second dataflow segment of the graph, dependent upon the first intermediate result, to generate a second intermediate result and to send the second intermediate result to a third computing unit, without passing through the host system, to continue execution of the graph.
    Type: Application
    Filed: January 27, 2022
    Publication date: June 23, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Martin Russell RAUMANN, Qi ZHENG, Bandish B. SHAH, Ravinder KUMAR, Kin Hing LEUNG, Sumti JAIRATH, Gregory Frederick GROHOSKI
  • Patent number: 11327771
    Abstract: A device architecture includes a spatially reconfigurable array of processors, such as configurable units of a CGRA, having spare elements, and a parameter store on the device which stores parameters that tag one or more elements as unusable. Technologies are described which change the pattern of placement of configuration data, in dependence on the tagged elements. As a result, a spatially reconfigurable array having unusable elements can be repaired.
    Type: Grant
    Filed: July 16, 2021
    Date of Patent: May 10, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Gregory F. Grohoski, Manish K. Shah, Kin Hing Leung
  • Publication number: 20220058034
    Abstract: A data processing system comprises a pool of reconfigurable data flow resources and a runtime processor. The pool of reconfigurable data flow resources includes arrays of physical configurable units and memory. The runtime processor includes logic to receive a plurality of configuration files for user applications. The configuration files include configurations of virtual data flow resources required to execute the user applications. The runtime processor also includes logic to allocate physical configurable units and memory in the pool of reconfigurable data flow resources to the virtual data flow resources and load the configuration files to the allocated physical configurable units. The runtime processor further includes logic to execute the user applications using the allocated physical configurable units and memory.
    Type: Application
    Filed: August 18, 2020
    Publication date: February 24, 2022
    Applicant: SambaNova Systems, Inc.
    Inventors: Gregory Frederick GROHOSKI, Manish K. SHAH, Raghu PRABHAKAR, Mark LUTTRELL, Ravinder KUMAR, Kin Hing LEUNG, Ranen CHATTERJEE, Sumti JAIRATH, David Alan KOEPLINGER, Ram SIVARAMAKRISHNAN, Matthew Thomas GRIMM
  • Patent number: 11237880
    Abstract: Roughly described, a system for data parallel training of a neural network on multiple reconfigurable units configured by a host with dataflow pipelines to perform different steps in the training CGRA units are configured to evaluate first and second sequential sections of neural network layers based on a respective subset of training data, and to back-propagate the error through the sections to calculate parameter gradients for the respective subset. Gradient synchronization and reduction are performed by one or more units having finer grain reconfigurability, such as an FPGA. The FPGA performs synchronization and reduction of the gradients for the second section while the CGRA units perform back-propagation through the first sequential section. Intermediate results are transmitted using a P2P message passing protocol layer. Execution of dataflow segments in the different units is triggered by receipt of data, rather than by a command from any host system.
    Type: Grant
    Filed: July 19, 2021
    Date of Patent: February 1, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Martin Russell Raumann, Qi Zheng, Bandish B. Shah, Ravinder Kumar, Kin Hing Leung, Sumti Jairath, Gregory Frederick Grohoski