Patents by Inventor Srivathsa Dhruvanarayan

Srivathsa Dhruvanarayan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11886981
    Abstract: A compiler generates a computer program implementing a machine learning network on a machine learning accelerator (MLA) including interconnected processing elements. The computer program includes data transfer instructions for non-colliding data transfers between the processing elements. To generate the data transfer instructions, the compiler determines non-conflicting data transfer paths for data transfers based on a topology of the interconnections between processing elements, on dependencies of the instructions and on a duration for execution of the instructions. Each data transfer path specifies a routing and a time slot for the data transfer. The compiler generates data transfer instructions that specify routing of the data transfers and generates a static schedule that schedules execution of the data transfer instructions during the time slots for the data transfers.
    Type: Grant
    Filed: May 1, 2020
    Date of Patent: January 30, 2024
    Assignee: SiMa Technologies, Inc.
    Inventors: Nishit Shah, Srivathsa Dhruvanarayan, Reed Kotler
  • Patent number: 11631001
    Abstract: A system-on-chip (SoC) integrated circuit product includes a machine learning accelerator (MLA). It also includes other processor cores, such as general purpose processors and application-specific processors. It also includes a network-on-chip for communication between the different modules. The SoC implements a heterogeneous compute environment because the processor cores are customized for different purposes and typically will use different instruction sets. Applications may use some or all of the functionalities offered by the processor cores, and the processor cores may be programmed into different pipelines to perform different tasks.
    Type: Grant
    Filed: April 10, 2020
    Date of Patent: April 18, 2023
    Assignee: SiMa Technologies, Inc.
    Inventors: Srivathsa Dhruvanarayan, Nishit Shah, Bradley Taylor, Moenes Zaher Iskarous
  • Publication number: 20230023303
    Abstract: A compiler receives a description of a machine learning network and generates a computer program that implements the machine learning network. The computer program includes statically scheduled instructions that are executed by a mesh of processing elements (Tiles). The instructions executed by the Tiles are statically scheduled because the compiler can determine which instructions are executed by which Tiles at what times. For example, for the statically scheduled instructions, there are no conditions, branching or data dependencies that can be resolved only at run-time, and which would affect the timing and order of the execution of the instructions.
    Type: Application
    Filed: October 3, 2022
    Publication date: January 26, 2023
    Inventors: Nishit Shah, Reed Kotler, Srivathsa Dhruvanarayan, Moenes Zaher Iskarous, Kavitha Prasad, Yogesh Laxmikant Chobe, Sedny S.J Attia, Spenser Don Gilliland, Bradley Taylor
  • Patent number: 11488066
    Abstract: Convolutions of an input sample with multiple kernels is decomposed into matrix multiplications of a V×C matrix of input values times a C×K matrix of kernel values, producing a V×K product. For the second matrix, C is a channel dimension (i.e., each row of the second matrix is a different channel of the input sample and kernel) and K is the kernel dimension (i.e., each column of the second matrix is a different kernel), but all the values correspond to the same pixel position in the kernel. In the matrix product, V is the output dimension and K is the kernel dimension. Thus, each value in the output matrix is a partial product for a certain output pixel and kernel, and the matrix multiplication parallelizes the convolutions by calculating partial products for multiple output pixels and multiple kernels.
    Type: Grant
    Filed: April 21, 2020
    Date of Patent: November 1, 2022
    Assignee: SiMa Technologies, Inc.
    Inventors: Nishit Shah, Srivathsa Dhruvanarayan
  • Publication number: 20220345423
    Abstract: Some embodiments provide a method for a parser of a processing pipeline. The method receives a packet for processing by a set of match-action stages of the processing pipeline. The method stores packet header field (PHF) values from a first set of PHFs of the packet in a set of data containers. The first set of PHFs are for use by the match-action stages. For a second set of PHFs not used by the match-action stages, the method generates descriptive data that identifies locations of the PHFs of the second set within the packet. The method sends (i) the set of data containers to the match-action stages and (ii) the packet data and the generated descriptive data outside of the match-action stages to a deparser that uses the packet data, generated descriptive data, and the set of data containers as modified by the match-action stages to reconstruct a modified packet.
    Type: Application
    Filed: July 8, 2022
    Publication date: October 27, 2022
    Applicant: Barefoot Networks, Inc.
    Inventors: Gregory C. Watson, Srivathsa Dhruvanarayan, Glen Raymond Gibb, Constantine Calamvokis, Aled Justin Edwards
  • Patent number: 11474557
    Abstract: In one embodiment, the present disclosure includes multichip timing synchronization circuits and methods. In one embodiment, hardware counters in different systems are synchronized. Programs on the systems may include synchronization instructions. A second system executes synchronization instruction, and in response thereto, synchronizes a local software counter to a local hardware counter. The software counter on the second system may be delayed a fixed period of time corresponding to a program delay on the first system. The software counter on the second system may further be delayed by an offset to bring software counters on the two systems into sync.
    Type: Grant
    Filed: September 15, 2020
    Date of Patent: October 18, 2022
    Assignee: GROQ, INC.
    Inventors: Gregory Michael Thorson, Srivathsa Dhruvanarayan
  • Patent number: 11425058
    Abstract: Some embodiments provide a method for a parser of a processing pipeline. The method receives a packet for processing by a set of match-action stages of the processing pipeline. The method stores packet header field (PHF) values from a first set of PHFs of the packet in a set of data containers. The first set of PHFs are for use by the match-action stages. For a second set of PHFs not used by the match-action stages, the method generates descriptive data that identifies locations of the PHFs of the second set within the packet. The method sends (i) the set of data containers to the match-action stages and (ii) the packet data and the generated descriptive data outside of the match-action stages to a deparser that uses the packet data, generated descriptive data, and the set of data containers as modified by the match-action stages to reconstruct a modified packet.
    Type: Grant
    Filed: May 20, 2020
    Date of Patent: August 23, 2022
    Assignee: Barefoot Networks, Inc.
    Inventors: Gregory C. Watson, Srivathsa Dhruvanarayan, Glen Raymond Gibb, Constantine Calamvokis, Aled Justin Edwards
  • Patent number: 11403519
    Abstract: A compiler receives a description of a machine learning network and generates a computer program that implements the machine learning network. The computer program includes statically scheduled instructions that are executed by a mesh of processing elements (Tiles). The instructions executed by the Tiles are statically scheduled because the compiler can determine which instructions are executed by which Tiles at what times. For example, for the statically scheduled instructions, there are no conditions, branching or data dependencies that can be resolved only at run-time, and which would affect the timing and order of the execution of the instructions.
    Type: Grant
    Filed: April 6, 2020
    Date of Patent: August 2, 2022
    Assignee: SiMa Technologies, Inc.
    Inventors: Nishit Shah, Reed Kotler, Srivathsa Dhruvanarayan, Moenes Zaher Iskarous, Kavitha Prasad, Yogesh Laxmikant Chobe, Sedny S. J Attia, Spenser Don Gilliland
  • Patent number: 11354570
    Abstract: A compiler receives a description of a machine learning network and generates a computer program that implements the machine learning network. The computer program includes statically scheduled instructions that are executed by a mesh of processing elements (Tiles). The instructions executed by the Tiles are statically scheduled because the compiler can determine which instructions are executed by which Tiles at what times. For example, for the statically scheduled instructions, there are no conditions, branching or data dependencies that can be resolved only at run-time, and which would affect the timing and order of the execution of the instructions.
    Type: Grant
    Filed: April 6, 2020
    Date of Patent: June 7, 2022
    Assignee: SiMa Technologies, Inc.
    Inventors: Nishit Shah, Reed Kotler, Srivathsa Dhruvanarayan, Moenes Zaher Iskarous, Kavitha Prasad, Yogesh Laxmikant Chobe, Sedny S. J Attia, Spenser Don Gilliland
  • Patent number: 11321607
    Abstract: A compiler receives a description of a machine learning network and generates a computer program that implements the machine learning network. The computer program includes statically scheduled instructions that are executed by a mesh of processing elements (Tiles). The instructions executed by the Tiles are statically scheduled because the compiler can determine which instructions are executed by which Tiles at what times. For example, for the statically scheduled instructions, there are no conditions, branching or data dependencies that can be resolved only at run-time, and which would affect the timing and order of the execution of the instructions.
    Type: Grant
    Filed: April 3, 2020
    Date of Patent: May 3, 2022
    Assignee: SiMa Technologies, Inc.
    Inventors: Nishit Shah, Reed Kotler, Srivathsa Dhruvanarayan, Moenes Zaher Iskarous, Kavitha Prasad, Yogesh Laxmikant Chobe, Sedny S. J Attia, Spenser Don Gilliland
  • Publication number: 20210342673
    Abstract: A compiler generates a computer program implementing a machine learning network on a machine learning accelerator (MLA) including interconnected processing elements. The computer program includes data transfer instructions for non-colliding data transfers between the processing elements. To generate the data transfer instructions, the compiler determines non-conflicting data transfer paths for data transfers based on a topology of the interconnections between processing elements, on dependencies of the instructions and on a duration for execution of the instructions. Each data transfer path specifies a routing and a time slot for the data transfer. The compiler generates data transfer instructions that specify routing of the data transfers and generates a static schedule that schedules execution of the data transfer instructions during the time slots for the data transfers.
    Type: Application
    Filed: May 1, 2020
    Publication date: November 4, 2021
    Inventors: Nishit Shah, Srivathsa Dhruvanarayan, Reed Kotler
  • Publication number: 20210326750
    Abstract: Convolutions of an input sample with multiple kernels is decomposed into matrix multiplications of a V×C matrix of input values times a C×K matrix of kernel values, producing a V×K product. For the second matrix, C is a channel dimension (i.e., each row of the second matrix is a different channel of the input sample and kernel) and K is the kernel dimension (i.e., each column of the second matrix is a different kernel), but all the values correspond to the same pixel position in the kernel. In the matrix product, V is the output dimension and K is the kernel dimension. Thus, each value in the output matrix is a partial product for a certain output pixel and kernel, and the matrix multiplication parallelizes the convolutions by calculating partial products for multiple output pixels and multiple kernels.
    Type: Application
    Filed: April 21, 2020
    Publication date: October 21, 2021
    Inventors: Nishit Shah, Srivathsa Dhruvanarayan
  • Publication number: 20210326189
    Abstract: A method, system, and apparatus are disclosed herein for bridging a deterministic phase of instructions with a non-deterministic phase of instructions when those instructions are executed by a machine learning accelerator while executing a machine learning network. In the non-deterministic phase, data and instructions are transferred from off-chip memory to on-chip memory. When the transfer is complete, processing elements are synchronized and, upon synchronization, a deterministic phase of instructions is executed by the processing elements.
    Type: Application
    Filed: April 17, 2020
    Publication date: October 21, 2021
    Inventors: Nishit Shah, Srivathsa Dhruvanarayan, Reed Kotler
  • Publication number: 20210319307
    Abstract: A system-on-chip (SoC) integrated circuit product includes a machine learning accelerator (MLA). It also includes other processor cores, such as general purpose processors and application-specific processors. It also includes a network-on-chip for communication between the different modules. The SoC implements a heterogeneous compute environment because the processor cores are customized for different purposes and typically will use different instruction sets. Applications may use some or all of the functionalities offered by the processor cores, and the processor cores may be programmed into different pipelines to perform different tasks.
    Type: Application
    Filed: April 10, 2020
    Publication date: October 14, 2021
    Inventors: Srivathsa Dhruvanarayan, Nishit Shah, Bradley Taylor, Moenes Zaher Iskarous
  • Publication number: 20210312322
    Abstract: A compiler receives a description of a machine learning network and generates a computer program that implements the machine learning network. The computer program includes statically scheduled instructions that are executed by a mesh of processing elements (Tiles). The instructions executed by the Tiles are statically scheduled because the compiler can determine which instructions are executed by which Tiles at what times. For example, for the statically scheduled instructions, there are no conditions, branching or data dependencies that can be resolved only at run-time, and which would affect the timing and order of the execution of the instructions.
    Type: Application
    Filed: April 6, 2020
    Publication date: October 7, 2021
    Inventors: Nishit Shah, Reed Kotler, Srivathsa Dhruvanarayan, Moenes Zaher Iskarous, Kavitha Prasad, Yogesh Laxmikant Chobe, Sedny S.J Attia, Spenser Don Gilliland
  • Publication number: 20210312267
    Abstract: A compiler receives a description of a machine learning network and generates a computer program that implements the machine learning network. The computer program includes statically scheduled instructions that are executed by a mesh of processing elements (Tiles). The instructions executed by the Tiles are statically scheduled because the compiler can determine which instructions are executed by which Tiles at what times. For example, for the statically scheduled instructions, there are no conditions, branching or data dependencies that can be resolved only at run-time, and which would affect the timing and order of the execution of the instructions.
    Type: Application
    Filed: April 6, 2020
    Publication date: October 7, 2021
    Inventors: Nishit Shah, Reed Kotler, Srivathsa Dhruvanarayan, Moenes Zaher Iskarous, Kavitha Prasad, Yogesh Laxmikant Chobe, Sedny S.J Attia, Spenser Don Gilliland
  • Publication number: 20210312320
    Abstract: A compiler receives a description of a machine learning network and generates a computer program that implements the machine learning network. The computer program includes statically scheduled instructions that are executed by a mesh of processing elements (Tiles). The instructions executed by the Tiles are statically scheduled because the compiler can determine which instructions are executed by which Tiles at what times. For example, for the statically scheduled instructions, there are no conditions, branching or data dependencies that can be resolved only at run-time, and which would affect the timing and order of the execution of the instructions.
    Type: Application
    Filed: April 3, 2020
    Publication date: October 7, 2021
    Inventors: Nishit Shah, Reed Kotler, Srivathsa Dhruvanarayan, Moenes Zaher Iskarous, Kavitha Prasad, Yogesh Laxmikant Chobe, Sedny S.J Attia, Spenser Don Gilliland
  • Patent number: 11115147
    Abstract: Embodiments of the present disclosure pertain to improved circuit and system architectures for identifying and managing operating statuses and faults in a system having multiple processing circuit chips. Each of the multiple processing circuit chips includes multiple signal rings, one to provide internal communications among circuitry within the circuit chip, and another with inter-chip communications circuitry to provide communications with neighboring circuit chips. One of the multiple processing circuit chips further includes external communications circuitry to provide communications with an external host.
    Type: Grant
    Filed: January 9, 2019
    Date of Patent: September 7, 2021
    Assignee: Groq, Inc.
    Inventors: Matthew Pond Baker, Srivathsa Dhruvanarayan, Boone Jared Severson
  • Publication number: 20210105220
    Abstract: Some embodiments provide a method for a hardware forwarding element that includes multiple queues. The method receives a packet at a multi-stage processing pipeline of the hardware forwarding element. The method determines, at one of the stages of the processing pipeline, to modify a setting of a particular one of the queues. The method stores an identifier for the particular queue and instructions to modify the queue setting with data passed through the processing pipeline for the packet. The stored information is subsequently used by the hardware forwarding element to modify the queue setting.
    Type: Application
    Filed: October 16, 2020
    Publication date: April 8, 2021
    Inventors: Jeongkeun LEE, Yi LI, Michael FENG, Srivathsa Dhruvanarayan, Anurag AGRAWAL
  • Patent number: 10949199
    Abstract: Some embodiments provide a method for a network forwarding integrated circuit (IC). The method receives packet data with an instruction to copy a portion of the packet data to a temporary storage of the network forwarding IC. The portion is larger than a maximum entry size of the temporary storage. The method generates a header for each of multiple packet data sections for storage in entries of the temporary storage, with each packet data section including a sub-portion of the packet data portion. The method sends the packet data sections with the generated headers to the temporary storage for storage in multiple separate temporary storage entries.
    Type: Grant
    Filed: December 8, 2017
    Date of Patent: March 16, 2021
    Assignee: Barefoot Networks, Inc.
    Inventors: Xiaozhou Li, Jeongkeun Lee, Srivathsa Dhruvanarayan, Anurag Agrawal, Changhoon Kim, Alain Loge