Patents Assigned to GROQ, INC.
  • Patent number: 12287695
    Abstract: One or more embodiments of a regulator circuit for providing power to a load device having a first power demand profile over time. The regulator circuit comprises a regulator and an energy storage device coupled to the regulator and the load device. The regulator circuit is configured to scavenge provided energy that is available beyond the first power demand profile. Further, the regulator circuit is configured to store that energy in the energy storage device, and the energy storage device is configured to augment deliverable peak power to the load device when the load device requires more power than is provided by the regulator circuit.
    Type: Grant
    Filed: April 15, 2024
    Date of Patent: April 29, 2025
    Assignee: Groq, Inc.
    Inventors: James David Sproch, Dinesh Maheshwari
  • Patent number: 12277444
    Abstract: A system contains a network of processors arranged in a plurality of nodes. Each node comprises a respective plurality of processors connected via local links, and different nodes are connected via global links. The processors of the network communicate with each other to establish a global counter for the network, enabling deterministic communication between the processors of the network. A compiler is configured to explicitly schedule communication traffic across the global and local links of the network of processors based upon the deterministic links between the processors, which enable software-scheduled networking with explicit send or receive instructions executed by functional units of the processors at specific times, to establish a specific ordering of operations performed by the network of processors. In some embodiments, the processors of the network of processors are tensor streaming processors (TSPs).
    Type: Grant
    Filed: November 23, 2022
    Date of Patent: April 15, 2025
    Assignee: Groq, Inc.
    Inventors: Dennis Charles Abts, Jonathan Ross, Garrin Kimmell, Michael Bye, Matthew Boyd, Andrew Ling
  • Patent number: 12271339
    Abstract: Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
    Type: Grant
    Filed: October 9, 2023
    Date of Patent: April 8, 2025
    Assignee: Groq, Inc.
    Inventors: Dennis Charles Abts, Jonathan Alexander Ross, John Thompson, Gregory Michael Thorson
  • Patent number: 12260118
    Abstract: A deterministic apparatus comprising a deterministic near-compute memory communicatively coupled with and proximate to a deterministic processor. The deterministic near-compute memory comprises a plurality of data banks having a global memory address space, a control bus, a data input bus and a data output bus for each data bank. The deterministic processor is configured to initiate, via the control bus, retrieval of a set of data from the plurality of data banks. The retrieved set of data comprises at least one row of a selected one of the data banks passed via the data output bus onto a plurality of stream registers of the deterministic processor.
    Type: Grant
    Filed: December 12, 2022
    Date of Patent: March 25, 2025
    Assignee: Groq, Inc.
    Inventor: Dinesh Maheshwari
  • Patent number: 12248357
    Abstract: Embodiments pertain to reducing power consumption in a computing system comprising one or more deterministic processors. A controller generates a plurality of control signals for a voltage regulator to regulate a supply voltage of a respective one of the one or more deterministic processors. A power management module determines an initial profile for power consumption and performance of an algorithm executed on the respective deterministic processor having an initial value for the supply voltage and an initial value for a clock frequency. The power management module further determines a target profile for power consumption and performance of the algorithm executed on the respective deterministic processor. The controller modifies the plurality of control signals based on the initial profile and the target profile. The respective deterministic processor executes the algorithm while the supply voltage is dynamically modified by the voltage regulator based on the modified plurality of control signals.
    Type: Grant
    Filed: September 20, 2021
    Date of Patent: March 11, 2025
    Assignee: GROQ, Inc.
    Inventors: Omar Ahmad, Geert Rosseel
  • Patent number: 12223436
    Abstract: A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
    Type: Grant
    Filed: January 5, 2024
    Date of Patent: February 11, 2025
    Assignee: GROQ, INC.
    Inventors: Jonathan Alexander Ross, Gregory M. Thorson
  • Patent number: 12222894
    Abstract: Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
    Type: Grant
    Filed: July 13, 2023
    Date of Patent: February 11, 2025
    Assignee: GROQ, INC.
    Inventors: Dennis Charles Abts, Jonathan Alexander Ross, John Thompson, Gregory Michael Thorson
  • Patent number: 12210642
    Abstract: Embodiments are directed to a computing system with permission control via data redundancy. The computing system includes a memory and a permission control circuit coupled to the memory. The permission control circuit encodes a first data vector by using a bit position register with a first permission control code for a first user, writes the encoded first data vector into the memory, and updates content of the bit position register from the first permission control code to a second permission control code for a second user. The encoded first data vector written into the memory is inaccessible for the second user based on the updated content of the bit position register.
    Type: Grant
    Filed: September 27, 2022
    Date of Patent: January 28, 2025
    Assignee: GROQ, INC.
    Inventors: Zefu Dai, John Thompson
  • Patent number: 12175287
    Abstract: A processor comprises a computational array of computational elements and an instruction dispatch circuit. The computational elements receive data operands via data lanes extending along a first dimension, and processes the operands based upon instructions received from the instruction dispatch circuit via instruction lanes extending along a second dimension. The instruction dispatch circuit receives raw instructions, and comprises an instruction dispatch unit (IDU) processor that processes a set of raw instructions to generate processed instructions for dispatch to the computational elements, where the number of processed instructions is not equal to the number of instructions of the set of raw instructions.
    Type: Grant
    Filed: December 20, 2023
    Date of Patent: December 24, 2024
    Assignee: GROQ, INC.
    Inventors: Brian Lee Kurtz, Dinesh Maheshwari, James David Sproch
  • Patent number: 12001383
    Abstract: Embodiments are directed to a deterministic streaming system with one or more deterministic streaming processors each having an array of processing elements and a first deterministic memory coupled to the processing elements. The deterministic streaming system further includes a second deterministic memory with multiple data banks having a global memory address space, and a controller. The controller initiates retrieval of first data from the data banks of the second deterministic memory as a first plurality of streams, each stream of the first plurality of streams streaming toward a respective group of processing elements of the array of processing elements. The controller further initiates writing of second data to the data banks of the second deterministic memory as a second plurality of streams, each stream of the second plurality of streams streaming from the respective group of processing elements toward a respective data bank of the second deterministic memory.
    Type: Grant
    Filed: July 6, 2022
    Date of Patent: June 4, 2024
    Assignee: Groq, Inc.
    Inventor: Dennis Charles Abts
  • Patent number: 11960346
    Abstract: One or more embodiments of a regulator circuit for providing power to a load device having a first power demand profile over time. The regulator circuit comprises a regulator and an energy storage device coupled to the regulator and the load device. The regulator circuit is configured to scavenge provided energy that is available beyond the first power demand profile. Further, the regulator circuit is configured to store that energy in the energy storage device, and the energy storage device is configured to augment deliverable peak power to the load device when the load device requires more power than is provided by the regulator circuit.
    Type: Grant
    Filed: October 8, 2020
    Date of Patent: April 16, 2024
    Assignee: GROQ, INC.
    Inventors: James David Sproch, Dinesh Maheshwari
  • Patent number: 11921559
    Abstract: Embodiments are directed to a power grid distribution for a deterministic processor. The deterministic processor includes a plurality of functional slices, a plurality of data transport lanes for transporting data across the functional slices along a first spatial dimension, and a plurality of instruction control units (ICUs). An instruction in each subset of the ICUs includes a functional slice specific operation code and is transported to a corresponding functional slice along a second spatial dimension orthogonal to the first spatial dimension. A power supply grid of metal traces is spread across the first and second spatial dimensions for supplying power to the functional slices and the ICUs. At least a portion of the metal traces are routed as discontinuous stubs along the first spatial dimension or the second spatial dimension.
    Type: Grant
    Filed: April 28, 2022
    Date of Patent: March 5, 2024
    Assignee: Groq, Inc.
    Inventor: Jeffrey Werner
  • Patent number: 11892896
    Abstract: In one embodiment, the present disclosure includes a method of reducing power in an artificial intelligence processor. For each cycle, over a plurality of cycles, an AI model is translated into operations executable on an artificial intelligence processor. The translating is based on power parameters that correspond to power consumption and performance of the artificial intelligence processor. The AI processor is configured with the executable operations, and input activation data sets are processed. Accordingly, result sets, power consumption data, and performance data are generated and stored over the plurality of cycles. The method further includes training an AI algorithm using the stored parameters, the power consumption data, and the performance data. A trained AI algorithm outputs a plurality of optimized parameters to reduce power consumption of the AI processor. The AI model is then translated into optimized executable operations based on the plurality of optimized parameters.
    Type: Grant
    Filed: November 18, 2020
    Date of Patent: February 6, 2024
    Assignee: Groq, Inc.
    Inventor: Sushma Honnavara-Prasad
  • Patent number: 11875874
    Abstract: A memory structure having 2m read ports allowing for concurrent access to n data entries can be constructed using three memory structures each having 2m-1 read ports. The three memory structures include two structures providing access to half of the n data entries, and a difference structure providing access to difference data between the halves of the n data entries. Each pair of the 2m ports is connected to a respective port of each of the 2m-1-port data structures, such that each port of the part can access data entries of a first half of the n data entries either by accessing the structure storing that half directly, or by accessing both the difference structure and the structure containing the second half to reconstruct the data entries of the first half, thus allowing for a pair of ports to concurrently access any of the stored data entries in parallel.
    Type: Grant
    Filed: August 9, 2021
    Date of Patent: January 16, 2024
    Assignee: Groq, Inc.
    Inventors: Jonathan Alexander Ross, Gregory M. Thorson
  • Patent number: 11868908
    Abstract: A system receives a predictive model and receives one or more runtime constraints. The system generates a directed acyclic graph (DAG) of the predictive model indicating dependencies. The system compiles the predictive model into first instructions for a first processor based on the one or more runtime constraints and the DAG. The system packages first instructions, the one or more runtime constraints, and the DAG of the predictive model in a first binary. The system recompiles the predictive model into second instructions for a second processor based on the runtime constraints and the DAG stored in the first processor. The system packages the second instructions, the DAG, and the runtime constraints in a second binary.
    Type: Grant
    Filed: December 16, 2022
    Date of Patent: January 9, 2024
    Assignee: Groq, Inc.
    Inventors: Jonathan Alexander Ross, Gregory M. Thorson
  • Patent number: 11868804
    Abstract: A processor comprises a computational array of computational elements and an instruction dispatch circuit. The computational elements receive data operands via data lanes extending along a first dimension, and processes the operands based upon instructions received from the instruction dispatch circuit via instruction lanes extending along a second dimension. The instruction dispatch circuit receives raw instructions, and comprises an instruction dispatch unit (IDU) processor that processes a set of raw instructions to generate processed instructions for dispatch to the computational elements, where the number of processed instructions is not equal to the number of instructions of the set of raw instructions.
    Type: Grant
    Filed: November 18, 2020
    Date of Patent: January 9, 2024
    Assignee: Groq, Inc.
    Inventors: Brian Lee Kurtz, Dinesh Maheshwari, James David Sprach
  • Patent number: 11868250
    Abstract: A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.
    Type: Grant
    Filed: January 24, 2022
    Date of Patent: January 9, 2024
    Assignee: Groq, Inc.
    Inventors: Jonathan Alexander Ross, Dennis Charles Abts, John Thompson, Gregory M. Thorson
  • Patent number: 11856226
    Abstract: Introduced here is a technique to create small compressed image files while preserving data quality upon decompression. Upon receiving an uncompressed data, such as an image, a video, an audio, and/or a structured data, a machine learning model identifies an object in the uncompressed data such as a house, a dog, a text, a distinct audio signal, a unique data pattern, etc. The identified object is compressed using a compression treatment optimized for the identified object. The identified object, either before or after the compression, is removed from the uncompressed data. The uncompressed data with the identified object removed is compressed using a standard compression treatment.
    Type: Grant
    Filed: March 8, 2021
    Date of Patent: December 26, 2023
    Assignee: Groq, Inc.
    Inventor: Jonathan Alexander Ross
  • Patent number: 11822510
    Abstract: Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
    Type: Grant
    Filed: March 1, 2022
    Date of Patent: November 21, 2023
    Assignee: Groq, Inc.
    Inventors: Dennis Charles Abts, Jonathan Alexander Ross, John Thompson, Gregory Michael Thorson
  • Patent number: 11809514
    Abstract: A method comprises receiving a kernel used to convolve with an input tensor. For a first dimension of the kernel, a square block of values for each single dimensional vector of the kernel that includes all rotations of that single dimensional vector is generated. For each additional dimension of the kernel, group blocks of an immediately preceding dimension into sets of blocks, each set of blocks including blocks of the immediately preceding dimension that are aligned along a vector that is parallel to the axis of the dimension; and generate, for the additional dimension, one or more blocks of values, each block including all rotations of blocks within each of the sets of blocks of the immediately preceding dimension. The block of values corresponding to the last dimension in the additional dimensions of the kernel is output as the expanded kernel.
    Type: Grant
    Filed: November 4, 2021
    Date of Patent: November 7, 2023
    Assignee: Groq, Inc.
    Inventors: Jonathan Alexander Ross, Thomas Hawkins, Gregory Michael Thorson, Matt Boyd