Patents by Inventor Dennis Charles Abts

Dennis Charles Abts has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240037064
    Abstract: Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
    Type: Application
    Filed: October 9, 2023
    Publication date: February 1, 2024
    Inventors: Dennis Charles Abts, Jonathan Alexander Ross, John Thompson, Gregory Michael Thorson
  • Patent number: 11868250
    Abstract: A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.
    Type: Grant
    Filed: January 24, 2022
    Date of Patent: January 9, 2024
    Assignee: Groq, Inc.
    Inventors: Jonathan Alexander Ross, Dennis Charles Abts, John Thompson, Gregory M. Thorson
  • Patent number: 11822510
    Abstract: Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
    Type: Grant
    Filed: March 1, 2022
    Date of Patent: November 21, 2023
    Assignee: Groq, Inc.
    Inventors: Dennis Charles Abts, Jonathan Alexander Ross, John Thompson, Gregory Michael Thorson
  • Publication number: 20230359584
    Abstract: Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
    Type: Application
    Filed: July 13, 2023
    Publication date: November 9, 2023
    Inventors: Dennis Charles Abts, Jonathan Alexander Ross, John Thompson, Gregory Michael Thorson
  • Patent number: 11645226
    Abstract: Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
    Type: Grant
    Filed: March 17, 2022
    Date of Patent: May 9, 2023
    Assignee: Groq, Inc.
    Inventors: Dennis Charles Abts, Jonathan Alexander Ross, John Thompson, Gregory Michael Thorson
  • Publication number: 20230024670
    Abstract: Embodiments are directed to a deterministic streaming system with one or more deterministic streaming processors each having an array of processing elements and a first deterministic memory coupled to the processing elements. The deterministic streaming system further includes a second deterministic memory with multiple data banks having a global memory address space, and a controller. The controller initiates retrieval of first data from the data banks of the second deterministic memory as a first plurality of streams, each stream of the first plurality of streams streaming toward a respective group of processing elements of the array of processing elements. The controller further initiates writing of second data to the data banks of the second deterministic memory as a second plurality of streams, each stream of the second plurality of streams streaming from the respective group of processing elements toward a respective data bank of the second deterministic memory.
    Type: Application
    Filed: July 6, 2022
    Publication date: January 26, 2023
    Inventor: Dennis Charles Abts
  • Patent number: 11392535
    Abstract: A computational array is implemented in which all operands and results are loaded or output from a single side of the array. The computational array comprises a plurality of cells arranged in n rows and m columns, each configured to produce a processed value based upon a weight value and an activation value. The cells receive weight and activation values via colinear weight and activation transmission channels that each extend across a first side edge of the computational array to provide weight values and activation values to the cells of the array. In addition, result values produced at a top cell of each of the m columns of the array are routed through the array to be output from the same first side edge of the array at a same relative timing at which the result values were produced.
    Type: Grant
    Filed: November 25, 2020
    Date of Patent: July 19, 2022
    Assignee: GROQ, INC.
    Inventors: Jonathan Alexander Ross, Tom Hawkins, Dennis Charles Abts
  • Patent number: 11360934
    Abstract: Embodiments are directed to a processor having a functional slice architecture. The processor is divided into tiles (or functional units) organized into a plurality of functional slices. The functional slices are configured to perform specific operations within the processor, which includes memory slices for storing operand data and arithmetic logic slices for performing operations on received operand data (e.g., vector processing, matrix manipulation). The processor includes a plurality of functional slices of a module type, each functional slice having a plurality of tiles. The processor further includes a plurality of data transport lanes for transporting data in a direction indicated in a corresponding instruction. The processor also includes a plurality of instruction queues, each instruction queue associated with a corresponding functional slice of the plurality of functional slices, wherein the instructions in the instruction queues comprise a functional slice specific operation code.
    Type: Grant
    Filed: November 27, 2020
    Date of Patent: June 14, 2022
    Assignee: GROQ, INC.
    Inventors: Dennis Charles Abts, Jonathan Alexander Ross, John Thompson, Gregory Michael Thorson
  • Patent number: 11263129
    Abstract: A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.
    Type: Grant
    Filed: July 30, 2019
    Date of Patent: March 1, 2022
    Assignee: Groq, Inc.
    Inventors: Jonathan Alexander Ross, Dennis Charles Abts, John Thompson, Gregory M. Thorson
  • Patent number: 11243880
    Abstract: A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.
    Type: Grant
    Filed: September 14, 2018
    Date of Patent: February 8, 2022
    Assignee: Groq, Inc.
    Inventors: Jonathan Alexander Ross, Dennis Charles Abts, John Thompson, Gregory M. Thorson
  • Publication number: 20210157767
    Abstract: A computational array is implemented in which all operands and results are loaded or output from a single side of the array. The computational array comprises a plurality of cells arranged in n rows and m columns, each configured to produce a processed value based upon a weight value and an activation value. The cells receive weight and activation values are received via colinear weight and activation transmission channels that each extend across a first side edge of the computational array to provide weight values and activations values to the cells of the array. In addition, result values produced at a top cell of each of the m columns of the array are routed through the array to be output from the same first side edge of the array at a same relative timing at which the result values were produced.
    Type: Application
    Filed: November 25, 2020
    Publication date: May 27, 2021
    Inventors: Jonathan Alexander Ross, Tom Hawkins, Dennis Charles Abts
  • Patent number: 10938412
    Abstract: A predictive model utilizes a set of coefficients for processing received input data. To reduce memory usage storing the coefficients, a compression circuit compresses the set of coefficients prior to storage by generating a cumulative count distribution of the coefficient values, and identifying a distribution function approximating the cumulative count distribution. Function parameters for the determined function are stored in a memory and used by a decompression circuit to apply the function the compressed coefficients to determine the decompressed component values. Storing the function parameters may consume less memory in comparison to storing a look-up table for decompression, and may reduce an amount of memory look-ups required during decompression.
    Type: Grant
    Filed: May 1, 2020
    Date of Patent: March 2, 2021
    Assignee: Groq, Inc.
    Inventors: Jonathan Alexander Ross, Dennis Charles Abts
  • Publication number: 20200259504
    Abstract: A predictive model utilizes a set of coefficients for processing received input data. To reduce memory usage storing the coefficients, a compression circuit compresses the set of coefficients prior to storage by generating a cumulative count distribution of the coefficient values, and identifying a distribution function approximating the cumulative count distribution. Function parameters for the determined function are stored in a memory and used by a decompression circuit to apply the function the compressed coefficients to determine the decompressed component values. Storing the function parameters may consume less memory in comparison to storing a look-up table for decompression, and may reduce an amount of memory look-ups required during decompression.
    Type: Application
    Filed: May 1, 2020
    Publication date: August 13, 2020
    Inventors: Jonathan Alexander Ross, Dennis Charles Abts
  • Patent number: 10680644
    Abstract: A predictive model utilizes a set of coefficients for processing received input data. To reduce memory usage storing the coefficients, a compression circuit compresses the set of coefficients prior to storage by generating a cumulative count distribution of the coefficient values, and identifying a distribution function approximating the cumulative count distribution. Function parameters for the determined function are stored in a memory and used by a decompression circuit to apply the function the compressed coefficients to determine the decompressed component values. Storing the function parameters may consume less memory in comparison to storing a look-up table for decompression, and may reduce an amount of memory look-ups required during decompression.
    Type: Grant
    Filed: September 14, 2018
    Date of Patent: June 9, 2020
    Assignee: Groq, Inc.
    Inventors: Jonathan Alexander Ross, Dennis Charles Abts
  • Publication number: 20190207626
    Abstract: A predictive model utilizes a set of coefficients for processing received input data. To reduce memory usage storing the coefficients, a compression circuit compresses the set of coefficients prior to storage by generating a cumulative count distribution of the coefficient values, and identifying a distribution function approximating the cumulative count distribution. Function parameters for the determined function are stored in a memory and used by a decompression circuit to apply the function the compressed coefficients to determine the decompressed component values. Storing the function parameters may consume less memory in comparison to storing a look-up table for decompression, and may reduce an amount of memory look-ups required during decompression.
    Type: Application
    Filed: September 14, 2018
    Publication date: July 4, 2019
    Inventors: Jonathan Alexander Ross, Dennis Charles Abts
  • Patent number: 10084718
    Abstract: The exemplary embodiments provide an indirect hypercube topology for a datacenter network. The indirect hypercube is formed by providing each host with a multi-port network interface controller (NIC). One port of the NIC is connected to a fat-tree network while another port is connected to a peer host forming a single dimension of an indirect binary n-cube. Hence, the composite topology becomes a hierarchical tree of cubes. The hierarchical tree of cubes topology uses (a) the fat-tree topology to scale to large host count and (b) the indirect binary n-cube topology at the leaves of the fat-tree topology for a tightly coupled high-bandwidth interconnect among a subset of hosts.
    Type: Grant
    Filed: December 31, 2013
    Date of Patent: September 25, 2018
    Assignee: Google LLC
    Inventor: Dennis Charles Abts
  • Patent number: 9929960
    Abstract: Aspects and implementations of the present disclosure are directed to an indirect generalized hypercube network in a computer network facility. Servers in the computer network facility participate in both an over-subscribed fat tree network hierarchy culminating in a gateway connection to external networks and in an indirect hypercube network interconnecting a plurality of servers in the fat tree. The participant servers have multiple network interface ports, including at least one port for a link to an edge layer network device of the fat tree and at least one port for a link to a peer server in the indirect hypercube network. Servers are grouped by edge layer network device to form virtual switches in the indirect hypercube network and data packets are routed between servers using routes through the virtual switches. Routes leverage properties of the hypercube topology. Participant servers function as destination points and as virtual interfaces for the virtual switches.
    Type: Grant
    Filed: May 31, 2017
    Date of Patent: March 27, 2018
    Assignee: Google LLC
    Inventors: Dennis Charles Abts, Abdul Kabbani, Robert Felderman
  • Patent number: 9705798
    Abstract: Aspects and implementations of the present disclosure are directed to an indirect generalized hypercube network in a data center. Servers in the data center participate in both an over-subscribed fat tree network hierarchy culminating in a gateway connection to external networks and in an indirect hypercube network interconnecting a plurality of servers in the fat tree. The participant servers have multiple network interface ports, including at least one port for a link to an edge layer network device of the fat tree and at least one port for a link to a peer server in the indirect hypercube network. Servers are grouped by edge layer network device to form virtual switches in the indirect hypercube network and data packets are routed between servers using routes through the virtual switches. Routes leverage properties of the hypercube topology. Participant servers function as destination points and as virtual interfaces for the virtual switches.
    Type: Grant
    Filed: January 7, 2014
    Date of Patent: July 11, 2017
    Assignee: Google Inc.
    Inventors: Dennis Charles Abts, Abdul Kabbani, Robert Felderman
  • Patent number: 8806244
    Abstract: Energy proportional solutions are provided for computer networks such as datacenters. Congestion sensing heuristics are used to adaptively route traffic across links. Traffic intensity is sensed and links are dynamically activated as they are needed. As the offered load is decreased, the lower channel utilization is sensed and the link speed is reduced to save power. Flattened butterfly topologies can be used in a further power saving approach. Switch mechanisms are exploit the topology's capabilities by reconfiguring link speeds on-the-fly to match bandwidth and power with the traffic demand. For instance, the system may estimate the future bandwidth needs of each link and reconfigure its data rate to meet those requirements while consuming less power. In one configuration, a mechanism is provided where the switch tracks the utilization of each of its links over an epoch, and then makes an adjustment at the end of the epoch.
    Type: Grant
    Filed: November 19, 2013
    Date of Patent: August 12, 2014
    Assignee: Google Inc.
    Inventors: Dennis Charles Abts, Peter Michael Klausler, Hong Liu, Michael Marty, Philip Michael Wells
  • Patent number: 8730965
    Abstract: Adaptive packet routing is employed in a multiprocessor network configuration such as an InfiniBand switch architecture. Packets are routed from host to host through one or more switches. Upon receipt of a packet at a switch, the packet header is inspected to determine the destination host. A destination field in the header is used to index into a lookup table or other memory, which produces a route type and an output port grouping. Depending on the route type, one or more primary and secondary output port candidates are identified. An output port arbitration module chooses an output port from which to send a given packet, using congestion sensing inputs for the specified ports. A heuristic may include the congestion information that is provided to the arbitration module. Switching may be performed among minimal or non-minimal routes along each hop in the path, depending upon link and packet injection information.
    Type: Grant
    Filed: January 5, 2011
    Date of Patent: May 20, 2014
    Assignee: Google Inc.
    Inventors: Dennis Charles Abts, Peter Michael Klausler, Michael Marty, Philip Wells