Patents Assigned to Cerebras Systems Inc.
  • Patent number: 11934945
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency, such as accuracy of learning, accuracy of prediction, speed of learning, performance of learning, and energy efficiency of learning. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Each compute element has processing resources and memory resources. Each router enables communication via wavelets with at least nearest neighbors in a 2D mesh. Stochastic gradient descent, mini-batch gradient descent, and continuous propagation gradient descent are techniques usable to train weights of a neural network modeled by the processing elements. Reverse checkpoint is usable to reduce memory usage during the training.
    Type: Grant
    Filed: February 23, 2018
    Date of Patent: March 19, 2024
    Assignee: Cerebras Systems Inc.
    Inventors: Sean Lie, Michael Morrison, Michael Edwin James, Gary R. Lauterbach, Srikanth Arekapudi
  • Patent number: 11853867
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a compute element and a routing element. Each router enables communication via wavelets with at least nearest neighbors in a 2D mesh. Routing is controlled by virtual channel specifiers in each wavelet and routing configuration information in each router. Execution of an activate instruction or completion of a fabric vector operation activates one of the virtual channels. A virtual channel is selected from a pool comprising previously activated virtual channels and virtual channels associated with previously received wavelets. A task corresponding to the selected virtual channel is activated by executing instructions corresponding to the selected virtual channel.
    Type: Grant
    Filed: October 19, 2021
    Date of Patent: December 26, 2023
    Assignee: Cerebras Systems Inc.
    Inventors: Sean Lie, Michael Morrison, Srikanth Arekapudi, Michael Edwin James, Gary R. Lauterbach
  • Patent number: 11727257
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Instructions executed by the compute element include operand specifiers, some specifying a data structure register storing a data structure descriptor describing an operand as a fabric vector or a memory vector. The data structure descriptor further describes the memory vector as one of a one-dimensional vector, a four-dimensional vector, or a circular buffer vector. Optionally, the data structure descriptor specifies an extended data structure register storing an extended data structure descriptor. The extended data structure descriptor specifies parameters relating to a four-dimensional vector or a circular buffer vector.
    Type: Grant
    Filed: January 24, 2022
    Date of Patent: August 15, 2023
    Assignee: Cerebras Systems Inc.
    Inventors: Sean Lie, Michael Morrison, Srikanth Arekapudi, Gary R. Lauterbach, Michael Edwin James
  • Patent number: 11727254
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow based computations on wavelets of data. Each processing element has a compute element and a routing element. Each compute element has memory. Each router enables communication via wavelets with nearest neighbors in a 2D mesh. A compute element receives a wavelet. If a control specifier of the wavelet is a first value, then instructions are read from the memory of the compute element in accordance with an index specifier of the wavelet. If the control specifier is a second value, then instructions are read from the memory of the compute element in accordance with a virtual channel specifier of the wavelet. Then the compute element initiates execution of the instructions.
    Type: Grant
    Filed: August 27, 2020
    Date of Patent: August 15, 2023
    Assignee: Cerebras Systems Inc.
    Inventors: Sean Lie, Gary R. Lauterbach, Michael Edwin James, Michael Morrison, Srikanth Arekapudi
  • Patent number: 11631600
    Abstract: Systems and methods of securing an integrated circuit assembly includes: arranging a plurality of securing elements within a plurality of orifices fabricated within one or more layer components of a plurality of layer components of an integrated circuit assembly; applying a mechanical compression load against the integrated circuit assembly that uniformly compresses together the plurality of layer components of the integrated circuit assembly; after applying the mechanical compression load to the integrated circuit assembly, fastening the plurality of securing elements while the integrated circuit assembly is in a compressed state based on the mechanical compression load; and terminating the application of the mechanical compression load against the integrated circuit assembly based on the fastening of the plurality of securing elements.
    Type: Grant
    Filed: August 14, 2020
    Date of Patent: April 18, 2023
    Assignee: CEREBRAS SYSTEMS INC.
    Inventors: Jean-Philippe Fricker, Frank Jun, Paul Kennedy
  • Patent number: 11626342
    Abstract: Systems and methods include an integrated circuit assembly that includes a semiconductor substrate; a heat transfer element; and an ambulatory thermal interface arranged between the semiconductor substrate and the heat transfer element, the ambulatory thermal interface comprising: a thermally conductive material, and a friction reduction material, wherein: the thermally conductive material is arranged along a surface of the heat transfer element, the friction reduction material is arranged along a surface of the semiconductor substrate, opposing surfaces of the thermally conductive material and the friction reduction material define a slidable interface when placed in contact.
    Type: Grant
    Filed: January 8, 2021
    Date of Patent: April 11, 2023
    Assignee: CEREBRAS SYSTEMS INC.
    Inventor: Jean-Philippe Fricker
  • Patent number: 11580394
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency, such as accuracy of learning, accuracy of prediction, speed of learning, performance of learning, and energy efficiency of learning. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Each compute element has processing resources and memory resources. Each router enables communication via wavelets with at least nearest neighbors in a 2D mesh. Stochastic gradient descent, mini-batch gradient descent, and continuous propagation gradient descent are techniques usable to train weights of a neural network modeled by the processing elements. Reverse checkpoint is usable to reduce memory usage during the training.
    Type: Grant
    Filed: June 24, 2020
    Date of Patent: February 14, 2023
    Assignee: Cerebras Systems Inc.
    Inventors: Sean Lie, Michael Morrison, Michael Edwin James, Gary R. Lauterbach, Srikanth Arekapudi
  • Patent number: 11488004
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Each compute element has memory. At least a first single neuron is implemented using resources of a plurality of the array of processing elements. At least a portion of a second neuron is implemented using resources of one or more of the plurality of processing elements. In some usage scenarios, the foregoing neuron implementation enables greater performance by enabling a single neuron to use the computational resources of multiple processing elements and/or computational load balancing across the processing elements while maintaining locality of incoming activations for the processing elements.
    Type: Grant
    Filed: April 15, 2018
    Date of Patent: November 1, 2022
    Assignee: Cerebras Systems Inc.
    Inventors: Sean Lie, Michael Morrison, Srikanth Arekapudi, Michael Edwin James, Gary R. Lauterbach
  • Patent number: 11475282
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of compute elements and routers performs flow-based computations on wavelets of data. Some instructions are performed in iterations, such as one iteration per element of a fabric vector or FIFO. When sources for an iteration of an instruction are unavailable, and/or there is insufficient space to store results of the iteration, indicators associated with operands of the instruction are checked to determine whether other work can be performed. In some scenarios, other work cannot be performed and processing stalls. Alternatively, information about the instruction is saved, the other work is performed, and sometime after the sources become available and/or sufficient space to store the results becomes available, the iteration is performed using the saved information.
    Type: Grant
    Filed: April 17, 2018
    Date of Patent: October 18, 2022
    Assignee: Cerebras Systems Inc.
    Inventors: Sean Lie, Michael Morrison, Michael Edwin James, Gary R. Lauterbach, Srikanth Arekapudi
  • Patent number: 11449574
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements comprising a portion of a neural network accelerator performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Each compute element has a respective floating-point unit enabled to perform stochastic rounding, thus in some circumstances enabling reducing systematic bias in long dependency chains of floating-point computations. The long dependency chains of floating-point computations are performed, e.g., to train a neural network or to perform inference with respect to a trained neural network.
    Type: Grant
    Filed: April 13, 2018
    Date of Patent: September 20, 2022
    Assignee: Cerebras Systems Inc.
    Inventors: Sean Lie, Michael Edwin James, Michael Morrison, Gary R. Lauterbach, Srikanth Arekapudi
  • Patent number: 11367701
    Abstract: An integrated circuit assembly that includes a semiconductor wafer having a first coefficient of thermal expansion; an electronic circuit substrate having a second coefficient of thermal expansion that is different than the first coefficient of thermal expansion; and an elastomeric connector arranged between the semiconductor wafer and the electronic circuit substrate and that forms an operable signal communication path between the semiconductor wafer and the electronic circuit substrate.
    Type: Grant
    Filed: December 7, 2020
    Date of Patent: June 21, 2022
    Assignee: Cerebras Systems Inc.
    Inventor: Jean-Philippe Fricker
  • Patent number: 11367686
    Abstract: A semiconductor and a method of fabricating the semiconductor having multiple, interconnected die including: providing a semiconductor substrate having a plurality of disparate die formed within the semiconductor substrate, and a plurality of scribe lines formed between pairs of adjacent die of the plurality of disparate die; and fabricating, by a lithography system, a plurality of inter-die connections that extend between adjacent pair of die of the plurality of die.
    Type: Grant
    Filed: August 14, 2020
    Date of Patent: June 21, 2022
    Assignee: Cerebras Systems Inc.
    Inventors: Jean-Philippe Fricker, Philip Ferolito
  • Patent number: 11328208
    Abstract: Techniques in advanced deep learning provide improvements in one or more of cost, accuracy, performance, and energy efficiency. The deep learning accelerator is implemented at least in part via wafer-scale integration. The wafer comprises a plurality of processor elements, each augmented with redundancy-enabling couplings. The redundancy-enabling couplings enable using redundant ones of the processor elements to replace defective ones of the processor elements. Defect information gathered at wafer test and/or in-situ, such as in a datacenter, is used to determine configuration information for the redundancy-enabling couplings.
    Type: Grant
    Filed: August 27, 2019
    Date of Patent: May 10, 2022
    Assignee: Cerebras Systems Inc.
    Inventors: Sean Lie, Michael Edwin James, Michael Morrison, Srikanth Arekapudi, Gary R. Lauterbach
  • Patent number: 11328207
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, energy efficiency, and cost. In a first embodiment, a scaled array of processing elements is implementable with varying dimensions of the processing elements to enable varying price/performance systems. In a second embodiment, an array of clusters communicates via high-speed serial channels. The array and the channels are implemented on a Printed Circuit Board (PCB). Each cluster comprises respective processing and memory elements. Each cluster is implemented via a plurality of 3D-stacked dice, 2.5D-stacked dice, or both in a Ball Grid Array (BGA). A processing portion of the cluster is implemented via one or more Processing Element (PE) dice of the stacked dice. A memory portion of the cluster is implemented via one or more High Bandwidth Memory (HBM) dice of the stacked dice.
    Type: Grant
    Filed: August 11, 2019
    Date of Patent: May 10, 2022
    Assignee: Cerebras Systems Inc.
    Inventors: Gary R. Lauterbach, Sean Lie, Michael Morrison, Michael Edwin James, Srikanth Arekapudi
  • Patent number: 11321087
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements comprising a portion of a neural network accelerator performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Each compute element is enabled to execute instructions in accordance with an ISA. The ISA is enhanced in accordance with improvements with respect to deep learning acceleration.
    Type: Grant
    Filed: August 27, 2019
    Date of Patent: May 3, 2022
    Assignee: Cerebras Systems Inc.
    Inventors: Michael Morrison, Michael Edwin James, Sean Lie, Srikanth Arekapudi, Gary R. Lauterbach
  • Patent number: 11232348
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Instructions executed by the compute element include operand specifiers, some specifying a data structure register storing a data structure descriptor describing an operand as a fabric vector or a memory vector. The data structure descriptor further describes the memory vector as one of a one-dimensional vector, a four-dimensional vector, or a circular buffer vector. Optionally, the data structure descriptor specifies an extended data structure register storing an extended data structure descriptor. The extended data structure descriptor specifies parameters relating to a four-dimensional vector or a circular buffer vector.
    Type: Grant
    Filed: July 15, 2020
    Date of Patent: January 25, 2022
    Assignee: Cerebras Systems Inc.
    Inventors: Sean Lie, Michael Morrison, Srikanth Arekapudi, Gary R. Lauterbach, Michael Edwin James
  • Patent number: 11232347
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Instructions executed by the compute element include operand specifiers, some specifying a data structure register storing a data structure descriptor describing an operand as a fabric vector or a memory vector. The data structure descriptor further describes various attributes of the fabric vector: length, microthreading eligibility, number of data elements to receive, transmit, and/or process in parallel, virtual channel and task identification information, whether to terminate upon receiving a control wavelet, and whether to mark an outgoing wavelet a control wavelet.
    Type: Grant
    Filed: April 17, 2018
    Date of Patent: January 25, 2022
    Assignee: Cerebras Systems Inc.
    Inventors: Sean Lie, Michael Morrison, Michael Edwin James, Srikanth Arekapudi, Gary R. Lauterbach
  • Patent number: 11201137
    Abstract: The power on wafer assembly can include: a compliant connector, an integrated circuit, a printed circuit board (PCB), a power component, and a set of compliant connectors. The power on wafer assembly can optionally include: a compression element, a cooling system, a set of mechanical clamping components, and a power source. However, the power on wafer assembly can additionally or alternately include any other suitable components.
    Type: Grant
    Filed: September 4, 2020
    Date of Patent: December 14, 2021
    Assignee: Cerebras Systems Inc.
    Inventor: Jean-Philippe Fricker
  • Patent number: 11157806
    Abstract: Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a compute element and a routing element. Each router enables communication via wavelets with at least nearest neighbors in a 2D mesh. Routing is controlled by virtual channel specifiers in each wavelet and routing configuration information in each router. Execution of an activate instruction or completion of a fabric vector operation activates one of the virtual channels. A virtual channel is selected from a pool comprising previously activated virtual channels and virtual channels associated with previously received wavelets. A task corresponding to the selected virtual channel is activated by executing instructions corresponding to the selected virtual channel.
    Type: Grant
    Filed: April 17, 2018
    Date of Patent: October 26, 2021
    Assignee: Cerebras Systems Inc.
    Inventors: Sean Lie, Michael Morrison, Srikanth Arekapudi, Michael Edwin James, Gary R. Lauterbach
  • Patent number: 11145530
    Abstract: The integrated circuit assembly can include: a semiconductor and a substrate (e.g., PCB). The integrated circuit assembly can optionally include: a compliant connector, a thermal management, and a securing element. The semiconductor 210 can include a first alignment feature. (e.g., orifice). The substrate can include a second alignment feature (e.g., alignment target) and conductive pads. The substrate can optionally include a cavity.
    Type: Grant
    Filed: November 9, 2020
    Date of Patent: October 12, 2021
    Assignee: Cerebras Systems Inc.
    Inventors: Jean-Philippe Fricker, Tim Botsford, Philip Ferolito, Paul Kennedy