Patents Assigned to Graphcore Limited
  • Patent number: 11477050
    Abstract: A gateway for interfacing a host with a subsystem for acting as a work accelerator to the host. The gateway enables the transfer of batches of data to the subsystem at precompiled data exchange synchronisation points. The gateway acts to route data between accelerators which are connected in a scaled system of multiple gateways and accelerators using a global address space set up at compile time of an application to run on the computer system.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: October 18, 2022
    Assignee: Graphcore Limited
    Inventors: Ola Tørudbakken, Daniel John Pelham Wilkinson, Brian Manula
  • Patent number: 11467833
    Abstract: A processor having an instruction set including a load-store instruction having operands specifying, from amongst the registers in at least one register file, a respective destination of each of two load operations, a respective source of a store operation, and a pair of address registers arranged to hold three memory addresses, the three memory addresses being a respective load address for each of the two load operations and a respective store address for the store operation. The load-store instruction further includes three stride operands each specifying a respective stride value for each of the two load addresses and one store address, wherein at least some possible values of each stride operand specify the respective stride value by specifying one of a plurality of fields within a stride register in one of the one or more register files, each field holding a different stride value.
    Type: Grant
    Filed: February 15, 2019
    Date of Patent: October 11, 2022
    Assignee: Graphcore Limited
    Inventors: Alan Graham Alexander, Simon Christian Knowles, Mrudula Chidambar Gore
  • Patent number: 11462293
    Abstract: A memory controller is provided for reading and writing to and from a memory module. The memory controller implements an error correction algorithm, which calculates error correction code for message data to be written to the memory module and checks the error correction code against the message data when the data is read out of the memory module. The memory controller spreads each codeword over at least four different beats sent over the interface with the memory module, with each beat comprising a symbol of error correction code. Bits of a particular symbol of message data occupy the same positions in different beats. Since the bits of the symbols occupy the same positions in different beat, the number of bits affected by a hardware error is minimised. With four symbols of error correction code available for use in the codeword.
    Type: Grant
    Filed: July 20, 2021
    Date of Patent: October 4, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Graham Bernard Cunningham, Stephen Felix
  • Patent number: 11461175
    Abstract: Signature generation circuitry is configured to update a signature in response to each of a plurality of writes to memory. The signature is updated by performing bitwise operations between current bit values of the signature and at least some of the bits written to memory in response a write. The bitwise operation are order-independent such that the resulting signature is the same irrespective of the order in which the writes are used to update the signature. The signatures are formed in an order-independent manner such that, if no errors have occurred in generating the data to be written to be memory, the signatures will match. In this way, a compact signature is developed that is suitable export from the data processing device for checking against a corresponding data processing device of a machine running a duplicate application.
    Type: Grant
    Filed: September 17, 2021
    Date of Patent: October 4, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Daniel Wilkinson, Graham Bernard Cunningham
  • Patent number: 11455155
    Abstract: A computer system comprises a work accelerator, a gateway the transfer of data to the accelerator from external storage, the accelerator executes a first compiled code sequence to perform computations on data transferred to the accelerator from the gateway. The first compiled code sequence comprises a synchronisation instruction indicating a barrier between a compute phase in which the compute instructions are executed and an exchange phase, wherein execution of the synchronisation instruction causes an indication of a pre-compiled data exchange synchronisation point to be transferred to the gateway. The gateway comprises a streaming engine storing a second compiled code sequence in the form of a set of data transfer instructions executable by the streaming engine to perform data transfer operations to stream data through the gateway in the exchange phase, wherein the first and second compiled code sequences are generated as a related set at compile time.
    Type: Grant
    Filed: January 27, 2021
    Date of Patent: September 27, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Ola Torudbakken, Daniel John Pelham Wilkinson, Brian Manula, Harald Hoeg
  • Patent number: 11449117
    Abstract: During normal operation of a processor, voltage droop is likely to occur and there is, therefore, a need for techniques for rapidly addressing this droop so as to reduce the probability of circuit timing failures. This problem is addressed by provided an apparatus that is configured to detect the droop and react to mitigate the droop. The apparatus includes a frequency divider that is configured to receive an output of a clock signal generator (e.g. a phase locked loop) and produce an output signal in which a predefined fraction of the clock pulses in the output of the clock signal generator are removed from the output signal. By reducing the frequency of the clock signal in this way (as may be understood by examining equation 3) VDD is increased, hence mitigating the voltage droop. This technique provides a fast throttling mechanism that prevents excessive VDD droop across the processor.
    Type: Grant
    Filed: April 8, 2020
    Date of Patent: September 20, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Daniel Wilkinson
  • Patent number: 11449254
    Abstract: A system and method for providing a set of data transfer instructions for converting one or more tensors between two different layouts. A first layout is used for storage of the data in host memory. A second layout is used for storage of the data in external memory accessible to a subsystem. The subsystem acts as a work accelerator to the host, and reads the external memory and processes the data read from the external memory. The first layout may be a logical representation of the tensor. The second layout is optimised for transfer to and processing by the subsystem. The data transfer instructions for converting between the two layouts are generated in dependence upon an analysis of the instructions to be executed by the subsystem.
    Type: Grant
    Filed: September 17, 2020
    Date of Patent: September 20, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Richard Osborne, Chad Jarvis, Fabian Tschopp, Tim Hutt, Emmanuel Menage
  • Patent number: 11449338
    Abstract: A multi-tile processing system has a plurality of tiles each having an execution unit, and an interconnect operable to conduct communications between a group of the tiles according to a bulk synchronous parallel scheme. The execution unit is operable to execute instructions of an instruction set which has a synchronisation instruction for execution by each tile upon completion of its compute phase. The execution of the synchronisation instruction depends on the state of an exception enable flag. In one state, the synchronisation instruction causes the execution unit to send the synchronisation request to hardware logic in the interconnect. In another state of the exception enable flag the synchronisation instruction does not send the synchronisation request, but sets an exception events status to permit interrogation access to the tile. A corresponding method of controlling the debug states of the processing system is provided.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: September 20, 2022
    Assignee: Graphcore Limited
    Inventors: Alan Graham Alexander, Matthew David Fyles
  • Patent number: 11449309
    Abstract: A hardware module comprising circuitry configured to: store a sequence of n bits in a register of the hardware module; generate a signed integer comprising a magnitude component and a sign bit by: if the most significant bit of the sequence of n bits is equal to one: set each of the n?1 of the most significant bits of the magnitude component to be equal to the corresponding bit of the n?1 least significant bits of the sequence of n bits; and set the sign bit to be zero; if the most significant bit of the sequence of n bits is equal to zero: set each of the n?1 of the most significant bits of the magnitude component to be equal to the inverse of the corresponding bit of the n?1 least significant bits of the sequence of n bits; and set the sign bit to be one.
    Type: Grant
    Filed: June 21, 2019
    Date of Patent: September 20, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Mrudula Gore
  • Patent number: 11442082
    Abstract: During normal operation of a processor, voltage droop is likely to occur and there is, therefore, a need for techniques for rapidly and accurately detecting this droop so as to reduce the probability of circuit timing failures. The droop detector described herein uses a tap sampled delay line in which a clock signal is split along two separate paths. Each of the taps in the paths are separated by two inverter delays such that the set of samples produced represent sample values of the clock signal that are each separated by a single inverter delay without inversion of the first clock signal between the samples.
    Type: Grant
    Filed: October 28, 2020
    Date of Patent: September 13, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Daniel John Pelham Wilkinson
  • Patent number: 11416440
    Abstract: A computer program comprising a sequence of instructions for execution on a processing unit having instruction storage for holding the computer program, an execution unit for executing the computer program and data storage for holding data, the computer program comprising: a switch control instruction which when executed causes the processing unit to control switching circuitry to connect a set of connection wires of the processing unit to a switching fabric to receive a data packet at a predetermined received time, the switch control instruction comprising a delay control field which holds a value defining a delay between issuance of the instruction in the sequence of instructions and its execution by the execution unit.
    Type: Grant
    Filed: February 13, 2020
    Date of Patent: August 16, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Richard Luke Southwell Osborne, Alan Graham Alexander, Stephen Felix
  • Patent number: 11416258
    Abstract: A method for debugging a processor which is executing vertices of a software application is described. Each vertex is assigned to a programming thread of the processor. The processor has debug hardware for raising exceptions in certain break conditions. The method comprises inspecting a vertex identifier, comparing the vertex identifier and raising an instruction exception event for the programming thread if the vertex identifier assigned to the thread matches the vertex break identifier in the debug hardware. Exceptions are raised based on identified vertices, rather than just individual instructions or instruction addresses.
    Type: Grant
    Filed: May 22, 2019
    Date of Patent: August 16, 2022
    Assignee: Graphcore Limited
    Inventors: Alan Graham Alexander, Richard Luke Southwell Osborne, Matthew David Fyles
  • Patent number: 11403108
    Abstract: A processor includes: a memory; an execution pipeline having a plurality of pipeline stages for executing an operation on data provided to the execution pipeline and storing a result of the operation into the memory; a receive pipeline having a plurality of pipeline stages for handling incoming data to the processor and storing the incoming data into memory; context status storage for holding an exception indicator of an exception encountered by the receive pipeline whilst it is handling incoming data; the receive pipeline being configured to determine that an exception has been encountered in one of its pipeline stages and to delay committing the exception indicator to the context status storage until a final one of its pipeline stages and to continue to receive and store incoming data into the memory until the exception indicator has been committed.
    Type: Grant
    Filed: May 17, 2021
    Date of Patent: August 2, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: James Pallister, Jamie Hanlon
  • Patent number: 11372791
    Abstract: A computer comprising a plurality of interconnected processing nodes arranged in a configuration with multiple layers, arranged along an axis, comprising first and second endmost layers and at least one intermediate layer between the first and second endmost layers is provided. Each layer comprises a plurality of processing nodes connected in a ring by an intralayer respective set of links between each pair of neighbouring processing nodes, the links adapted to operate simultaneously. Nodes in each layer are connected to respective corresponding nodes in each adjacent layer by an interlayer link. Each processing node in the first endmost layer is connected to a corresponding node in the second endmost layer. Data is transmitted around a plurality of embedded one-dimensional logical rings with an asymmetric bandwidth utilisation, each logical ring using all processing nodes of the computer in such a manner that the plurality of embedded one-dimensional logical rings operate simultaneously.
    Type: Grant
    Filed: March 26, 2020
    Date of Patent: June 28, 2022
    Assignee: GRAPHCORE LIMITED
    Inventor: Simon Knowles
  • Patent number: 11366649
    Abstract: A method for generating a program to run on multiple tiles. The method comprises: receiving an input graph comprising data nodes, compute vertices and edges; receiving an initial tile-mapping specifying which data nodes and vertices are allocated to which tile; and determining a subgraph of the input graph that meets one or more heuristic rules. The rules comprises: the subgraph comprises at least one data node, the subgraph spans no more than a threshold number of tiles in the initial tile-mapping, and the subgraph comprises at least a minimum number of edges outputting to one or more vertices on one or more other tiles. The method further comprises adapting the initial mapping to migrate the data nodes and any vertices of the determined subgraph to said one or more other tiles.
    Type: Grant
    Filed: May 4, 2020
    Date of Patent: June 21, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Mark Lloyd Pupilli, David Lacey
  • Patent number: 11340868
    Abstract: An execution unit for a processor, the execution unit comprising: a look up table having a plurality of entries, each of the plurality of entries comprising an initial estimate for a result of an operation; a preparatory circuit configured to search the look up table using an index value dependent upon the operand to locate an entry comprising a first initial estimate for a result of the operation; a plurality of processing circuits comprising at least one multiplier circuit; and control circuitry configured to provide the first initial estimate to the at least one multiplier circuit of the plurality of processing circuits so as perform processing, by the plurality of processing units, of the first initial estimate to generate the function result, said processing comprising applying one or more Newton Raphson iterations to the first initial estimate.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: May 24, 2022
    Assignee: Graphcore Limited
    Inventors: Jonathan Mangnall, Stephen Felix
  • Patent number: 11334400
    Abstract: A gateway implementing multiple independent sync networks. The independent sync networks can be used to allow for synchronisation between different synchronisation groups of accelerators. The independent sync networks allow synchronisations to be carried out asynchronously and simultaneously. The gateway has sync propagation circuitry that receives a first synchronisation request for an upcoming exchange phase and propagates this sync request through a first sync network. The first synchronisation request is a request for synchronisation between subsystems of a first synchronisation group. The sync propagation circuitry of the gateway also receives a second synchronisation request for a different exchange phase and propagates this sync request through the second sync network. The second synchronisation request is a request for synchronisation between subsystems of a second synchronisation group. The two exchange phases overlap in time. Therefore, the syncs are simultaneous and asynchronous.
    Type: Grant
    Filed: December 17, 2020
    Date of Patent: May 17, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Brian Manula, Daniel John Pelham Wilkinson
  • Patent number: 11334320
    Abstract: An execution unit configured to execute a computer program instruction to generate random numbers based on a predetermined probability distribution. The execution unit comprises a hardware pseudorandom number generator configured to generate at least randomised bit string on execution of the instruction and adding circuitry which is configured to receive a number of bit sequences of a predetermined bit length selected from the randomised bit string and to sum them to produce a result.
    Type: Grant
    Filed: February 21, 2020
    Date of Patent: May 17, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Godfrey Da Costa
  • Patent number: 11328015
    Abstract: A function approximation system is disclosed for determining output floating point values of functions calculated using floating point numbers. Complex functions have different shapes in different subsets of their input domain, making them difficult to predict for different values of the input variable. The function approximation system comprises an execution unit configured to determine corresponding values of a given function given a floating point input to the function; a plurality of look up tables for each function type; a correction table of values which determines if corrections to the output value are required; and a table selector for finding an appropriate table for a given function.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: May 10, 2022
    Assignee: Graphcore Limited
    Inventors: Jonathan Mangnall, Stephen Felix
  • Patent number: 11327813
    Abstract: Implicit sync group selection is performed by having dual interfaces to a gateway. A subsystem coupled to the gateway selects a sync group to be used for an upcoming exchange by selecting the interface to which a sync request is written to. The gateway propagates the sync requests and/or acknowledgments in dependence upon configuration settings for the sync group that is associated with the interface to which the sync request was written to.
    Type: Grant
    Filed: December 17, 2020
    Date of Patent: May 10, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Brian Manula, Daniel John Pelham Wilkinson