Patents Assigned to Graphcore Limited
  • Patent number: 11550591
    Abstract: A processor comprising: an execution unit for executing a respective thread in each of a repeating sequence of time slots; and a plurality of context register sets, each comprising a respective set of registers for representing a state of a respective thread. The context register sets comprise a respective worker context register set for each of the number of time slots the execution unit is operable to interleave, and at least one extra context register set. The worker context register sets represent the respective states of worker threads and the extra context register set being represents the state of a supervisor thread. The processor is configured to begin running the supervisor thread in each of the time slots, and to enable the supervisor thread to then individually relinquish each of the time slots in which it is running to a respective one of the worker threads.
    Type: Grant
    Filed: February 10, 2021
    Date of Patent: January 10, 2023
    Assignee: GRAPHCORE LIMITED
    Inventor: Simon Christian Knowles
  • Patent number: 11531637
    Abstract: A computer comprising a plurality of interconnected processing nodes arranged in a toroid configuration in which multiple layers of interconnected nodes are arranged along an axis; each layer comprising a plurality of processing nodes connected in a ring in a non-axial plane by at least an intralayer respective set of links between each pair of neighbouring processing nodes, the links in each set adapted to operate simultaneously; wherein each of the processing nodes in each layer is connected to a respective corresponding node in each adjacent layer by an interlayer link to form respective rings along the axis; the computer programmed to provide a plurality of embedded one-dimensional logical paths and to transmit data around each of the embedded one-dimensional paths in such a manner that the plurality of embedded one-dimensional logical paths operate simultaneously, each logical path using all processing nodes of the computer in a sequence.
    Type: Grant
    Filed: March 24, 2021
    Date of Patent: December 20, 2022
    Assignee: GRAPHCORE LIMITED
    Inventor: Simon Knowles
  • Patent number: 11520941
    Abstract: Access permissions are set for different requesting circuits on a control bus. The access permissions can be set by the level 1 manager and the level 2 manager, allowing two layers of security to be added. The level 1 manager has priority, allowing it to add access permissions that cannot be removed by the level 2 manager.
    Type: Grant
    Filed: May 24, 2021
    Date of Patent: December 6, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Daniel John Pelham Wilkinson, Graham Bernard Cunningham
  • Patent number: 11520726
    Abstract: A processor comprises a plurality of processing units on an integrated circuit interconnected by an exchange. The exchange has a group of exchange paths extending between first and second portions of the integrated circuit. Each group has a first exchange block in the first portion and a second exchange block in the second portion. The processor has a first external interface in the first portion a second external interface in the second portion and a routing bus which routes packets between the external interfaces and the exchange blocks. The first external interface exchanges packets between the integrated circuit and a host. The second interface exchanges packets between the integrated circuit and another integrated circuit. Errors may be trapped when packets are wrongly addressed. A network of such processors is also provided.
    Type: Grant
    Filed: July 14, 2021
    Date of Patent: December 6, 2022
    Assignee: GRAPHCORE LIMITED
    Inventor: Hachem Yassine
  • Patent number: 11507386
    Abstract: A processing system comprises a first subsystem comprising at least one host processor and one or more storage units, and a second subsystem comprising at least one second processor. Each second processor comprises a plurality of tiles. Each tile comprises a processing unit and memory. At least one storage unit stores bootloader code for each of first and second subsets of the plurality of tiles of at least one second processor. The first subsystem writes bootloader code to each of the first subset of tiles of the at least one second processor. At least one of the first subset of tiles requests at least one of the storage units to return the bootloader code to the second subset of the plurality of tiles. Each tile to which the bootloader code is written retrieves boot code from the storage unit and then runs said boot code.
    Type: Grant
    Filed: July 31, 2019
    Date of Patent: November 22, 2022
    Assignee: GRAPHCORE LIMITED
    Inventor: Daniel John Pelham Wilkinson
  • Patent number: 11507416
    Abstract: A computer system comprising: (i) a computer subsystem configured to act as a work accelerator, and (ii) a gateway connected to the computer subsystem, the gateway enabling the transfer of data to the computer subsystem from external storage at pre-compiled data exchange synchronization points attained by the computer subsystem, which act as a barrier between a compute phase and an exchange phase of the computer subsystem, wherein the computer subsystem is configured to pull data from a gateway transfer memory of the gateway in response to the pre-compiled data exchange synchronization point attained by the subsystem, wherein the gateway comprises at least one processor configured to perform at least one operation to pre-load at least some of the data from a first memory of the gateway to the gateway transfer memory in advance of the pre-compiled data exchange synchronization point attained by the subsystem.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: November 22, 2022
    Assignee: GRAPHCORE LIMITED
    Inventor: Brian Manula
  • Patent number: 11477050
    Abstract: A gateway for interfacing a host with a subsystem for acting as a work accelerator to the host. The gateway enables the transfer of batches of data to the subsystem at precompiled data exchange synchronisation points. The gateway acts to route data between accelerators which are connected in a scaled system of multiple gateways and accelerators using a global address space set up at compile time of an application to run on the computer system.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: October 18, 2022
    Assignee: Graphcore Limited
    Inventors: Ola Tørudbakken, Daniel John Pelham Wilkinson, Brian Manula
  • Patent number: 11467833
    Abstract: A processor having an instruction set including a load-store instruction having operands specifying, from amongst the registers in at least one register file, a respective destination of each of two load operations, a respective source of a store operation, and a pair of address registers arranged to hold three memory addresses, the three memory addresses being a respective load address for each of the two load operations and a respective store address for the store operation. The load-store instruction further includes three stride operands each specifying a respective stride value for each of the two load addresses and one store address, wherein at least some possible values of each stride operand specify the respective stride value by specifying one of a plurality of fields within a stride register in one of the one or more register files, each field holding a different stride value.
    Type: Grant
    Filed: February 15, 2019
    Date of Patent: October 11, 2022
    Assignee: Graphcore Limited
    Inventors: Alan Graham Alexander, Simon Christian Knowles, Mrudula Chidambar Gore
  • Patent number: 11462293
    Abstract: A memory controller is provided for reading and writing to and from a memory module. The memory controller implements an error correction algorithm, which calculates error correction code for message data to be written to the memory module and checks the error correction code against the message data when the data is read out of the memory module. The memory controller spreads each codeword over at least four different beats sent over the interface with the memory module, with each beat comprising a symbol of error correction code. Bits of a particular symbol of message data occupy the same positions in different beats. Since the bits of the symbols occupy the same positions in different beat, the number of bits affected by a hardware error is minimised. With four symbols of error correction code available for use in the codeword.
    Type: Grant
    Filed: July 20, 2021
    Date of Patent: October 4, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Graham Bernard Cunningham, Stephen Felix
  • Patent number: 11461175
    Abstract: Signature generation circuitry is configured to update a signature in response to each of a plurality of writes to memory. The signature is updated by performing bitwise operations between current bit values of the signature and at least some of the bits written to memory in response a write. The bitwise operation are order-independent such that the resulting signature is the same irrespective of the order in which the writes are used to update the signature. The signatures are formed in an order-independent manner such that, if no errors have occurred in generating the data to be written to be memory, the signatures will match. In this way, a compact signature is developed that is suitable export from the data processing device for checking against a corresponding data processing device of a machine running a duplicate application.
    Type: Grant
    Filed: September 17, 2021
    Date of Patent: October 4, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Daniel Wilkinson, Graham Bernard Cunningham
  • Patent number: 11455155
    Abstract: A computer system comprises a work accelerator, a gateway the transfer of data to the accelerator from external storage, the accelerator executes a first compiled code sequence to perform computations on data transferred to the accelerator from the gateway. The first compiled code sequence comprises a synchronisation instruction indicating a barrier between a compute phase in which the compute instructions are executed and an exchange phase, wherein execution of the synchronisation instruction causes an indication of a pre-compiled data exchange synchronisation point to be transferred to the gateway. The gateway comprises a streaming engine storing a second compiled code sequence in the form of a set of data transfer instructions executable by the streaming engine to perform data transfer operations to stream data through the gateway in the exchange phase, wherein the first and second compiled code sequences are generated as a related set at compile time.
    Type: Grant
    Filed: January 27, 2021
    Date of Patent: September 27, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Ola Torudbakken, Daniel John Pelham Wilkinson, Brian Manula, Harald Hoeg
  • Patent number: 11449309
    Abstract: A hardware module comprising circuitry configured to: store a sequence of n bits in a register of the hardware module; generate a signed integer comprising a magnitude component and a sign bit by: if the most significant bit of the sequence of n bits is equal to one: set each of the n?1 of the most significant bits of the magnitude component to be equal to the corresponding bit of the n?1 least significant bits of the sequence of n bits; and set the sign bit to be zero; if the most significant bit of the sequence of n bits is equal to zero: set each of the n?1 of the most significant bits of the magnitude component to be equal to the inverse of the corresponding bit of the n?1 least significant bits of the sequence of n bits; and set the sign bit to be one.
    Type: Grant
    Filed: June 21, 2019
    Date of Patent: September 20, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Mrudula Gore
  • Patent number: 11449338
    Abstract: A multi-tile processing system has a plurality of tiles each having an execution unit, and an interconnect operable to conduct communications between a group of the tiles according to a bulk synchronous parallel scheme. The execution unit is operable to execute instructions of an instruction set which has a synchronisation instruction for execution by each tile upon completion of its compute phase. The execution of the synchronisation instruction depends on the state of an exception enable flag. In one state, the synchronisation instruction causes the execution unit to send the synchronisation request to hardware logic in the interconnect. In another state of the exception enable flag the synchronisation instruction does not send the synchronisation request, but sets an exception events status to permit interrogation access to the tile. A corresponding method of controlling the debug states of the processing system is provided.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: September 20, 2022
    Assignee: Graphcore Limited
    Inventors: Alan Graham Alexander, Matthew David Fyles
  • Patent number: 11449117
    Abstract: During normal operation of a processor, voltage droop is likely to occur and there is, therefore, a need for techniques for rapidly addressing this droop so as to reduce the probability of circuit timing failures. This problem is addressed by provided an apparatus that is configured to detect the droop and react to mitigate the droop. The apparatus includes a frequency divider that is configured to receive an output of a clock signal generator (e.g. a phase locked loop) and produce an output signal in which a predefined fraction of the clock pulses in the output of the clock signal generator are removed from the output signal. By reducing the frequency of the clock signal in this way (as may be understood by examining equation 3) VDD is increased, hence mitigating the voltage droop. This technique provides a fast throttling mechanism that prevents excessive VDD droop across the processor.
    Type: Grant
    Filed: April 8, 2020
    Date of Patent: September 20, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Daniel Wilkinson
  • Patent number: 11449254
    Abstract: A system and method for providing a set of data transfer instructions for converting one or more tensors between two different layouts. A first layout is used for storage of the data in host memory. A second layout is used for storage of the data in external memory accessible to a subsystem. The subsystem acts as a work accelerator to the host, and reads the external memory and processes the data read from the external memory. The first layout may be a logical representation of the tensor. The second layout is optimised for transfer to and processing by the subsystem. The data transfer instructions for converting between the two layouts are generated in dependence upon an analysis of the instructions to be executed by the subsystem.
    Type: Grant
    Filed: September 17, 2020
    Date of Patent: September 20, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Richard Osborne, Chad Jarvis, Fabian Tschopp, Tim Hutt, Emmanuel Menage
  • Patent number: 11442082
    Abstract: During normal operation of a processor, voltage droop is likely to occur and there is, therefore, a need for techniques for rapidly and accurately detecting this droop so as to reduce the probability of circuit timing failures. The droop detector described herein uses a tap sampled delay line in which a clock signal is split along two separate paths. Each of the taps in the paths are separated by two inverter delays such that the set of samples produced represent sample values of the clock signal that are each separated by a single inverter delay without inversion of the first clock signal between the samples.
    Type: Grant
    Filed: October 28, 2020
    Date of Patent: September 13, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Daniel John Pelham Wilkinson
  • Patent number: 11416440
    Abstract: A computer program comprising a sequence of instructions for execution on a processing unit having instruction storage for holding the computer program, an execution unit for executing the computer program and data storage for holding data, the computer program comprising: a switch control instruction which when executed causes the processing unit to control switching circuitry to connect a set of connection wires of the processing unit to a switching fabric to receive a data packet at a predetermined received time, the switch control instruction comprising a delay control field which holds a value defining a delay between issuance of the instruction in the sequence of instructions and its execution by the execution unit.
    Type: Grant
    Filed: February 13, 2020
    Date of Patent: August 16, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Richard Luke Southwell Osborne, Alan Graham Alexander, Stephen Felix
  • Patent number: 11416258
    Abstract: A method for debugging a processor which is executing vertices of a software application is described. Each vertex is assigned to a programming thread of the processor. The processor has debug hardware for raising exceptions in certain break conditions. The method comprises inspecting a vertex identifier, comparing the vertex identifier and raising an instruction exception event for the programming thread if the vertex identifier assigned to the thread matches the vertex break identifier in the debug hardware. Exceptions are raised based on identified vertices, rather than just individual instructions or instruction addresses.
    Type: Grant
    Filed: May 22, 2019
    Date of Patent: August 16, 2022
    Assignee: Graphcore Limited
    Inventors: Alan Graham Alexander, Richard Luke Southwell Osborne, Matthew David Fyles
  • Patent number: 11403108
    Abstract: A processor includes: a memory; an execution pipeline having a plurality of pipeline stages for executing an operation on data provided to the execution pipeline and storing a result of the operation into the memory; a receive pipeline having a plurality of pipeline stages for handling incoming data to the processor and storing the incoming data into memory; context status storage for holding an exception indicator of an exception encountered by the receive pipeline whilst it is handling incoming data; the receive pipeline being configured to determine that an exception has been encountered in one of its pipeline stages and to delay committing the exception indicator to the context status storage until a final one of its pipeline stages and to continue to receive and store incoming data into the memory until the exception indicator has been committed.
    Type: Grant
    Filed: May 17, 2021
    Date of Patent: August 2, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: James Pallister, Jamie Hanlon
  • Patent number: 11372791
    Abstract: A computer comprising a plurality of interconnected processing nodes arranged in a configuration with multiple layers, arranged along an axis, comprising first and second endmost layers and at least one intermediate layer between the first and second endmost layers is provided. Each layer comprises a plurality of processing nodes connected in a ring by an intralayer respective set of links between each pair of neighbouring processing nodes, the links adapted to operate simultaneously. Nodes in each layer are connected to respective corresponding nodes in each adjacent layer by an interlayer link. Each processing node in the first endmost layer is connected to a corresponding node in the second endmost layer. Data is transmitted around a plurality of embedded one-dimensional logical rings with an asymmetric bandwidth utilisation, each logical ring using all processing nodes of the computer in such a manner that the plurality of embedded one-dimensional logical rings operate simultaneously.
    Type: Grant
    Filed: March 26, 2020
    Date of Patent: June 28, 2022
    Assignee: GRAPHCORE LIMITED
    Inventor: Simon Knowles