Patents Assigned to Graphcore Limited
  • Patent number: 11119559
    Abstract: There is disclosed a method of controlling the frequency of a clock signal in a processor. The method selects a first clock generator to provide a processor clock signal for executing an application. If a threshold event is detected, a second clock generator is selected. The method reduces the frequency of a clock signal generated by the first clock generator while a processor clock signal is being provided for execution of an application from the second clock generator. The second clock generator generates a clock at a lower speed than the first clock generator. After a predetermined time the first clock generator is reselected to provide the processor clock signal. The threshold detection is repeated until an optimum clock frequency is discovered.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: September 14, 2021
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Mrudula Gore
  • Patent number: 11119873
    Abstract: A processor comprises a plurality of processing units, wherein there is a fixed transmission time for transmitting a message from a sending processing unit to a receiving processing unit, based on the physical positions of the sending and receiving processing units in the processor. The processing units are arranged in a column, and the fixed transmission time depends on the position of a processing circuit in the column. An exchange fabric is provided for exchanging messages between sending and receiving processing units, the columns being arranged with respect to the exchange fabric such that the fixed transmission time depends on the distances of the processing circuits with respect to the exchange fabric. The processor comprises at least one delay stage for each processing circuit and switching circuitry for selectively switching the delay stage into or out of a communication path involved in message exchange.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: September 14, 2021
    Assignee: GRAPHCORE LIMITED
    Inventor: Stephen Felix
  • Patent number: 11119952
    Abstract: A gateway for use in a computing system to interface a host with the subsystem for acting as a work accelerator to the host, the gateway having an streaming engine for controlling the streaming of batches of data into and out of the gateway in response to pre-compiled data exchange synchronisation points attained by the subsystem, wherein the streaming of batches of data is selectively via at least one of an accelerator interface, a data connection interface, a gateway interface and an memory interface, wherein the streaming engine is configured to perform data preparation processing of the batches of data streamed into the gateway prior to said batches of data being streamed out of the gateway, wherein the data preparation processing comprises at least one of: data augmentation; decompression; and decryption.
    Type: Grant
    Filed: July 2, 2019
    Date of Patent: September 14, 2021
    Assignee: GRAPHCORE LIMITED
    Inventors: Ola Torudbakken, Brian Manula
  • Patent number: 11113060
    Abstract: A processing apparatus comprising one or more processing modules, each comprising an execution unit. The one or more processing modules are operable to run a plurality of parallel or concurrent threads, and the processing apparatus further comprises a storage location for storing an aggregated exit state of the plurality of threads. An instruction set of the processing apparatus comprises an exit instruction for inclusion in each of the plurality of threads, the exit state instruction taking an individual exit state of the respective thread as an operand. The exit instruction terminates the respective thread and also causes the individual exit state specified in the operand to contribute to the aggregated exit state.
    Type: Grant
    Filed: August 1, 2019
    Date of Patent: September 7, 2021
    Assignee: GRAPHCORE LIMITED
    Inventor: Simon Christian Knowles
  • Patent number: 11106510
    Abstract: A processing system comprising: a subsystem for acting as a work accelerator to a host processor, the subsystem comprising an arrangement of tiles; and an interconnect for communicating between the tiles and connecting the subsystem to the host. The interconnect comprises synchronization logic to coordinate barrier synchronizations between a group of the tiles. The synchronization logic comprises a host sync proxy module, comprising a counter written with a number of credits by the host processor, and being configured to automatically decrement the number of credits each time one of the barrier synchronizations requiring host involvement is performed. When the number of credits in the counter is exhausted, the barrier is not released until a further write from the host to the host sync proxy module, but when the number is credits in the counter is not exhausted the barrier is released without a separate write from the host.
    Type: Grant
    Filed: July 17, 2019
    Date of Patent: August 31, 2021
    Assignee: GRAPHCORE LIMITED
    Inventors: Daniel John Pelham Wilkinson, Stephen Felix, Matthew David Fyles, Richard Luke Southwell Osborne
  • Patent number: 11106432
    Abstract: An execution unit is described which is particularly configured to generate an exponential of an operand floating point format. The operand is multiplied by a fixed multiplicand, logged to the base 2 (e) to generate a multiplication result. An integer part and a fractional part are extracted from the multiplication result. An exponent register stores the integer part to form the exponent of the exponential result. A lookup table has a plurality of entries each providing a value of 2f for a fractional part f used to access a lookup table. The fractional part is derived from a mantissa of the operand. That is, first and second bit sequences are extracted from the mantissa. One of the bit sequences is used to generate an estimated fractional component, and the other is used to access a value from the lookup table.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: August 31, 2021
    Assignee: Graphcore Limited
    Inventors: Jonathan Mangnall, Stephen Felix
  • Patent number: 11061679
    Abstract: A processor comprising an execution unit, memory and one or more register files. The execution unit is configured to execute instances of machine code instructions from an instruction set. The types of instruction defined in the instruction set include a double-load instruction for loading from the memory to at least one of the one or more register files. The execution unit is configured so as, when the load instruction is executed, to perform a first load operation strided by a fixed stride, and a second load operation strided by a variable stride, the variable stride being specified in a variable stride register in one of the one or more register files.
    Type: Grant
    Filed: April 19, 2019
    Date of Patent: July 13, 2021
    Assignee: Graphcore Limited
    Inventors: Alan Graham Alexander, Simon Christian Knowles, Mrudula Chidambar Gore
  • Patent number: 11048563
    Abstract: A processing system comprising: a subsystem for acting as a work accelerator to a host processor, the subsystem comprising an arrangement of tiles; and an interconnect for communicating between the tiles and connecting the subsystem to the host. The interconnect comprises synchronization logic to coordinate barrier synchronizations between a group of the tiles. The synchronization logic comprises a host sync proxy module, comprising a counter written with a number of credits by the host processor, and being configured to automatically decrement the number of credits each time one of the barrier synchronizations requiring host involvement is performed. When the number of credits in the counter is exhausted, the barrier is not released until a further write from the host to the host sync proxy module, but when the number is credits in the counter is not exhausted the barrier is released without a separate write from the host.
    Type: Grant
    Filed: February 1, 2018
    Date of Patent: June 29, 2021
    Assignee: Graphcore Limited
    Inventors: Daniel John Pelham Wilkinson, Stephen Felix, Matthew David Fyles, Richard Luke Southwell Osborne
  • Patent number: 11048844
    Abstract: A method and system for improved design verification for a data processing device when performing a logic simulation. The system identifies certain corresponding coverpoints at different points in results of a logic simulation for a design. Using coverage results obtained for the design, a merging of the results is performed for the certain corresponding coverpoints in the design. In the merged results, a coverpoint is considered as covered if at least one corresponding coverpoint is covered during the logic simulation.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: June 29, 2021
    Assignee: GRAPHCORE LIMITED
    Inventors: Paul Miseldine, Michael Davie
  • Patent number: 11036673
    Abstract: A method of recording tile identifiers in each of a plurality of tiles of a multitile processor is described. Tiles are arranged in columns, each column having a plurality of processing circuits, each processing circuit comprising one or more tiles, wherein a base processing circuit in each column is connected to a set of processing circuit identifier wires. A base value is generated on each of the set of processing circuit identifier wires for the base processing circuit in each column. At the base processing circuit, the base value on the set of processing circuit identifier wires is read and incremented by one. The incremented value is propagated to a next processing circuit in the column, and at the next processing circuit a unique identifier is recorded by concatenating an identifier of the column and the incremented value.
    Type: Grant
    Filed: May 22, 2019
    Date of Patent: June 15, 2021
    Assignee: Graphcore Limited
    Inventors: Stephen Felix, Jonathan Mangnall
  • Patent number: 11029962
    Abstract: An execution unit comprising a processing pipeline configured to perform calculations to evaluate a plurality of mathematical functions. The processing pipeline comprises a plurality of stages through which each calculation for evaluating a mathematical function progresses to an end result. Each of a plurality of processing circuits in the pipeline is configured to perform an operation on input values during at least one stage of the plurality of stages. The plurality of processing circuits include multiplier circuits. A first multiplier circuit and a second multiplier circuit are configured to operate in parallel, such that at the same stage in the processing pipeline, the first multiplier circuit and the second multiplier circuit perform their processing. A third multiplier circuit is arranged in series with the first multiplier circuit and the second multiplier circuit and processes outputs from the first multiplier circuit and the second multiplier circuit.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: June 8, 2021
    Assignee: GRAPHCORE LIMITED
    Inventor: Jonathan Mangnall
  • Patent number: 11023239
    Abstract: A processor comprising an execution unit, memory and one or more register files. The execution unit is configured to execute instances of machine code instructions from an instruction set. The types of instruction defined in the instruction set include a double-load instruction for loading from the memory to at least one of the one or more register files. The execution unit is configured so as, when the load instruction is executed, to perform a first load operation strided by a fixed stride, and a second load operation strided by a variable stride, the variable stride being specified in a variable stride register in one of the one or more register files.
    Type: Grant
    Filed: April 19, 2019
    Date of Patent: June 1, 2021
    Assignee: Graphcore Limited
    Inventors: Alan Graham Alexander, Simon Christian Knowles, Mrudula Chidambar Gore
  • Patent number: 11023290
    Abstract: A processing system comprising an arrangement of tiles and an interconnect between the tiles. The interconnect comprises synchronization logic for coordinating a barrier synchronization to be performed between a group of the tiles. The instruction set comprises a synchronization instruction taking an operand which selects one of a plurality of available modes each specifying a different membership of the group. Execution of the synchronization instruction cause a synchronization request to be transmitted from the respective tile to the synchronization logic, and instruction issue to be suspended on the respective tile pending a synchronization acknowledgement being received back from the synchronization logic. In response to receiving the synchronization request from all the tiles in the group as specified by the operand of the synchronization instruction, the synchronization logic returns the synchronization acknowledgment to the tiles in the specified group.
    Type: Grant
    Filed: February 1, 2018
    Date of Patent: June 1, 2021
    Assignee: Graphcore Limited
    Inventors: Daniel John Pelham Wilkinson, Simon Christian Knowles, Matthew David Fyles, Alan Graham Alexander, Stephen Felix
  • Patent number: 11023413
    Abstract: A method of operating a system comprising multiple processor tiles divided into a plurality of domains wherein within each domain the tiles are connected to one another via a respective instance of a time-deterministic interconnect and between domains the tiles are connected to one another via a non-time-deterministic interconnect. The method comprises: performing a compute stage, then performing a respective internal barrier synchronization within each domain, then performing an internal exchange phase within each domain, then performing an external barrier synchronization to synchronize between different domains, then performing an external exchange phase between the domains.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: June 1, 2021
    Assignee: GRAPHCORE LIMITED
    Inventors: Daniel John Pelham Wilkinson, Stephen Felix, Richard Luke Southwell Osborne, Simon Christian Knowles, Alan Graham Alexander, Ian James Quinn
  • Patent number: 11010208
    Abstract: A work accelerator is connected to a gateway. The gateway enables the transfer of data to the work accelerator from an external storage at pre-compiled data synchronisation points attained by the work accelerator. The work accelerator is configured to send to a register of the gateway an indication of a sync group comprising the gateway. The work accelerator then sends to the gateway, a synchronisation request for a synchronisation to be performed at an upcoming pre-compiled data exchange synchronisation point. The sync propagation circuits are each configured to receive at least one synchronisation request and propagate or acknowledge the synchronisation request in dependence upon the indication of the sync group received from the work accelerator.
    Type: Grant
    Filed: October 25, 2019
    Date of Patent: May 18, 2021
    Assignee: GRAPHCORE LIMITED
    Inventor: Brian Manula
  • Patent number: 10970131
    Abstract: A gateway for interfacing a host with a subsystem for acting as a work accelerator to the host, the gateway enabling the transfer of batches of data to and from the subsystem at pre-compiled data exchange synchronisation points attained by the subsystem. The gateway is configured to: receive from a storage system data determined by the host to be processed by the subsystem; store a number of credits indicating the availability of data for transfer to the subsystem at each pre-compiled data exchange synchronisation point; receive a synchronisation request from the subsystem when it attains a data exchange synchronisation point; and in response to determining that the number of credits comprises a non-zero number of credits: transmit a synchronisation acknowledgment to the subsystem; and cause the received data to be transferred to the subsystem.
    Type: Grant
    Filed: December 28, 2018
    Date of Patent: April 6, 2021
    Assignee: Graphcore Limited
    Inventors: Ola Tørudbakken, Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Stephen Felix, Matthew David Fyles, Brian Manula, Harald Høeg
  • Patent number: 10963315
    Abstract: A system comprising: a first subsystem comprising one or more first processors, and a second subsystem comprising one or more second processors. The second subsystem is configured to process code over a series of steps delineated by barrier synchronizations, and in a current step, to send a descriptor to the first subsystem specifying a value of each of one or more parameters of each of one or more interactions that the second subsystem is programmed to perform with the first subsystem via an inter-processor interconnect in a subsequent step. The first subsystem is configured to execute a portion of code to perform one or more preparatory operations, based on the specified values of at least one of the one or more parameters of each interaction as specified by the descriptor, to prepare for said one or more interactions prior to the barrier synchronization leading into the subsequent phase.
    Type: Grant
    Filed: February 15, 2019
    Date of Patent: March 30, 2021
    Assignee: Graphcore Limited
    Inventors: David Lacey, Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Matthew David Fyles
  • Patent number: 10963003
    Abstract: The invention relates to a computer comprising: a plurality of processing units each having instruction storage holding a local program, an execution unit executing the local program, data storage for holding data; an input interface with a set of input wires, and an output interface with a set of output wires; a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by each processing unit; a synchronisation module operable to generate a synchronisation signal to control the computer to switch between a compute phase and an exchange phase, wherein the processing units are configured to execute their local programs according to a common clock, the local programs being such that in the exchange phase at least one processing unit executes a send instruction from its local program to transmit at a transmit time a data packet onto its output set of connection
    Type: Grant
    Filed: October 19, 2018
    Date of Patent: March 30, 2021
    Assignee: GRAPHCORE LIMITED
    Inventors: Simon Christian Knowles, Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Alan Graham Alexander, Stephen Felix, Jonathan Mangnall, David Lacey
  • Patent number: 10956234
    Abstract: A system comprising a gateway for interfacing external data sources with one or more accelerators. The gateway comprises a plurality of virtual gateways, each of which is configured to stream data from the external data sources to one or more associated accelerators. The plurality of virtual gateways are each configured to stream data from external data sources so that the data is received at an associated accelerator in response to a synchronisation point being obtained by a synchronisation zone. Each of the virtual gateways is assigned a virtual ID so that when data is received at the gateway, data can be delivered to the appropriate gateway.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: March 23, 2021
    Assignee: GRAPHCORE LIMITED
    Inventors: Brian Manula, Harald Hoeg, Ola Torudbakken
  • Patent number: 10956165
    Abstract: A processor comprising: an execution unit for executing a respective thread in each of a repeating sequence of time slots; and a plurality of context register sets, each comprising a respective set of registers for representing a state of a respective thread. The context register sets comprise a respective worker context register set for each of the number of time slots the execution unit is operable to interleave, and at least one extra context register set. The worker context register sets represent the respective states of worker threads and the extra context register set being represents the state of a supervisor thread. The processor is configured to begin running the supervisor thread in each of the time slots, and to enable the supervisor thread to then individually relinquish each of the time slots in which it is running to a respective one of the worker threads.
    Type: Grant
    Filed: February 1, 2018
    Date of Patent: March 23, 2021
    Assignee: Graphcore Limited
    Inventor: Simon Christian Knowles