Patents by Inventor Daniel John Pelham WILKINSON

Daniel John Pelham WILKINSON has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Sync Group Selection

Publication number: 20210200601

Abstract: Implicit sync group selection is performed by having dual interfaces to a gateway. A subsystem coupled to the gateway selects a sync group to be used for an upcoming exchange by selecting the interface to which a sync request is written to. The gateway propagates the sync requests and/or acknowledgments in dependence upon configuration settings for the sync group that is associated with the interface to which the sync request was written to.

Type: Application

Filed: December 17, 2020

Publication date: July 1, 2021

Inventors: Brian MANULA, Daniel John Pelham WILKINSON
Synchronization with a host processor

Patent number: 11048563

Abstract: A processing system comprising: a subsystem for acting as a work accelerator to a host processor, the subsystem comprising an arrangement of tiles; and an interconnect for communicating between the tiles and connecting the subsystem to the host. The interconnect comprises synchronization logic to coordinate barrier synchronizations between a group of the tiles. The synchronization logic comprises a host sync proxy module, comprising a counter written with a number of credits by the host processor, and being configured to automatically decrement the number of credits each time one of the barrier synchronizations requiring host involvement is performed. When the number of credits in the counter is exhausted, the barrier is not released until a further write from the host to the host sync proxy module, but when the number is credits in the counter is not exhausted the barrier is released without a separate write from the host.

Type: Grant

Filed: February 1, 2018

Date of Patent: June 29, 2021

Assignee: Graphcore Limited

Inventors: Daniel John Pelham Wilkinson, Stephen Felix, Matthew David Fyles, Richard Luke Southwell Osborne
Droop Detection

Publication number: 20210190835

Abstract: During normal operation of a processor, voltage droop is likely to occur and there is, therefore, a need for techniques for rapidly and accurately detecting this droop so as to reduce the probability of circuit timing failures. The droop detector described herein uses a tap sampled delay line in which a clock signal is split along two separate paths. Each of the taps in the paths are separated by two inverter delays such that the set of samples produced represent sample values of the clock signal that are each separated by a single inverter delay without inversion of the first clock signal between the samples.

Type: Application

Filed: October 28, 2020

Publication date: June 24, 2021

Inventors: Stephen FELIX, Daniel John Pelham WILKINSON
Synchronization amongst processor tiles

Patent number: 11023290

Abstract: A processing system comprising an arrangement of tiles and an interconnect between the tiles. The interconnect comprises synchronization logic for coordinating a barrier synchronization to be performed between a group of the tiles. The instruction set comprises a synchronization instruction taking an operand which selects one of a plurality of available modes each specifying a different membership of the group. Execution of the synchronization instruction cause a synchronization request to be transmitted from the respective tile to the synchronization logic, and instruction issue to be suspended on the respective tile pending a synchronization acknowledgement being received back from the synchronization logic. In response to receiving the synchronization request from all the tiles in the group as specified by the operand of the synchronization instruction, the synchronization logic returns the synchronization acknowledgment to the tiles in the specified group.

Type: Grant

Filed: February 1, 2018

Date of Patent: June 1, 2021

Assignee: Graphcore Limited

Inventors: Daniel John Pelham Wilkinson, Simon Christian Knowles, Matthew David Fyles, Alan Graham Alexander, Stephen Felix
Synchronization in a multi-tile, multi-chip processing arrangement

Patent number: 11023413

Abstract: A method of operating a system comprising multiple processor tiles divided into a plurality of domains wherein within each domain the tiles are connected to one another via a respective instance of a time-deterministic interconnect and between domains the tiles are connected to one another via a non-time-deterministic interconnect. The method comprises: performing a compute stage, then performing a respective internal barrier synchronization within each domain, then performing an internal exchange phase within each domain, then performing an external barrier synchronization to synchronize between different domains, then performing an external exchange phase between the domains.

Type: Grant

Filed: December 23, 2019

Date of Patent: June 1, 2021

Assignee: GRAPHCORE LIMITED

Inventors: Daniel John Pelham Wilkinson, Stephen Felix, Richard Luke Southwell Osborne, Simon Christian Knowles, Alan Graham Alexander, Ian James Quinn
CODE COMPILATION FOR SCALING ACCELERATORS

Publication number: 20210149651

Abstract: A computer system comprises a work accelerator, a gateway the transfer of data to the accelerator from external storage, the accelerator executes a first compiled code sequence to perform computations on data transferred to the accelerator from the gateway. The first compiled code sequence comprises a synchronisation instruction indicating a barrier between a compute phase in which the compute instructions are executed and an exchange phase, wherein execution of the synchronisation instruction causes an indication of a pre-compiled data exchange synchronisation point to be transferred to the gateway. The gateway comprises a streaming engine storing a second compiled code sequence in the form of a set of data transfer instructions executable by the streaming engine to perform data transfer operations to stream data through the gateway in the exchange phase, wherein the first and second compiled code sequences are generated as a related set at compile time.

Type: Application

Filed: January 27, 2021

Publication date: May 20, 2021

Inventors: Ola TORUDBAKKEN, Daniel John Pelham WILKINSON, Brian MANULA, Harald Hoeg
Host proxy on gateway

Patent number: 10970131

Abstract: A gateway for interfacing a host with a subsystem for acting as a work accelerator to the host, the gateway enabling the transfer of batches of data to and from the subsystem at pre-compiled data exchange synchronisation points attained by the subsystem. The gateway is configured to: receive from a storage system data determined by the host to be processed by the subsystem; store a number of credits indicating the availability of data for transfer to the subsystem at each pre-compiled data exchange synchronisation point; receive a synchronisation request from the subsystem when it attains a data exchange synchronisation point; and in response to determining that the number of credits comprises a non-zero number of credits: transmit a synchronisation acknowledgment to the subsystem; and cause the received data to be transferred to the subsystem.

Type: Grant

Filed: December 28, 2018

Date of Patent: April 6, 2021

Assignee: Graphcore Limited

Inventors: Ola Tørudbakken, Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Stephen Felix, Matthew David Fyles, Brian Manula, Harald Høeg
Synchronization and exchange of data between processors

Patent number: 10963315

Abstract: A system comprising: a first subsystem comprising one or more first processors, and a second subsystem comprising one or more second processors. The second subsystem is configured to process code over a series of steps delineated by barrier synchronizations, and in a current step, to send a descriptor to the first subsystem specifying a value of each of one or more parameters of each of one or more interactions that the second subsystem is programmed to perform with the first subsystem via an inter-processor interconnect in a subsequent step. The first subsystem is configured to execute a portion of code to perform one or more preparatory operations, based on the specified values of at least one of the one or more parameters of each interaction as specified by the descriptor, to prepare for said one or more interactions prior to the barrier synchronization leading into the subsequent phase.

Type: Grant

Filed: February 15, 2019

Date of Patent: March 30, 2021

Assignee: Graphcore Limited

Inventors: David Lacey, Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Matthew David Fyles
Synchronization in a multi-tile processing array

Patent number: 10963003

Abstract: The invention relates to a computer comprising: a plurality of processing units each having instruction storage holding a local program, an execution unit executing the local program, data storage for holding data; an input interface with a set of input wires, and an output interface with a set of output wires; a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by each processing unit; a synchronisation module operable to generate a synchronisation signal to control the computer to switch between a compute phase and an exchange phase, wherein the processing units are configured to execute their local programs according to a common clock, the local programs being such that in the exchange phase at least one processing unit executes a send instruction from its local program to transmit at a transmit time a data packet onto its output set of connection

Type: Grant

Filed: October 19, 2018

Date of Patent: March 30, 2021

Assignee: GRAPHCORE LIMITED

Inventors: Simon Christian Knowles, Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Alan Graham Alexander, Stephen Felix, Jonathan Mangnall, David Lacey
Synchronization and exchange of data between processors

Patent number: 10949266

Abstract: A system comprising: a first subsystem comprising one or more first processors, and a second subsystem comprising one or more second processors. The second subsystem is configured to process code over a series of steps delineated by barrier synchronizations, and in a current step, to send a descriptor to the first subsystem specifying a value of each of one or more parameters of each of one or more interactions that the second subsystem is programmed to perform with the first subsystem via an inter-processor interconnect in a subsequent step. The first subsystem is configured to execute a portion of code to perform one or more preparatory operations, based on the specified values of at least one of the one or more parameters of each interaction as specified by the descriptor, to prepare for said one or more interactions prior to the barrier synchronization leading into the subsequent phase.

Type: Grant

Filed: August 13, 2019

Date of Patent: March 16, 2021

Assignee: GRAPHCORE LIMITED

Inventors: David Lacey, Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Matthew David Fyles
Synchronization in a multi-tile processing array

Patent number: 10936008

Abstract: The invention relates to a computer comprising: a plurality of processing units each having instruction storage holding a local program, an execution unit executing the local program, data storage for holding data; an input interface with a set of input wires, and an output interface with a set of output wires; a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by each processing unit; a synchronisation module operable to generate a synchronisation signal to control the computer to switch between a compute phase and an exchange phase, wherein the processing units are configured to execute their local programs according to a common clock, the local programs being such that in the exchange phase at least one processing unit executes a send instruction from its local program to transmit at a transmit time a data packet onto its output set of connection

Type: Grant

Filed: February 1, 2018

Date of Patent: March 2, 2021

Assignee: Graphcore Limited

Inventors: Simon Christian Knowles, Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Alan Graham Alexander, Stephen Felix, Jonathan Mangnall, David Lacey
Code compilation for scaling accelerators

Patent number: 10922063

Abstract: A computer system comprises a work accelerator, a gateway the transfer of data to the accelerator from external storage, the accelerator executes a first compiled code sequence to perform computations on data transferred to the accelerator from the gateway. The first compiled code sequence comprises a synchronisation instruction indicating a barrier between a compute phase in which the compute instructions are executed and an exchange phase, wherein execution of the synchronisation instruction causes an indication of a pre-compiled data exchange synchronisation point to be transferred to the gateway. The gateway comprises a streaming engine storing a second compiled code sequence in the form of a set of data transfer instructions executable by the streaming engine to perform data transfer operations to stream data through the gateway in the exchange phase, wherein the first and second compiled code sequences are generated as a related set at compile time.

Type: Grant

Filed: December 28, 2018

Date of Patent: February 16, 2021

Assignee: Graphcore Limited

Inventors: Ola Tørudbakken, Daniel John Pelham Wilkinson, Brian Manula, Harald Høeg
Sending data from an arrangement of processor modules

Patent number: 10817444

Abstract: A system comprising an arrangement of multiple processor modules, and an external interconnect for communicating data in the form of packets to outside the arrangement. The interconnect comprises an exchange block configured to provide flow control. One of the processor modules is arranged to send an exchange request message to the exchange block on behalf of others with data to send outside the arrangement. The exchange block sends an exchange-on message to a first of these processor modules, to cause the first module to start sending packets via the interconnect. Then, once this processor module has sent its last data packet, the exchange block sends an exchange-off message to this processor module to cause it to stop sending packets, and sends another exchange-on message to the next processor module with data to send, and so forth.

Type: Grant

Filed: July 30, 2019

Date of Patent: October 27, 2020

Assignee: GRAPHCORE LIMITED

Inventors: Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Stephen Felix, Graham Bernard Cunningham, Alan Graham Alexander
Compiler method

Patent number: 10802536

Abstract: The invention relates to a computer implemented method of generating multiple programs to deliver a computerised function, each program to be executed in a processing unit of a computer comprising a plurality of processing units each having instruction storage for holding a local program, an execution unit for executing the local program and data storage for holding data, a switching fabric connected to an output interface of each processing unit and connectable to an input interface of each processing unit by switching circuitry controllable by each processing unit, and a synchronisation module operable to generate a synchronisation signal, the method comprising: generating a local program for each processing unit comprising a sequence of executable instructions; determining for each processing unit a relative time of execution of instructions of each local program whereby a local program allocated to one processing unit is scheduled to execute with a predetermined delay relative to a synchronisation signal

Type: Grant

Filed: February 1, 2018

Date of Patent: October 13, 2020

Assignee: Graphcore Limited

Inventors: Simon Christian Knowles, Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Alan Graham Alexander, Stephen Felix, Jonathan Mangnall, David Lacey
Booting Tiles of Processing Units

Publication number: 20200319893

Abstract: A processing system comprises a first subsystem comprising at least one host processor and one or more storage units, and a second subsystem comprising at least one second processor. Each second processor comprises a plurality of tiles. Each tile comprises a processing unit and memory. At least one storage unit stores bootloader code for each of first and second subsets of the plurality of tiles of at least one second processor. The first subsystem writes bootloader code to each of the first subset of tiles of the at least one second processor. At least one of the first subset of tiles requests at least one of the storage units to return the bootloader code to the second subset of the plurality of tiles. Each tile to which the bootloader code is written retrieves boot code from the storage unit and then runs said boot code.

Type: Application

Filed: July 31, 2019

Publication date: October 8, 2020

Inventor: Daniel John Pelham Wilkinson
CHECKPOINTING

Publication number: 20200319974

Abstract: A system comprising: a first subsystem comprising at least one first processor, and a second subsystem comprising one or more second processors. A first program is arranged to run on the at least one first processor, the first program being configured to send data from the first subsystem to the second subsystem. A second program is arranged to run on the one more second processors, the second program being configured to operate on the data content from the first subsystem. The first program is configured to set a checkpoint at successive points in time. At each checkpoint it records in memory of the first subsystem i) a program state of the second program, comprising a state of one or more registers on each of the second processors at the time of the checkpoint, and ii) a copy of the data content sent to the second subsystem since the respective checkpoint.

Type: Application

Filed: May 22, 2019

Publication date: October 8, 2020

Applicant: Graphcore Limited

Inventors: David Lacey, Daniel John Pelham Wilkinson
Exchange of data between processor modules

Patent number: 10705998

Abstract: A processing system comprising: multiple processor modules, each comprising a respective execution unit memory; and an interconnect for exchanging data between different sets of the processor modules. A group of the processor modules operates in a series of BSP supersteps. For the exchange phase of each superstep, each receiving processor module that is to receive data from outside its own set is pre-programmed with a value representing the number of units of data to receive. Starting from the pre-programmed value, it then counts out the number of data units remaining to be received each time a data unit is received. Each receiving processor module is further arranged to perform an exchange synchronization whereby, before advancing from the exchange phase to the compute phase of the current superstep, the receiving processor module waits until no units of data remain to be received according to the count.

Type: Grant

Filed: February 15, 2019

Date of Patent: July 7, 2020

Assignee: Graphcore Limited

Inventors: Daniel John Pelham Wilkinson, Alan Graham Alexander
Exchange of data between processor modules

Patent number: 10705999

Abstract: A processing system comprising: multiple processor modules, each comprising a respective execution unit memory; and an interconnect for exchanging data between different sets of the processor modules. A group of the processor modules operates in a series of steps. For an exchange phase of each step by each receiving processor module that is to receive data from outside its own set, the receiving module is pre-programmed with a value representing the number of units of data to receive. Starting from the pre-programmed value, it then counts out the number of data units remaining to be received each time a data unit is received. Each receiving processor module is further arranged to perform an exchange synchronization whereby, before advancing from the exchange phase to the compute phase of the current step, the receiving processor module waits until no units of data remain to be received according to the count.

Type: Grant

Filed: August 22, 2019

Date of Patent: July 7, 2020

Assignee: GRAPHCORE LIMITED

Inventors: Daniel John Pelham Wilkinson, Alan Graham Alexander
EXCHANGE OF DATA BETWEEN PROCESSOR MODULES

Publication number: 20200210364

Abstract: A processing system comprising: multiple processor modules, each comprising a respective execution unit memory; and an interconnect for exchanging data between different sets of the processor modules. A group of the processor modules operates in a series of BSP supersteps. For the exchange phase of each superstep, each receiving processor module that is to receive data from outside its own set is pre-programmed with a value representing the number of units of data to receive. Starting from the pre-programmed value, it then counts out the number of data units remaining to be received each time a data unit is received. Each receiving processor module is further arranged to perform an exchange synchronization whereby, before advancing from the exchange phase to the compute phase of the current superstep, the receiving processor module waits until no units of data remain to be received according to the count.

Type: Application

Filed: February 15, 2019

Publication date: July 2, 2020

Applicant: Graphcore Limited

Inventors: Daniel John Pelham Wilkinson, Alan Graham Alexander
EXCHANGE OF DATA BETWEEN PROCESSOR MODULES

Publication number: 20200210365

Abstract: A processing system comprising: multiple processor modules, each comprising a respective execution unit memory; and an interconnect for exchanging data between different sets of the processor modules. A group of the processor modules operates in a series of steps. For an exchange phase of each step by each receiving processor module that is to receive data from outside its own set, the receiving module is pre-programmed with a value representing the number of units of data to receive. Starting from the pre-programmed value, it then counts out the number of data units remaining to be received each time a data unit is received. Each receiving processor module is further arranged to perform an exchange synchronization whereby, before advancing from the exchange phase to the compute phase of the current step, the receiving processor module waits until no units of data remain to be received according to the count.

Type: Application

Filed: August 22, 2019

Publication date: July 2, 2020

Inventors: Daniel John Pelham Wilkinson, Alan Graham Alexander

prev 1 2 3 4 5 6 next