Patents Assigned to Graphcore Limited
-
Patent number: 11644884Abstract: There is disclosed a method of controlling the frequency of a clock signal in a processor. The method selects a first clock generator to provide a processor clock signal for executing an application. If a threshold event is detected, a second clock generator is selected. The method reduces the frequency of a clock signal generated by the first clock generator while a processor clock signal is being provided for execution of an application from the second clock generator. The second clock generator generates a clock at a lower speed than the first clock generator. After a predetermined time, the first clock generator is reselected to provide the processor clock signal. The threshold detection is repeated until an optimum clock frequency is discovered.Type: GrantFiled: August 17, 2021Date of Patent: May 9, 2023Assignee: GRAPHCORE LIMITEDInventors: Stephen Felix, Mrudula Gore
-
Patent number: 11645225Abstract: A computer, including a plurality of processing nodes arranged in two-dimensional arrays in respective front and rear layers. Each processing node has a set of activatable links. When activated, transmission of data items between the nodes connected via the activated link is enabled. When not activated, transmission of data items between the nodes is prevented. The set of activatable links including a respective link which connects the processing node to each adjacent node in the array, and to a facing processing node in the other layer. An allocation engine is configured to receive an allocation instruction and connected to the processing nodes to selectively activate the links in a configuration.Type: GrantFiled: August 10, 2022Date of Patent: May 9, 2023Assignee: GRAPHCORE LIMITEDInventor: Simon Knowles
-
Patent number: 11637682Abstract: An apparatus is provided for converting the form in which a synchronisation request for a barrier synchronisation is provided. The synchronisation request is provided from a first synchronisation circuitry to a second synchronisation circuitry by asserting one of a set of separate signals that may each correspond to a bit in a register or a signal on a wire. The second synchronisation circuitry provides for the packetisation of the sync request by sending a packet comprising the sync request over a network to be received at a further subsystem.Type: GrantFiled: July 14, 2021Date of Patent: April 25, 2023Assignee: GRAPHCORE LIMITEDInventors: Martin Vickers, Daniel John Pelham Wilkinson
-
Patent number: 11635966Abstract: Aspects of the present disclosure provide a processor having: an execution unit configured to execute machine code instructions, at least one of the machine code instructions requiring multiple cycles for its execution; instruction memory holding instructions for execution, wherein the execution unit is configured to access the memory to fetch instructions for execution; an instruction injection mechanism configured to inject an instruction into the execution pipeline during execution of the at least one machine code instruction fetched from the memory; the execution unit configured to pause execution of the at least one machine code instruction, to execute the injected instruction to termination, to detect termination of the injected instruction and to automatically recommence execution of the at least one machine code instruction on detection of termination of the injected instruction.Type: GrantFiled: June 1, 2021Date of Patent: April 25, 2023Assignee: GRAPHCORE LIMITEDInventors: James Pallister, Jamie Hanlon
-
Patent number: 11630986Abstract: A method for generating an executable program to run on a system of one or more processor chips each comprising a plurality of tiles. The method comprises: receiving a graph comprising a plurality of data nodes, compute vertices and directional edges, wherein the graph is received in a first graph format that does not specify which data nodes and vertices are allocated to which of the tiles; and generating an application programming interface, API, for converting the graph, to determine a tile-mapping allocating the data nodes and vertices amongst the tiles. The generating of the API comprises searching the graph to identify compute vertices which match any of a predetermined set of one or more compute vertex types. The API is then called to convert the graph to a second graph format that includes the tile-mapping, including the allocation by the assigned memory allocation functions.Type: GrantFiled: April 2, 2020Date of Patent: April 18, 2023Assignee: GRAPHCORE LIMITEDInventor: David Norman
-
Patent number: 11630983Abstract: A method for generating an executable program to run on a system of one or more processor chips each comprising a plurality of tiles. The method comprises: receiving a graph comprising a plurality of data nodes, compute vertices and directional edges, wherein the graph is received in a first graph format that does not specify which data nodes and vertices are allocated to which of the tiles; and generating an application programming interface, API, for converting the graph, to determine a tile-mapping allocating the data nodes and vertices amongst the tiles. The generating of the API comprises searching the graph to identify compute vertices which match any of a predetermined set of one or more compute vertex types. The API is then called to convert the graph to a second graph format that includes the tile-mapping, including the allocation by the assigned memory allocation functions.Type: GrantFiled: July 31, 2019Date of Patent: April 18, 2023Assignee: GRAPHCORE LIMITEDInventor: David Norman
-
Patent number: 11625356Abstract: A computer comprising a plurality of interconnected processing nodes arranged in a configuration in which multiple layers of interconnected nodes are arranged along an axis, each layer comprising at least four processing nodes connected in a non-axial ring by at least respective intralayer link between each pair of neighbouring processing nodes, wherein each of the at least four processing nodes in each layer is connected to a respective corresponding node in one or more adjacent layer by a respective interlayer link, the computer being programmed to provide in the configuration two embedded one dimensional paths and to transmit data around each of the two embedded one dimensional paths, each embedded one dimensional path using all processing nodes of the computer in such a manner that the two embedded one dimensional paths operate simultaneously without sharing links.Type: GrantFiled: March 24, 2021Date of Patent: April 11, 2023Assignee: GRAPHCORE LIMITEDInventor: Simon Knowles
-
Patent number: 11625061Abstract: Two clocks, a fast clock and a slow clock are provided for clocking a processing unit. A plurality of frequency settings, referred to as gears, are defined for the two clock. Each of these gears indicates a maximum frequency for the fast clock and a minimum frequency for the slow clock, such that the gap between the two frequencies may be kept to a manageable level so as to reduce transients upon switching between the two clocks. The system switches between the gears as required. In response to a determination to increase the frequency of the clock signal, a higher gear is selected at which the maximum and minimum frequencies defined for that gear are higher than the previous selected gear.Type: GrantFiled: June 16, 2021Date of Patent: April 11, 2023Assignee: GRAPHCORE LIMITEDInventors: Simon Douglas Chambers, Stephen Felix, Ian Malcolm King
-
Patent number: 11625357Abstract: A data processing system comprising a plurality of processors, wherein each of the processors is configured to perform data transfer operations to transfer outgoing data to one or more others of the processors during a first of the exchange stages; receive incoming data from the one or more others of the processors during the first of the exchange stages; determine further outgoing data in dependence upon at least part of the incoming data; count an amount of at least part the incoming data received during the first of the exchange stages from the one or more others of the processors; and in response to determining that the amount of the at least part of the incoming data received has reached a predefined amount, perform data transfer operations to transfer the further outgoing data to the one or more others of the processors during a second of the exchange stages.Type: GrantFiled: April 9, 2020Date of Patent: April 11, 2023Assignee: GRAPHCORE LIMITEDInventor: Lars Paul Huse
-
Patent number: 11614789Abstract: A system and method for docking a processing unit provided. According to the method, the system dithers between the two signals provided by the two clock generators so as to clock the processing unit at an average clock frequency having a value between the frequencies of the two signals. The average clock frequency is adjusted by modifying the proportion of time spent on one clock signal vs the other clock signal.Type: GrantFiled: June 17, 2021Date of Patent: March 28, 2023Assignee: GRAPHCORE LIMITEDInventor: Ian Malcolm King
-
Patent number: 11615053Abstract: A processor in a network has a plurality of processing units arranged on a chip. An on-chip interconnect enables data to be exchanged between the processing units. A plurality of external interfaces are configured to communicate data off chip in the form of packets, each packet having a destination address identifying a destination of the packet. The external interfaces are connected to respective additional connected processors. A routing bus routes packets between the processing units and the external interfaces. A routing register defines a routing domain for the processor, the routing domain comprising one or more of the additional processor, and at least a subset of further additional processors of the network, wherein the additional processors of the subset are directly or indirectly connected to the processor. The routing domain can be modified by changing the contents of the routing register as a sliding window domain.Type: GrantFiled: July 13, 2021Date of Patent: March 28, 2023Assignee: GRAPHCORE LIMITEDInventors: Daniel John Pelham Wilkinson, Lars Paul Huse, Richard Luke Southwell Osborne, Graham Bernard Cunningham, Hachem Yassine
-
Patent number: 11615038Abstract: A gateway for use in a computing system to interface a host with the subsystem for acting as a work accelerator to the host, the gateway having: an accelerator interface for connection to the subsystem to enable transfer of batches of data between the subsystem and the gateway; a data connection interface for connection to external storage for exchanging data between the gateway and storage; a gateway interface for connection to at least one second gateway; a memory interface connected to a local memory associated with the gateway; and a streaming engine for controlling the streaming of batches of data into and out of the gateway in response to pre-compiled data exchange synchronisation points attained by the subsystem, wherein the streaming of batches of data are selectively via at least one of the accelerator interface, data connection interface, gateway interface and memory interface.Type: GrantFiled: December 28, 2018Date of Patent: March 28, 2023Assignee: Graphcore LimitedInventors: Ola Tørudbakken, Brian Manula, Harald Høeg
-
Patent number: 11614946Abstract: A computer comprising a plurality of processing nodes is provided. Each processing node has at least one processor configured to process input data to generate an array of data items. The processing nodes are arranged in cliques in which each processing node of a clique is connected to each other processing node in the clique by first and second clique links. The cliques are inter-connected in rings such that each processing node is a member of a single clique and a single ring. The processing nodes of all cliques are configured to exchange in each exchange step of a machine learning collective via the respective first and second clique links at least two data items with the other processing node(s) in its clique, and all processing nodes are configured to reduce each received data item with the data item in the corresponding position in the array on that processing node.Type: GrantFiled: March 26, 2020Date of Patent: March 28, 2023Assignee: GRAPHCORE LIMITEDInventor: Simon Knowles
-
Patent number: 11599363Abstract: A computer comprising a plurality of processors, each of which are configured to perform operations on data during a compute phase for the computer and, following a pre-compiled synchronisation barrier, exchange data with at least one other of the processors during an exchange phase for the computer, wherein of the processors in the computer is indexed and the data exchange operations carried out by each processor in the exchange phase depend upon its index value.Type: GrantFiled: April 6, 2020Date of Patent: March 7, 2023Assignee: GRAPHCORE LIMITEDInventors: Richard Osborne, Matthew Fyles
-
Patent number: 11593185Abstract: A processing system comprising multiple tiles and an interconnect between the tiles. The interconnect is used to communicate between a group of some or all of the tiles according to a bulk synchronous parallel scheme, whereby each tile in the group performs an on-tile compute phase followed by an inter-tile exchange phase with the exchange phase being held back until all tiles in the group have completed the compute phase. Each tile in the group has a local exit state upon completion of the compute phase. The instruction set comprises a synchronization instruction for execution by each tile upon completion of its compute phase to signal a sync request to logic in the interconnect. In response to receiving the sync request from all the tiles in the group, the logic releases the next exchange phase and also makes available an aggregated a state of all the tiles in the group.Type: GrantFiled: November 19, 2019Date of Patent: February 28, 2023Assignee: GRAPHCORE LIMITEDInventors: Simon Christian Knowles, Alan Graham Alexander
-
Patent number: 11586483Abstract: A processing system comprising an arrangement of tiles and an interconnect between the tiles. The interconnect comprises synchronization logic for coordinating a barrier synchronization to be performed between a group of the tiles. The instruction set comprises a synchronization instruction taking an operand which selects one of a plurality of available modes each specifying a different membership of the group. Execution of the synchronization instruction cause a synchronization request to be transmitted from the respective tile to the synchronization logic, and instruction issue to be suspended on the respective tile pending a synchronization acknowledgement being received back from the synchronization logic. In response to receiving the synchronization request from all the tiles in the group as specified by the operand of the synchronization instruction, the synchronization logic returns the synchronization acknowledgment to the tiles in the specified group.Type: GrantFiled: May 14, 2021Date of Patent: February 21, 2023Assignee: GRAPHCORE LIMITEDInventors: Daniel John Pelham Wilkinson, Simon Christian Knowles, Matthew David Fyles, Alan Graham Alexander, Stephen Felix
-
Patent number: 11567768Abstract: A processor is disclosed including: a barrel-threaded execution unit for executing concurrent threads, and a repeat cache shared between the concurrent threads. The processor's instruction set includes a repeat instruction which takes a repeat count operand. When the repeat cache is not claimed and the repeat instruction is executed in a first thread, a portion of code is cached from the first thread into the repeat cache, the state of the repeat cache is changed to record it as claimed, and the cached code is executed a number of times. When the repeat instruction is then executed in a further thread, then the already-cached portion of code is again executed a respective number of times, each time from the repeat cache. For each of the first and further instructions, the repeat count operand in the respective instruction specifies the number of times to execute the cached code.Type: GrantFiled: February 15, 2019Date of Patent: January 31, 2023Assignee: Graphcore LimitedInventors: Alan Graham Alexander, Simon Christian Knowles, Mrudula Chidambar Gore, Jonathan Louis Ferguson
-
Patent number: 11561799Abstract: An execution unit comprising a processing pipeline configured to perform calculations to evaluate a plurality of mathematical functions. The processing pipeline comprises a plurality of stages through which each calculation for evaluating a mathematical function progresses to an end result. Each of a plurality of processing circuits in the pipeline is configured to perform an operation on input values during at least one stage of the plurality of stages. The plurality of processing circuits include multiplier circuits. A first multiplier circuit and a second multiplier circuit are configured to operate in parallel, such that at the same stage in the processing pipeline, the first multiplier circuit and the second multiplier circuit perform their processing. A third multiplier circuit is arranged in series with the first multiplier circuit and the second multiplier circuit and processes outputs from the first multiplier circuit and the second multiplier circuit.Type: GrantFiled: June 3, 2021Date of Patent: January 24, 2023Assignee: GRAPHCORE LIMITEDInventor: Jonathan Mangnall
-
Patent number: 11561926Abstract: A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets. The computer comprises: a plurality of processing units each having an input interface with a set of input wires, and an output interface with a set of output wires: a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by its associated processing unit; the processing units arranged in columns, each column having a base processing unit proximate the switching fabric and multiple processing units one adjacent the other in respective positions in the direction of the column.Type: GrantFiled: January 20, 2022Date of Patent: January 24, 2023Assignee: GRAPHCORE LIMITEDInventors: Stephen Felix, Simon Christian Knowles
-
Patent number: 11550639Abstract: A work accelerator is connected to a gateway. The gateway enables the transfer of data to the work accelerator from an external storage at pre-compiled data synchronisation points attained by the work accelerator. The work accelerator is configured to send to a register of the gateway an indication of a sync group comprising the gateway. The work accelerator then sends to the gateway, a synchronisation request for a synchronisation to be performed at an upcoming pre-compiled data exchange synchronisation point. The sync propagation circuits are each configured to receive at least one synchronisation request and propagate or acknowledge the synchronisation request in dependence upon the indication of the sync group received from the work accelerator.Type: GrantFiled: May 6, 2021Date of Patent: January 10, 2023Assignee: GRAPHCORE LIMITEDInventor: Brian Manula