Patents by Inventor Simon Christian Knowles

Simon Christian Knowles has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11900109
    Abstract: The present invention relates to an execution unit for executing a computer program comprising a sequence of instructions, which include a masking instruction. The execution unit is configured to execute the masking instruction which, when executed by the execution unit, masks randomly selected values from a source operand of n values and retains other original values from the source operand to generate a result which includes original values from the source operand and symbols in place of the selected values.
    Type: Grant
    Filed: February 1, 2018
    Date of Patent: February 13, 2024
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Simon Christian Knowles, Godfrey Da Costa
  • Publication number: 20230283547
    Abstract: A memory attachment and routing chip includes a single die having a set of external ports; at least one memory attachment interface comprising a memory controller to attach to external memory, and a fabric core in which routing logic is implemented. The routing logic can (i) receive a first packet of a first type from a first port of the set of ports, the first type of packet being a memory access packet with a memory address which lies in a range of memory addresses associated with the memory attachment and routing chip, detect the memory address and route the packet of the first type to the memory attachment interface. The routing logic can (ii) receive a second packet of a second type, the second type of packet being an inter-processor packet comprising a destination identifier identifying a processing chip external to the memory attachment.
    Type: Application
    Filed: January 25, 2023
    Publication date: September 7, 2023
    Inventors: Simon Christian KNOWLES, Stephen FELIX, Daniel John Pelham WILKINSON
  • Publication number: 20230280907
    Abstract: A computer includes first and second computer devices of a first class. Each computer device of the first class includes first and second external ports, at least one memory controller to attach to external memory, and routing logic to route data from the first external port to one of the memory controller and the second external port. The computer further includes first and second computer devices of a second class. The first computer device of the second class is connected to the first external ports via respective first and second links. The second computer device of the second class is connected to the second external ports via respective third and fourth links. The first and second computer devices of the second class include processing circuitry to execute a computer program and are connected to the first and second links, or third and fourth links, respectively to transmit and receive messages.
    Type: Application
    Filed: January 24, 2023
    Publication date: September 7, 2023
    Inventors: Simon Christian KNOWLES, Stephen FELIX, Daniel John Pelham WILKINSON
  • Publication number: 20230280975
    Abstract: Logic circuitry for multiplying floating point numbers is disclosed, comprising multiplication and addition logic. The multiplication logic includes first and second mantissa multiplying circuitry. The logic circuitry is configured to: in a first mode, determine a product of two values having a first number format, using sub-units of the first mantissa multiplying circuitry to calculate partial products of the mantissas, and using the addition logic to combine the partial products; in a second mode, determine a respective product of each of four pairs of values having a second number format, using the sub-units of the first mantissa multiplying circuitry to multiply the mantissas of the pairs; and in a third mode, determine products of each of a plurality of pairs of values having a third number format, using the second mantissa multiplying circuitry to generate a product for each pair.
    Type: Application
    Filed: December 5, 2022
    Publication date: September 7, 2023
    Inventors: Mrudula GORE, Stephen FELIX, Simon Christian KNOWLES
  • Publication number: 20230084132
    Abstract: A computer comprising a plurality of processor devices connected in a ring, wherein each of the processor devices is connected to each of two neighbouring ones of the processor devices by a respective physical inter-processor link. Each of a set of external memory device stores a local portion of the externally stored dataset. Each processor device executes instructions to: determine that a synchronisation point has been reached by the plurality of processor devices; responsive to the determination, access from its connected external memory device its local portion of the externally stored dataset stored; record a copy of its local portion of the externally stored dataset in its local memory; transmit its local portion of the externally stored dataset to at least one of its connected neighbouring processing devices; and receive an incoming portion of the externally stored dataset from at least one of its connected neighbouring processing devices.
    Type: Application
    Filed: September 12, 2022
    Publication date: March 16, 2023
    Inventor: Simon Christian KNOWLES
  • Publication number: 20230082673
    Abstract: A computer comprising a plurality of processor devices connected in a ring, wherein each of the processor devices is connected to each of two neighbouring ones of the processor devices by a respective physical inter-processor link. Each of a set of external memory device stores a local portion of the externally stored dataset. Each processor device executes instructions to: determine that a synchronisation point has been reached by the plurality of processor devices; responsive to the determination, access from its connected external memory device its local portion of the externally stored dataset stored; record a copy of its local portion of the externally stored dataset in its local memory; transmit its local portion of the externally stored dataset to at least one of its connected neighbouring processing devices; and receive an incoming portion of the externally stored dataset from at least one of its connected neighbouring processing devices.
    Type: Application
    Filed: September 9, 2022
    Publication date: March 16, 2023
    Inventor: Simon Christian KNOWLES
  • Patent number: 11593185
    Abstract: A processing system comprising multiple tiles and an interconnect between the tiles. The interconnect is used to communicate between a group of some or all of the tiles according to a bulk synchronous parallel scheme, whereby each tile in the group performs an on-tile compute phase followed by an inter-tile exchange phase with the exchange phase being held back until all tiles in the group have completed the compute phase. Each tile in the group has a local exit state upon completion of the compute phase. The instruction set comprises a synchronization instruction for execution by each tile upon completion of its compute phase to signal a sync request to logic in the interconnect. In response to receiving the sync request from all the tiles in the group, the logic releases the next exchange phase and also makes available an aggregated a state of all the tiles in the group.
    Type: Grant
    Filed: November 19, 2019
    Date of Patent: February 28, 2023
    Assignee: GRAPHCORE LIMITED
    Inventors: Simon Christian Knowles, Alan Graham Alexander
  • Patent number: 11586483
    Abstract: A processing system comprising an arrangement of tiles and an interconnect between the tiles. The interconnect comprises synchronization logic for coordinating a barrier synchronization to be performed between a group of the tiles. The instruction set comprises a synchronization instruction taking an operand which selects one of a plurality of available modes each specifying a different membership of the group. Execution of the synchronization instruction cause a synchronization request to be transmitted from the respective tile to the synchronization logic, and instruction issue to be suspended on the respective tile pending a synchronization acknowledgement being received back from the synchronization logic. In response to receiving the synchronization request from all the tiles in the group as specified by the operand of the synchronization instruction, the synchronization logic returns the synchronization acknowledgment to the tiles in the specified group.
    Type: Grant
    Filed: May 14, 2021
    Date of Patent: February 21, 2023
    Assignee: GRAPHCORE LIMITED
    Inventors: Daniel John Pelham Wilkinson, Simon Christian Knowles, Matthew David Fyles, Alan Graham Alexander, Stephen Felix
  • Patent number: 11567768
    Abstract: A processor is disclosed including: a barrel-threaded execution unit for executing concurrent threads, and a repeat cache shared between the concurrent threads. The processor's instruction set includes a repeat instruction which takes a repeat count operand. When the repeat cache is not claimed and the repeat instruction is executed in a first thread, a portion of code is cached from the first thread into the repeat cache, the state of the repeat cache is changed to record it as claimed, and the cached code is executed a number of times. When the repeat instruction is then executed in a further thread, then the already-cached portion of code is again executed a respective number of times, each time from the repeat cache. For each of the first and further instructions, the repeat count operand in the respective instruction specifies the number of times to execute the cached code.
    Type: Grant
    Filed: February 15, 2019
    Date of Patent: January 31, 2023
    Assignee: Graphcore Limited
    Inventors: Alan Graham Alexander, Simon Christian Knowles, Mrudula Chidambar Gore, Jonathan Louis Ferguson
  • Patent number: 11561926
    Abstract: A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets. The computer comprises: a plurality of processing units each having an input interface with a set of input wires, and an output interface with a set of output wires: a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by its associated processing unit; the processing units arranged in columns, each column having a base processing unit proximate the switching fabric and multiple processing units one adjacent the other in respective positions in the direction of the column.
    Type: Grant
    Filed: January 20, 2022
    Date of Patent: January 24, 2023
    Assignee: GRAPHCORE LIMITED
    Inventors: Stephen Felix, Simon Christian Knowles
  • Patent number: 11550591
    Abstract: A processor comprising: an execution unit for executing a respective thread in each of a repeating sequence of time slots; and a plurality of context register sets, each comprising a respective set of registers for representing a state of a respective thread. The context register sets comprise a respective worker context register set for each of the number of time slots the execution unit is operable to interleave, and at least one extra context register set. The worker context register sets represent the respective states of worker threads and the extra context register set being represents the state of a supervisor thread. The processor is configured to begin running the supervisor thread in each of the time slots, and to enable the supervisor thread to then individually relinquish each of the time slots in which it is running to a respective one of the worker threads.
    Type: Grant
    Filed: February 10, 2021
    Date of Patent: January 10, 2023
    Assignee: GRAPHCORE LIMITED
    Inventor: Simon Christian Knowles
  • Patent number: 11467833
    Abstract: A processor having an instruction set including a load-store instruction having operands specifying, from amongst the registers in at least one register file, a respective destination of each of two load operations, a respective source of a store operation, and a pair of address registers arranged to hold three memory addresses, the three memory addresses being a respective load address for each of the two load operations and a respective store address for the store operation. The load-store instruction further includes three stride operands each specifying a respective stride value for each of the two load addresses and one store address, wherein at least some possible values of each stride operand specify the respective stride value by specifying one of a plurality of fields within a stride register in one of the one or more register files, each field holding a different stride value.
    Type: Grant
    Filed: February 15, 2019
    Date of Patent: October 11, 2022
    Assignee: Graphcore Limited
    Inventors: Alan Graham Alexander, Simon Christian Knowles, Mrudula Chidambar Gore
  • Publication number: 20220253399
    Abstract: The invention relates to a computer program comprising a sequence of instructions for execution on a processing unit having instruction storage for holding the computer program, an execution unit for executing the computer program and data storage for holding data, the computer program comprising one or more computer executable instruction which, when executed, implements: a send function which causes a data packet destined for a recipient processing unit to be transmitted on a set of connection wires connected to the processing unit, the data packet having no destination identifier but being transmitted at a predetermined transmit time; and a switch control function which causes the processing unit to control switching circuitry to connect a set of connection wires of the processing unit to a switching fabric to receive a data packet at a predetermined receive time.
    Type: Application
    Filed: April 6, 2022
    Publication date: August 11, 2022
    Inventors: Simon Christian KNOWLES, Daniel John Pelham WILKINSON, Richard Luke Southwell OSBORNE, Alan Graham ALEXANDER, Stephen FELIX, Jonathan MANGNALL, David LACEY
  • Publication number: 20220197857
    Abstract: A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets. The computer comprises: a plurality of processing units each having an input interface with a set of input wires, and an output interface with a set of output wires: a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by its associated processing unit; the processing units arranged in columns, each column having a base processing unit proximate the switching fabric and multiple processing units one adjacent the other in respective positions in the direction of the column.
    Type: Application
    Filed: January 20, 2022
    Publication date: June 23, 2022
    Inventors: Stephen Felix, Simon Christian KNOWLES
  • Publication number: 20220197645
    Abstract: A processor is disclosed including: a barrel-threaded execution unit for executing concurrent threads, and a repeat cache shared between the concurrent threads. The processor's instruction set includes a repeat instruction which takes a repeat count operand. When the repeat cache is not claimed and the repeat instruction is executed in a first thread, a portion of code is cached from the first thread into the repeat cache, the state of the repeat cache is changed to record it as claimed, and the cached code is executed a number of times. When the repeat instruction is then executed in a further thread, then the already-cached portion of code is again executed a respective number of times, each time from the repeat cache. For each of the first and further instructions, the repeat count operand in the respective instruction specifies the number of times to execute the cached code.
    Type: Application
    Filed: March 11, 2022
    Publication date: June 23, 2022
    Inventors: Alan Graham ALEXANDER, Simon Christian KNOWLES, Mrudula Chidambar GORE, Jonathan FERGUSON
  • Patent number: 11321272
    Abstract: The invention relates to a computer program comprising a sequence of instructions for execution on a processing unit having instruction storage for holding the computer program, an execution unit for executing the computer program and data storage for holding data, the computer program comprising one or more computer executable instruction which, when executed, implements: a send function which causes a data packet destined for a recipient processing unit to be transmitted on a set of connection wires connected to the processing unit, the data packet having no destination identifier but being transmitted at a predetermined transmit time; and a switch control function which causes the processing unit to control switching circuitry to connect a set of connection wires of the processing unit to a switching fabric to receive a data packet at a predetermined receive time.
    Type: Grant
    Filed: February 1, 2018
    Date of Patent: May 3, 2022
    Assignee: Graphcore Limited
    Inventors: Simon Christian Knowles, Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Alan Graham Alexander, Stephen Felix, Jonathan Mangnall, David Lacey
  • Patent number: 11269806
    Abstract: A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets.
    Type: Grant
    Filed: May 22, 2019
    Date of Patent: March 8, 2022
    Assignee: Graphcore Limited
    Inventors: Stephen Felix, Simon Christian Knowles
  • Patent number: 11262787
    Abstract: The invention relates to a computer implemented method of generating multiple programs to deliver a computerised function, each program to be executed in a processing unit of a computer comprising a plurality of processing units each having instruction storage for holding a local program, an execution unit for executing the local program and data storage for holding data, a switching fabric connected to an output interface of each processing unit and connectable to an input interface of each processing unit by switching circuitry controllable by each processing unit, and a synchronisation module operable to generate a synchronisation signal, the method comprising: generating a local program for each processing unit comprising a sequence of executable instructions; determining for each processing unit a relative time of execution of instructions of each local program whereby a local program allocated to one processing unit is scheduled to execute with a predetermined delay relative to a synchronisation signal
    Type: Grant
    Filed: January 16, 2020
    Date of Patent: March 1, 2022
    Assignee: GRAPHCORE LIMITED
    Inventors: Simon Christian Knowles, Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Alan Graham Alexander, Stephen Felix, Jonathan Mangnall, David Lacey
  • Patent number: 11113060
    Abstract: A processing apparatus comprising one or more processing modules, each comprising an execution unit. The one or more processing modules are operable to run a plurality of parallel or concurrent threads, and the processing apparatus further comprises a storage location for storing an aggregated exit state of the plurality of threads. An instruction set of the processing apparatus comprises an exit instruction for inclusion in each of the plurality of threads, the exit state instruction taking an individual exit state of the respective thread as an operand. The exit instruction terminates the respective thread and also causes the individual exit state specified in the operand to contribute to the aggregated exit state.
    Type: Grant
    Filed: August 1, 2019
    Date of Patent: September 7, 2021
    Assignee: GRAPHCORE LIMITED
    Inventor: Simon Christian Knowles
  • Publication number: 20210271527
    Abstract: A processing system comprising an arrangement of tiles and an interconnect between the tiles. The interconnect comprises synchronization logic for coordinating a barrier synchronization to be performed between a group of the tiles. The instruction set comprises a synchronization instruction taking an operand which selects one of a plurality of available modes each specifying a different membership of the group. Execution of the synchronization instruction cause a synchronization request to be transmitted from the respective tile to the synchronization logic, and instruction issue to be suspended on the respective tile pending a synchronization acknowledgement being received back from the synchronization logic. In response to receiving the synchronization request from all the tiles in the group as specified by the operand of the synchronization instruction, the synchronization logic returns the synchronization acknowledgment to the tiles in the specified group.
    Type: Application
    Filed: May 14, 2021
    Publication date: September 2, 2021
    Inventors: Daniel John Pelham Wilkinson, Simon Christian Knowles, Matthew David Fyles, Alan Graham Alexander, Stephen Felix