Patents by Inventor Stephen Felix
Stephen Felix has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230214255Abstract: A processing device comprising: at least one execution unit configured to interleave execution of a plurality of worker threads, wherein each of the worker threads is configured to execute a same set of code to perform operations on a different set of data held in an input buffer of a memory of the processing device and output the results data to an output buffer. An instruction is executed so as to cause a plurality of operand registers, each of which is associated with one of the worker threads, to be populated with one or more variables enabling each worker to determine where in the input buffer is located its set of input data and where to store its results data.Type: ApplicationFiled: October 28, 2022Publication date: July 6, 2023Inventors: Alan ALEXANDER, Stephen FELIX, Edward ANDREWS, Godfrey DA COSTA
-
Patent number: 11680965Abstract: During normal operation of a processor, voltage droop is likely to occur and there is, therefore, a need for techniques for rapidly and accurately detecting this droop so as to reduce the probability of circuit timing failures. The droop detector described herein uses a tap sampled delay line in which a clock signal is split along two separate paths. Each of the taps in the paths are separated by two inverter delays such that the set of samples produced represent sample values of the clock signal that are each separated by a single inverter delay without inversion of the first clock signal between the samples.Type: GrantFiled: July 22, 2022Date of Patent: June 20, 2023Assignee: GRAPHCORE LIMITEDInventors: Stephen Felix, Daniel John Pelham Wilkinson
-
Patent number: 11644884Abstract: There is disclosed a method of controlling the frequency of a clock signal in a processor. The method selects a first clock generator to provide a processor clock signal for executing an application. If a threshold event is detected, a second clock generator is selected. The method reduces the frequency of a clock signal generated by the first clock generator while a processor clock signal is being provided for execution of an application from the second clock generator. The second clock generator generates a clock at a lower speed than the first clock generator. After a predetermined time, the first clock generator is reselected to provide the processor clock signal. The threshold detection is repeated until an optimum clock frequency is discovered.Type: GrantFiled: August 17, 2021Date of Patent: May 9, 2023Assignee: GRAPHCORE LIMITEDInventors: Stephen Felix, Mrudula Gore
-
Publication number: 20230116320Abstract: The first logic wafer is attached to a supporting wafer, which adds sufficient depth to this bonded structure such that the first logic wafer may be thinned during the manufacturing process. The first logic wafer is thinned such that the through silicon vias may be etched in the substrate of the first logic wafer so as to provide adequate connectivity to a second logic wafer, which is bonded to the first logic wafer. The second logic wafer adds sufficient depth to this bonded structure to allow the supporting wafer to then be thinned to enable through silicon vias to be added to the supporting wafer so as to provide appropriate connectivity for the entire stacked structure. The thinned supporting wafer is retained in the finished stacked wafer structure and may comprise additional components (e.g. capacitors) supporting the operation of the processing circuitry in the logic wafers.Type: ApplicationFiled: October 5, 2022Publication date: April 13, 2023Inventors: Stephen FELIX, Phillip HORSFIELD, Simon Jonathan STACEY
-
Publication number: 20230114044Abstract: A method for testing a stacked integrated circuit device comprising a first die and a second die, the method comprising: sending from testing logic of the first die, first testing control signals to first testing apparatus on the first die; in response to the first testing control signals, the first testing apparatus running a first one or more tests for testing functional logic or memory of the first die; sending from the testing logic of the first die, second testing control signals to the second die via through silicon vias formed in a substrate of the first die; and in dependence upon the second testing control signals from the first die, running a second one or more tests for testing the stacked integrated circuit device.Type: ApplicationFiled: September 22, 2022Publication date: April 13, 2023Inventors: Stephen FELIX, Phillip HORSFIELD
-
Patent number: 11625061Abstract: Two clocks, a fast clock and a slow clock are provided for clocking a processing unit. A plurality of frequency settings, referred to as gears, are defined for the two clock. Each of these gears indicates a maximum frequency for the fast clock and a minimum frequency for the slow clock, such that the gap between the two frequencies may be kept to a manageable level so as to reduce transients upon switching between the two clocks. The system switches between the gears as required. In response to a determination to increase the frequency of the clock signal, a higher gear is selected at which the maximum and minimum frequencies defined for that gear are higher than the previous selected gear.Type: GrantFiled: June 16, 2021Date of Patent: April 11, 2023Assignee: GRAPHCORE LIMITEDInventors: Simon Douglas Chambers, Stephen Felix, Ian Malcolm King
-
Publication number: 20230079541Abstract: A processor comprises at least one delay stage for each processing circuit and switching circuitry for selectively switching the delay stage into or out of a communication path involved in message exchange. For processing circuits up to a defective processing circuit in the column, the delay stage is switched into the communication path, and for processing circuits above the defective processing circuit in the column, including a repairing processing circuit which repairs the defective processing circuit the delay stage is switched out of the communication path whereby the fixed transmission time of processing circuits is preserved in the event of a repair of the column.Type: ApplicationFiled: September 9, 2022Publication date: March 16, 2023Inventor: Stephen FELIX
-
Patent number: 11586483Abstract: A processing system comprising an arrangement of tiles and an interconnect between the tiles. The interconnect comprises synchronization logic for coordinating a barrier synchronization to be performed between a group of the tiles. The instruction set comprises a synchronization instruction taking an operand which selects one of a plurality of available modes each specifying a different membership of the group. Execution of the synchronization instruction cause a synchronization request to be transmitted from the respective tile to the synchronization logic, and instruction issue to be suspended on the respective tile pending a synchronization acknowledgement being received back from the synchronization logic. In response to receiving the synchronization request from all the tiles in the group as specified by the operand of the synchronization instruction, the synchronization logic returns the synchronization acknowledgment to the tiles in the specified group.Type: GrantFiled: May 14, 2021Date of Patent: February 21, 2023Assignee: GRAPHCORE LIMITEDInventors: Daniel John Pelham Wilkinson, Simon Christian Knowles, Matthew David Fyles, Alan Graham Alexander, Stephen Felix
-
Publication number: 20230036665Abstract: A method for repairing a processor. The processor comprises a plurality of processing units and an exchange comprising a plurality of exchange paths for transmitting data between the processing units. Each processing unit is connected to output data to a respective exchange path. An exchange path functional test of at least a portion of the exchange paths is carried out. Based on the exchange path functional test, it is identified that one or more of the exchange paths is defective, and the processing units connected to the one or more defective exchange paths is identified. The identified processing units are switched out of functional operation of the processor and switching in at least one repair processing unit connected to a non-defective exchange path for functional operation of the processor.Type: ApplicationFiled: July 22, 2022Publication date: February 2, 2023Inventors: Stephen FELIX, Natalie NARKONSKI, Philip HORSFIELD
-
Publication number: 20230029217Abstract: A multi-tile processing unit in which the tiles in the processing unit may be divided between two or more different external sync groups for performing barrier synchronisations. In this way, different sets of tiles of the same processing unit each sync with different sets of tiles external to that processing unit.Type: ApplicationFiled: September 1, 2021Publication date: January 26, 2023Inventors: Simon KNOWLES, Daniel John Pelham WILKINSON, Alan ALEXANDER, Stephen FELIX, Richard OSBORNE, David LACEY, Lars Paul HUSE
-
Publication number: 20230023957Abstract: In a stacked integrated circuit device, there are two components, one in a first of the die and another in a second of the die. Each of the components is provided with two output connections, one leading above and one leading below the die, and two input connections, one leading above and one leading below the die, either of the two die. As a result of the redundancy, both die may be used in either position in the stacked structure. If either of the die is used as the top die, it sends data on its second output path and receives data on its second input path. On the other hand, when one of the die is used as the bottom die, it sends data on its first output path and receives data on its first input path. In this way, the same design may be used for the connections between each of the die.Type: ApplicationFiled: September 28, 2022Publication date: January 26, 2023Inventors: Alexander MACFADEN, Stephen FELIX
-
Patent number: 11561926Abstract: A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets. The computer comprises: a plurality of processing units each having an input interface with a set of input wires, and an output interface with a set of output wires: a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by its associated processing unit; the processing units arranged in columns, each column having a base processing unit proximate the switching fabric and multiple processing units one adjacent the other in respective positions in the direction of the column.Type: GrantFiled: January 20, 2022Date of Patent: January 24, 2023Assignee: GRAPHCORE LIMITEDInventors: Stephen Felix, Simon Christian Knowles
-
Publication number: 20230016049Abstract: A set of configurable sync groupings (which may be referred to as sync zones) are defined. Any of the processors may belong to any of the sync zones. Each of the processor comprises a register indicating to which of the sync zones it belongs. If a processor does not belong to a sync zone, it continually asserts a sync request for that sync zone to the sync controller. If a processor does belong to a sync zone, it will only assert its sync request for that sync zone upon arriving at a synchronisation point for that sync zone indicated in its compiled code set.Type: ApplicationFiled: July 11, 2022Publication date: January 19, 2023Inventors: Stephen FELIX, Richard OSBORNE
-
Publication number: 20220414040Abstract: A method for controlling the sending of data by a plurality of processors belonging to a device, the method comprising: sending a first message to a first processor of the plurality of processors to grant permission to the first processor of the plurality of processors to send a first set of data packets over at least one external interface of the device; receiving from the first processor, an identifier of a second processor of the plurality of processors; and in response to receipt of the identifier of the second processor, send a second message to the second processor to grant permission to the second processor to send a second set of data packets over the at least one external interface.Type: ApplicationFiled: September 16, 2021Publication date: December 29, 2022Inventors: Graham Bernard CUNNINGHAM, Stephen FELIX
-
Publication number: 20220413961Abstract: Signature generation circuitry is configured to update a signature in response to each of a plurality of writes to memory. The signature is updated by performing bitwise operations between current bit values of the signature and at least some of the bits written to memory in response a write. The bitwise operation are order-independent such that the resulting signature is the same irrespective of the order in which the writes are used to update the signature. The signatures are formed in an order-independent manner such that, if no errors have occurred in generating the data to be written to be memory, the signatures will match. In this way, a compact signature is developed that is suitable export from the data processing device for checking against a corresponding data processing device of a machine running a duplicate application.Type: ApplicationFiled: August 30, 2022Publication date: December 29, 2022Inventors: Stephen FELIX, Daniel WILKINSON, Graham Bernard CUNNINGHAM
-
Publication number: 20220365116Abstract: During normal operation of a processor, voltage droop is likely to occur and there is, therefore, a need for techniques for rapidly and accurately detecting this droop so as to reduce the probability of circuit timing failures. The droop detector described herein uses a tap sampled delay line in which a clock signal is split along two separate paths. Each of the taps in the paths are separated by two inverter delays such that the set of samples produced represent sample values of the clock signal that are each separated by a single inverter delay without inversion of the first clock signal between the samples.Type: ApplicationFiled: July 22, 2022Publication date: November 17, 2022Inventors: Stephen FELIX, Daniel John Pelham WILKINSON
-
Patent number: 11461175Abstract: Signature generation circuitry is configured to update a signature in response to each of a plurality of writes to memory. The signature is updated by performing bitwise operations between current bit values of the signature and at least some of the bits written to memory in response a write. The bitwise operation are order-independent such that the resulting signature is the same irrespective of the order in which the writes are used to update the signature. The signatures are formed in an order-independent manner such that, if no errors have occurred in generating the data to be written to be memory, the signatures will match. In this way, a compact signature is developed that is suitable export from the data processing device for checking against a corresponding data processing device of a machine running a duplicate application.Type: GrantFiled: September 17, 2021Date of Patent: October 4, 2022Assignee: GRAPHCORE LIMITEDInventors: Stephen Felix, Daniel Wilkinson, Graham Bernard Cunningham
-
Patent number: 11462293Abstract: A memory controller is provided for reading and writing to and from a memory module. The memory controller implements an error correction algorithm, which calculates error correction code for message data to be written to the memory module and checks the error correction code against the message data when the data is read out of the memory module. The memory controller spreads each codeword over at least four different beats sent over the interface with the memory module, with each beat comprising a symbol of error correction code. Bits of a particular symbol of message data occupy the same positions in different beats. Since the bits of the symbols occupy the same positions in different beat, the number of bits affected by a hardware error is minimised. With four symbols of error correction code available for use in the codeword.Type: GrantFiled: July 20, 2021Date of Patent: October 4, 2022Assignee: GRAPHCORE LIMITEDInventors: Graham Bernard Cunningham, Stephen Felix
-
Patent number: 11449117Abstract: During normal operation of a processor, voltage droop is likely to occur and there is, therefore, a need for techniques for rapidly addressing this droop so as to reduce the probability of circuit timing failures. This problem is addressed by provided an apparatus that is configured to detect the droop and react to mitigate the droop. The apparatus includes a frequency divider that is configured to receive an output of a clock signal generator (e.g. a phase locked loop) and produce an output signal in which a predefined fraction of the clock pulses in the output of the clock signal generator are removed from the output signal. By reducing the frequency of the clock signal in this way (as may be understood by examining equation 3) VDD is increased, hence mitigating the voltage droop. This technique provides a fast throttling mechanism that prevents excessive VDD droop across the processor.Type: GrantFiled: April 8, 2020Date of Patent: September 20, 2022Assignee: GRAPHCORE LIMITEDInventors: Stephen Felix, Daniel Wilkinson
-
Patent number: 11449309Abstract: A hardware module comprising circuitry configured to: store a sequence of n bits in a register of the hardware module; generate a signed integer comprising a magnitude component and a sign bit by: if the most significant bit of the sequence of n bits is equal to one: set each of the n?1 of the most significant bits of the magnitude component to be equal to the corresponding bit of the n?1 least significant bits of the sequence of n bits; and set the sign bit to be zero; if the most significant bit of the sequence of n bits is equal to zero: set each of the n?1 of the most significant bits of the magnitude component to be equal to the inverse of the corresponding bit of the n?1 least significant bits of the sequence of n bits; and set the sign bit to be one.Type: GrantFiled: June 21, 2019Date of Patent: September 20, 2022Assignee: GRAPHCORE LIMITEDInventors: Stephen Felix, Mrudula Gore