Patents by Inventor Stephen Felix
Stephen Felix has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11442082Abstract: During normal operation of a processor, voltage droop is likely to occur and there is, therefore, a need for techniques for rapidly and accurately detecting this droop so as to reduce the probability of circuit timing failures. The droop detector described herein uses a tap sampled delay line in which a clock signal is split along two separate paths. Each of the taps in the paths are separated by two inverter delays such that the set of samples produced represent sample values of the clock signal that are each separated by a single inverter delay without inversion of the first clock signal between the samples.Type: GrantFiled: October 28, 2020Date of Patent: September 13, 2022Assignee: GRAPHCORE LIMITEDInventors: Stephen Felix, Daniel John Pelham Wilkinson
-
Patent number: 11416440Abstract: A computer program comprising a sequence of instructions for execution on a processing unit having instruction storage for holding the computer program, an execution unit for executing the computer program and data storage for holding data, the computer program comprising: a switch control instruction which when executed causes the processing unit to control switching circuitry to connect a set of connection wires of the processing unit to a switching fabric to receive a data packet at a predetermined received time, the switch control instruction comprising a delay control field which holds a value defining a delay between issuance of the instruction in the sequence of instructions and its execution by the execution unit.Type: GrantFiled: February 13, 2020Date of Patent: August 16, 2022Assignee: GRAPHCORE LIMITEDInventors: Richard Luke Southwell Osborne, Alan Graham Alexander, Stephen Felix
-
Publication number: 20220253279Abstract: An execution unit for a processor, the execution unit comprising: a look up table having a plurality of entries, each of the plurality of entries comprising an initial estimate for a result of an operation; a preparatory circuit configured to search the look up table using an index value dependent upon the operand to locate an entry comprising a first initial estimate for a result of the operation; a plurality of processing circuits comprising at least one multiplier circuit; and control circuitry configured to provide the first initial estimate to the at least one multiplier circuit of the plurality of processing circuits so as perform processing, by the plurality of processing units, of the first initial estimate to generate the function result, said processing comprising applying one or more Newton Raphson iterations to the first initial estimate.Type: ApplicationFiled: April 26, 2022Publication date: August 11, 2022Inventors: Jonathan Mangnall, Stephen Felix
-
Publication number: 20220253399Abstract: The invention relates to a computer program comprising a sequence of instructions for execution on a processing unit having instruction storage for holding the computer program, an execution unit for executing the computer program and data storage for holding data, the computer program comprising one or more computer executable instruction which, when executed, implements: a send function which causes a data packet destined for a recipient processing unit to be transmitted on a set of connection wires connected to the processing unit, the data packet having no destination identifier but being transmitted at a predetermined transmit time; and a switch control function which causes the processing unit to control switching circuitry to connect a set of connection wires of the processing unit to a switching fabric to receive a data packet at a predetermined receive time.Type: ApplicationFiled: April 6, 2022Publication date: August 11, 2022Inventors: Simon Christian KNOWLES, Daniel John Pelham WILKINSON, Richard Luke Southwell OSBORNE, Alan Graham ALEXANDER, Stephen FELIX, Jonathan MANGNALL, David LACEY
-
Publication number: 20220229802Abstract: Two or more die are stacked together in a stacked integrated circuit device. Each of the processors on these die is able to communicate with other processors on its die by sending data over the switching fabric of its respective die. The mechanism for sending data between processors on the same die (i.e. intradie communication) is reused for sending data between processors on different die (i.e. interdie communication). The reuse of the mechanism is enabled by assigning each processor a vertical neighbour on its opposing die. Each processor has an interdie connection that connects it to the output exchange bus of its neighbour. A processor is able to borrow the output exchange bus of its neighbour by sending data along the output exchange bus of its neighbour.Type: ApplicationFiled: October 19, 2021Publication date: July 21, 2022Inventors: Stephen FELIX, Richard Luke Southwell OSBORNE, Alan Graham ALEXANDER
-
Publication number: 20220229871Abstract: A function approximation system is disclosed for determining output floating point values of functions calculated using floating point numbers. Complex functions have different shapes in different subsets of their input domain, making them difficult to predict for different values of the input variable. The function approximation system comprises an execution unit configured to determine corresponding values of a given function given a floating point input to the function; a plurality of look up tables for each function type; a correction table of values which determines if corrections to the output value are required; and a table selector for finding an appropriate table for a given function.Type: ApplicationFiled: April 5, 2022Publication date: July 21, 2022Inventors: Jonathan MANGNALL, Stephen FELIX
-
Publication number: 20220197857Abstract: A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets. The computer comprises: a plurality of processing units each having an input interface with a set of input wires, and an output interface with a set of output wires: a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by its associated processing unit; the processing units arranged in columns, each column having a base processing unit proximate the switching fabric and multiple processing units one adjacent the other in respective positions in the direction of the column.Type: ApplicationFiled: January 20, 2022Publication date: June 23, 2022Inventors: Stephen Felix, Simon Christian KNOWLES
-
Patent number: 11340868Abstract: An execution unit for a processor, the execution unit comprising: a look up table having a plurality of entries, each of the plurality of entries comprising an initial estimate for a result of an operation; a preparatory circuit configured to search the look up table using an index value dependent upon the operand to locate an entry comprising a first initial estimate for a result of the operation; a plurality of processing circuits comprising at least one multiplier circuit; and control circuitry configured to provide the first initial estimate to the at least one multiplier circuit of the plurality of processing circuits so as perform processing, by the plurality of processing units, of the first initial estimate to generate the function result, said processing comprising applying one or more Newton Raphson iterations to the first initial estimate.Type: GrantFiled: April 26, 2019Date of Patent: May 24, 2022Assignee: Graphcore LimitedInventors: Jonathan Mangnall, Stephen Felix
-
Publication number: 20220157398Abstract: A memory controller is provided for reading and writing to and from a memory module. The memory controller implements an error correction algorithm, which calculates error correction code for message data to be written to the memory module and checks the error correction code against the message data when the data is read out of the memory module. The memory controller spreads each codeword over at least four different beats sent over the interface with the memory module, with each beat comprising a symbol of error correction code. Bits of a particular symbol of message data occupy the same positions in different beats. Since the bits of the symbols occupy the same positions in different beat, the number of bits affected by a hardware error is minimised. With four symbols of error correction code available for use in the codeword.Type: ApplicationFiled: July 20, 2021Publication date: May 19, 2022Inventors: Graham Bernard CUNNINGHAM, Stephen FELIX
-
Patent number: 11334320Abstract: An execution unit configured to execute a computer program instruction to generate random numbers based on a predetermined probability distribution. The execution unit comprises a hardware pseudorandom number generator configured to generate at least randomised bit string on execution of the instruction and adding circuitry which is configured to receive a number of bit sequences of a predetermined bit length selected from the randomised bit string and to sum them to produce a result.Type: GrantFiled: February 21, 2020Date of Patent: May 17, 2022Assignee: GRAPHCORE LIMITEDInventors: Stephen Felix, Godfrey Da Costa
-
Patent number: 11328015Abstract: A function approximation system is disclosed for determining output floating point values of functions calculated using floating point numbers. Complex functions have different shapes in different subsets of their input domain, making them difficult to predict for different values of the input variable. The function approximation system comprises an execution unit configured to determine corresponding values of a given function given a floating point input to the function; a plurality of look up tables for each function type; a correction table of values which determines if corrections to the output value are required; and a table selector for finding an appropriate table for a given function.Type: GrantFiled: April 26, 2019Date of Patent: May 10, 2022Assignee: Graphcore LimitedInventors: Jonathan Mangnall, Stephen Felix
-
Patent number: 11321272Abstract: The invention relates to a computer program comprising a sequence of instructions for execution on a processing unit having instruction storage for holding the computer program, an execution unit for executing the computer program and data storage for holding data, the computer program comprising one or more computer executable instruction which, when executed, implements: a send function which causes a data packet destined for a recipient processing unit to be transmitted on a set of connection wires connected to the processing unit, the data packet having no destination identifier but being transmitted at a predetermined transmit time; and a switch control function which causes the processing unit to control switching circuitry to connect a set of connection wires of the processing unit to a switching fabric to receive a data packet at a predetermined receive time.Type: GrantFiled: February 1, 2018Date of Patent: May 3, 2022Assignee: Graphcore LimitedInventors: Simon Christian Knowles, Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Alan Graham Alexander, Stephen Felix, Jonathan Mangnall, David Lacey
-
Patent number: 11294635Abstract: A pseudo random number generator implemented in hardware. The pseudo random number generator comprises a state post processing circuit for processing two state values to produce a random number. The circuit having a first combinatorial logic comprising a XOR or XNOR gate configured to process a first pair of bits from the state values, a second combinatorial logic comprising an OR or AND gate configured to process a second pair of bits from the state value, and third combinatorial logic comprising an OR or AND gate configured or process a third pair of bits from the state value. The circuit has fourth combinatorial logic configured to process the outputs of the first three set of combinatorial logic so as to provide a result bit of the random number. The fourth combinatorial logic comprises an AND or OR gate and a XOR or XNOR gate.Type: GrantFiled: April 26, 2019Date of Patent: April 5, 2022Assignee: Graphcore LimitedInventors: Stephen Felix, James William Hanlon
-
Patent number: 11269806Abstract: A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets.Type: GrantFiled: May 22, 2019Date of Patent: March 8, 2022Assignee: Graphcore LimitedInventors: Stephen Felix, Simon Christian Knowles
-
Patent number: 11262787Abstract: The invention relates to a computer implemented method of generating multiple programs to deliver a computerised function, each program to be executed in a processing unit of a computer comprising a plurality of processing units each having instruction storage for holding a local program, an execution unit for executing the local program and data storage for holding data, a switching fabric connected to an output interface of each processing unit and connectable to an input interface of each processing unit by switching circuitry controllable by each processing unit, and a synchronisation module operable to generate a synchronisation signal, the method comprising: generating a local program for each processing unit comprising a sequence of executable instructions; determining for each processing unit a relative time of execution of instructions of each local program whereby a local program allocated to one processing unit is scheduled to execute with a predetermined delay relative to a synchronisation signalType: GrantFiled: January 16, 2020Date of Patent: March 1, 2022Assignee: GRAPHCORE LIMITEDInventors: Simon Christian Knowles, Daniel John Pelham Wilkinson, Richard Luke Southwell Osborne, Alan Graham Alexander, Stephen Felix, Jonathan Mangnall, David Lacey
-
Publication number: 20220019257Abstract: Two clocks, a fast clock and a slow clock are provided for clocking a processing unit. A plurality of frequency settings, referred to as gears, are defined for the two clock. Each of these gears indicates a maximum frequency for the fast clock and a minimum frequency for the slow clock, such that the gap between the two frequencies may be kept to a manageable level so as to reduce transients upon switching between the two clocks. The system switches between the gears as required. In response to a determination to increase the frequency of the clock signal, a higher gear is selected at which the maximum and minimum frequencies defined for that gear are higher than the previous selected gear.Type: ApplicationFiled: June 16, 2021Publication date: January 20, 2022Inventors: Simon Douglas CHAMBERS, Stephen FELIX, Ian Malcolm KING
-
Publication number: 20210406115Abstract: A processor comprises a plurality of processing units, wherein there is a fixed transmission time for transmitting a message from a sending processing unit to a receiving processing unit, based on the physical positions of the sending and receiving processing units in the processor. The processing units are arranged in a column, and the fixed transmission time depends on the position of a processing circuit in the column. An exchange fabric is provided for exchanging messages between sending and receiving processing units, the columns being arranged with respect to the exchange fabric such that the fixed transmission time depends on the distances of the processing circuits with respect to the exchange fabric.Type: ApplicationFiled: September 10, 2021Publication date: December 30, 2021Inventor: Stephen FELIX
-
Publication number: 20210373637Abstract: There is disclosed a method of controlling the frequency of a clock signal in a processor. The method selects a first clock generator to provide a processor clock signal for executing an application. If a threshold event is detected, a second clock generator is selected. The method reduces the frequency of a clock signal generated by the first clock generator while a processor clock signal is being provided for execution of an application from the second clock generator. The second clock generator generates a clock at a lower speed than the first clock generator. After a predetermined time, the first clock generator is reselected to provide the processor clock signal. The threshold detection is repeated until an optimum clock frequency is discovered.Type: ApplicationFiled: August 17, 2021Publication date: December 2, 2021Inventors: Stephen FELIX, Mrudula GORE
-
Patent number: 11176066Abstract: The present disclosure relates to a method of scheduling messages to be exchanged between tiles in a computer where there is a fixed transmission time between sending and receiving tiles. According to the method a total size of message data to be sent or received by each tile is determined. One of the tiles is selected based at least on the size of the message data to schedule a first message. The first message to be scheduled is selected from the set of messages on that tile. In order to schedule the message the other end points of this selected message are determined, and then respective time slots are allocated at the sending and receiving tiles for that message. The size of the selected message is then deducted from each of the tiles acting as end points for the message, and then the sequence is carried out again until all messages have been scheduled. This technique optimises message exchange in an exchange phase of a BSP system.Type: GrantFiled: April 26, 2019Date of Patent: November 16, 2021Assignee: Graphcore LimitedInventors: Richard Luke Southwell Osborne, Stephen Felix
-
Patent number: 11169778Abstract: A hardware module comprising at least one of: one or more field programmable gate arrays and one or more application specific integrated circuits configured to: receive a number in floating-point representation at a first precision level, the number comprising an exponent and a first mantissa; apply a first random number to the first mantissa to generate a first carry; truncate the first mantissa to a level specified by a second precision level; add the first carry to the least significant bit of the mantissa truncated to the level specified by the second precision level to form a mantissa for the number in floating-point representation at the second precision level.Type: GrantFiled: July 26, 2019Date of Patent: November 9, 2021Assignee: GRAPHCORE LIMITEDInventors: Stephen Felix, Mrudula Gore, Alan Graham Alexander