Patents by Inventor Nicolas Brunie

Nicolas Brunie has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11550544
    Abstract: A fused multiply-add hardware operator comprising a multiplier receiving two multiplicands as floating-point numbers encoded in a first precision format; an alignment circuit associated with the multiplier configured to convert the result of the multiplication into a first fixed-point number; and an adder configured to add the first fixed-point number and an addition operand. The addition operand is a floating-point number encoded in a second precision format, and the operator comprises an alignment circuit associated with the addition operand, configured to convert the addition operand into a second fixed-point number of reduced dynamic range relative to the dynamic range of the addition operand, having a number of bits equal to the number of bits of the first fixed-point number, extended on both sides by at least the size of the mantissa of the addition operand; the adder configured to add the first and second fixed-point numbers without loss.
    Type: Grant
    Filed: June 25, 2020
    Date of Patent: January 10, 2023
    Assignee: Kalray
    Inventor: Nicolas Brunie
  • Patent number: 11489544
    Abstract: A circuit for generating an N-bit cyclic redundancy code of a k-bit digit d, the code based on a reconfigurable generator polynomial P of degree N, the circuit including a dynamic table comprising a multiplication sub-table storing products resulting from multiplication by the polynomial P of each element definable over k bits, in the order of the scalar values of the k-bit elements; a division sub-table storing quotients resulting from Euclidean division by the polynomial P of each k-bit element shifted by N bits to the left, in the order of the scalar values of the k-bit elements; and a group of first multiplexers, each multiplexer connected to be indexed by a respective cell of the division table to transmit the contents of a corresponding cell of the multiplication table to an output of the dynamic table, of same rank as the respective cell of the division table.
    Type: Grant
    Filed: March 30, 2021
    Date of Patent: November 1, 2022
    Assignee: Kalray
    Inventor: Nicolas Brunie
  • Publication number: 20220207108
    Abstract: A method is disclosed for block processing two matrices stored in a same shared memory, one being stored by rows and the other being stored by columns, using a plurality of processing elements (PE), where each processing element is connected to the shared memory by a respective N-bit access and to a first adjacent processing element by a bidirectional N-bit point-to-point link. The method comprising the following steps carried out in one processor instruction cycle: receiving in the processing elements respective different N-bit segments of a same one of the two matrices by the respective memory accesses; and exchanging with the first adjacent processing element, by means of the point-to-point link, N-bit segments of a first of the two matrices which were received in the adjacent processing elements in a previous instruction cycle.
    Type: Application
    Filed: December 30, 2021
    Publication date: June 30, 2022
    Inventors: Benoit Dupont de Dinechin, Julien Le Maire, Nicolas Brunie
  • Patent number: 11294627
    Abstract: The disclosure relates to a hardware operator for dot-product computation, comprising a plurality of multipliers each receiving two multiplicands in the form of floating-point numbers encoded in a first precision format; an alignment circuit associated with each multiplier, configured to, based on the exponents of the corresponding multiplicands, convert the result of the multiplication into a respective fixed-point number having a sufficient number of bits to cover the full dynamic range of the multiplication; and a multi-adder configured to add without loss the fixed-point numbers provided by the multipliers, providing a sum in the form of a fixed-point number.
    Type: Grant
    Filed: June 25, 2020
    Date of Patent: April 5, 2022
    Assignee: Kalray
    Inventor: Nicolas Brunie
  • Patent number: 11169808
    Abstract: The disclosure relates to a processor including an N-bit data bus configured to access a memory; a central processing unit CPU connected to the data bus; a coprocessor coupled to the CPU, including a register file with N-bit registers; an instruction processing unit in the CPU, configured to, in response to a load-scatter machine instruction received by the CPU, read accessing a memory address and delegating to the coprocessor the processing of the corresponding N-bit word presented on the data bus; and a register control unit in the coprocessor, configured by the CPU in response to the load-scatter instruction, to divide the word presented on the data bus into K segments and writing the K segments at the same position in K respective registers, the position and the registers being designated by the load-scatter instruction.
    Type: Grant
    Filed: December 20, 2019
    Date of Patent: November 9, 2021
    Assignee: Kalray
    Inventors: Benoit Dupont de Dinechin, Julien Le Maire, Nicolas Brunie
  • Publication number: 20210306002
    Abstract: A circuit for generating an N-bit cyclic redundancy code of a k-bit digit d, the code based on a reconfigurable generator polynomial P of degree N, the circuit including a dynamic table comprising a multiplication sub-table storing products resulting from multiplication by the polynomial P of each element definable over k bits, in the order of the scalar values of the k-bit elements; a division sub-table storing quotients resulting from Euclidean division by the polynomial P of each k-bit element shifted by N bits to the left, in the order of the scalar values of the k-bit elements; and a group of first multiplexers, each multiplexer connected to be indexed by a respective cell of the division table to transmit the contents of a corresponding cell of the multiplication table to an output of the dynamic table, of same rank as the respective cell of the division table.
    Type: Application
    Filed: March 30, 2021
    Publication date: September 30, 2021
    Inventor: Nicolas Brunie
  • Publication number: 20200409661
    Abstract: The disclosure relates to a hardware operator for dot-product computation, comprising a plurality of multipliers each receiving two multiplicands in the form of floating-point numbers encoded in a first precision format; an alignment circuit associated with each multiplier, configured to, based on the exponents of the corresponding multiplicands, convert the result of the multiplication into a respective fixed-point number having a sufficient number of bits to cover the full dynamic range of the multiplication; and a multi-adder configured to add without loss the fixed-point numbers provided by the multipliers, providing a sum in the form of a fixed-point number.
    Type: Application
    Filed: June 25, 2020
    Publication date: December 31, 2020
    Inventor: Nicolas Brunie
  • Publication number: 20200409659
    Abstract: A fused multiply-add hardware operator comprising a multiplier receiving two multiplicands as floating-point numbers encoded in a first precision format; an alignment circuit associated with the multiplier configured to convert the result of the multiplication into a first fixed-point number ; and an adder configured to add the first fixed-point number and an addition operand. The addition operand is a floating-point number encoded in a second precision format , and the operator comprises an alignment circuit associated with the addition operand, configured to convert the addition operand into a second fixed-point number of reduced dynamic range relative to the dynamic range of the addition operand, having a number of bits equal to the number of bits of the first fixed-point number, extended on both sides by at least the size of the mantissa of the addition operand; the adder configured to add the first and second fixed-point numbers without loss.
    Type: Application
    Filed: June 25, 2020
    Publication date: December 31, 2020
    Inventor: Nicolas Brunie
  • Publication number: 20200201642
    Abstract: The disclosure relates to a processor including an N-bit data bus configured to access a memory; a central processing unit CPU connected to the data bus; a coprocessor coupled to the CPU, including a register file with N-bit registers; an instruction processing unit in the CPU, configured to, in response to a load-scatter machine instruction received by the CPU, read accessing a memory address and delegating to the coprocessor the processing of the corresponding N-bit word presented on the data bus; and a register control unit in the coprocessor, configured by the CPU in response to the load-scatter instruction, to divide the word presented on the data bus into K segments and writing the K segments at the same position in K respective registers, the position and the registers being designated by the load-scatter instruction.
    Type: Application
    Filed: December 20, 2019
    Publication date: June 25, 2020
    Inventors: Benoit Dupont de Dinechin, Julien Le Maire, Nicolas Brunie
  • Patent number: 9851977
    Abstract: A method for executing instructions on a single-program, multiple-data processor system having a fixed number of execution lanes, including: scheduling a primary instruction for execution with a first wave of multiple data; assigning the first wave to a corresponding primary subset of the execution lanes; scheduling a secondary instruction having a second wave of multiple data, such that the second wave fits in lanes that are unused by the primary subset of lanes; assigning the second wave to a corresponding secondary subset of the lanes; fetching the primary and secondary instructions; configuring the execution lanes such that the primary subset is responsive to the primary instruction and the secondary subset is simultaneously responsive to the secondary instruction; and simultaneously executing the primary and secondary instructions in the execution lanes.
    Type: Grant
    Filed: December 6, 2012
    Date of Patent: December 26, 2017
    Assignee: KALRAY
    Inventors: Nicolas Brunie, Sylvain Collange
  • Patent number: 9367287
    Abstract: A circuit for calculating the fused sum of an addend and product of two multiplication operands, the addend and multiplication operands being binary floating-point numbers represented in a standardized format as a mantissa and an exponent is provided. The multiplication operands are in a lower precision format than the addend, with q>2p, where p and q are the mantissa size of the multiplication operand and addend precision formats. The circuit includes a p-bit multiplier receiving the mantissas of the multiplication operands; a shift circuit aligning the mantissa of the addend with the product output by the multiplier based on the exponent values of the addend and multiplication operands; and an adder processing q-bit mantissas, receiving the aligned mantissa of the addend and the product, the input lines of the adder corresponding to the product being completed to the right by lines at 0 to form a q-bit mantissa.
    Type: Grant
    Filed: April 19, 2012
    Date of Patent: June 14, 2016
    Assignee: KALRAY
    Inventors: Florent Dupont De Dinechin, Nicolas Brunie, Benoit Dupont De Dinechin
  • Publication number: 20140164737
    Abstract: A method for executing instructions on a single-program, multiple-data processor system having a fixed number of execution lanes, including: scheduling a primary instruction for execution with a first wave of multiple data; assigning the first wave to a corresponding primary subset of the execution lanes; scheduling a secondary instruction having a second wave of multiple data, such that the second wave fits in lanes that are unused by the primary subset of lanes; assigning the second wave to a corresponding secondary subset of the lanes; fetching the primary and secondary instructions; configuring the execution lanes such that the primary subset is responsive to the primary instruction and the secondary subset is simultaneously responsive to the secondary instruction; and simultaneously executing the primary and secondary instructions in the execution lanes.
    Type: Application
    Filed: December 6, 2012
    Publication date: June 12, 2014
    Applicant: KALRAY
    Inventors: Sylvain COLLANGE, Nicolas BRUNIE
  • Publication number: 20140089371
    Abstract: A circuit for calculating the fused sum of an addend and product of two multiplicands, the addend and multiplicands being binary floating-point numbers represented in a standardized format as a mantissa and an exponent is provided. The multiplicands are in a lower precision format than the addend, with q>2p, where p and q are respectively the mantissa size of the multiplicand precision format and the addend precision format. The circuit includes a p-bit multiplier receiving the mantissas of the multiplicands; a shift circuit that aligns the mantissa of the addend with the product output by the multiplier based on the exponent values of the addend and multiplicands; and an adder that processes q-bit mantissas, receiving the aligned mantissa of the addend and the product, the input lines of the adder corresponding to the product being completed to the right by lines at 0 to form a q-bit mantissa.
    Type: Application
    Filed: April 19, 2012
    Publication date: March 27, 2014
    Applicant: KALRAY
    Inventors: Florent Dupont De Dinechin, Nicolas Brunie, Benoit Dupont De Dinechin