Patents by Inventor Bogdan Pasca
Bogdan Pasca has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240289168Abstract: Systems, apparatuses and methods may provide for technology that identifies a type of a first activation function, identifies a derivative level of the first activation function, and generates a first instruction based on the type of the first activation function and the derivative level of the first activation function. The technology also includes an accelerator having logic coupled to one or more substrates, the logic including a compute engine including a plurality of arithmetic operators, a multiplexer network coupled to the compute engine, and a controller coupled to the multiplexer network, the controller to detect the first instruction, decode the first instruction to identify the first activation function, and drive the multiplexer network to form first connections between two or more of the plurality of arithmetic operators in accordance with the first activation function, wherein the first connections are to cause the compute engine to conduct the first activation function.Type: ApplicationFiled: November 10, 2023Publication date: August 29, 2024Inventors: Krishnan Ananthanarayanan, Martin Langhammer, Om Ji Omer, Bogdan Pasca, Kamlesh Pillai, Pramod Udupa
-
Publication number: 20230273770Abstract: Integrated circuit devices, methods, and circuitry for implementing and using an iterative multiplicative modular reduction circuit are provided. Such circuitry may include polynomial multiplication circuitry and modular reduction circuitry that may operate concurrently. The polynomial multiplication circuitry may multiply a first input value to a second input value to compute a product. The modular reduction circuitry may perform modular reduction on a first component of the product while the polynomial multiplication circuitry is still generating other components of the product.Type: ApplicationFiled: March 16, 2023Publication date: August 31, 2023Inventors: Sergey Vladimirovich Gribok, Martin Langhammer, Bogdan Pasca
-
Publication number: 20230239136Abstract: Integrated circuits, methods, and circuitry are provided for performing multiplication such as that used in Galois field counter mode (GCM) hash computations. An integrated circuit may include selection circuitry to provide one of several powers of a hash key. A Galois field multiplier may receive the one of the powers of the hash key and a hash sequence and generate one or more values. The Galois field multiplier may include multiple levels of pipeline stages. An adder may receive the one or more values and provide a summation of the one or more values in computing a GCM hash.Type: ApplicationFiled: March 31, 2023Publication date: July 27, 2023Inventors: Sergey Vladimirovich Gribok, Gregg William Baeckler, Bogdan Pasca, Martin Langhammer
-
Publication number: 20230021396Abstract: A method for implementing an artificial neural network in a computing system that comprises performing a compute operation using an input activation and a weight to generate an output activation, and modifying the output activation using a noise value to increase activation sparsity.Type: ApplicationFiled: September 27, 2022Publication date: January 26, 2023Applicant: Intel CorporationInventors: Nihat Tunali, Arnab Raha, Bogdan Pasca, Martin Langhammer, Michael Wu, Deepak Mathaikutty
-
Publication number: 20230026331Abstract: A circuit system for performing modular reduction of a modular multiplication includes multiplier circuits that receive a first subset of coefficients that are generated by summing partial products of a multiplication operation that is part of the modular multiplication. The multiplier circuits multiply the coefficients in the first subset by constants that equal remainders of divisions to generate products. Adder circuits add a second subset of the coefficients and segments of bits of the products that are aligned with respective ones of the second subset of the coefficients to generate sums.Type: ApplicationFiled: September 23, 2022Publication date: January 26, 2023Applicant: Intel CorporationInventors: Sergey Gribok, Bogdan Pasca, Martin Langhammer
-
Publication number: 20220230057Abstract: The present disclosure relates generally to techniques for enhancing recurrent neural networks (RNNs) implemented on an integrated circuit. In particular, approximations of activation functions used in an RNN, such as sigmoid and hyperbolic tangent, may be implemented in an integrated circuit, which may result in increased efficiencies, reduced latency, increased accuracy, and reduced resource consumption involved with implementing machine learning.Type: ApplicationFiled: February 22, 2022Publication date: July 21, 2022Inventors: Bogdan Pasca, Martin Langhammer
-
Publication number: 20220188072Abstract: This disclosure is directed to multiplier circuitry that includes a multiplier that is configurable to generate a plurality of subproducts by performing a plurality of multiplication operations involving values having a first precision using a recursive multiplication process in which a second multiplier of the multiplier performs a second plurality of multiplication operations involving values having a second precision that are derived from the values having the first precision.Type: ApplicationFiled: December 23, 2021Publication date: June 16, 2022Inventors: Martin Langhammer, Bogdan Pasca
-
Patent number: 11301213Abstract: An integrated circuit with a large multiplier is provided. The multiplier may be configured to receive large input operands with thousands of bits. The multiplier may be implemented using a multiplier decomposition scheme that is recursively flattened into multiple decomposition levels to expose a tree of adders. The adders may be collapsed into a merged pipelined structure, where partial sums are forwarded from one level to the next while bypassing intervening prefix networks. The final correct sum is not calculated until later. In accordance with the decomposition technique, the partial sums are successively halved, which allows the prefix networks to be smaller from one level to the next. This allows all sums to be calculated at approximately the same pipeline depth, which significantly reduces latency with no or limited pipeline balancing.Type: GrantFiled: June 24, 2019Date of Patent: April 12, 2022Assignee: Intel CorporationInventors: Martin Langhammer, Bogdan Pasca
-
Publication number: 20220107783Abstract: A programmable device may be configured to support machine learning training operations using matrix multiplication circuitry. In some embodiments, the multiplication is implemented on a systolic array. The systolic array includes an array of processing elements, each of which includes hybrid floating-point dot-product circuitry.Type: ApplicationFiled: December 16, 2021Publication date: April 7, 2022Inventors: Martin Langhammer, Bogdan Pasca, Sergey Gribok, Gregg William Baeckler, Andrei Hagiescu
-
Patent number: 11256978Abstract: The present disclosure relates generally to techniques for enhancing recurrent neural networks (RNNs) implemented on an integrated circuit. In particular, approximations of activation functions used in an RNN, such as sigmoid and hyperbolic tangent, may be implemented in an integrated circuit, which may result in increased efficiencies, reduced latency, increased accuracy, and reduced resource consumption involved with implementing machine learning.Type: GrantFiled: January 5, 2018Date of Patent: February 22, 2022Assignee: Intel CorporationInventors: Bogdan Pasca, Martin Langhammer
-
Patent number: 11249726Abstract: An integrated circuit is provided with a modular multiplication circuit. The modular multiplication circuit includes an input multiplier for computing the product of two input signals, truncated multipliers for computing another product based on a modulus value and the product, and a subtraction circuit for computing a difference between the two products. An error correction circuit uses the difference to look up an estimated quotient value and to subtract out an integer multiple of the modulus value from the difference in a single step, wherein the integer multiple is equal to the estimated quotient value. A final adjustment stage is used to remove any remaining residual estimation error.Type: GrantFiled: September 10, 2019Date of Patent: February 15, 2022Assignee: Intel CorporationInventors: Martin Langhammer, Bogdan Pasca
-
Patent number: 11210063Abstract: A programmable device may be configured to support machine learning training operations using matrix multiplication circuitry implemented on a systolic array. The systolic array includes an array of processing elements, each of which includes hybrid floating-point dot-product circuitry. The hybrid dot-product circuitry has a hard data path that uses digital signal processing (DSP) blocks operating in floating-point mode and a hard/soft data path that uses DSP blocks operating in fixed-point mode operated in conjunction with general purpose soft logic. The hard/soft data path includes 2-element dot-product circuits that feed an adder tree. Results from the hard data path are combined with the adder tree using format conversion and normalization circuitry. Inputs to the hybrid dot-product circuitry may be in the BFLOAT16 format. The hard data path may be in the single precision format. The hard/soft data path uses a custom format that is similar to but different than BFLOAT16.Type: GrantFiled: September 27, 2019Date of Patent: December 28, 2021Assignee: Intel CorporationInventors: Martin Langhammer, Bogdan Pasca, Sergey Gribok, Gregg William Baeckler, Andrei Hagiescu
-
Patent number: 11010131Abstract: An integrated circuit may include a floating-point adder. The adder may be implemented using a dual-path adder architecture having a near path and a far path. The near path may include a leading zero anticipator (LZA), a comparison circuit for comparing an exponent value to an LZA count, and associated circuitry for handling subnormal numbers. The far path may include a subtraction circuit for computing the difference between a received exponent value and a minimum exponent value, at least two shifters for shifting far greater and far lesser mantissa values in parallel, and associated circuitry for handling subnormal numbers. The adder may be dynamically configured to support a first mode that processes FP16 at inputs and outputs, a second mode that processes modified FP16? inputs, and a third mode that processes FP16? at inputs and outputs.Type: GrantFiled: September 14, 2017Date of Patent: May 18, 2021Assignee: Intel CorporationInventors: Martin Langhammer, Bogdan Pasca
-
Patent number: 11003446Abstract: Adder trees may be constructed for efficient packing of arithmetic operators into an integrated circuit. The operands of the trees may be truncated to pack an integer number of nodes per logic array block. As a result, arithmetic operations may pack more efficiently onto the integrated circuit while providing increased precision and performance.Type: GrantFiled: December 14, 2017Date of Patent: May 11, 2021Assignee: Intel CorporationInventors: Martin Langhammer, Gregg William Baeckler, Bogdan Pasca
-
Patent number: 10942706Abstract: The present embodiments relate to integrated circuits with circuitry that implements floating-point trigonometric functions. The circuitry may include an approximation circuit that generates an approximation of the output of the trigonometric functions, a storage circuit that stores predetermined output values of the trigonometric functions, and a selector circuit that selects between different possible output values based on a control signal from a control circuit. In some embodiments, the circuitry may include a mapping circuit and a restoration circuit. The mapping circuit may map an input value from an original quadrant of the trigonometric circle to a predetermined input interval, and the restoration circuit may map the output value selected by the selection circuit back to the original quadrant of the trigonometric circle. If desired, the circuitry may be implemented in specialized processing blocks.Type: GrantFiled: June 27, 2017Date of Patent: March 9, 2021Assignee: Intel CorporationInventors: Martin Langhammer, Bogdan Pasca
-
Patent number: 10871946Abstract: Integrated circuits with digital signal processing (DSP) blocks are provided. A DSP block may include one or more large multiplier circuits. A large multiplier circuit (e.g., an 18×18 or 18×19 multiplier circuit) may be used to support two or more smaller multiplication operations sharing one or two sets of multiplier operands, a complex multiplication, and a sum of two multiplications. If the multiplier products overflow and interfere with one another, correction operations can be performed. Partial products from two or more larger multiplier circuits can be used to combine decomposed partial products. A large multiplier circuit can also be used to support two floating-point mantissa multipliers.Type: GrantFiled: September 27, 2018Date of Patent: December 22, 2020Assignee: Intel CorporationInventors: Martin Langhammer, Gregg William Baeckler, Sergey Gribok, Dmitry N. Denisenko, Bogdan Pasca
-
Patent number: 10732932Abstract: Integrated circuits with digital signal processing (DSP) blocks are provided. A DSP block may include one or more large multiplier circuits. A large multiplier circuit such as an 18×18 multiplier circuit may be used to support two or more smaller multiplication operations such as two 8×8 integer multiplications or two 9×9 integer multiplications. To implement the two 8×8 or 9×9 unsigned/signed multiplications, the 18×18 multiplier may be configured to support two 8×8 multiplications with one shared operand, two 6×6 multiplications without any shared operand, or two 7×7 multiplications without any shared operand. Any potential overlap of partial product terms may be subtracted out using correction logic. The multiplication of the remaining most significant bits can be computed using associated multiplier extension logic and appended to the other least significant bits using merging logic.Type: GrantFiled: December 21, 2018Date of Patent: August 4, 2020Assignee: Intel CorporationInventors: Bogdan Pasca, Martin Langhammer, Sergey Gribok, Gregg William Baeckler
-
Patent number: 10671345Abstract: An integrated circuit may include normalization circuitry that can be used when converting a fixed-point number to a floating-point number. The normalization circuitry may include at least a floating-point generation circuit that receives the fixed-point number and that creates a corresponding floating-point number. The normalization circuitry may then leverage an embedded digital signal processing (DSP) block on the integrated circuit to perform an arithmetic operation by removing the leading one from the created floating-point number. The resulting number may have a fractional component and an exponent value, which can then be used to derive the final normalized value.Type: GrantFiled: February 2, 2017Date of Patent: June 2, 2020Assignee: Intel CorporationInventor: Bogdan Pasca
-
Publication number: 20200142671Abstract: Integrated circuits with digital signal processing (DSP) blocks are provided. A DSP block may include one or more large multiplier circuits. A large multiplier circuit such as an 18×18 multiplier circuit may be used to support two or more smaller multiplication operations such as two 8×8 integer multiplications or two 9×9 integer multiplications. To implement the two 8×8 or 9×9 unsigned/signed multiplications, the 18×18 multiplier may be configured to support two 8×8 multiplications with one shared operand, two 6×6 multiplications without any shared operand, or two 7×7 multiplications without any shared operand. Any potential overlap of partial product terms may be subtracted out using correction logic. The multiplication of the remaining most significant bits can be computed using associated multiplier extension logic and appended to the other least significant bits using merging logic.Type: ApplicationFiled: December 21, 2018Publication date: May 7, 2020Applicant: Intel CorporationInventors: Bogdan Pasca, Martin Langhammer, Sergey Gribok, Gregg William Baeckler
-
Publication number: 20200026494Abstract: A programmable device may be configured to support machine learning training operations using matrix multiplication circuitry implemented on a systolic array. The systolic array includes an array of processing elements, each of which includes hybrid floating-point dot-product circuitry. The hybrid dot-product circuitry has a hard data path that uses digital signal processing (DSP) blocks operating in floating-point mode and a hard/soft data path that uses DSP blocks operating in fixed-point mode operated in conjunction with general purpose soft logic. The hard/soft data path includes 2-element dot-product circuits that feed an adder tree. Results from the hard data path are combined with the adder tree using format conversion and normalization circuitry. Inputs to the hybrid dot-product circuitry may be in the BFLOAT16 format. The hard data path may be in the single precision format. The hard/soft data path uses a custom format that is similar to but different than BFLOAT16.Type: ApplicationFiled: September 27, 2019Publication date: January 23, 2020Applicant: Intel CorporationInventors: Martin Langhammer, Bogdan Pasca, Sergey Gribok, Gregg William Baeckler, Andrei Hagiescu