Patents by Inventor Bogdan Pasca

Bogdan Pasca has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Multiplier Circuit with Carry-Based Partial Product Encoding

Publication number: 20250251910

Abstract: Integrated circuit devices, methods, and circuitry for an efficient multiplier are provided. Multiplier circuitry to multiply a multiplicand value with a multiplier value may include, among other things, input circuitry and carry-based coding circuitry. The input circuitry may receive the multiplicand value and the multiplier value. The carry-based coding circuitry may receive bits of the multiplier value and generate multiplication codes using a carry-based coding scheme that includes multiplication codes according to a Booth's coding scheme but with at least one multiplication code that is removed and replaced with another at least one multiplication code with a different value. A first encoder of the carry-based coding circuitry may receive a carry signal to adjust a multiplication code value of the first encoder based on a second encoder of the carry-based coding circuitry encoding the multiplication code with the different value.

Type: Application

Filed: March 28, 2024

Publication date: August 7, 2025

Inventors: Igor Viktorovich Kucherenko, Bogdan Pasca, Martin Langhammer
MACHINE LEARNING TRAINING ARCHITECTURE FOR PROGRAMMABLE DEVICES

Publication number: 20250199762

Abstract: A programmable device may be configured to support machine learning training operations using matrix multiplication circuitry. In some embodiments, the multiplication is implemented on a systolic array. The systolic array includes an array of processing elements, each of which includes hybrid floating-point dot-product circuitry.

Type: Application

Filed: March 3, 2025

Publication date: June 19, 2025

Inventors: Martin Langhammer, Bogdan Pasca, Sergey Gribok, Gregg William Baeckler, Andrei Hagiescu
LUT-FREE HARDWARE BASED SOFTMAX ACCELERATOR

Publication number: 20250190523

Abstract: SoftMax operation is one part of a deep neural network (DNN). Because computing SoftMax is complex and time-consuming, the SoftMax operation can limit the overall execution latency of the DNN. To address this issue, an in-line data path is added to pass output data from a matrix-to-matrix multiplication core to a hardware SoftMax accelerator. During a denominator phase of the SoftMax operation, the SoftMax accelerator can operate in-line to produce a denominator value using output values generated by the matrix-to-matrix multiplication core and received over the in-line data path. During a numerator phase of the SoftMax operation, the SoftMax accelerator can calculate SoftMax outputs using output values generated by the matrix-to-matrix multiplication core and retrieved from a memory. In other words, the SoftMax accelerator can produce partial results while the matrix-to-matrix multiplication is in-flight to cut down overall latency and reduce memory transactions.

Type: Application

Filed: February 18, 2025

Publication date: June 12, 2025

Applicant: Intel Corporation

Inventors: Kamlesh Pillai, Bogdan Pasca, Martin Langhammer
SUMMATION AND FLOATING POINT CONVERSION OF TENSOR RESULTS

Publication number: 20250045017

Abstract: Integrated circuit devices and circuitry for implementing and using efficient circuitry for summation of tensors having shared exponents and conversion into a floating-point format rae provided. Such circuitry may include first input circuitry to receive a first tensor in a fixed-point format having a first shared exponent and second input circuitry to receive a second tensor in the fixed-point format with a second shared exponent. Addition circuitry may add the first tensor and the second tensor, without first converting the first tensor and the second tensor to a floating-point format, to obtain a result in the floating-point format.

Type: Application

Filed: September 27, 2024

Publication date: February 6, 2025

Inventors: Martin Langhammer, Bogdan Pasca, Dongdong Chen, Ilya Ganusov
Filtering with Tensor Structures

Publication number: 20250021305

Abstract: Integrated circuit devices, methods, and circuitry for implementing filters based on multipliers in tensor circuits are provided. Integrated circuitry may include a first tensor circuit with a first set of multipliers of a first precision and first summation circuitry and a second tensor circuit with a second set of multipliers of a second precision and second summation circuitry. The first tensor circuit and the second tensor circuit may collectively perform a multiplication operation at a third precision higher than the first precision and the second precision.

Type: Application

Filed: September 27, 2024

Publication date: January 16, 2025

Inventors: Martin Langhammer, Volker Mauer, Gregory Ives, Dongdong Chen, Bogdan Pasca
Modular Multipliers using Hybrid Reduction Techniques

Publication number: 20250013431

Abstract: Integrated circuit devices, methods, and circuitry for implementing and using a hybrid modular multiplier circuit using a number of different modular reduction techniques are provided. Integrated circuitry may include multiplication circuitry to multiply an input multiplicand value with an input multiplier value to obtain a product, first coarse-grain modular reduction circuitry to partially reduce the product based on a modulus value using a first type of modular reduction, second coarse-grain modular reduction circuitry to further reduce the product based on the modulus value using a second type of modular reduction, and fine-grain modular reduction circuitry to finally reduce the product based on the modulus value using a third type of modular reduction to produce a final modular reduction result.

Type: Application

Filed: September 26, 2024

Publication date: January 9, 2025

Inventors: Sergey Vladimirovich Gribok, Martin Langhammer, Bogdan Pasca
PROGRAMMABLE LOOK UP TABLE FREE HARDWARE ACCELERATOR AND INSTRUCTION SET ARCHITECTURE FOR ACTIVATION FUNCTIONS

Publication number: 20240289168

Abstract: Systems, apparatuses and methods may provide for technology that identifies a type of a first activation function, identifies a derivative level of the first activation function, and generates a first instruction based on the type of the first activation function and the derivative level of the first activation function. The technology also includes an accelerator having logic coupled to one or more substrates, the logic including a compute engine including a plurality of arithmetic operators, a multiplexer network coupled to the compute engine, and a controller coupled to the multiplexer network, the controller to detect the first instruction, decode the first instruction to identify the first activation function, and drive the multiplexer network to form first connections between two or more of the plurality of arithmetic operators in accordance with the first activation function, wherein the first connections are to cause the compute engine to conduct the first activation function.

Type: Application

Filed: November 10, 2023

Publication date: August 29, 2024

Inventors: Krishnan Ananthanarayanan, Martin Langhammer, Om Ji Omer, Bogdan Pasca, Kamlesh Pillai, Pramod Udupa
Iterative Multiplicative Reduction Circuit

Publication number: 20230273770

Abstract: Integrated circuit devices, methods, and circuitry for implementing and using an iterative multiplicative modular reduction circuit are provided. Such circuitry may include polynomial multiplication circuitry and modular reduction circuitry that may operate concurrently. The polynomial multiplication circuitry may multiply a first input value to a second input value to compute a product. The modular reduction circuitry may perform modular reduction on a first component of the product while the polynomial multiplication circuitry is still generating other components of the product.

Type: Application

Filed: March 16, 2023

Publication date: August 31, 2023

Inventors: Sergey Vladimirovich Gribok, Martin Langhammer, Bogdan Pasca
Pipelined Galois Counter Mode Hash Circuit

Publication number: 20230239136

Abstract: Integrated circuits, methods, and circuitry are provided for performing multiplication such as that used in Galois field counter mode (GCM) hash computations. An integrated circuit may include selection circuitry to provide one of several powers of a hash key. A Galois field multiplier may receive the one of the powers of the hash key and a hash sequence and generate one or more values. The Galois field multiplier may include multiple levels of pipeline stages. An adder may receive the one or more values and provide a summation of the one or more values in computing a GCM hash.

Type: Application

Filed: March 31, 2023

Publication date: July 27, 2023

Inventors: Sergey Vladimirovich Gribok, Gregg William Baeckler, Bogdan Pasca, Martin Langhammer
High Performance Systems And Methods For Modular Multiplication

Publication number: 20230026331

Abstract: A circuit system for performing modular reduction of a modular multiplication includes multiplier circuits that receive a first subset of coefficients that are generated by summing partial products of a multiplication operation that is part of the modular multiplication. The multiplier circuits multiply the coefficients in the first subset by constants that equal remainders of divisions to generate products. Adder circuits add a second subset of the coefficients and segments of bits of the products that are aligned with respective ones of the second subset of the coefficients to generate sums.

Type: Application

Filed: September 23, 2022

Publication date: January 26, 2023

Applicant: Intel Corporation

Inventors: Sergey Gribok, Bogdan Pasca, Martin Langhammer
Techniques For Increasing Activation Sparsity In Artificial Neural Networks

Publication number: 20230021396

Abstract: A method for implementing an artificial neural network in a computing system that comprises performing a compute operation using an input activation and a weight to generate an output activation, and modifying the output activation using a noise value to increase activation sparsity.

Type: Application

Filed: September 27, 2022

Publication date: January 26, 2023

Applicant: Intel Corporation

Inventors: Nihat Tunali, Arnab Raha, Bogdan Pasca, Martin Langhammer, Michael Wu, Deepak Mathaikutty
HYPERBOLIC FUNCTIONS FOR MACHINE LEARNING ACCELERATION

Publication number: 20220230057

Abstract: The present disclosure relates generally to techniques for enhancing recurrent neural networks (RNNs) implemented on an integrated circuit. In particular, approximations of activation functions used in an RNN, such as sigmoid and hyperbolic tangent, may be implemented in an integrated circuit, which may result in increased efficiencies, reduced latency, increased accuracy, and reduced resource consumption involved with implementing machine learning.

Type: Application

Filed: February 22, 2022

Publication date: July 21, 2022

Inventors: Bogdan Pasca, Martin Langhammer
SYSTEMS AND METHODS FOR CALCULATING LARGE POLYNOMIAL MULTIPLICATIONS

Publication number: 20220188072

Abstract: This disclosure is directed to multiplier circuitry that includes a multiplier that is configurable to generate a plurality of subproducts by performing a plurality of multiplication operations involving values having a first precision using a recursive multiplication process in which a second multiplier of the multiplier performs a second plurality of multiplication operations involving values having a second precision that are derived from the values having the first precision.

Type: Application

Filed: December 23, 2021

Publication date: June 16, 2022

Inventors: Martin Langhammer, Bogdan Pasca
Reduced latency multiplier circuitry for very large numbers

Patent number: 11301213

Abstract: An integrated circuit with a large multiplier is provided. The multiplier may be configured to receive large input operands with thousands of bits. The multiplier may be implemented using a multiplier decomposition scheme that is recursively flattened into multiple decomposition levels to expose a tree of adders. The adders may be collapsed into a merged pipelined structure, where partial sums are forwarded from one level to the next while bypassing intervening prefix networks. The final correct sum is not calculated until later. In accordance with the decomposition technique, the partial sums are successively halved, which allows the prefix networks to be smaller from one level to the next. This allows all sums to be calculated at approximately the same pipeline depth, which significantly reduces latency with no or limited pipeline balancing.

Type: Grant

Filed: June 24, 2019

Date of Patent: April 12, 2022

Assignee: Intel Corporation

Inventors: Martin Langhammer, Bogdan Pasca
MACHINE LEARNING TRAINING ARCHITECTURE FOR PROGRAMMABLE DEVICES

Publication number: 20220107783

Abstract: A programmable device may be configured to support machine learning training operations using matrix multiplication circuitry. In some embodiments, the multiplication is implemented on a systolic array. The systolic array includes an array of processing elements, each of which includes hybrid floating-point dot-product circuitry.

Type: Application

Filed: December 16, 2021

Publication date: April 7, 2022

Inventors: Martin Langhammer, Bogdan Pasca, Sergey Gribok, Gregg William Baeckler, Andrei Hagiescu
Hyperbolic functions for machine learning acceleration

Patent number: 11256978

Abstract: The present disclosure relates generally to techniques for enhancing recurrent neural networks (RNNs) implemented on an integrated circuit. In particular, approximations of activation functions used in an RNN, such as sigmoid and hyperbolic tangent, may be implemented in an integrated circuit, which may result in increased efficiencies, reduced latency, increased accuracy, and reduced resource consumption involved with implementing machine learning.

Type: Grant

Filed: January 5, 2018

Date of Patent: February 22, 2022

Assignee: Intel Corporation

Inventors: Bogdan Pasca, Martin Langhammer
Integrated circuits with modular multiplication circuitry

Patent number: 11249726

Abstract: An integrated circuit is provided with a modular multiplication circuit. The modular multiplication circuit includes an input multiplier for computing the product of two input signals, truncated multipliers for computing another product based on a modulus value and the product, and a subtraction circuit for computing a difference between the two products. An error correction circuit uses the difference to look up an estimated quotient value and to subtract out an integer multiple of the modulus value from the difference in a single step, wherein the integer multiple is equal to the estimated quotient value. A final adjustment stage is used to remove any remaining residual estimation error.

Type: Grant

Filed: September 10, 2019

Date of Patent: February 15, 2022

Assignee: Intel Corporation

Inventors: Martin Langhammer, Bogdan Pasca
Machine learning training architecture for programmable devices

Patent number: 11210063

Abstract: A programmable device may be configured to support machine learning training operations using matrix multiplication circuitry implemented on a systolic array. The systolic array includes an array of processing elements, each of which includes hybrid floating-point dot-product circuitry. The hybrid dot-product circuitry has a hard data path that uses digital signal processing (DSP) blocks operating in floating-point mode and a hard/soft data path that uses DSP blocks operating in fixed-point mode operated in conjunction with general purpose soft logic. The hard/soft data path includes 2-element dot-product circuits that feed an adder tree. Results from the hard data path are combined with the adder tree using format conversion and normalization circuitry. Inputs to the hybrid dot-product circuitry may be in the BFLOAT16 format. The hard data path may be in the single precision format. The hard/soft data path uses a custom format that is similar to but different than BFLOAT16.

Type: Grant

Filed: September 27, 2019

Date of Patent: December 28, 2021

Assignee: Intel Corporation

Inventors: Martin Langhammer, Bogdan Pasca, Sergey Gribok, Gregg William Baeckler, Andrei Hagiescu
Floating-point adder circuitry with subnormal support

Patent number: 11010131

Abstract: An integrated circuit may include a floating-point adder. The adder may be implemented using a dual-path adder architecture having a near path and a far path. The near path may include a leading zero anticipator (LZA), a comparison circuit for comparing an exponent value to an LZA count, and associated circuitry for handling subnormal numbers. The far path may include a subtraction circuit for computing the difference between a received exponent value and a minimum exponent value, at least two shifters for shifting far greater and far lesser mantissa values in parallel, and associated circuitry for handling subnormal numbers. The adder may be dynamically configured to support a first mode that processes FP16 at inputs and outputs, a second mode that processes modified FP16? inputs, and a third mode that processes FP16? at inputs and outputs.

Type: Grant

Filed: September 14, 2017

Date of Patent: May 18, 2021

Assignee: Intel Corporation

Inventors: Martin Langhammer, Bogdan Pasca
Reduction operation mapping systems and methods

Patent number: 11003446

Abstract: Adder trees may be constructed for efficient packing of arithmetic operators into an integrated circuit. The operands of the trees may be truncated to pack an integer number of nodes per logic array block. As a result, arithmetic operations may pack more efficiently onto the integrated circuit while providing increased precision and performance.

Type: Grant

Filed: December 14, 2017

Date of Patent: May 11, 2021

Assignee: Intel Corporation

Inventors: Martin Langhammer, Gregg William Baeckler, Bogdan Pasca

1 2 next