Patents Examined by Michael D Yaary

Processor for fine-grain sparse integer and floating-point operations

Patent number: 11861328

Abstract: A processor for fine-grain sparse integer and floating-point operations and method of operation thereof are provided. In some embodiments, the method includes forming a first set of products, and forming a second set of products. The forming of the first set of products may include: multiplying, in a first multiplier, a second multiplier, and a third multiplier, the first activation value by a first least significant sub-word, a second least significant sub-word, and a most significant sub-word; and adding a first resulting partial product and a second resulting partial product. The forming of the second set of products may include forming a first floating point product, the forming of the first floating point product including multiplying, in the first multiplier, a first sub-word of a mantissa of an activation value by a first sub-word of a mantissa of a weight, to form a third partial product.

Type: Grant

Filed: December 23, 2020

Date of Patent: January 2, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Ali Shafiee Ardestani, Joseph H. Hassoun
Floating-point computation with threshold prediction for artificial intelligence system

Patent number: 11853715

Abstract: A system comprises a floating-point computation unit configured to perform a dot-product operation in accordance with a first floating-point value and a second floating-point value, and detection logic operatively coupled to the floating-point computation unit. The detection logic is configured to compute a difference between fixed-point summations of exponent parts of the first floating-point value and the second floating-point value and, based on the computed difference, detect the presence of a condition prior to completion of the dot-product operation by the floating-point computation unit. In response to detection of the presence of the condition, the detection logic is further configured to cause the floating-point computation unit to avoid performing a subset of computations otherwise performed as part of the dot-product operation.

Type: Grant

Filed: November 23, 2020

Date of Patent: December 26, 2023

Assignee: International Business Machines Corporation

Inventors: Mingu Kang, Seonghoon Woo, Eun Kyung Lee
Processing with compact arithmetic processing element

Patent number: 11842166

Abstract: A processor or other device, such as a programmable and/or massively parallel processor or other device, includes processing elements designed to perform arithmetic operations (possibly but not necessarily including, for example, one or more of addition, multiplication, subtraction, and division) on numerical values of low precision but high dynamic range (“LPHDR arithmetic”). Such a processor or other device may, for example, be implemented on a single chip. Whether or not implemented on a single chip, the number of LPHDR arithmetic elements in the processor or other device in certain embodiments of the present invention significantly exceeds (e.g., by at least 20 more than three times) the number of arithmetic elements, if any, in the processor or other device which are designed to perform high dynamic range arithmetic of traditional precision (such as 32 bit or 64 bit floating point arithmetic).

Type: Grant

Filed: December 2, 2022

Date of Patent: December 12, 2023

Assignee: Singular Computing LLC

Inventor: Joseph Bates
Matrix calculation device and operation method thereof

Patent number: 11836214

Abstract: A matrix calculation device including a storing unit, a multiply accumulate (MAC) circuit, a pre-fetch circuit, and a control circuit, and an operation method thereof are provided. The storing unit stores a first and second matrixes. The MAC circuit is configured to execute MAC calculation. The pre-fetch circuit pre-fetches at least one column of the first matrix from the storing unit to act as pre-fetch data, pre-fetches at least one row of the second matrix from the storing unit to act as the pre-fetch data, or pre-fetches at least one column of the first matrix and at least one row of the second matrix from the storing unit to act as the pre-fetch data. The control circuit decides whether to perform the MAC calculation on a current column of the first matrix and a current row of the second matrix through the MAC circuit according to the pre-fetch data.

Type: Grant

Filed: September 28, 2020

Date of Patent: December 5, 2023

Assignee: NEUCHIPS CORPORATION

Inventors: Chiung-Liang Lin, Chao-Yang Kao
Methods and apparatus for a vector memory subsystem for use with a programmable mixed-radix DFT/IDFT processor

Patent number: 11829322

Abstract: A vector memory subsystem for use with a programmable mix-radix vector processor (“PVP”) capable of calculating discrete Fourier transform (“DFT/IDFT”) values. In an exemplary embodiment, an apparatus includes a vector memory bank and a vector memory system (VMS) that generates input memory addresses that are used to store input data into the vector memory bank. The VMS also generates output memory addresses that are used to unload vector data from the memory banks. The input memory addresses are used to shuffle the input data in the memory bank based on a radix factorization associated with an N-point DFT, and the output memory addresses are used to unload the vector data from the memory bank to compute radix factors of the radix factorization.

Type: Grant

Filed: December 16, 2020

Date of Patent: November 28, 2023

Assignee: Marvell Asia Pte, Ltd.

Inventors: Yuanbin Guo, Hong Jik Kim
Compressing like-magnitude partial products in multiply accumulation

Patent number: 11816448

Abstract: An ALU is capable of generating a multiply accumulation by compressing like-magnitude partial products. Given N pairs of multiplier and multiplicand, Booth encoding is used to encode the multipliers into M digits, and M partial products are produced for each pair of with each partial product in a smaller precision than a final product. The partial products resulting from the same encoded multiplier digit position, are summed across all the multiplies to produce a summed partial product. In this manner, the partial product summation operations can be advantageously performed in the smaller precision. The M summed partial products are then summed together with an aggregated fixup vector for sign extension. If the N multipliers equal to a constant, a preliminary fixup vector can be generated based on a predetermined value with adjustment on particular bits, where the predetermined value is determined by the signs of the encoded multiplier digits.

Type: Grant

Filed: January 27, 2021

Date of Patent: November 14, 2023

Assignee: Marvell Asia Pte, Ltd.

Inventor: David Carlson
Implementing fixed-point polynomials in hardware logic

Patent number: 11809795

Abstract: A method implements fixed-point polynomials in hardware logic. In an embodiment the method comprises distributing a defined error bound for the whole polynomial between operators in a data-flow graph for the polynomial and optimizing each operator to satisfy the part of the error bound allocated to that operator. The distribution of errors between operators is updated in an iterative process until a stop condition (such as a maximum number of iterations) is reached.

Type: Grant

Filed: May 18, 2021

Date of Patent: November 7, 2023

Assignee: Imagination Technologies Limited

Inventor: Theo Alan Drane
Reduced dot product computation circuit

Patent number: 11809515

Abstract: Some embodiments provide an IC for implementing a machine-trained network with multiple layers. The IC includes a set of circuits to compute a dot product of (i) a first number of input values computed by other circuits of the IC and (ii) a set of predefined weight values, several of which are zero, with a weight value for each of the input values. The set of circuits includes (i) a dot product computation circuit to compute the dot product based on a second number of inputs and (ii) for each input value, at least two sets of wires for providing the input value to at least two of the dot product computation circuit inputs. The second number is less than the first number. Each input value with a corresponding weight value that is not equal to zero is provided to a different one of the dot product computation circuit inputs.

Type: Grant

Filed: May 10, 2021

Date of Patent: November 7, 2023

Assignee: PERCEIVE CORPORATION

Inventors: Kenneth Duong, Jung Ko, Steven L. Teig
Method and apparatus for data processing operation

Patent number: 11809836

Abstract: A system includes a fixed-point accumulator for storing numbers in an anchored fixed-point number format, a data interface arranged to receive a plurality of weight values and a plurality of data values represented in a floating-point number format, and logic circuitry. The logic circuitry is configured to: determine an anchor value indicative of a value of a lowest significant bit of the anchored fixed-point number format; convert at least a portion of the plurality of data values to the anchored fixed-point number format; perform MAC operations between the converted at least portion and respective weight values, using fixed-point arithmetic, to generate an accumulation value in the anchored fixed-point number format; and determine an output element of a later of a neural network in dependence on the accumulation value.

Type: Grant

Filed: August 27, 2020

Date of Patent: November 7, 2023

Assignee: Arm Limited

Inventors: Daren Croxford, Guy Larri
Product-sum operation device, neuromorphic device, and method for using product-sum operation device

Patent number: 11797829

Abstract: The product-sum operation device includes a product operator and a sum operator. The product operator includes a plurality of product operation elements, and an alternative element that, when any of the plurality of product operation elements has malfunctioned, is used instead of the malfunctioning product operation element. Each of the plurality of product operation elements and the alternative element is a resistance change element. The sum operator includes an output detector which detects a sum of outputs from the plurality of product operation elements when the alternative element is not used.

Type: Grant

Filed: December 12, 2018

Date of Patent: October 24, 2023

Assignee: TDK CORPORATION

Inventors: Tatsuo Shibata, Tomoyuki Sasaki
LSTM circuit with selective input computation

Patent number: 11790217

Abstract: An apparatus is described. The apparatus includes a long short term memory (LSTM) circuit having a multiply accumulate circuit (MAC). The MAC circuit has circuitry to rely on a stored product term rather than explicitly perform a multiplication operation to determine the product term if an accumulation of differences between consecutive, preceding input values has not reached a threshold.

Type: Grant

Filed: September 25, 2019

Date of Patent: October 17, 2023

Assignee: Intel Corporation

Inventors: Ram Krishnamurthy, Gregory K. Chen, Raghavan Kumar, Phil Knag, Huseyin Ekin Sumbul
Electronic device performing outlier-aware approximation coding and method thereof

Patent number: 11782498

Abstract: An electronic device includes a coding module that determines whether a parameter of an artificial neural network is an outlier, depending on a value of the parameter and compresses the parameter by truncating a first bit of the parameter when the parameter is a non-outlier and truncating a second bit of the parameter when the parameter is the outlier, and a decoding module that decodes a compressed parameter.

Type: Grant

Filed: February 17, 2020

Date of Patent: October 10, 2023

Assignee: University-Industry Cooperation Group of Kyung Hee University

Inventors: Ik Joon Chang, Ho Nguyen Dong, Minhson Le
Enhanced low precision binary floating-point formatting

Patent number: 11775257

Abstract: Techniques for operating on and calculating binary floating-point numbers using an enhanced floating-point number format are presented. The enhanced format can comprise a single sign bit, six bits for the exponent, and nine bits for the fraction. Using six bits for the exponent can provide an enhanced exponent range that facilitates desirably fast convergence of computing-intensive algorithms and low error rates for computing-intensive applications. The enhanced format can employ a specified definition for the lowest binade that enables the lowest binade to be used for zero and normal numbers; and a specified definition for the highest binade that enables it to be structured to have one data point used for a merged Not-a-Number (NaN)/infinity symbol and remaining data points used for finite numbers. The signs of zero and merged NaN/infinity can be “don't care” terms. The enhanced format employs only one rounding mode, which is for rounding toward nearest up.

Type: Grant

Filed: April 6, 2020

Date of Patent: October 3, 2023

Assignee: International Business Machines Corporation

Inventors: Silvia Melitta Mueller, Ankur Agrawal, Bruce Fleischer, Kailash Gopalakrishnan, Dongsoo Lee
Elimination of rounding error accumulation

Patent number: 11775258

Abstract: The present invention extends to methods, systems, and computing system program products for elimination of rounding error accumulation in iterative calculations for Big Data or streamed data. Embodiments of the invention include iteratively calculating a function for a primary computation window of a pre-defined size while incrementally calculating the function for one or more backup computation windows started at different time points and whenever one of the backup computation windows reaches a size of the pre-defined size, swapping the primary computation window and the backup computation window. The result(s) of the function is/are generated by either the iterative calculation performed for the primary computation window or the incremental calculation performed for a backup computation window which reaches the pre-defined size.

Type: Grant

Filed: September 13, 2021

Date of Patent: October 3, 2023

Assignee: CLOUD & STREAM GEARS LLC

Inventors: Jizhu Lu, Lihang Lu
Processing unit with mixed precision operations

Patent number: 11768664

Abstract: A graphics processing unit (GPU) implements operations, with associated op codes, to perform mixed precision mathematical operations. The GPU includes an arithmetic logic unit (ALU) with different execution paths, wherein each execution path executes a different mixed precision operation. By implementing mixed precision operations at the ALU in response to designate op codes that delineate the operations, the GPU efficiently increases the precision of specified mathematical operations while reducing execution overhead.

Type: Grant

Filed: October 2, 2019

Date of Patent: September 26, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Bin He, Michael Mantor, Jiasheng Chen
Processing with compact arithmetic processing element

Patent number: 11768659

Abstract: A processor or other device, such as a programmable and/or massively parallel processor or other device, includes processing elements designed to perform arithmetic operations (possibly but not necessarily including, for example, one or more of addition, multiplication, subtraction, and division) on numerical values of low precision but high dynamic range (“LPHDR arithmetic”). Such a processor or other device may, for example, be implemented on a single chip. Whether or not implemented on a single chip, the number of LPHDR arithmetic elements in the processor or other device in certain embodiments of the present invention significantly exceeds (e.g., by at least 20 more than three times) the number of arithmetic elements, if any, in the processor or other device which are designed to perform high dynamic range arithmetic of traditional precision (such as 32 bit or 64 bit floating point arithmetic).

Type: Grant

Filed: September 23, 2020

Date of Patent: September 26, 2023

Assignee: SINGULAR COMPUTING LLC

Inventor: Joseph Bates
Processing with compact arithmetic processing element

Patent number: 11768660

Abstract: A processor or other device, such as a programmable and/or massively parallel processor or other device, includes processing elements designed to perform arithmetic operations (possibly but not necessarily including, for example, one or more of addition, multiplication, subtraction, and division) on numerical values of low precision but high dynamic range (“LPHDR arithmetic”). Such a processor or other device may, for example, be implemented on a single chip. Whether or not implemented on a single chip, the number of LPHDR arithmetic elements in the processor or other device in certain embodiments of the present invention significantly exceeds (e.g., by at least 20 more than three times) the number of arithmetic elements, if any, in the processor or other device which are designed to perform high dynamic range arithmetic of traditional precision (such as 32 bit or 64 bit floating point arithmetic).

Type: Grant

Filed: January 26, 2023

Date of Patent: September 26, 2023

Assignee: SINGULAR COMPUTING LLC

Inventor: Joseph Bates
Low latency long short-term memory inference with sequence interleaving

Patent number: 11769041

Abstract: Systems, apparatuses, and methods for implementing a low latency long short-term memory (LSTM) machine learning engine using sequence interleaving techniques are disclosed. A computing system includes at least a host processing unit, a machine learning engine, and a memory. The host processing unit detects a plurality of sequences which will be processed by the machine learning engine. The host processing unit interleaves the sequences into data blocks and stores the data blocks in the memory. When the machine learning engine receives a given data block, the machine learning engine performs, in parallel, a plurality of matrix multiplication operations on the plurality of sequences in the given data block and a plurality of coefficients. Then, the outputs of the matrix multiplication operations are coupled to one or more LSTM layers.

Type: Grant

Filed: October 31, 2018

Date of Patent: September 26, 2023

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Sateesh Lagudu, Lei Zhang, Allen H. Rush
Device and method for filtering signal

Patent number: 11764757

Abstract: This application relates to a signal filtering device. The device includes a memory and a processor. The processor may generate one or more matrices based on a size of a digital filter bank that generates an output signal by dividing an input signal into a plurality of channels and store in the memory each of the generated one or more matrices to which a plurality of digital filter bank coefficients or a plurality of input signals are assigned. The processor may also partially calculate the stored plurality of digital filter bank coefficients and the plurality of signals based on a number of at least some of the plurality of channels, and calculate the calculated digital filter bank coefficients and signals. The processor may further perform a discrete Fourier transform (DFT) on the calculated signal and compensate for a phase of the discrete Fourier transformed signal according to a preset reference.

Type: Grant

Filed: February 9, 2021

Date of Patent: September 19, 2023

Assignee: AGENCY FOR DEFENSE DEVELOPMENT

Inventors: Yeonsoo Jang, Jintae Park, Beomjun Park, Insun Kim, Ghiback Kim
Dynamic gradient calibration method for computing-in-memory neural network and system thereof

Patent number: 11763162

Abstract: A dynamic gradient calibration method for a computing-in-memory neural network is performed to update a plurality of weights in a computing-in-memory circuit according to a plurality of inputs corresponding to a correct answer. A forward operating step includes performing a bit wise multiply-accumulate operation on a plurality of divided inputs and a plurality of divided weights to generate a plurality of multiply-accumulate values, and performing a clamping function on the multiply-accumulate values to generate a plurality of clamped multiply-accumulate values according to a predetermined upper bound value, and comparing the clamped multiply-accumulate values with the correct answer to generate a plurality of loss values. A backward operating step includes performing a partial differential operation on the loss values relative to the weights to generate a weight-based gradient. The weights are updated according to the weight-based gradient.

Type: Grant

Filed: June 16, 2020

Date of Patent: September 19, 2023

Assignee: NATIONAL TSING HUA UNIVERSITY

Inventors: Meng-Fan Chang, Wei-Hsing Huang, Ta-Wei Liu

prev 1 2 3 4 5 6 … next