Multiplication Patents (Class 708/620)
  • Patent number: 11886377
    Abstract: A representative reconfigurable processing circuit and a reconfigurable arithmetic circuit are disclosed, each of which may include input reordering queues; a multiplier shifter and combiner network coupled to the input reordering queues; an accumulator circuit; and a control logic circuit, along with a processor and various interconnection networks. A representative reconfigurable arithmetic circuit has a plurality of operating modes, such as floating point and integer arithmetic modes, logical manipulation modes, Boolean logic, shift, rotate, conditional operations, and format conversion, and is configurable for a wide variety of multiplication modes. Dedicated routing connecting multiplier adder trees allows multiple reconfigurable arithmetic circuits to be reconfigurably combined, in pair or quad configurations, for larger adders, complex multiplies and general sum of products use, for example.
    Type: Grant
    Filed: September 9, 2020
    Date of Patent: January 30, 2024
    Assignee: Cornami, Inc.
    Inventor: Raymond J. Andraka
  • Patent number: 11853715
    Abstract: A system comprises a floating-point computation unit configured to perform a dot-product operation in accordance with a first floating-point value and a second floating-point value, and detection logic operatively coupled to the floating-point computation unit. The detection logic is configured to compute a difference between fixed-point summations of exponent parts of the first floating-point value and the second floating-point value and, based on the computed difference, detect the presence of a condition prior to completion of the dot-product operation by the floating-point computation unit. In response to detection of the presence of the condition, the detection logic is further configured to cause the floating-point computation unit to avoid performing a subset of computations otherwise performed as part of the dot-product operation.
    Type: Grant
    Filed: November 23, 2020
    Date of Patent: December 26, 2023
    Assignee: International Business Machines Corporation
    Inventors: Mingu Kang, Seonghoon Woo, Eun Kyung Lee
  • Patent number: 11768661
    Abstract: An integrated circuit includes a logic block configured to perform multiplication operations. The logic block includes a plurality of lookup tables configured to receive a plurality of inputs and generate a first plurality of outputs. Additionally, the logic block includes adding circuitry configured to receive the first plurality of outputs and generate a second plurality of outputs. Furthermore, the logic block includes circuitry configured to receive a portion of the plurality of inputs, determine one or more partial products, and generate a third plurality of outputs.
    Type: Grant
    Filed: December 27, 2019
    Date of Patent: September 26, 2023
    Assignee: Intel Corporation
    Inventors: Sadegh Yazdanshenas, Tim Vanderhoek
  • Patent number: 11769478
    Abstract: A digital signal processing system for multiplying a digital value and a digital signal. The digital signal processing system receives the digital value in an encoded format, and multiplies the digital value with the digital signal. The digital value in the encoded format has an offset, which is encoded as a floating point. The disclosure provides a digital processing system that can carry out a multiplication operation with a smaller area, less complexity and/or reduced power usage compared with known multipliers.
    Type: Grant
    Filed: July 15, 2021
    Date of Patent: September 26, 2023
    Assignee: Dialog Semiconductor B.V.
    Inventors: Wessel Harm Lubberhuizen, Johannes Steensma
  • Patent number: 11755903
    Abstract: The present disclosure relates to systems and methods for providing block-wise sparsity in neural networks. In one implementation, a system for providing block-wise sparsity in a neural network may include at least one memory storing instructions and at least one processor configured to execute the instructions to: divide a matrix of weights associated with a neural network into a plurality of blocks; extract non-zero elements from one or more of the plurality of blocks; re-encode the extracted non-zero elements as vectors with associated coordinates of the extracted non-zero elements within the one or more blocks; enforce input sparsity in the neural network corresponding to the associated coordinates; and execute the neural network using the vectors and the enforced input sparsity.
    Type: Grant
    Filed: July 24, 2019
    Date of Patent: September 12, 2023
    Assignee: Alibaba Group Holding Limited
    Inventors: Maohua Zhu, Zhenyu Gu, Yuan Xie
  • Patent number: 11668748
    Abstract: A test apparatus for testing electrical parameters of a target chip includes: a function generator; a switch matrix module; a plurality of source measurement units (SMUs); at least one of the SMUs is configured to provide power supply for the target chip; at least one of the SMUs is coupled to the switch matrix module; and at least two of said SMUs are test SMUs coupled to ports of the target chip and the function generator.
    Type: Grant
    Filed: January 25, 2022
    Date of Patent: June 6, 2023
    Assignee: SEMITRONIX CORPORATION
    Inventors: Fan Lan, Weiwei Pan, Shenzhi Yang, Yongjun Zheng
  • Patent number: 11669344
    Abstract: Apparatuses and methods are disclosed for an FPGA architecture that may improve processing speed and efficiency in processing less complex operands. Some applications may utilize operands that are less complex, such as operands that are 1, 2, or 4 bits, for example. In some examples, the DSP architecture may skip or avoid processing all received operands or may process a common operand more frequently than other operands. An example apparatus may include a first configurable logic unit configured to receive a first operand and a second operand; a second configurable logic unit configured to receive a third operand and the first calculated operand; a first switch configured to receive the first operand and a fourth operand and to output a first selected operand; and a second switch configured to receive the second calculated operand and the first selected operand.
    Type: Grant
    Filed: August 30, 2018
    Date of Patent: June 6, 2023
    Assignee: Micron Technology, Inc.
    Inventors: Gregory Edvenson, Jeremy Chritz, David Hulton
  • Patent number: 11662980
    Abstract: In-memory arithmetic processors for the “n-bit” by “n-bit” multiplication, the “n-bit” by “n-bit” addition, and the “n-bit” by “n-bit” subtraction operations are disclosed. The in-memory arithmetic processors of the invention can obtain the operational resultant integer in the binary format for two inputted integers represented by two “n-bit” binary codes in one-step processing with no sequential multiple-step operations as for the conventional arithmetic binary processors. The in-memory arithmetic processors are implemented by a 2-dimensional memory array with X and Y decoding for the two inputted operational integers in the arithmetic binary operations.
    Type: Grant
    Filed: November 6, 2019
    Date of Patent: May 30, 2023
    Assignee: FLASHSILICON INCORPORATION
    Inventor: Lee Wang
  • Patent number: 11656846
    Abstract: In an example, an apparatus comprises a plurality of execution units and logic, at least partially including hardware logic, to gate at least one of a multiply unit or an accumulate unit in response to an input of value zero. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: November 24, 2020
    Date of Patent: May 23, 2023
    Assignee: INTEL CORPORATION
    Inventors: Yaniv Fais, Tomer Bar-On, Jacob Subag, Jeremie Dreyfuss, Lev Faivishevsky, Michael Behar, Amit Bleiweiss, Guy Jacob, Gal Leibovich, Itamar Ben-Ari, Galina Ryvchin, Eyal Yaacoby
  • Patent number: 11604646
    Abstract: A method of processing data by a processor, the method comprising the steps of: receiving, by the processor, an instruction including an operator code associated with three register references designating registers configured to contain pairs of multiplication operands, an addition operand, and a result register configured to receive an operator result, the operator code designating an operator configured to compute products of the pairs of multiplication operands and add the products with the addition operand; decoding the instruction by an instruction decoder of the processor, to determine the operator to be executed, and the registers containing the operands to be supplied to the operator and the result of the operator; actuating the operator by an arithmetic circuit of the processor, consuming the operands in the registers designated by the register references; and storing the result of the operator in the designated result register.
    Type: Grant
    Filed: December 29, 2021
    Date of Patent: March 14, 2023
    Assignee: Kalray
    Inventor: Benoit Dupont de Dinechin
  • Patent number: 11599341
    Abstract: A program rewrite method executed by a computer, the method includes rewriting a program to output a first output group by performing operations for a first variable among a plurality of variables with a plurality of data types; rewriting the program to output a second output group by performing operations for a second variable among the plurality of variables with a plurality of data types; identifying, from the first output group and the second output group, a third output group that satisfied a predetermined criterion as a result of executing the rewritten programs; determining a data type that corresponds to the third output group as a use data type; and outputting a program in which the use data type is set for each of the plurality of variables.
    Type: Grant
    Filed: April 27, 2021
    Date of Patent: March 7, 2023
    Assignee: FUJITSU LIMITED
    Inventor: Masaki Arai
  • Patent number: 11593689
    Abstract: According to one embodiment, a calculating device includes a processor repeating a processing procedure. The processing procedure includes first, second, and third variable updates. The first variable update includes updating an ith entry of a first variable xi by adding an ith entry of a first function to the first variable xi. The second variable update includes updating the second variable yi by adding, to the second variable yi, an arithmetic result of an ith entry of a second function, an ith entry of a third function, and an ith entry of a first element function. The third variable update includes updating the third variable z by adding an ith entry of a second element function to the third variable z. The processor performs at least an output of at least one of the first variable xi or a function of the first variable xi.
    Type: Grant
    Filed: March 1, 2019
    Date of Patent: February 28, 2023
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Taro Kanao, Hayato Goto, Kosuke Tatsumura
  • Patent number: 11556311
    Abstract: Technology for reconfigurable input precision in-memory computing is disclosed herein. Reconfigurable input precision allows the bit resolution of input data to be changed to meet the requirements of in-memory computing operations. Voltage sources (that may include DACs) provide voltages that represent input data to memory cell nodes. The resolution of the voltage sources may be reconfigured to change the precision of the input data. In one parallel mode, the number of DACs in a DAC node is used to configure the resolution. In one serial mode, the number of cycles over which a DAC provides voltages is used to configure the resolution. The memory system may include relatively low resolution voltage sources, which avoids the need to have complex high resolution voltage sources (e.g., high resolution DACs). Lower resolution voltage sources can take up less area and/or use less power than higher resolution voltage sources.
    Type: Grant
    Filed: April 16, 2020
    Date of Patent: January 17, 2023
    Assignee: SanDisk Technologies LLC
    Inventors: Wen Ma, Pi-Feng Chiu, Won Ho Choi, Martin Lueker-Boden
  • Patent number: 11474786
    Abstract: Certain aspects provide methods and apparatus for multiplication of digital signals. In accordance with certain aspects, a multiplication circuit may be used to multiply a portion of a first digital input signal with a portion of a second digital input signal via a first multiplier circuit to generate a first multiplication signal, and multiply another portion of the first digital input signal with another portion of the second digital input signal via a second multiplier circuit to generate a second multiplication signal. A third multiplier circuit and multiple adder circuits may be used to generate an output of the multiplication circuit based on the first and second multiplication signals.
    Type: Grant
    Filed: January 31, 2020
    Date of Patent: October 18, 2022
    Assignee: QUALCOMM Incorporated
    Inventors: Xia Li, Zhongze Wang, Periannan Chidambaram
  • Patent number: 11461074
    Abstract: The multi-digit binary in-memory multiplication devices are disclosed. The multi-digit binary in-memory multiplication devices of the invention can dramatically reduce the operational steps in comparison with the conventional binary multiplier device. In one embodiment with the expense of more hardware, the in-memory multiplication device can achieve one single step operation. Consequently, the multi-digit binary in-memory multiplication device can improve the computation efficiency and save the computation power by eliminating the data transportations between Arithmetic Logic Unit (ALU), registers, and memory units.
    Type: Grant
    Filed: July 10, 2020
    Date of Patent: October 4, 2022
    Assignee: FLASHSILICON INCORPORATION
    Inventor: Lee Wang
  • Patent number: 11403431
    Abstract: A cryptographic processing device for cryptographically processing data, having a memory configured to store a first operand and a second operand represented by the data to be cryptographically processed, wherein the first operand and the second operand each correspond to an indexed array of data words, and a cryptographic processor configured to determine, for cryptographically processing the data, a product of the first operand with the second operand by accumulating results of partial multiplications, each partial multiplication comprising the multiplication of a data word of the first operand with a data word of the second operand wherein the cryptographic processor is configured to perform the partial multiplications in successive blocks of partial multiplications, each block being associated with a result index range and a first operand index range and each block comprising all partial multiplications between data words of the first operand within the first operand index range with data words of the sec
    Type: Grant
    Filed: May 6, 2020
    Date of Patent: August 2, 2022
    Assignee: Infineon Technologies AG
    Inventor: Erich Wenger
  • Patent number: 11314483
    Abstract: A system is provided for error resiliency in a bit serial computation. A delay monitor enforces an overall processing duration threshold for bit-serial processing all iterations for the bit serial computation, while determining a threshold for processing each iteration. At least some iterations correspond to a respective bit in an input bit sequence. A clock generator generates a clock signal for controlling a performance of the iterations. Each of iteration units perform a particular iteration, starting with a Most Significant Bit (MSB) of the input bit sequence and continuing in descending bit significant order, and by selectively increasing the threshold for at least one iteration while skipping from processing at least one subsequent iteration whose iteration-level processing duration exceeds a remaining amount of an overall processing duration for all iterations, responsive to the at least one iteration requiring more time to complete than a current value of the threshold.
    Type: Grant
    Filed: January 8, 2020
    Date of Patent: April 26, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mingu Kang, Seyoung Kim, Kyu-hyoun Kim, Eun Kyung Lee
  • Patent number: 11301264
    Abstract: Processing cores with the ability to suppress operations based on a contribution estimate for those operations for purposes of increasing the overall performance of the core are disclosed. Associated methods that can be conducted by such processing cores are also disclosed. One such method includes generating a reference value for a composite computation. A complete execution of the composite computation generates a precise output and requires execution of a set of component computations. The method also includes generating a component computation approximation. The method also includes evaluating the component computation approximation with the reference value. The method also includes executing a partial execution of the composite computation using the component computation approximation to produce an estimated output.
    Type: Grant
    Filed: February 11, 2020
    Date of Patent: April 12, 2022
    Assignee: Tenstorrent Inc.
    Inventors: Ljubisa Bajic, Milos Trajkovic, Ivan Hamer, Syed Gilani
  • Patent number: 11277255
    Abstract: This disclosure describes systems on a chip (SOCs) that prevent side channel attacks (SCAs). The SoCs of this disclosure concurrently operate multi-round encryption and decryption datapaths according to a combined sequence of encryption rounds and decryption rounds. An example SoC of this disclosure includes an engine configured to encrypt transmission (Tx) channel data using a multi-round encryption datapath, and to decrypt encrypted received (Rx) channel data using a multi-round decryption datapath. The SoC further includes a security processor configured to multiplex the multi-round encryption datapath against the multi-round decryption datapath on a round-by-round basis to generate a mixed sequence of encryption rounds and decryption rounds, and to control the engine to encrypt the Tx channel data and decrypt the encrypted Rx channel data according to the mixed sequence of encryption rounds and decryption rounds.
    Type: Grant
    Filed: January 31, 2020
    Date of Patent: March 15, 2022
    Assignee: Facebook Technologies, LLC
    Inventors: Sudhir Satpathy, Wojciech Stefan Powiertowski, Neeraj Upasani
  • Patent number: 11263353
    Abstract: This disclosure describes systems on a chip (SOCs) that prevent side channel attacks (SCAs). An example SoC of this disclosure includes an engine configured to encrypt transmission (Tx) channel data using an encryption operation set configured with a first polynomial, and to decrypt encrypted received (Rx) channel data using a decryption operation set configured with a second polynomial different from the first polynomial. The SoC further includes a security processor configured to multiplex the encryption operation set against the decryption operation set with a varied sequence of selection inputs on a round-by-round basis to generate a mixed sequence of encryption rounds and decryption rounds, and to control the engine to encrypt the Tx channel data and decrypt the encrypted Rx channel data in a combined datapath according to the mixed sequence of encryption rounds and decryption rounds.
    Type: Grant
    Filed: March 30, 2020
    Date of Patent: March 1, 2022
    Assignee: Facebook Technologies, LLC
    Inventors: Sudhir Satpathy, Wojciech Stefan Powiertowski, Neeraj Upasani
  • Patent number: 11262980
    Abstract: A computing accelerator using a lookup table. The accelerator may accelerate floating point multiplications by retrieving the fraction portion of the product of two floating-point operands from a lookup table, or by retrieving the product of two floating-point operands of two floating-point operands from a lookup table, or it may retrieve dot products of floating point vectors from a lookup table. The accelerator may be implemented in a three-dimensional memory assembly. It may use approximation, the symmetry of a multiplication lookup table, and zero-skipping to improve performance.
    Type: Grant
    Filed: July 1, 2020
    Date of Patent: March 1, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Krishna T. Malladi, Peng Gu, Hongzhong Zheng, Robert Brennan
  • Patent number: 11196544
    Abstract: Systems and methods generate reasonably secure hash values at relatively few CPU cycles per byte. An example method includes, for each of a plurality of packets, injecting the packet into an internal state that represents an internal hash sum, mixing the internal state using multiplication, and shuffling the result of the multiplication so that bytes with highest quality are moved to locations that will propagate most widely in a next multiplication operation. Each of the plurality of packets include data from an input to be hashed. In some implementation, a last packet for the input is padded. The method may also include further mixing the internal state using multiplication after processing the plurality of packets and providing, to a requesting process, a portion of the final internal state as a hash of the input.
    Type: Grant
    Filed: November 11, 2019
    Date of Patent: December 7, 2021
    Assignee: GOOGLE LLC
    Inventors: Jyrki Antero Alakuijala, Jan Wassenberg
  • Patent number: 11188842
    Abstract: Examples are disclosed relating to obtaining a solution to a multiproduct formula of order m to solve a quantum computing problem comprising a product formula. One example provides a method comprising selecting a set of exponents kj, wherein each kj is a real number and is an exponent in a linear combination of product formulas. Based on the set of exponents kj, a set of pre-factors aj is determined based on an underdetermined solution to an m×M system of linear equations, where M is a number of lower-order product formulas in the linear combination of product formulas. The set of exponents kj and the set of pre-factors aj are used to solve the quantum computing problem comprising the product formula. By minimizing the set of exponents kj and the set of pre-factors aj, sparse solutions to the multiproduct formula are generated, reducing computational time and scaling.
    Type: Grant
    Filed: June 14, 2019
    Date of Patent: November 30, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Vadym Kliuchnikov, Guang Hao Low, Nathan Wiebe
  • Patent number: 11163530
    Abstract: Multiplier circuitry includes first combinatorial circuitry configured to perform a combinatorial function, based at least in part on redundant form arithmetic, to generate a first subset of two or more partial products. The two or more partial products are based at least in part on a first input to the multiplier circuitry and a second input to the multiplier circuitry. The multiplier circuitry also includes a carry chain that includes a second combinatorial circuitry configured to generate a second subset of the two or more partial products based at least in part on the first input and the second input. Furthermore, the carry chain includes one or more binary ripple-carry adders configured to generate a product of the multiplier circuitry based at least in part on a sum of the two or more partial products.
    Type: Grant
    Filed: March 22, 2018
    Date of Patent: November 2, 2021
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Gregg William Baeckler
  • Patent number: 11140141
    Abstract: This invention pertains to secure communications between multiple parties and/or secure computation or data transmission between multiple computers or multiple vehicles. This invention provides a secure method for three or more parties to establish one or more shared secrets between all parties. In some embodiments, there are less than 40 parties and in other embodiments there are more than 1 million parties that establish a shared secret. In some embodiments, establishing a shared secret among multiple parties provides a method for a secure conference call. In some embodiments, a shared secret is established with multiple computer nodes across the whole earth to help provide a secure Internet infrastructure that can reliably and securely route Internet traffic. In some embodiments, a shared secret is established so that self-driving vehicles may securely communicate and securely coordinate their motion to avoid collisions.
    Type: Grant
    Filed: September 17, 2018
    Date of Patent: October 5, 2021
    Assignee: Fiske Software LLC
    Inventor: Michael Stephen Fiske
  • Patent number: 11139971
    Abstract: A method for checking results, including (a) determining a first result by conducting an operation g( ) based on test data; (b) determining combined data by performing a combining operation based on the test data and user data; (c) determining a second result conducting the operation g( ) based on the combined data; and (d) determining whether the second result is indicative of the first result.
    Type: Grant
    Filed: July 20, 2018
    Date of Patent: October 5, 2021
    Assignee: Infineon Technologies AG
    Inventor: Thomas Poeppelmann
  • Patent number: 11119772
    Abstract: A computer system, processor, and method for processing information is disclosed that includes at least one processor having a main register file, the main register file having a plurality of entries for storing data; one or more execution units including a dense math execution unit; and at least one accumulator register file, the at least one accumulator register file associated with the dense math execution unit. The processor in an embodiment is configured to process data in the dense math execution unit where the results of the dense math execution unit are written to a first group of one or more accumulator register file entries, and after a checkpoint boundary is crossed based upon, for example, the number “N” of instructions dispatched after the start of the checkpoint, the results of the dense math execution unit are written to a second group of one or more accumulator register file entries.
    Type: Grant
    Filed: December 6, 2019
    Date of Patent: September 14, 2021
    Assignee: International Business Machines Corporation
    Inventors: Steven J Battle, Brian D. Barrick, Susan E. Eisen, Andreas Wagner, Dung Q. Nguyen, Brian W. Thompto, Hung Q. Le, Kenneth L. Ward
  • Patent number: 11055062
    Abstract: Methods and apparatuses enable a general-purpose low power analog vector-matrix multiplier. A switched capacitor matrix multiplier may comprise a plurality of successive approximate registers (SAR) operating in parallel, each SAR having a SAR digital output; and a plurality of Analog Multiply-and-Accumulate (MAC) units for multiplying and accumulating and scaling bit-wise products of a digital weight matrix with a digital input vector, wherein each MAC unit is connected in series to a SAR of the plurality of SARs.
    Type: Grant
    Filed: February 11, 2019
    Date of Patent: July 6, 2021
    Assignee: AREANNA INC.
    Inventor: Behdad Youssefi
  • Patent number: 11042360
    Abstract: In one embodiment, in a first mode, first and second input operands having a first data type are multiplied using one or more of a plurality of multipliers, and in second mode, a plurality of input operands having a second data type are multiplied using the plurality of multipliers. Accordingly, multiplier circuitry may process different input data types and share circuitry across the different modes. In some embodiments, in the first mode, products may be converted to a third data type, and in the second mode, multiple products may be concatenated. Values in the third data type, in the first mode, and concatenated values having the second data type, in the second mode, may be added across different multimodal multipliers to form a multiply-accumulator. In some embodiments, the plurality of multiply-accumulators may be configured in series.
    Type: Grant
    Filed: August 5, 2020
    Date of Patent: June 22, 2021
    Assignee: Groq, Inc.
    Inventors: Christopher Aaron Clark, Jonathan Ross
  • Patent number: 10963220
    Abstract: An N×N multiplier may include a N/2×N first multiplier, a N/2×N/2 second multiplier, and a N/2×N/2 third multiplier. The N×N multiplier receives two operands to multiply. The first, second and/or third multipliers are selectively disabled if an operand equals zero or has a small value. If the operands are both less than 2N/2, the second or the third multiplier are used to multiply the operands. If one operand is less than 2N/2 and the other operand is equal to or greater than 2N/2, the first multiplier is used or the second and third multipliers are used to multiply the operands. If both operands are equal to or greater than 2N/2, the first, second and third multipliers are used to multiply the operands.
    Type: Grant
    Filed: February 14, 2019
    Date of Patent: March 30, 2021
    Inventors: Ilia Ovsiannikov, Ali Shafiee Ardestani, Joseph Hassoun, Lei Wang
  • Patent number: 10936312
    Abstract: A processor includes a decode unit to decode a packed data alignment plus compute instruction. The instruction is to indicate a first set of one or more source packed data operands that is to include first data elements, a second set of one or more source packed data operands that is to include second data elements, at least one data element offset. An execution unit, in response to the instruction, is to store a result packed data operand that is to include result data elements that each have a value of an operation performed with a pair of a data element of the first set of source packed data operands and a data element of the second set of source packed data operands. The execution unit is to apply the at least one data element offset to at least a corresponding one of the first and second sets of source packed data operands. The at least one data element offset is to counteract any lack of correspondence between the data elements of each pair in the first and second sets of source packed data operands.
    Type: Grant
    Filed: April 6, 2018
    Date of Patent: March 2, 2021
    Assignee: Intel Corporation
    Inventors: Edwin Jan Van Dalen, Alexander Augusteijn, Martinus C. Wezelenburg, Steven Roos
  • Patent number: 10901694
    Abstract: An arithmetic logic unit (ALU) including a binary, parallel adder and multiplier to perform arithmetic operations is described. The ALU includes an adder circuit coupled to a multiplexer to receive input operands that are directed to either an addition operation or a multiplication operation. During the multiplication operation, the ALU is configured to determine partial product operands based on first and second operands and provide the partial product operands to the adder circuit via the multiplexer, and the adder circuit is configured to provide an output having a value equal to a product of the first operand second operands. During an addition operation, the ALU is configured to provide the first and second operands to the adder circuit via the multiplexer, and the adder circuit is configured to provide the output having a value equal to a sum of the first and second operands.
    Type: Grant
    Filed: December 31, 2018
    Date of Patent: January 26, 2021
    Assignee: Micron Technology, Inc.
    Inventor: Fabio Indelicato
  • Patent number: 10871946
    Abstract: Integrated circuits with digital signal processing (DSP) blocks are provided. A DSP block may include one or more large multiplier circuits. A large multiplier circuit (e.g., an 18×18 or 18×19 multiplier circuit) may be used to support two or more smaller multiplication operations sharing one or two sets of multiplier operands, a complex multiplication, and a sum of two multiplications. If the multiplier products overflow and interfere with one another, correction operations can be performed. Partial products from two or more larger multiplier circuits can be used to combine decomposed partial products. A large multiplier circuit can also be used to support two floating-point mantissa multipliers.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: December 22, 2020
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Gregg William Baeckler, Sergey Gribok, Dmitry N. Denisenko, Bogdan Pasca
  • Patent number: 10853035
    Abstract: In an example, an apparatus comprises a plurality of execution units and logic, at least partially including hardware logic, to gate at least one of a multiply unit or an accumulate unit in response to an input of value zero. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: March 27, 2020
    Date of Patent: December 1, 2020
    Assignee: INTEL CORPORATION
    Inventors: Yaniv Fais, Tomer Bar-On, Jacob Subag, Jeremie Dreyfuss, Lev Faivishevsky, Michael Behar, Amit Bleiweiss, Guy Jacob, Gal Leibovich, Itamar Ben-Ari, Galina Ryvchin, Eyal Yaacoby
  • Patent number: 10853034
    Abstract: An integrated circuit that includes common factor mass multiplier (CFMM) circuitry is provided that multiplies a common factor operand by a large number of multiplier operands. The CFMM circuitry may be implemented as an instance specific version or a non-instance specific version. The instance specific version might also be fully enumerated so that the hardware doesn't have to be redesigned assuming all possible unique multiplier values are implemented. Either version can be formed on a programmable integrated circuit or an application-specific integrated circuit. CFMM circuitry configured in this way can be used to support convolution neural networks or any operation that requires a straight common factor multiply. Any adder component with the CFMM circuitry may be implemented using bit-serial adders. The bit-serial adders may be further connected in a tree in CNN applications to sum together many input streams.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: December 1, 2020
    Assignee: Intel Corporation
    Inventors: Thiam Khean Hah, Jason Gee Hock Ong, Yeong Tat Liew, Carl Ebeling, Vamsi Nalluri
  • Patent number: 10833847
    Abstract: A fast cryptographic hash of an input file using multiplication and permutation operations in a parallel processing environment. An example method includes updating an internal state for each of a plurality of packets, the packets being read from an input file. Updating the state for a packet can include injecting the packet into an internal state, mixing the bits of the internal state using multiplication, and shuffling the result of the multiplication so that bits with highest quality are permuted to locations that will propagate most widely in a next multiplication operation. The method also includes performing a reduction on the internal state and repeating the update of the internal state, the reduction, and the injecting a second time. The method may further include finalizing the internal state and storing a portion of the final internal state as a cryptographic hash of the input file.
    Type: Grant
    Filed: February 22, 2018
    Date of Patent: November 10, 2020
    Assignee: GOOGLE LLC
    Inventors: Jan Wassenberg, Jyrki Antero Alakuijala
  • Patent number: 10795967
    Abstract: A computer-implemented method, computer program product, and apparatus are provided. The method includes substituting N×N first integer elements, among a plurality of first integer elements obtained by dividing first integer data expressing a first integer in a first digit direction, into a first matrix having N rows and N columns. The method further includes substituting each of one or more second integer elements, among a plurality of second integer elements obtained by dividing second integer data expressing a second integer in a second digit direction, into at least one matrix element of a second matrix having N rows and N columns. The method also includes calculating a third matrix that is a product of the first matrix and the second matrix. The method includes outputting each matrix element of the third matrix as a partial product in a calculation of a product of the first integer and the second integer.
    Type: Grant
    Filed: November 7, 2019
    Date of Patent: October 6, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Jun Doi
  • Patent number: 10776078
    Abstract: In one embodiment, in a first mode, first and second input operands having a first data type are multiplied using one or more of a plurality of multipliers, and in second mode, a plurality of input operands having a second data type are multiplied using the plurality of multipliers. Accordingly, multiplier circuitry may process different input data types and share circuitry across the different modes. In some embodiments, in the first mode, products may be converted to a third data type, and in the second mode, multiple products may be concatenated. Values in the third data type, in the first mode, and concatenated values having the second data type, in the second mode, may be added across different multimodal multipliers to form a multiply-accumulator. In some embodiments, the plurality of multiply-accumulators may be configured in series.
    Type: Grant
    Filed: September 23, 2018
    Date of Patent: September 15, 2020
    Assignee: Groq, Inc.
    Inventors: Christopher Aaron Clark, Jonathan Ross
  • Patent number: 10732929
    Abstract: A computing accelerator using a lookup table. The accelerator may accelerate floating point multiplications by retrieving the fraction portion of the product of two floating-point operands from a lookup table, or by retrieving the product of two floating-point operands of two floating-point operands from a lookup table, or it may retrieve dot products of floating point vectors from a lookup table. The accelerator may be implemented in a three-dimensional memory assembly. It may use approximation, the symmetry of a multiplication lookup table, and zero-skipping to improve performance.
    Type: Grant
    Filed: March 8, 2018
    Date of Patent: August 4, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Krishna T. Malladi, Peng Gu, Hongzhong Zheng, Robert Brennan
  • Patent number: 10528323
    Abstract: A circuit is provided for addition of multiple binary numbers. The circuit includes a 4-to-2-compressor configured for calculating a compressed representation from four binary numbers received via operand inputs of the 4-to-2-compressor. The 4-to-2-compressor includes a first sub-circuit and a second sub-circuit. Each of the first sub-circuit and the second sub-circuit is configured for transmitting a bitwise inverted representation, of a compressed representation, from three binary numbers.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: January 7, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Manuel Beck, Wilhelm Haller, Ulrich Krauch, Kurt Lind, Friedrich Schroeder
  • Patent number: 10491377
    Abstract: Systems and methods generate reasonably secure hash values at relatively few CPU cycles per byte. An example method includes, for each of a plurality of packets, injecting the packet into an internal state that represents an internal hash sum, mixing the internal state using multiplication, and shuffling the result of the multiplication so that bytes with highest quality are moved to locations that will propagate most widely in a next multiplication operation. Each of the plurality of packets include data from an input to be hashed. In some implementation, a last packet for the input is padded. The method may also include further mixing the internal state using multiplication after processing the plurality of packets and providing, to a requesting process, a portion of the final internal state as a hash of the input.
    Type: Grant
    Filed: February 28, 2017
    Date of Patent: November 26, 2019
    Assignee: GOOGLE LLC
    Inventors: Jyrki Antero Alakuijala, Jan Wassenberg
  • Patent number: 10445638
    Abstract: Disclosed herein are techniques for performing neural network computations. In one embodiment, an apparatus may include an array of processing elements, the array having a configurable first effective dimension and a configurable second effective dimension. The apparatus may also include a controller configured to determine at least one of: a first number of input data sets to be provided to the array at the first time or a second number of output data sets to be generated by the array at the second time, and to configure, based on at least one of the first number or the second number, at least one of the first effective dimension or the second effective dimension of the array.
    Type: Grant
    Filed: February 28, 2018
    Date of Patent: October 15, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Sundeep Amirineni, Ron Diamant, Randy Huang, Thomas A. Volpe
  • Patent number: 10372416
    Abstract: In an example, an apparatus comprises a plurality of execution units and logic, at least partially including hardware logic, to gate at least one of a multiply unit or an accumulate unit in response to an input of value zero. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: August 6, 2019
    Assignee: INTEL CORPORATION
    Inventors: Yaniv Fais, Tomer Bar-On, Jacob Subag, Jeremie Dreyfuss, Lev Faivishevsky, Michael Behar, Amit Bleiweiss, Guy Jacob, Gal Leibovich, Itamar Ben-Ari, Galina Ryvchin, Eyal Yaacoby
  • Patent number: 10372415
    Abstract: A multiplier circuit includes a partial product generation circuit, a truncation circuit, and a summation circuit. The partial product generation circuit is configured to generate a plurality of partial products for multiplying two values. The truncation circuit is coupled to the partial product generation circuit. The truncation circuit is configured to shorten at least some of the partial products by removing a least significant bit from the at least some of the partial products. The summation circuit coupled to the truncation circuit. The summation circuit is configured to sum the truncated partial products produced by the truncation circuit.
    Type: Grant
    Filed: May 4, 2017
    Date of Patent: August 6, 2019
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Jawaharlal Tangudu, Suvam Nandi, Pooja Sundar, Jaiganesh Balakrishnan
  • Patent number: 10353860
    Abstract: A neural network unit. A register holds an indicator that specifies narrow and wide configurations. A first memory holds rows of 2N/N narrow/wide weight words in the narrow/wide configuration. A second memory holds rows of 2N/N narrow/wide data words in the narrow/wide configuration. An array of neural processing units (NPU) is configured as 2N/N narrow/wide NPUs and to receive the 2N/N narrow/wide weight words of rows from the first memory and to receive the 2N/N narrow/wide data words of rows from the second memory in the narrow/wide configuration. In the narrow configuration, the 2N NPUs perform narrow arithmetic operations on the 2N narrow weight words and the 2N narrow data words received from the first and second memories. In the wide configuration, the N NPUs perform wide arithmetic operations on the N wide weight words and the N wide data words received from the first and second memories.
    Type: Grant
    Filed: April 5, 2016
    Date of Patent: July 16, 2019
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Terry Parks
  • Patent number: 10347306
    Abstract: A memory module includes a plurality of memory components, an in-memory power manager, and an interface to a host computer over a memory bus. The in-memory power manager is configured to control a transition of a power state of the memory module. The transition of the power state of the memory module includes a direct transition from a low power down state to a maximum power down state.
    Type: Grant
    Filed: August 8, 2016
    Date of Patent: July 9, 2019
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Mu-Tien Chang, Dimin Niu, Hongzhong Zheng, Craig Hanson, Sun Young Lim, Indong Kim, Jangseok Choi
  • Patent number: 10338919
    Abstract: A method, computer readable medium, and processor are disclosed for performing matrix multiply and accumulate (MMA) operations. The processor includes a datapath configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the datapath. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes the steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.
    Type: Grant
    Filed: November 29, 2017
    Date of Patent: July 2, 2019
    Assignee: NVIDIA Corporation
    Inventors: Brent Ralph Boswell, Ming Y. Siu, Jack H. Choquette, Jonah M. Alben, Stuart Oberman
  • Patent number: 10320392
    Abstract: Aspects of the disclosure are directed to sequencing. In accordance with one aspect, sequencing includes creating a one hot list; selecting a current word of the one hot list as a one hot list output; comparing the one hot list output with a current accumulation register value of an accumulation register to produce a logical comparison; inputting the logical comparison to the accumulation register to generate an updated accumulation register value; and outputting the updated accumulated register state to a client unit to enable or disable the client unit.
    Type: Grant
    Filed: August 2, 2018
    Date of Patent: June 11, 2019
    Assignee: QUALCOMM Incorporated
    Inventor: Kelly Wong Hagen
  • Patent number: 10310816
    Abstract: A hardware logic representation of a circuit to implement an operation to perform multiplication by an invariant rational is generated by truncating an infinite single summation array (which is represented in a finite way). The truncation is performed by identifying a repeating section and then discarding all but a finite number of the repeating sections whilst still satisfying a defined error bound. To further reduce the size of the summation array, the binary representation of the invariant rational is converted into canonical signed digit notation prior to creating the finite representation of the infinite array.
    Type: Grant
    Filed: June 27, 2017
    Date of Patent: June 4, 2019
    Assignee: Imagination Technologies Limited
    Inventor: Theo Alan Drane
  • Patent number: 10297001
    Abstract: Systems and methods may provide a graphics processor that may identify operating conditions under which certain floating point instructions may utilize power to fewer hardware resources compared to when the instructions are executing under other operating conditions. The operating conditions may be determined by examining operands used in a given instruction, including the relative magnitudes of the operands and whether the operands may be taken as equal to certain defined values. The floating point instructions may include instructions for an addition operation, a multiplication operation, a compare operation, and/or a fused multiply-add operation.
    Type: Grant
    Filed: December 26, 2014
    Date of Patent: May 21, 2019
    Assignee: Intel Corporation
    Inventors: Subramaniam Maiyuran, Shubh B. Shah, Ashutosh Garg, Jin Xu, Thomas A. Piazza, Jorge F. Garcia Pabon, Michael K. Dwyer