Sum Of Products Generation Patents (Class 708/603)
  • Patent number: 12136032
    Abstract: A technique for stable and fast computation of a variance representing a confidence interval for an estimation result in an estimation apparatus using a neural network including an integrated layer that combines a dropout layer for dropping out part of input data and an FC layer for computing a weight is provided. When input data having a multivariate distribution is supplied to the integrated layer, a data analysis unit 30 determines, based on a numerical distribution of terms formed by respective products of each vector element of the input data and the weight, a data type of each vector element of output data from the integrated layer. An estimated confidence interval computation unit 20 applies an approximate computation method associated with the data type, to analytically compute a variance of each vector element of the output data from the integrated layer based on the input data to the integrated layer.
    Type: Grant
    Filed: November 14, 2017
    Date of Patent: November 5, 2024
    Assignee: DENSO IT LABORATORY, INC.
    Inventor: Jingo Adachi
  • Patent number: 12079733
    Abstract: Anon-volatile memory structure capable of storing weights for layers of a deep neural network (DNN) and perform an inferencing operation within the structure is presented. An in-array multiplication can be performed between multi-bit valued inputs, or activations, for a layer of the DNN and multi-bit valued weights of the layer. Each bit of a weight value is stored in a binary valued memory cell of the memory array and each bit of the input is applied as a binary input to a word line of the array for the multiplication of the input with the weight. To perform a multiply and accumulate operation, the results of the multiplications are accumulated by adders connected to sense amplifiers along the bit lines of the array. The adders can be configured to multiple levels of precision, so that the same structure can accommodate weights and activations of 8-bit, 4-bit, and 2-bit precision.
    Type: Grant
    Filed: July 28, 2020
    Date of Patent: September 3, 2024
    Assignee: SanDisk Technologies LLC
    Inventors: Tung Thanh Hoang, Won Ho Choi, Martin Lueker-Boden
  • Patent number: 12008069
    Abstract: In a system with control logic and a processing element array, two modes of operation may be provided. In the first mode of operation, the control logic may configure the system to perform matrix multiplication or 1×1 convolution. In the second mode of operation, the control logic may configure the system to perform 3×3 convolution. The processing element array may include an array of processing elements. Each of the processing elements may be configured to compute the dot product of two vectors in a single clock cycle, and further may accumulate the dot products that are sequentially computed over time.
    Type: Grant
    Filed: November 29, 2023
    Date of Patent: June 11, 2024
    Assignee: Recogni Inc.
    Inventors: Jian hui Huang, Gary S. Goldman
  • Patent number: 12001953
    Abstract: Techniques for mixed-precision data manipulation for neural network data computation are disclosed. A first left group comprising eight bytes of data and a first right group of eight bytes of data are obtained for computation using a processor. A second left group comprising eight bytes of data and a second right group of eight bytes of data are obtained. A sum of products is performed between the first left and right groups and the second left and right groups. The sum of products is performed on bytes of 8-bit integer data. A first result is based on a summation of eight values that are products of the first group's left eight bytes and the second group's left eight bytes. A second result is based on the summation of eight values that are products of the first group's left eight bytes and the second group's right eight bytes. Results are output.
    Type: Grant
    Filed: February 24, 2023
    Date of Patent: June 4, 2024
    Assignee: MIPS Tech, LLC
    Inventors: James Hippisley Robinson, Sanjay Patel
  • Patent number: 11947929
    Abstract: An arithmetic device includes a comparison unit comparing voltage generated with charge stored in a storage unit with a threshold, and outputting an output signal at a timing when the voltage exceeds the threshold, and a timing extension unit extending an interval between timings at each of which the output signal is output.
    Type: Grant
    Filed: July 4, 2019
    Date of Patent: April 2, 2024
    Assignee: SONY CORPORATION
    Inventor: Hiroyuki Yamagishi
  • Patent number: 11900184
    Abstract: A multiply-accumulate device (10) includes: a comparison unit (18) that compares, with a threshold voltage, a voltage generated by an electric charge stored in a storage unit (14), and outputs an output signal at timing at which the voltage exceeds the threshold voltage; and a control circuit (110) that reduces, based on a predetermined set value, a charging current to the storage unit (14) from a plurality of input units (13) connected to the storage unit (14).
    Type: Grant
    Filed: July 5, 2019
    Date of Patent: February 13, 2024
    Assignee: Sony Group Corporation
    Inventors: Yasushi Fujinami, Hiroyuki Yamagishi
  • Patent number: 11625224
    Abstract: An apparatus includes a first holding unit and a second holding unit configured to hold first-type data and second-type data, respectively, a first operation unit configured to execute a first product-sum operation based on the first-type data, a branch unit configured to output an operation result of the first product-sum operation in parallel, a sampling unit configured to sample the operation result and to output a sampling result, and a second operation unit configured to execute a second product-sum operation based on the second-type data and the sampling result.
    Type: Grant
    Filed: April 17, 2019
    Date of Patent: April 11, 2023
    Assignee: CANON KABUSHIKI KAISHA
    Inventors: Tsewei Chen, Masami Kato, Masahiro Ariizumi
  • Patent number: 11614919
    Abstract: A circuit, comprising a first term operation circuit and a second term operation circuit, a third term operation circuit, and a second calculation circuit. Each of the first and the second term operation circuits comprises multiple higher bit operation circuits, a lowest bit operation circuit, and a first calculation circuit. Each of the higher bit operation circuits selectively left-shifts a multiplicand by different bits, outputs the shifted multiplicand, determines a sign of the shifted multiplicand, and left-shifts the shifted multiplicand. The lowest bit operation circuit outputs the multiplicand, and determines a sign of the multiplicand. The first calculation circuit generates a term operation result. The third term operation circuit generates a third term operation result. The second calculation circuit adds the term operation result of the first and second term operation circuits and the third term operation result to generate a total operation result.
    Type: Grant
    Filed: August 13, 2020
    Date of Patent: March 28, 2023
    Assignee: REALTEK SEMICONDUCTOR CORPORATION
    Inventor: Szu-Chun Chang
  • Patent number: 11615307
    Abstract: Techniques for mixed-precision data manipulation for neural network data computation are disclosed. A first left group comprising eight bytes of data and a first right group of eight bytes of data are obtained for computation using a processor. A second left group comprising eight bytes of data and a second right group of eight bytes of data are obtained. A sum of products is performed between the first left and right groups and the second left and right groups. The sum of products is performed on bytes of 8-bit integer data. A first result is based on a summation of eight values that are products of the first group's left eight bytes and the second group's left eight bytes. A second result is based on the summation of eight values that are products of the first group's left eight bytes and the second group's right eight bytes. Results are output.
    Type: Grant
    Filed: August 5, 2020
    Date of Patent: March 28, 2023
    Assignee: MIPS Tech, LLC
    Inventors: James Hippisley Robinson, Sanjay Patel
  • Patent number: 11500629
    Abstract: A multiplying-and-accumulating (MAC) circuit includes a multiplying circuit and an adding circuit. The multiplying circuit includes a first multiplier and a second multiplier, and each of the first multiplier and the second multiplier performs a multiplying calculation for first input data with N bits and second input data with M bits to output multiplication result data with (N+M) bits (where, “N” and “M” are natural numbers which are equal to or greater than one). The adding circuit includes an adder which performs an adding calculation for the multiplication result data of the first multiplier and the multiplication result data of the second multiplier to output addition result data with (N+M) bits.
    Type: Grant
    Filed: January 8, 2021
    Date of Patent: November 15, 2022
    Assignee: SK hynix Inc.
    Inventor: Choung Ki Song
  • Patent number: 11487541
    Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand.
    Type: Grant
    Filed: November 30, 2020
    Date of Patent: November 1, 2022
    Assignee: Intel Corporation
    Inventors: Jesus Corbal, Robert Valentine, Roman S. Dubtsov, Nikita A. Shustrov, Mark J. Charney, Dennis R. Bradford, Milind B. Girkar, Edward T. Grochowski, Thomas D. Fletcher, Warren E. Ferguson
  • Patent number: 11455143
    Abstract: A device (e.g., an integrated circuit chip) includes a dot product processing component, a data alignment component, and an accumulator. The dot product processing component is configured to calculate a dot product of a first group of elements stored in a first storage unit with a second group of elements, wherein: each element of the first group of elements is represented using a first number of bits, each value of a group of values stored in the first storage unit is represented using a second number of bits greater than the first number of bits, and each value of the group of values is stored as split segments across more than one element of the elements of the first group of elements. The data alignment component is configured to receive results of the dot product processing component and modify one or more of the results of the dot product processing component. The accumulator is configured to sum outputs of the data alignment component to at least in part determine a sum of the group of values.
    Type: Grant
    Filed: May 7, 2020
    Date of Patent: September 27, 2022
    Assignee: Meta Platforms, Inc.
    Inventors: Thomas Mark Ulrich, Krishnakumar Narayanan Nair, Ehsan Khish Ardestani Zadeh
  • Patent number: 11455142
    Abstract: Embodiments for implementing a fused multiply-multiply-accumulate (“FMMA”) unit by one or more processors in a computing system. Mantissas for two products, an exponent difference of the two products serving as an alignment shift amount for a product of the two products having a smallest exponent, and an alignment shift amount for an addend relative to an alternative product of the two product having a larger exponent may be determined in parallel. The addend may be aligned relative to the alternative product having the larger exponent. The product having the smallest exponent may be aligned relative to the alternative product having the larger exponent according to the alignment shift amount.
    Type: Grant
    Filed: June 5, 2019
    Date of Patent: September 27, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ankur Agrawal, Silvia Mueller, Kailash Gopalakrishnan, Bruce Fleischer, Balaram Sinharoy, Mingu Kang
  • Patent number: 11366663
    Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.
    Type: Grant
    Filed: November 9, 2018
    Date of Patent: June 21, 2022
    Assignee: Intel Corporation
    Inventors: Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Raanan Sade, Menachem Adelman, Zeev Sperber, Amit Gradstein, Simon Rubanovich
  • Patent number: 11262982
    Abstract: A computation circuit includes a plurality of processing elements and a common accumulator. The plurality of processing elements are sequentially coupled in series, and performs a multiply and accumulate (MAC) operation on a weight signal and at least one of two or more input signals received in each unit cycle. The common accumulator is sequentially and cyclically coupled to first to Kth processing elements among the plurality of processing elements, and configured to receive a computation value outputted from a processing element coupled thereto among the first to Kth processing elements, and store computation information. The K is decided based on values of the two or more input signals and the number of guard bits included in one processing element.
    Type: Grant
    Filed: July 22, 2019
    Date of Patent: March 1, 2022
    Assignees: SK hynix Inc., SK Telecom Co., Ltd.
    Inventors: Yong Sang Park, Seok Joong Hwang
  • Patent number: 11188305
    Abstract: A computation device includes: a data multiplexer configured to output first high-order data as first output data and fifth output data, output first low-order data as third output data and seventh output data, output second high-order data as second output data, output second low-order data as fourth output data, output third high-order data, which is high-order data having a second bit number out of third input data, as sixth output data, and output third low-order data, which is low-order data having the second bit number out of the third input data, as eighth output data when a mode signal indicates a second computation mode; and first to fourth multipliers each of which multiplies two output data.
    Type: Grant
    Filed: May 11, 2018
    Date of Patent: November 30, 2021
    Assignees: Preferred Networks, Inc., Riken
    Inventors: Junichiro Makino, Takayuki Muranushi, Miyuki Tsubouchi, Ken Namura
  • Patent number: 11163532
    Abstract: A method may include obtaining a set of multivariate quadratic polynomials associated with a multivariate quadratic problem and generating an Ising Model connection weight matrix “W” and an Ising Model bias vector “b” based on the multivariate quadratic polynomials. The method may also include providing the matrix “W” and the vector “b” to an annealing system configured to solve problems written according to the Ising Model and obtaining an output from the annealing system that represents a set of integers. The method may also include using the set of integers as a solution to the multivariate quadratic problem.
    Type: Grant
    Filed: January 18, 2019
    Date of Patent: November 2, 2021
    Assignee: FUJITSU LIMITED
    Inventors: Hart Montgomery, Arnab Roy, Ryuichi Ohori, Toshiya Shimizu, Takeshi Shimoyama, Jumpei Yamaguchi
  • Patent number: 10983755
    Abstract: A transcendental calculation unit includes a configuration table storing a set of constants and provide a selected one of the constants, a power series multiplier that iteratively develops a power series, a coefficient series multiplier and accumulator that develops an accumulated product of the power series and the constant, and a round and normalize stage that rounds the accumulated product and normalizes rounded product.
    Type: Grant
    Filed: July 22, 2020
    Date of Patent: April 20, 2021
    Inventor: Mitchell K. Alsup
  • Patent number: 10970044
    Abstract: A semiconductor device for performing a sum-of-product computation and an operating method thereof are provided. The semiconductor device includes an inputting circuit, a scaling circuit, a computing memory and an outputting circuit. The inputting circuit is used for receiving a plurality of inputting signals. The inputting signals are voltages or currents. The scaling circuit is connected to the inputting circuit for transforming the inputting signals to be a plurality of compensated signals respectively. The compensated signals are voltages or currents. The computing memory is connected to the scaling circuit. The computing memory includes a plurality of computing cells and the compensated signals are applied to the computing cells respectively. The outputting circuit is connected to the computing memory for reading an outputting signals of the computing cells. The outputting signal is voltage or current.
    Type: Grant
    Filed: May 9, 2019
    Date of Patent: April 6, 2021
    Assignee: MACRONIX INTERNATIONAL CO., LTD.
    Inventors: Ming-Hsiu Lee, Chao-Hung Wang
  • Patent number: 10963265
    Abstract: Examples described herein include systems and methods which include an apparatus comprising a plurality of configurable logic units and a plurality of switches, with each switch being coupled to at least one configurable logic unit of the plurality of configurable logic units. The apparatus further includes an instruction register configured to provide respective switch instructions of a plurality of switch instructions to each switch based on a computation to be implemented among the plurality of configurable logic units. For example, the switch instructions may include allocating the plurality of configurable logic units to perform the computation and activating an input of the switch and an output of the switch to couple at least a first configurable logic unit and a second configurable logic unit. In various embodiments, configurable logic units can include arithmetic logic units (ALUs), bit manipulation units (BMUs), and multiplier-accumulator units (MACs).
    Type: Grant
    Filed: April 21, 2017
    Date of Patent: March 30, 2021
    Assignee: Micron Technology, Inc.
    Inventors: Fa-Long Luo, Tamara Schmitz, Jeremy Chritz, Jaime Cummins
  • Patent number: 10936939
    Abstract: An operation processing apparatus includes a memory and a processor coupled to the memory. The processor executes an operation according to an operation instruction, acquires statistical information for a distribution of bits in fixed point data after an execution of an operation for the fixed point data according to an acquisition instruction, and outputs the statistical information to a register designated by the acquisition instruction.
    Type: Grant
    Filed: February 14, 2019
    Date of Patent: March 2, 2021
    Assignee: FUJITSU LIMITED
    Inventors: Mitsuru Tomono, Makiko Ito
  • Patent number: 10929134
    Abstract: A processor to facilitate acceleration of instruction execution is disclosed. The processor includes a plurality of execution units (EUs), each including an instruction decode unit to decode an instruction into one or more operands and opcode defining an operation to be performed at an accelerator, a register file having a plurality of registers to store the one or more operands and an accelerator having programmable hardware to retrieve the one or more operands from the register file and perform the operation on the one or more operands.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: February 23, 2021
    Assignee: Intel Corporation
    Inventors: Radhakrishna Sripada, Peter Yiannacouras, Josh Triplett, Nagabhushan Chitlur, Kalyan Kondapally
  • Patent number: 10802826
    Abstract: An apparatus and method for performing dual concurrent multiplications of packed data elements.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: October 13, 2020
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Mark Charney, Robert Valentine, Binwei Yang
  • Patent number: 10747501
    Abstract: Providing efficient floating-point operations using matrix processors in processor-based systems is disclosed. In this regard, a matrix-processor-based device provides a matrix processor comprising a positive partial sum accumulator and a negative partial sum accumulator. As the matrix processor processes pairs of floating-point operands, the matrix processor calculates an intermediate product based on a first floating-point operand and a second floating-point operand and determines a sign of the intermediate product. Based on the sign, the matrix processor normalizes the intermediate product with a partial sum fraction of the positive partial sum accumulator or the negative partial sum accumulator, then adds the intermediate product to the positive sum accumulator or the negative sum accumulator.
    Type: Grant
    Filed: August 30, 2018
    Date of Patent: August 18, 2020
    Assignee: Qualcomm Incorporated
    Inventors: Mattheus Cornelis Antonius Adrianus Heddes, Natarajan Vaidhyanathan, Robert Dreyer, Colin Beaton Verrilli, Koustav Bhattacharya
  • Patent number: 10719296
    Abstract: A device for generating sum-of-products data includes an array of variable resistance cells, variable resistance cells in the array each comprising a programmable threshold transistor and a resistor connected in parallel, the array including n columns of cells including strings of series-connected cells and m rows of cells. Control and bias circuitry are coupled to the array, including logic for programming the programmable threshold transistors in the array with thresholds corresponding to values of a weight factor Wmn for the corresponding cell. Input drivers are coupled to corresponding ones of the m rows of cells, the input drivers selectively applying inputs Xm to rows m. Column drivers are configured to apply currents In to corresponding ones of the n columns of cells. Voltage sensing circuits operatively coupled to the columns of cells.
    Type: Grant
    Filed: January 17, 2018
    Date of Patent: July 21, 2020
    Assignee: MACRONIX INTERNATIONAL CO., LTD.
    Inventors: Feng-Min Lee, Yu-Yu Lin
  • Patent number: 10705839
    Abstract: A processor having a decoder to decode an instruction to generate a decoded instruction; a first source register to store a first plurality of packed signed bytes; a second source register to store a second plurality of packed signed bytes; execution circuitry to execute the decoded instruction, the execution circuitry including: multiplier circuitry to multiply each packed signed byte from the first source register with a corresponding packed signed byte from the second source register to generate temporary products, adder circuitry to add a plurality of sets of the temporary products to generate a plurality of temporary sums; negation and extension circuitry to negate and extend each of the temporary sums to doublewords sums; and accumulation circuitry to add each of the doublewords sums to a doubleword from a third source register to generate final doubleword results; and a packed data destination register to store the final doubleword results.
    Type: Grant
    Filed: December 21, 2017
    Date of Patent: July 7, 2020
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Mark Charney, Jesus Corbal
  • Patent number: 10552154
    Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers. A method comprises: multiplying selected imaginary and real data elements in a first and second source registers to generate a plurality of imaginary products; adding a first subset of the plurality of imaginary products to generate a first temporary result and adding a second subset of the plurality of imaginary products to generate a second temporary result; negating the first temporary result to generate a third temporary result and the second temporary result to generate a fourth temporary result; accumulating the third temporary result with first data to generate a first final result and accumulating the fourth temporary result with second data to generate a second final result; and storing the first final result and second final.
    Type: Grant
    Filed: September 29, 2017
    Date of Patent: February 4, 2020
    Assignee: Intel Corporation
    Inventors: Venkateswara Madduri, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Mark Charney, Robert Valentine, Binwei Yang
  • Patent number: 10482156
    Abstract: A special-purpose, hardware-based accelerator may include an input subsystem configured to receive first and second vectors as operands of a full dot-product operation. The accelerator may also include a sparsity-aware dot-product engine communicatively coupled to the input subsystem and configured to perform adaptive dot-product processing by: (1) identifying, within the first and second vectors, at least one zero-value element and (2) executing, in response to identifying the zero-value element, a reduced dot-product operation that excludes, relative to the full dot-product operation, at least one mathematical operation in which the zero-value element is an operand. The accelerator may also include an output subsystem that is communicatively coupled to the sparsity-aware dot-product engine and configured to send a result of the reduced dot-product operation to a storage subsystem. Various other accelerators, computing systems, and methods are also disclosed.
    Type: Grant
    Filed: December 29, 2017
    Date of Patent: November 19, 2019
    Assignee: Facebook, Inc.
    Inventors: Abdulkadir Utku Diril, Jong Soo Park, Nadav Rotem, Mikhail Smelyanskiy
  • Patent number: 10459876
    Abstract: A processing element (PE) of a systolic array can perform neural networks computations in parallel on two or more sequential data elements of an input data set using the same weight. Thus, two or more output data elements corresponding to an output data set may be generated in parallel. Based on the size of the input data set and an input data type, the systolic array can process a single data element or multiple data elements in parallel.
    Type: Grant
    Filed: January 31, 2018
    Date of Patent: October 29, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Dana Michelle Vantrease, Ron Diamant
  • Patent number: 10249356
    Abstract: A method of obtaining a dot product includes applying a programming signal to a number of capacitive memory devices coupled at a number of junctions formed between a number of row lines and a number of column lines. The programming signal defines a number of values within a matrix. The method further includes applying a vector signal. The vector signal defines a number of vector values to be applied to the capacitive memory devices.
    Type: Grant
    Filed: October 28, 2014
    Date of Patent: April 2, 2019
    Assignee: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
    Inventors: Ning Ge, John Paul Strachan, Jianhua Yang, Miao Hu
  • Patent number: 10228911
    Abstract: An apparatus includes a plurality of arithmetic logic units each having an accumulator and an integer arithmetic unit that receives and performs integer arithmetic operations on integer inputs and accumulates integer results of a series of the integer arithmetic operations into the accumulator as an integer accumulated value. A register is programmable with an indication of a number of fractional bits of the integer accumulated values and an indication of a number of fractional bits of integer outputs. A first bit width of the accumulator is greater than twice a second bit width of the integer outputs. A plurality of adjustment units scale and saturate the first bit width integer accumulated values to generate the second bit width integer outputs based on the indications of the number of fractional bits of the integer accumulated values and outputs programmed into the register.
    Type: Grant
    Filed: April 5, 2016
    Date of Patent: March 12, 2019
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventors: G. Glenn Henry, Terry Parks
  • Patent number: 10055383
    Abstract: A circuit is provided. In an example, the circuit includes a memory array that includes a plurality of memory cells to store a matrix and a plurality of data lines coupled to the plurality of memory cells to provide a first set of values of the matrix. The circuit includes a multiplier coupled to the plurality of data lines to multiply the first set of values by a second set of values to produce a third set of values. A summing unit is included that is coupled to the multiplier to sum the third set of values to produce a sum. The circuit includes a shifting unit coupled to the summing unit to shift the sum and to add the shifted sum to a running total.
    Type: Grant
    Filed: April 28, 2017
    Date of Patent: August 21, 2018
    Assignee: Hewlett Packard Enterprise Development LP
    Inventors: Ali Shafiee Ardestani, Naveen Muralimanohar
  • Patent number: 9893742
    Abstract: An information processing method for a computer for data compression, the method includes: performing projective transformation of a first numeric string corresponding to an input signal into a second numeric string which contains more components than the first numeric string and having a sum of squares of components as a predetermined value by using a plurality of projective parameters; and generating a bit string in which bits indicating positive and negative signs of the respective components of an operation result obtained by a vector product operation of the second numeric string obtained by the projective transformation and an observation matrix are arranged.
    Type: Grant
    Filed: September 12, 2017
    Date of Patent: February 13, 2018
    Assignee: FUJITSU LIMITED
    Inventor: Yui Noma
  • Patent number: 9710228
    Abstract: Embodiments disclosed pertain to apparatuses, systems, and methods for performing multi-precision single instruction multiple data (SIMD) operations on integer, fixed point and floating point operands. Disclosed embodiments pertain to a circuit that is capable of performing concurrent multiply, fused multiply-add, rounding, saturation, and dot products on the above operand types. In addition, the circuit may facilitate 64-bit multiplication when Newton-Raphson, divide and square root operations are performed.
    Type: Grant
    Filed: December 29, 2014
    Date of Patent: July 18, 2017
    Assignee: Imagination Technologies Limited
    Inventor: Leonard Rarick
  • Patent number: 9639362
    Abstract: An integrated circuit device comprising at least one instruction processing module arranged to receive a bit-manipulation instruction, and in response to receiving the bit-manipulation instruction to select at least one bit from at least one source data register in accordance with a value of at least one control bit, select from candidate values a manipulation value for the at least one selected bit in accordance with a value of at least one further control bit, and store the selected manipulation value for the at least one selected bit in at least one output data register.
    Type: Grant
    Filed: March 30, 2011
    Date of Patent: May 2, 2017
    Assignee: NXP USA, INC.
    Inventors: Noam Eshel-Goldman, Aviram Amir, Itzhak Barak, Amir Kleen
  • Patent number: 9513912
    Abstract: Methods and controllers for executing an instruction set are provided. In one such method, executing an instruction set includes executing an instruction of one type in the instruction set, executing a context switch instruction, and executing an instruction of a second type in the instruction set. in one such controller, a single machine executes instructions in an instruction set with instructions having an operational code, and instructions that do not have an operational code.
    Type: Grant
    Filed: July 27, 2012
    Date of Patent: December 6, 2016
    Assignee: Micron Technology, Inc.
    Inventors: Luca De Santis, Maria-Luisa Gallese, Emanuele Sirizotti, Walter Di-Francesco
  • Patent number: 9411554
    Abstract: A signed multiplier circuit includes a two-dimensional array of substantially similar logic blocks. Each of the logic blocks is programmable to implement any of four multiply functions of first and second inputs, in which: the first and second inputs are both signed; the first and second inputs are both unsigned; the first input is signed and the second input is unsigned; and the first input is unsigned and the second input is signed. Each logic block includes rows and columns of sub-circuits, e.g., logical AND gates and full adders. One row and one column of each logic block include a programmably invertible AND gate, with the row and column being independently controlled. The ability to program the logic block to perform all four of these functions enables the combination of rows and columns of the logic blocks to build large signed multipliers of virtually any size.
    Type: Grant
    Filed: April 2, 2009
    Date of Patent: August 9, 2016
    Assignee: XILINX, INC.
    Inventors: Steven P. Young, Brian C. Gaide
  • Patent number: 9280633
    Abstract: A method of designing a content-addressable memory (CAM) includes associating CAM cells with a summary circuit. The summary circuit includes a first level of logic gates and a second level of logic gates. The first level of logic gates have inputs each configured to receive an output of a corresponding one of the plurality of CAM cell. The second level of logic gates have inputs each configured to receive an output of a corresponding one of the first level of logic gates. Logic gates in at least one of the first level of logic gates or the second level of logic gates are selected to have an odd number of input pins so that an input pin and an output pin share a layout sub-slot.
    Type: Grant
    Filed: May 16, 2014
    Date of Patent: March 8, 2016
    Assignee: TAIWAN SEMICONDUCTOR MANUFACTURING COMPANY, LTD.
    Inventors: Young Seog Kim, Kuoyuan Hsu, Jacklyn Chang
  • Patent number: 9207909
    Abstract: Polynomial circuitry includes a respective partial product generator for each bit position of each term of a plurality of terms of a polynomial to be evaluated. A respective plurality of adders for each bit position adds partial products of a respective bit position across all of the plurality of terms to provide a respective bit-slice sum. Resulting bit-slice sums are offset from one another according to their respective bit positions. A final adder adds together the respective offset bit-slice sums to provide a result.
    Type: Grant
    Filed: March 8, 2013
    Date of Patent: December 8, 2015
    Assignee: Altera Corporation
    Inventor: Martin Langhammer
  • Patent number: 9170775
    Abstract: A multiplier-accumulator (MAC) block can be programmed to operate in one or more modes. When the MAC block implements at least one multiply-and-accumulate operation, the accumulator value can be zeroed without introducing clock latency or initialized in one clock cycle. To zero the accumulator value, the most significant bits (MSBs) of data representing zero can be input to the MAC block and sent directly to the add-subtract-accumulate unit. Alternatively, dedicated configuration bits can be set to clear the contents of a pipeline register for input to the add-subtract-accumulate unit.
    Type: Grant
    Filed: January 7, 2010
    Date of Patent: October 27, 2015
    Assignee: Altera Corporation
    Inventors: Leon Zheng, Martin Langhammer, Nitin Prasad, Greg Starr, Chiao Kai Hwang, Kumara Tharmalingam
  • Patent number: 8930435
    Abstract: A method for computation, including defining a sequence of n bits that encodes an exponent d, such that no more than a specified number of successive bits in the sequence are the same, initializing first and second registers using a value of a base x that is to be exponentiated, whereby the first and second registers hold respective first and second values, which are successively updated during the computation, successively, for each bit in the sequence computing a product of the first and second values, depending on whether the bit is one or zero, selecting one of the first and second registers, and storing the product in the selected one of the registers, whereby the first and second registers hold respective first and second final values upon completion of the sequence, and returning xd based on the first and second final values. Related apparatus and methods are also described.
    Type: Grant
    Filed: September 21, 2010
    Date of Patent: January 6, 2015
    Assignee: Cisco Technology Inc.
    Inventors: Yaacov Belenky, Zeev Geyzel
  • Patent number: 8805916
    Abstract: Digital signal processing (“DSP”) circuit blocks are provided that can more easily work together to perform larger (e.g., more complex and/or more arithmetically precise) DSP operations if desired. These DSP blocks may also include redundancy circuitry that facilitates stitching together multiple such blocks despite an inability to use some block (e.g., because of a circuit defect).
    Type: Grant
    Filed: March 3, 2009
    Date of Patent: August 12, 2014
    Assignee: Altera Corporation
    Inventors: Martin Langhammer, Yi-Wen Lin, Keone Streicher
  • Patent number: 8793300
    Abstract: A circuit for calculating a sum of products, each product having a q-bit binary operand and a k-bit binary operand, where k is a multiple of q, includes a q-input carry-save adder (CSA); a multiplexer (10) by input of the adder, having four k-bit channels respectively receiving the value 0, a first (Yi) of the k-bit operands, the second k-bit operand (M[63:0], mi), and the sum of the two k-bit operands, the output of a multiplexer of rank t (where t is between 0 and q?1) being taken into account by the adder with a t-bit left shift; and each multiplexer having first and second path selection inputs, the bits of a first of the q-bit operands being respectively supplied to the first selection inputs, and the bits of the second q-bit operand being respectively supplied to the second selection inputs.
    Type: Grant
    Filed: April 11, 2012
    Date of Patent: July 29, 2014
    Assignee: INSIDE Secure
    Inventor: Michael Niel
  • Patent number: 8649508
    Abstract: A system and method for implementing the Elliptic Curve scalar multiplication method in cryptography, where the Double Base Number System is expressed in decreasing order of exponents and further on using it to determine Elliptic curve scalar multiplication over a finite elliptic curve.
    Type: Grant
    Filed: September 29, 2008
    Date of Patent: February 11, 2014
    Assignee: Tata Consultancy Services Ltd.
    Inventor: Natarajan Vijayarangan
  • Patent number: 8645450
    Abstract: Multiplier-accumulator circuitry includes circuitry for forming a plurality of partial products of multiplier and multiplicand inputs, carry-save adder circuitry for adding together the partial products and another input to produce intermediate sum and carry outputs, final adder circuitry for adding together the intermediate sum and carry outputs to produce a final output, and feedback circuitry for applying the final output (typically after some delay, e.g., due to registration of the final output) to the carry-save adder circuitry as said another input. The above circuitry may be implemented in so-called “hard IP” (intellectual property) of a field-programmable gate array (“FPGA”) integrated circuit device. If desired, any overflow from the accumulation performed by the above circuitry may be accumulated in “soft” accumulator-overflow circuitry that is implemented in the general-purpose programmable logic of the FPGA.
    Type: Grant
    Filed: March 2, 2007
    Date of Patent: February 4, 2014
    Assignee: Altera Corporation
    Inventors: Kok Heng Choe, Tony K Ngai, Henry Y. Lui
  • Patent number: 8543634
    Abstract: A specialized processing block such as a DSP block may be enhanced by including direct connections that allow the block output to be directly connected to either the multiplier inputs or the adder inputs of another such block. A programmable integrated circuit device may includes a plurality of such specialized processing blocks. The specialized processing block includes a multiplier having two multiplicand inputs and a product output, an adder having as one adder input the product output of the multiplier, and having a second adder input and an adder output, a direct-connect output of the adder output to a first other one of the specialized processing block, and a direct-connect input from a second other one of the specialized processing block. The direct-connect input connects a direct-connect output of that second other one of the specialized processing block to a first one of the multiplicand inputs.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: September 24, 2013
    Assignee: Altera Corporation
    Inventors: Lei Xu, Volker Mauer, Steven Perry
  • Patent number: 8463837
    Abstract: A method and apparatus for performing bi-linear interpolation and motion compensation including multiply-add operations and byte shuffle operations on packed data in a processor. In one embodiment, two or more lines of 2n+1 content byte elements may be shuffled to generate a first and second packed data respectively including at least a first and a second 4n byte elements including 2n?1 duplicated elements. A third packed data including sums of products is generated from the first packed data and packed byte coefficients by a multiply-add instruction. A fourth packed data including sums of products is generated from the second packed data and elements and packed byte coefficients by another multiply-add instruction. Corresponding sums of products of the third and fourth packed data are then summed, and may be rounded and averaged.
    Type: Grant
    Filed: October 17, 2003
    Date of Patent: June 11, 2013
    Assignee: Intel Corporation
    Inventors: Yen-Kuang Chen, Minerva M. Yeung
  • Patent number: 8457309
    Abstract: Apparatus for ciphering, including a non-volatile memory, which stores a number from which a private cryptographic key, having a complementary public cryptographic key, is derivable, wherein the number is shorter than the private cryptographic key, and a processor, which is configured to receive an instruction indicating that the private cryptographic key is to be applied to data and, responsively to the instruction, to compute the private cryptographic key using the stored number and to perform a cryptographic operation on the data using the private cryptographic key. Related apparatus and methods are also described.
    Type: Grant
    Filed: June 28, 2010
    Date of Patent: June 4, 2013
    Assignee: Cisco Technology, Inc.
    Inventors: Yaacov Belenky, Yaakov (Jordan) Levy
  • Publication number: 20130097212
    Abstract: Disclosed are new approaches to Multi-dimensional filtering with a reduced number of memory reads and writes. In one embodiment, a filter includes first and second coefficients. A block of a data having width and height each equal to the number of one of the first or second coefficients is read from a memory device. Arrays of values from the block are filtering using the first filter coefficients and the results filtered using the second coefficients. The final result may be optionally blended with another data value and written to a memory device. Registers store results of filtering with the first coefficients. The block of data may be read from a location including a source coordinate. The final result of filtering may be written to a destination coordinate obtained by rotating and/or mirroring the source coordinate. The orientation of arrays filtered using the first coefficients varies according to a rotation mode.
    Type: Application
    Filed: October 14, 2011
    Publication date: April 18, 2013
    Applicant: Vivante Corporation
    Inventors: Mike M. Cai, Huiming Zhang
  • Publication number: 20130054666
    Abstract: A method for predicting a value for a length of a future time interval in which a physical variable changes is described, in which at least one measured value for the length of a past time interval and an instantaneously measured value for a length of an instantaneous time interval are taken into account, m values for lengths of past time intervals being added. A first value precedes the instantaneously measured value by k?1, and an mth value precedes the instantaneously measured value by k?m. The m added values are divided by a value for a length of a past time interval which precedes the instantaneously measured value by k. A ratio of the mentioned values is formed. For determining the value to be predicted, an average error is initially added to the instantaneously measured value, forming a sum. The formed ratio is subsequently applied to this sum.
    Type: Application
    Filed: January 12, 2011
    Publication date: February 28, 2013
    Inventors: Eberhard Boehl, Bernd Becker, Bernard Pawlok